Regex101

CyberArk

2015.08.12 22:05 yanni CyberArk

Technical talk, news, and more about CyberArk Privileged Account Security and other related products. *This subreddit is not affiliated with CyberArk Software.*
[link]


2006.02.28 19:19 spez programming

Computer Programming
[link]


2009.10.02 17:59 HattoriHanzo Python Education

Subreddit for posting questions and asking for general advice about your python code.
[link]


2023.05.27 19:40 code_hunter_cc Apache mod_alias RedirectMatch everything except specific pattern

Apache
Good old regular expressions are driving me nuts.
I need to redirect all traffic in Apache 2.4 from HTTP to HTTPS, except for "/bt/sub/[a_few_endings]", using Redirect from mod_alias (can't use mod_rewrite).
I tested the following regular expression in all online testers I know (e.g. http://regex101.com/) and all confirm that the regex should indeed match everything except the URLs I don't want it to match:
^/(?!bt/sub/(went_activesuccesscancelexpired)).*$ As far as I can tell, this should match everything in http://local.mysite.com and redirect it to https://local.mysite.com, except for the following four:
Still, Apache redirects everything, including the above URLs I don't want redirected.
I found several similar questions in SO but most of them are answered in the light of mod_rewrite, which is not what I want/need, and the ones that people say have worked have not worked for me.
Here's my virtual host configuration as it currently stands:
ServerName local.mysite.com RedirectMatch 302 ^/(?!bt/sub/(went_activesuccesscancelexpired)).*$ https://local.mysite.com DocumentRoot /home/borfast/projects/www/mysite/public #Header set Access-Control-Allow-Origin * SetEnv LARAVEL_ENV localdev Options All DirectoryIndex index.php AllowOverride All Require all granted Please help and prevent me from going crazy :)
UPDATE:There's something weird going on: apparently when the requested URL/path can be found, Apache ignores the expression in RedirectMatch and redirects the client, even though the RedirectMatch tells it not to.
To test this I created a new virtualhost from scratch inside a separate VM freshly installed with Ubuntu Trussty 64, loaded with Apache 2.4. This new virtual host contained just the ServerName, RedirectMatch and DocumentRoot directives, like this:
ServerName testing.com RedirectMatch 302 ^/(?!bt/sub/(went_activesuccess)$).*$ https://othersite.com/ DocumentRoot /home/vagrant/www I created the directory /home/vagrant/www/bt/sub/went_active to make sure Apache could get to at least one of the two possible URLs. When trying to access http://testing.com:8080, I get redirected, as expected.
Then the weirdness comes: when accessing http://testing.com:8080/bt/sub/went_active, the URL that matches the directory I created, I am still redirected, even though I shouldn't be, but when accessing http://testing.com:8080/bt/sub/success, I don't get redirected and instead get a 403 Forbidden.
I may be losing my sanity over this but it seems that when Apache sees that it could serve the request and it matches the regular expression in RedirectMatch that should prevent the redirect, it decides to ignore the regular expression and do the redirect anyway. Three letters for this: WTF?!?!?!
Answer link : https://codehunter.cc/a/apache/apache-mod-alias-redirectmatch-everything-except-specific-pattern
submitted by code_hunter_cc to codehunter [link] [comments]


2023.05.22 22:01 MikeMichalko Help with pulling multiple values out of one cell in KQL using REGEX

I have a table in KQL (Sentinel) which contains one column that has several values that start with
"cpe:/a:" and after contains the data I want to extract, ending with a space. I attempted to use a regex (cpe:\/a:) (.*?=) (?=\s) which checks out OK in Regex101.
I tried using double backslashes as recommended for escape characters by MS. When I do that, I get a message that the syntax of my query is invalid, with the correct regex given.
I'm attempting to use extract_all() to pull out all the answers and mv-expand to break them into separate rows. I am getting a message that argument #2 must be a dynamic array.
I suspect this may be confusing. If you want any further information, tell me what you need and I'll do my best to supply it.
Thanks in advance for any help you can provide. I'm not married to REGEX. If there is a better way to approach this, LMK.
Mike
submitted by MikeMichalko to AzureSentinel [link] [comments]


2023.05.18 22:18 J_K_M_A_N Another weird one I am not sure is possible - Trying to get an "Alternate Code" from an order

So, here is a sample of the data.
1 60ea ABC A1234-16-32 Description here 8.88/ea 532.80 Possible Extended Description here - do not need this UPC: 1234567890 2 20ea DEF 866 1562PL Description here 4.44/ea 88.80 UPC: 2234567890 3 10ea GHI 34-12-66-12 Description here 2.22/ea 22.20 Possible extended description 
The first number is the line number. I do not care about that. The next is the quantity. I want that. Then is a manufacturer code the customer uses (ABC or DEF or GHI). They are always the same for each manufacturer. After that code is a manufacturer part number. The problem I am running into is, one manufacturer has possible spaces in it (well, a maximum of 1 space) but they always end with PL, CP or EG (some others too but I am simplifying). The other codes COULD end with PL, CP or EG but they may not and they will not have a space. Here is what I have for the items without a space.
^\d+\s(?\d+?)(?:EARLBX)\s(?:(:ABCDEFGHI) (?.*?) )?(?.*?) (?\d+\.\d+)\/ea \d+\.\d+(?:(?:(?!(?:^\d+\s\d+eaUPC:))(?:.\n))+)?(?:UPC:\s?(?(?!^).*?))?$ 
https://regex101.com/p0gUKY/1
I am not sure how to allow up to 1 space on the code for DEF and capture until it sees the PL, CP or EG. I know I will need something like this maybe: (?:PLCPEG)? but I am not sure how to handle it if it is one of the others that won't end in that (I need to capture the PL, CP and EG as part of the code).
Hopefully I explained that well enough that someone could come up with an answer. Thanks for looking.
submitted by J_K_M_A_N to regex [link] [comments]


2023.05.18 16:33 4bjmc881 Utilizing regex groups correctly in C

I am trying to get capture groups working properly in C. I am pretty comfortable with how regex works, but for some reason, I can't get capture groups to works like I expect them to. Surely, it is me, doing something wrong, but I don't quiet know what. See the following example:
#include  #include  #include  void extract_info(char* input, const char *pattern); char* input1 = "8096 SHA256:RoWJBGWIRLhNM01nNAVtMKN+b6AoKc7IzIBLjESd3Lc [email protected] (RSA)"; char* input2 = "3072 SHA256:3KM6bS6rLN2ErpKZ/q6w9ofPoclxC1NIQas3ngZoR1A no comment (RSA)"; char* input3 = "1024 SHA256:5alMYgak0SlJOljkZCrSYhQbu5RpmsFtv3aSx+2irNU [email protected] (DSA)"; char* input4 = "256 SHA256:lx/aoiwuLcHSmEwk5+gfokM+6BJ1HLPRbAh9ItgDWNs [email protected] (ED25519)"; int main(int argc, char** argv) { char* pattern2 = "^(\\d+)\\s\\S+:\\S+\\s.*\\((\\S+)\\)$"; extract_info(input1, pattern2); return 0; } void extract_info(char* input, const char *pattern) { regex_t regex; regmatch_t match[2]; // Compile regex int rc = regcomp(®ex, pattern, REG_EXTENDED); if (rc != 0) { exit(1); } // Run regex if (regexec(®ex, input, 2, match, 0) == 0) { printf("Match!\n"); // ToDo: Print capture groups } regfree(®ex); } 
I would like to capture the first number from every input, and the algorithm in the brackets at the end. Using the same pattern on regex101 for example, works perfectly fine. But in C, I don't get any match in the first place.
The expected output would be two capture groups for every input like so:
input1: 8096 and RSA input2: 3072 and RSA input3: 1024 and DSA input4: 256 and ED25519 
What exactly am I doing wrong?
submitted by 4bjmc881 to C_Programming [link] [comments]


2023.05.18 12:43 MeIsALaugher Stack or Better: An *Actual* Negative Lookbehind with Boolean OR

Yes, I had a similar problem at https://www.reddit.com/AutoModeratocomments/13kq513/stumped_regex_negative_lookbehind_with_boolean_o?utm_source=share&utm_medium=web2x&context=3 and u/001Guy001's solution did work. I now have an ever so slightly different problem. This time it's with a negative lookbehind that won't let me use the Boolean OR: (?https://regex101.com/Ykr33Y/10. I'm trying to match:
without matching:
TL;DR: Stacking them seems to work but it doesn't look like it should work. Is there a better way of accomplishing this task?
submitted by MeIsALaugher to AutoModerator [link] [comments]


2023.05.18 08:03 MeIsALaugher Stumped: Regex Negative Lookbehind With Boolean OR in Automod?

So, I'm trying to do something like (?!letsgo\.shop\.advertise\.static\.www\.)((\w-)+\.)(tumblr\.comtmblr\.co)\/?(?!\S) in my subreddit's automoderator but regex101 keeps matching links I don't want to match like https://shop.tumblr.com/. You can see the regex in action at https://regex101.com/SKaOqZ/12.
Is regex101 buggy or is there an alternative?
Edit: The title is supposed to say Lookahead but I wouldn't mind using a lookbehind, instead. I've tried switching to negative lookbehind, ((\w-)+\.)(? Edit: I changed it to body+title (regex, includes):.
submitted by MeIsALaugher to AutoModerator [link] [comments]


2023.05.11 22:14 BigfootWhiteBoy Regex return all matches

When using regex101 in golang mode there is a global flag to select (returning all matches)
My expression uses a boolean or pipe for 2 discreet searches over the same text and should return 2 matches per a string of text.
The functionality of the regex is perfect on regex101 environment.
When I copy the line to go implement it, the search is not global and only returns the first match.
How do I maintain the global functionality of the expression outside of the regex101 environment?
Any help would be much appreciated.
submitted by BigfootWhiteBoy to golang [link] [comments]


2023.05.10 21:51 chad917 Regextract woes

I'm trying to extract pagenames from hyperlinks in a set of cells. There are multiple hyperlinks in most, and the number varies between 1-20+ links per cell contents.
I have a pattern working in the regex101 builder, (?<=\/products\/)(.*?)(?=\") though it only matches the first occurrence there (unsure why, but I assume my settings there)
However when I try to use this in a =regexmatch formula in Sheets, it's stumbling over the quotes (translating part of the regex to string?) and I can't figure out how to escape them or if my formula will even work. I've put up a sample sheet, with a sample data in cell A2 and sample output (manually entered) in B2:E2
Can anyone help me figure out how to fix my regex entry so it will run in the Sheet?
submitted by chad917 to sheets [link] [comments]


2023.05.09 17:16 J_K_M_A_N I think I need a negative lookahead here but I cannot figure it out for sure

I am trying to get a regex for something like this.
1 5ea This is the description 12.234/ea 61.17 extended description may or may not be here UPC: 1234567890 2 4ea Description goes here 1.12/ea 4.48 extended description may or may not be here 3 2ea Description goes here 4.10/ea 8.20 extended description may or may not be here UPC: 0987654321 
I want something like this.
^\d+ (?\d+?)EA (?.*?) (?\d+\.\d+).*?\d+\.\d+[\s\S]*?UPC: (?.*?)(?:\s$) 
That works for some (https://regex101.com/qYMWFA/1) but it is a problem if they don't have the UPC part (it basically combines two lines). Is it possible to use a negative lookahead or something to still get the quantity, description and price and just have an empty partnum if they don't have the UPC code listed without combining two lines? I tried this (which of course did not work).
^\d+ (?\d+?)EA (?.*?) (?\d+\.\d+).*?\d+\.\d+[\s\S]*?(?!\d+\s\d+)(?:UPC: (?.*?)(?:\s$))? 
I would appreciate any help, especially to let me know if it is not possible. I don't want to keep pulling my hair out. Thanks.
Edit to add: This is what I am hoping to get.
Quantity Description Price UPC 5 This is the description 12.34 1234567890 4 Description goes here 1.12 2 Description goes here 4.10 0987654321 
Not sure if it is possible. Also, I do not need the extended description at all but there could be 0, 1 or 2 lines of that before the UPC line (that is the killer part IMO).
submitted by J_K_M_A_N to regex [link] [comments]


2023.05.08 22:55 mulberrymerlangius Variable Search Replace

I'm trying to use variable search replace to get the task name, step number, and profile name when I get specific Tasker action errors.
I copied and pasted the error message text to Regex101, so in theory the regex is matching properly. I have search replace set for multiple lines, one match only, and to store matches in an array.
I'm guessing I am experiencing an id10t error, as I can get search replace to work when I'm completing both search and replace.
In this scenario - searching only using regex - the match arrays are empty. What am I missing, or what have I misunderstood about variable search replace?
Thanks.
submitted by mulberrymerlangius to tasker [link] [comments]


2023.05.07 09:14 _EggTart_ Regex test website that work with python re.sub?

I've tried regex101.com and regexr.com but I couldn't get python's capturing group reference in re.sub to work in these website (the /g<> and /1 syntax). Am I doing something wrong in these website or they just don't work with re.sub syntax?
submitted by _EggTart_ to learnpython [link] [comments]


2023.05.04 23:04 Ethiack Explainer on the REcollapse technique (for zero-interaction takeovers, bypasses for webapp firewalls, and more)

Explainer on the REcollapse technique (for zero-interaction takeovers, bypasses for webapp firewalls, and more)
Hi there.
Wanted to share a technique we’ve been researching for the past few years. It’s called REcollapse. This technique can be used to perform zero-interaction account takeovers, uncover new bypasses for web application firewalls, and more.
This post is mostly based on André’s BSidesLisbon 2022 talk and insights from researching this technique at Ethiack.
We’ll explain more about it and how it works. TLDR: you can watch the original talk on Youtube.

First, the issue with user input

It all starts with unexpected input. Modern applications and APIs rely on validation, sanitization, and normalization. This is usually done by custom regular expressions and widely used libraries that validate and transform typical user input formats, such as email addresses, URLs, and more. Like this:

Validation (Python)
The goal is always about preventing dangerous user input from being stored in the first place. Let’s consider an application that rejects special characters in the name of a user on a /signup endpoint. An attacker can’t inject payloads in the name but this doesn’t necessarily mean that, later on, the name would not be sanitized somewhere, resulting in vulnerabilities, such as XSS.
In this case, we can try to find alternative endpoints that are more permissive and accept special characters in the same parameter. On the other hand, normalization is used to make user input consistent. It’s handy for applications with multiple account flows to avoid duplicate email addresses, such as [email protected] vs [email protected] vs á@ª.com and so on. The normalization libraries have different outputs, as you can see in these examples, which can be helpful to detect technologies used by the backend.

What’s the problem?

Regex is usually reused from StackOverflow, Github, or other sources. Developers typically don’t test them properly and sometimes paste different regular expressions across backend endpoints. For instance, the aforementioned regex "^\S+@\S+\.\S+$" doesn’t work well for proper email validation:

regex101.com
Things also get interesting with GitHub Copilot. Generating code to validate if an URL is part of a whitelisted domain gives the following result in Python:

Code Generation with Copilot
Fuzzing this regex with the REcollapse tool presented bellow gives an input https://example՟com that will be accepted for example.com as the domain argument, but it’s translated to xn--examplecom-ehl (punycode), allowing an attacker to bypass the validation, as an example.
In terms of normalization, confusion and duplicate states can sometimes be reached if normalization is not used consistently in all endpoints and flows. In addition, the core regex libraries of different programming languages can have slight differences while processing the same regular expression.

Using the REcollapse technique

So, how to bypass the current validation or sanitization? Also, how can we leverage user input transformations? Fuzz the parameters in a smart way.
Consider the following scenario:
https://example.com/redirect?url=https://legit.example.comhttps://example.com/redirect?url=https://evil.com
We can’t redirect to an attacker-controlled URL at first glance. Trying a bunch of payloads also doesn’t work. What can we do?
  1. Identify the regex pivot positions
  • Starting & termination positions (in red)
    • Beginning and end of the input
  • Separator positions (in green)
    • Before and after special characters
  • Normalization positions (in blue)
    • Typically vowels ª > a
https://preview.redd.it/o6afygoppvxa1.png?width=1600&format=png&auto=webp&s=8f2f28e2d8ede54662a53423d429d4440635d6f7
https://preview.redd.it/fvlrogoppvxa1.png?width=1600&format=png&auto=webp&s=970d93f841b4095f76ba21a785da6f0609d1c1fc
https://preview.redd.it/ka55lgoppvxa1.png?width=1600&format=png&auto=webp&s=5bc3bf726e9280e119e4b64e0a7d7dbc2c0bc94c
2) Fuzz positions with all possible bytes %00 to %ff. Here you can see more examples:
https://preview.redd.it/gngl3z7tpvxa1.png?width=1445&format=png&auto=webp&s=764a287443099c8fd2db992cb63b22d59482b7f7
3) Analyze the results: sort by response codes or response length.
And that’s it. André built a tool for this, in case you want to try it out. Github repo here: https://github.com/0xacb/recollapse
submitted by Ethiack to hacking [link] [comments]


2023.05.04 15:26 Pb_Blimp Is it possible to match a newline with egrep?

Hello,
I am tailing a log to monitor specific events on a CentOS 8.5 server. There is a ton of information that shows up, so I am also using egrep to help weed out the clutter.
This is my command...
tail -F  egrep "HandoverPreparationUEContextSetupRequestUEContextReleaseCompletephysCellIdmeasResult.+CellMeasurementReport\\sRRCReconfiguration^\\s*rsrp\\s[0-9]+measResults.*$.*measId\\s[0-9]" 
As you can see, I have 10 different patterns I egrep for.
My issue is with the last pattern in the command...
.*measId\\s\[0-9] 
This turns up more than I care to see. Here is an example of when I want to see it...
https://i.imgur.com/8TE2UNB.png
Here is an example of when I dont want to see it...
https://i.imgur.com/GLFkX7Q.png
In the example of when I want to see it, it always immediately follows "measResults {", which is also a pattern I egrep for...
measResults.*$ 
Ultimately I would like to combine the last two patterns in my egrep, but cannot figure out for the life of me how to get it to work in linux with my other egrep patterns². I can get it to work in regex101.com, but not in linux.
² I say this because if I remove all my other patterns, I can get it to match with the following grep, but now I am unsure of how to add all the other patterns I need. I would rather get the other method to work if possible.
grep -zoP "measResults.*\\n.*measId\\s\\d" 
I believe that is everything. Please let me know if you have any suggestions or questions.
submitted by Pb_Blimp to linuxquestions [link] [comments]


2023.05.04 09:58 G0nz0uk Help with simple regex (I think)

Hello,
I've spent far too long trying to get this to work on https://regex101.com/ .
I want to match these below, the number could be random though after as or nls
 http://ab01-pre-net.ourdomain.com/health http://ab02-pre-net.ourdomain.com/health http://ab03-pre-net.ourdomain.com/health and http://nls01-pre-net.ourdomain.com/health http://nls02-pre-net.ourdomain.com/health http://nls03-pre-net.ourdomain.com/health 
I'm useless at this, the was my attempt:
http://(ab*nls*)-pre-net.ourdomain.com/health 
What am I doing wrong? I don't know why I find Regex so hard.
EDIT: I think this might work:
http://(ab.*nls.*)-pre-net.ourdomain.com/health 
Thanks
submitted by G0nz0uk to regex [link] [comments]


2023.05.03 19:43 kilroy1937 Replicating Ruby Regex in JavaScript

I'm trying to replicate the behavior of the Ruby file in the new JavaScript file. In each file, I'm trying to categorize natural language as an opinion or a fact using regexes.
When I give each of the scripts the test case found in test_case.csv, the Ruby returns this match from the fourth regex in the regex array (labeled 'fp4'):"S government or international affairs; I can't begin to fathom how he will".The JavaScript does not return this match or anything similar. When I use regex101 to test the regex from the JavaScript (also labeled fp4), regex101 says the regex should match "S government or international affairs; I can't begin to fathom how he will".
I'm new to JS, Ruby, and regexes so I'd be very appreciative of any insight into this discrepancy.

https://preview.redd.it/js97brc2lnxa1.png?width=1380&format=png&auto=webp&s=a5e9bef9653f75502889cd02ae7b8ca2e0bb78d6
Ruby file:
require 'csv' require 'pp' require 'active_support' FILE_NAME = "study2.csv" RESPONSE_COL_NAME = 'open_response' FILE_HEADERS = [ 'part_id', RESPONSE_COL_NAME, 'fact_phrases', 'opinion_phrases', 'fact_phrases_label', 'opinion_phrases_label', 'fact_phrases_t2', 'opinion_phrases_t2', 'total_words_t2' ] DONT_PHRASES = / dont don't do not can not cant can't/ PRONOUNS = /hesheitthey/i PRESIDENT_NAMES = /candidateclintondonaldgophillaryhilarytrumptrum/i SKIP_WORDS = / also really very much/ AMBIGUOUS_WORDS = /seemedprefe I_OPINION_WORDS = /agreebelieveconsiderdisagreehopefeelfeltfindopposethinkthoughtsupport/ OPINION_PHRASES = /in my opinionit seems to mefrom my perspectivein my viewfrom my viewfrom my standpointfor me/ OPINION_PHRASE_REGEXES = [ /(i(?:#{DONT_PHRASES}#{SKIP_WORDS})? #{I_OPINION_WORDS})/, /(i'm [a-z]+ to #{I_OPINION_WORDS})/, /#{OPINION_PHRASES},? /, ].freeze STRONG_FACT_WORDS = /arecan'tdemonstratedemontratedidhadisneedsshouldwillwould/ WEAKER_FACT_WORDS = /werewashas/ FACT_WORDS = /#{STRONG_FACT_WORDS}#{WEAKER_FACT_WORDS}/ FACT_PHRASES = // FACT_PHRASE_REGEXES = [ [/[tT]he [^\.]*[A-Z][a-z]+ #{FACT_WORDS}/, false], #fp1 [/(?:^.+\. )[A-Z][a-z]+ #{FACT_WORDS}/, false], #fp2 [/[tT]he [^\.]*[A-Z][a-z]+'s? [a-z]+ #{FACT_WORDS}/, false], #fp3 [/[^\.]*#{PRONOUNS} #{STRONG_FACT_WORDS}/, true], #fp4 [/(?:^.+\. )#{PRONOUNS} #{FACT_WORDS}/, true], #fp5 [/(?:^[^.]* )#{PRESIDENT_NAMES} #{FACT_WORDS}/, true], #fp6 [/(?:^[^.]* )(?:#{PRONOUNS}#{PRESIDENT_NAMES}) [a-z]+(?:ed[^ia]s) /, true], #fp7 [/(?:^[^.]* )(?:#{PRONOUNS}#{PRESIDENT_NAMES}) [a-z]+ [a-z]+(?:ed[^ia]s) /, true], #fp8 [/(?:$\. )(?:She'sHe's)/, true], #fp9 ].freeze CSV.open("C:/wd/CohenLab/post_Qintegrat/output_ruby_labels.csv", "w") do csv csv << FILE_HEADERS CSV.foreach(FILE_NAME, :headers => true , :encoding => 'ISO-8859-1') do row id = row['part_id'] response = row[RESPONSE_COL_NAME] if response.nil? csv << [id, response, 'NA', 'NA', 'NA'] next end response_words = response.to_s.split.map(&:downcase).map { w w.gsub(/[\W]/, '') } opinion_phrases = [] OPINION_PHRASE_REGEXES.each_with_index do p, index if response.downcase.match(p) found_phrases = response.downcase.scan(p) # Store the matched phrases along with the index of the regex in an inner array found_phrases.each do ph opinion_phrases << [ph, index] end end end opinion_phrases_t2 = opinion_phrases.length # Replace fact_phrases array with a hash fact_phrases = [] FACT_PHRASE_REGEXES.each_with_index do (p, allow_pres), index if response.match(p) found_phrases = response.scan(p) found_phrases.select! { ph ph if allow_pres !ph.match(/#{PRONOUNS}#{PRESIDENT_NAMES}/) } # Store the matched phrases along with the index of the regex in an inner array found_phrases.each do ph fact_phrases << [ph, index] end end end # Update the select! block to filter based on the phrase part of the inner array fact_phrases.select! do p, _ OPINION_PHRASE_REGEXES.none? { ph p.downcase.match(ph) } && !p.downcase.match(AMBIGUOUS_WORDS) end fact_phrases_t2 = fact_phrases.length output = [ id, response, fact_phrases.map(&:first).join('] '), opinion_phrases.map(&:first).join('] '), fact_phrases.map { _, v "regex#{v+1}" }.join(', '), opinion_phrases.map { _, v "regex#{v+1}" }.join(', '), fact_phrases_t2, opinion_phrases_t2, response_words.length ] csv << output end end 
JS File:
const history = []; // Ref: https://www.bennadel.com/blog/1504-ask-ben-parsing-csv-strings-with-javascript-exec-regular-expression-command.htm function parseCSV( strData, strDelimiter ){ strDelimiter = (strDelimiter ","); var objPattern = new RegExp( ( // Delimiters. "(\\" + strDelimiter + "\\r?\\n\\r^)" + // Quoted fields. "(?:\"([^\"]*(?:\"\"[^\"]*)*)\"" + // Standard fields. "([^\"\\" + strDelimiter + "\\r\\n]*))" ), "gi" ); var arrData = [[]]; var arrMatches = null; var header = null; while (arrMatches = objPattern.exec( strData )){ var strMatchedDelimiter = arrMatches[ 1 ]; if ( strMatchedDelimiter.length && (strMatchedDelimiter != strDelimiter) ){ arrData.push( [] ); } if (arrMatches[ 2 ]){ var strMatchedValue = arrMatches[ 2 ].replace( new RegExp( "\"\"", "g" ), "\"" ); } else { var strMatchedValue = arrMatches[ 3 ]; } if (arrData.length === 1) { header = arrData[0]; } // Now that we have our value string, let's add // it to the data array. arrData[ arrData.length - 1 ].push( strMatchedValue ); } var data = arrData.slice(1).map(function (row) { var obj = {}; for (var i = 0; i < header.length; i++) { obj[header[i]] = row[i]; } return obj; }); // Return the parsed data. return( data ); } const input = fetch("study2.csv"); function analyze(input) { console.log(input) input.then(response => response.text()) .then(csvText => { const fileData_raw = parseCSV(csvText,","); console.log(fileData_raw) const data = fileData_raw.filter(entry => entry.open_response && entry.open_response !== 'NA'); console.log(data) let response; for (let i = 0; i < data.length; i++) { const response = data[i].open_response; let response_words = response.toString().split(' ') .map((w) => w.toLowerCase().replace(/[\W]/g, '')); console.log('Response: ', response) const DONT_PHRASES_ARR = ["dont"," don't"," do not"," can not"," cant"," can't"]; const DONT_PHRASES = DONT_PHRASES_ARR.join(""); const PRONOUNS_ARR = ["he","she","it","they"]; const PRONOUNS = PRONOUNS_ARR.join(""); const PRESIDENT_NAMES_ARR = ["candidate","clinton","donald","gop","hillary","hilary","trump","trum"]; const PRESIDENT_NAMES = PRESIDENT_NAMES_ARR.join(""); const SKIP_WORDS_ARR = ["also"," really"," very much"]; const SKIP_WORDS = SKIP_WORDS_ARR.join(""); const AMBIGUOUS_WORDS_ARR = ["seemed","prefer"]; const AMBIGUOUS_WORDS = new RegExp(AMBIGUOUS_WORDS_ARR.join(""), 'i'); const I_OPINION_WORDS_ARR = ["agree","believe","consider","disagree","hope","feel","felt","find","oppose","think","thought","support"]; const I_OPINION_WORDS = I_OPINION_WORDS_ARR.join(""); const OPINION_PHRASES_ARR = ["in my opinion","it seems to me","from my perspective","in my view","from my view","from my standpoint","for me"]; const OPINION_PHRASES = OPINION_PHRASES_ARR.join(""); const OPINION_FRAME_REGEXES = [ {op_label: "op1", op_regex: new RegExp(`(?:i(?: dont don't do not can not cant can'talso really very much)? \\b(?:agreebelieveconsiderdisagreehopefeelfeltfindopposethinkthoughtsupport)\\b)`, 'gmi')}, {op_label: "op2", op_regex: new RegExp(`(?:i'm [a-z]+ to \\b(?:agreebelieveconsiderdisagreehopefeelfeltfindopposethinkthoughtsupport)\\b)`, 'gmi')}, {op_label: "op3", op_regex: new RegExp(`(?:in my opinionit seems to mefrom my perspectivein my viewfrom my viewfrom my standpointfor me),? `, 'gmi')} ]; const FACT_FRAME_REGEXES = [ {f_label: "fp1", f_regex: new RegExp(`(?:[tT]he [^\.]*[A-Z][a-z]+ \\b(?:arecan'tdemonstratedemonstratesdidhadisneedsshouldwillwouldwerewashas)\\b)`, 'gm')}, {f_label: "fp2", f_regex: new RegExp(`(?:(?:^.+\. )[A-Z][a-z]+ (?:arecan'tdemonstratedemonstratesdidhadisneedsshouldwillwouldwerewashas))`, 'gm')}, {f_label: "fp3", f_regex: new RegExp(`(?:[tT]he [^\.]*[A-Z][a-z]+?:(\'s)? [a-z]+ \\b(?:arecan'tdemonstratedemonstratesdidhadisneedsshouldwillwouldwerewashas)\\b )`, 'gm')}, {f_label: "fp4", f_regex: new RegExp(`(?:[^\.]*(?:hesheitthey) (?:arecan'tdemonstratedemonstratesdidhadisneedsshouldwillwould))`, 'gmi')}, {f_label: "fp5", f_regex: new RegExp(`(?:(?:^\. )?:(hesheitthey) \\b(?:arecan'tdemonstratedemonstratesdidhadisneedsshouldwillwouldwerewashas)\\b)`, 'gmi')}, {f_label: "fp6", f_regex: new RegExp(`(?:(?:^[^.]* )\\b(?:candidateclintondonaldgophillaryhilarytrumptrum)\\b \\b(?:arecan'tdemonstratedemonstratesdidhadisneedsshouldwillwouldwerewashas)\\b)`, 'gmi')}, {f_label: "fp7", f_regex: new RegExp(`(?:(?:^[^.]* )(?:hesheittheycandidateclintondonaldgophillaryhilarytrumptrum) [a-z]+(?:ed[^ia]s) )`, 'gmi')}, {f_label: "fp8", f_regex: new RegExp(`(?:(?:^[^.]* )(?:hesheittheycandidateclintondonaldgophillaryhilarytrumptrum) [a-z]+ [a-z]+(?:ed[^ia]s) )`, 'gmi')}, {f_label: "fp9", f_regex: new RegExp(`(?:(?:$\. )(?:She\'sHe\'s))`, 'g')} ]; let fact_frames = []; let opinion_frames = []; // Check for opinion frames OPINION_FRAME_REGEXES.forEach(({ op_label, op_regex }) => { let op_match = response.match(op_regex); if (op_match) { opinion_frames.push({ match: op_match[0], label: op_label }); } }); // Check for fact frames FACT_FRAME_REGEXES.forEach(({ f_label, f_regex }) => { let fact_match = response.match(f_regex); if (fact_match) { fact_frames.push({ match: fact_match[0], label: f_label }); fact_frames = fact_frames.filter((frameObj) => { const lowerCaseFrame = frameObj.match.toLowerCase(); return ( OPINION_FRAME_REGEXES.every(({ op_regex }) => !op_regex.test(lowerCaseFrame)) && !AMBIGUOUS_WORDS.test(lowerCaseFrame) ); }); } }); console.log('Op Frames :', opinion_frames) let opinion_frames_t2 = opinion_frames.length; console.log('Op Fr Num: ', opinion_frames_t2) console.log('Fact Frames :', fact_frames) let fact_frames_t2 = fact_frames.length; let net_score = opinion_frames_t2 - fact_frames_t2; let id = data[i].part_id const result = { part_id: id, input: response, net_score: net_score, opinion_frames_t2: opinion_frames_t2, fact_frames_t2: fact_frames_t2, opinion_frames: opinion_frames, fact_frames: fact_frames }; const op_txt = opinion_frames.map(arr => arr.match); const fact_txt = fact_frames.map(arr => arr.match); const out_net = result.net_score const out_op_num = result.opinion_frames_t2 const out_fp_num = result.fact_frames_t2 const out_op = op_txt const out_fp = fact_txt const out_op2 = op_txt.join("; ") const out_fp2 = fact_txt.join("; ") var feedback_net = result.net_score var feedback_op_num = result.opinion_frames_t2 var feedback_fp_num = result.fact_frames_t2 var feedback_op = op_txt.join("; ") var feedback_fp = fact_txt.join("; ") // Update history history.push(result); updateHistory(); // Display result const output = document.getElementById('output'); output.textContent = `Net score: ${net_score}\nOpinion frames: ${opinion_frames_t2}\nFact frames: ${fact_frames_t2}`; }; }); }; var i = 0; function updateHistory() { const historyTable = document.getElementById('historyTable'); historyTable.innerHTML = ''; const headerRow = historyTable.insertRow(0); const headers = ['pid', 'input', 'net_score', 'op_fram_num', 'fact_fram_num', 'op_frames', 'fact_frames']; for (const header of headers) { const th = document.createElement('th'); th.textContent = header; headerRow.appendChild(th); } history.forEach((result, i) => { const row = historyTable.insertRow(); const cellId = row.insertCell(); cellId.textContent = result.part_id; const cellInput = row.insertCell(); cellInput.textContent = result.input; // cellInput.textContent = result.input.slice(0,50); const cellNetScore = row.insertCell(); cellNetScore.textContent = result.net_score; const cellOpinionFramesT2 = row.insertCell(); cellOpinionFramesT2.textContent = result.opinion_frames_t2; const cellFactFramesT2 = row.insertCell(); cellFactFramesT2.textContent = result.fact_frames_t2; const cellOpinionFrames = row.insertCell(); cellOpinionFrames.textContent = result.opinion_frames.map(obj => JSON.stringify(obj)).join(", "); const cellFactFrames = row.insertCell(); cellFactFrames.textContent = result.fact_frames.map(obj => JSON.stringify(obj)).join(", "); historyTable.appendChild(row); }); // center align table contents const tableElements = document.querySelectorAll('table, th, td'); tableElements.forEach(el => el.style.textAlign = 'center'); const firstColumnElements = document.querySelectorAll('th:first-child, td:first-child'); firstColumnElements.forEach(el => el.style.textAlign = 'left'); } analyze(input) 
submitted by kilroy1937 to regex [link] [comments]


2023.04.29 12:35 Adventurous-Hair-355 Finding Duplicated Lines - What is wrong with my Vim Search Pattern ?

I'm trying to find all duplicated lines in a text file without sorting or any external commands. This is the the text file and regex pattern I have prepared so far.
one one sadadsadad one one two two two one one three three one one three three four abc asadad ddsada abcdefghi 
My vim search pattern is "\v^(.*)$(\_.*\1\_.*)@="
I'm trying to match line 6 and 7 too but it doesn't work as expected. Can anyone help me finding the correct pattern? Thanks
Edit:
I am trying to accomplish the same results with the example below
https://regex101.com/aF3xD0/1
Edit 2:
My Partial and Ugly Solution So Far
g/\v^(.*)$(\_.*\1\_.*)@=/exe ':%s/' . getline('.') . '//'
https://preview.redd.it/wreai81swswa1.png?width=1008&format=png&auto=webp&v=enabled&s=44e6c67817f84bb141c902bbc9657967704f21e6
submitted by Adventurous-Hair-355 to vim [link] [comments]


2023.04.25 03:49 AbideOutside Not matching on anything that is commented out ("--" before the string match)

https://regex101.com/GGRH0k/1 line 3 needs to also match, but everything else is working.
In the regex linked above, I'm attempting to match on "abc" but not if it is preceded by "--". I am close, but struggling to match on situations where there should be a match, but the "--" occurs after, such as "text abc --abc". Ideally the expression would still match on the first, non-commented "abc".
submitted by AbideOutside to regex [link] [comments]


2023.04.24 11:46 sukur55 Generating regex pattern automatically

We are trying to come up with a logic which will create following kind of specific regex based on list of metrics.
data
alertmanager_notifications_total alertmanager_notifications_failed_total alertmanager_cluster_failed_peers alertmanager_cluster_reconnections_total alertmanager_cluster_reconnections_failed_total alertmanager_cluster_messages_received_total alertmanager_cluster_messages_sent_size_total go_memstats_other_sys_bytes go_memstats_next_gc_bytes 
regex
/alertmanager_(notifications_(totalfailed_total)cluster_(failed_peersmessages_(received_totalsent_size_total)reconnections_(totalfailed_total)))go_(memstats_(other_sys_bytesnext_gc_bytes))/gm 
https://regex101.com/sLXbAx/1
maybe there is already a library for it? or written code somewhere?
submitted by sukur55 to learnpython [link] [comments]


2023.04.19 18:40 Fast-Cardiologist705 terraform regex troubel

Hi,
I want to perform the following check and reject any resource groups that would not match the pattern:

variable "name" { type = string description = "The name of the Azure Resource Group to create" validation { condition = can(regex("^EU-(PRDNPDNPUUATNPT)-RSG-TTT-(PLNODESECP)-[A-Z0-9]+-[A-Z0-9]+$", upper(var.name))) error_message = "Invalid name format. The name must match the pattern: EU-(PRDNPDNPUUATNPT)-RSG-TTT-(PLNODESECP)--" } } 
Checked the regex on Regex 101 and it matches as expected
https://preview.redd.it/enkfioqwcvua1.png?width=1007&format=png&auto=webp&s=809b987865aa86206910ec16458d2f60a5a64068
and rejects what is not matching

https://preview.redd.it/o4hvnco2dvua1.png?width=1047&format=png&auto=webp&s=c88b54c1b259b7875544ff77d88e7f89d7929cb0
However, when I'm running terraform plan it just validates a string that wouldn't match in Regex101 .... and creates the resource group
Any idea why ?

EDIT:
https://preview.redd.it/sy8j1rq8h6va1.png?width=969&format=png&auto=webp&s=0b5f7d43047de0c96b786fa0c5612c5fd1b11569
submitted by Fast-Cardiologist705 to Terraform [link] [comments]


2023.04.17 13:26 MeIsALaugher Regex Generator?

Is there a regex generator for Reddit's Automod or Python? I've already tried Googling "regex generator python" but I only came up with https://regex-generator.olafneumann.org/, https://pythex.org/, https://regex101.com/, and a whole bunch of build/testers. Olaf Neumann's generator seemed the most promising, but I couldn't get it to work because I didn't know how to separate each phrase, i.e. "you're dumb," "your dumb," "youre dumb," and "you are dumb." I was able to make sense of https://www.javainuse.com/rexgenerator but it's only for java.
submitted by MeIsALaugher to AutoModerator [link] [comments]


2023.04.16 12:57 kevuwk Struggling with matching a string but only if it doesn't include an exclamation mark

https://regex101.com/ucW4xd/1

This is for streamelements on twitch.

In the example I want it to pick out when somebody says "test" but not "!test". The problem I am having is that if I try and negate the "!" then it seems to start the match 1 character before it should. \btest\b works but obviously matches "!test".

In the link provided I should match the middle lines but only the "test" text and and not the previous character.

Is this even possible?
submitted by kevuwk to regex [link] [comments]


2023.04.13 23:04 Gewerd_Strauss Regex while-loop match replacement: The bane of my existence continues (4)

Yep,
it's this time again.
At least I have some progress in this regards, even if it is mostly shit probably.
Assume I have a variable containing the following string:
.1 = GuiControlLoadImage(): On MetaData: 20221009 .2 = ISO8601: 20230413 .3 = Displayed Format: 13 04 2023_2023 April 13 .4 = Intended Format: dd MM yyyy_yyy MMMM d .5 = Intended Format > BackTransformed: 20230413 
I now want to remove all occurences of . from the front of any line within that string. This lead me to the following attempt:
p:=1 while (p:=RegExMatch(text, "miO)(^((\.(\d\s\w)+\=\s)\s))", match, p)) { match_length:=StrLen(match[0]) mtc:=match[0] text:=StrReplace(text, match[0]) p-=match_length if (p<1) { p:=1 ;; in case the replacement would frameshift the position into the negative, reset it. } OutputDebug, % text OutputDebug, % p } 
I mean, it doesn't work, and I am obviously incapable of figuring out the flaw in my pattern. So... yay, here I am. AGAIN :/
Here's a testing ground of all the edge cases I can think of right now
Anyone willing to help me out on this?
Thank you, Sincerely, ~Gw
P.S.: This got to be the bane of my coding hobby.
It's not like u/anonymous1184 didn't write what is essentially a step-by-step explanation of _this very exact problem_™ and I am somehow still unable to translate thatI'm sorry my friend ._.
submitted by Gewerd_Strauss to AutoHotkey [link] [comments]