I have a text file like the following which I am trying to create some regex for in Python:
CR INFO
CR INFO
Wed Aug 17
foo-bar name_10_Name-Child_test
foo-bar name_25_Name-out
foo-bar name_1000_Name-test_out
CR INFO
CR INFO
Wed Aug 17
foo-bar name_10_Name-Child_test
foo-bar name_25_Name-out
foo-bar name_1000_Name-test_out
Now I’m fairly new to regex so apologies if this is very simple.
I’m trying to capture the lines starting with foo-bar, and grouping them together. So for example, the 3 foo-bar lines in one group, then the 3 below the date go in to another.
I so far have the following regex (^foo-bar\s+[A-z0-9-]+) but that matches every foo-bar line to an individual group, rather than having 3 in one group. Regex flags on regex101.com are gm.
How can I group the 3 lines together until it meets either the "CR" string, or a double new line?
Many thanks.
>Solution :
You can use
^foo-bar\s+[A-Za-z0-9-].*(?:\n.+)*
Or, to make sure each next line start with foo-bar and whitespace:
^foo-bar\s+[A-Za-z0-9-].*(?:\nfoo-bar\s.*)*
See the regex demo / regex demo #2. Use it with re.M / re.MULTILINE to make sure ^ matches the start of any line.
Details:
^– start of a linefoo-bar– a literal string\s+– one or more whitespaces[A-Za-z0-9-]– an alphanumeric or hyphen.*– the rest of the line(?:\n.+)*– zero or more non-empty lines(?:\nfoo-bar\s.*)*– zero or more non-empty lines that start withfoo-barand whitespace.
Note that [A-z] matches more than just letters.