Home Regex capture group extracting elements of a list in a sentence

Questions

Regex capture group extracting elements of a list in a sentence

September 4, 2023

I have a list of sentences, with some that contain elements in sentence list form:

index	sentence
0	You can get cars, trucks, planes, and boats.
1	You can get the car, truck, and plane.
2	You should ignore this sentence.

I only wish to extract elements from sentences that start with "You can get" or "You can get the" which I hope to extract using pandas extractall method, where I extract each individual element of the list in the sentences.

Desired output:

index	match	object
0	0	car
	1	truck
	2	plane
	3	boat
1	0	car
	1	truck
	2	plane

I have three main questions:

How to use look behinds (?<=[Y|y]ou can get ) so it won’t capture the
How to include the look ahead \w+(?=s)? so that both plural and singular forms of the elements are captured
Is it possible to write a capture group that also extracts each word as individual elements, or should I extract the list in the sentence first (e.g cars, trucks, planes, and boats) then run another regex?

>Solution :

What about using:

df.loc[df['sentence'].str.startswith('You can get '),
       'sentence'].str.extractall(r'(?P<object>\S+?)s?\b(?:,|.$)')

Output:

        object
  match       
0 0        car
  1      truck
  2      plane
  3       boat
1 0        car
  1      truck
  2      plane

regex

byMR

Published September 04, 2023

Add a comment

Vue3-easy-data-table installation for nuxt 3

byMR

September 4, 2023

Questions

Is it documented that the four numeric fields of a C# app are each of type UNSIGNED16?

byMR

September 4, 2023

Questions

Ignore a Query String when using ajaxComplete (settings.url)

byMR

September 4, 2023

Questions

Combine object data with an array of objects

byMR

September 4, 2023

Questions

Bash: Converting string with new lines to string with unique values with commas, inline

byMR

September 4, 2023

Questions

Regex string match uuid beetwin the words

byMR

September 4, 2023

Regex capture group extracting elements of a list in a sentence

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Vue3-easy-data-table installation for nuxt 3

Is it documented that the four numeric fields of a C# app are each of type UNSIGNED16?

Ignore a Query String when using ajaxComplete (settings.url)

Combine object data with an array of objects

Bash: Converting string with new lines to string with unique values with commas, inline

Regex string match uuid beetwin the words

Keep Up to Date with the Most Important News

Regex capture group extracting elements of a list in a sentence

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Vue3-easy-data-table installation for nuxt 3

Is it documented that the four numeric fields of a C# app are each of type UNSIGNED16?

Ignore a Query String when using ajaxComplete (settings.url)

Combine object data with an array of objects

Bash: Converting string with new lines to string with unique values with commas, inline

Regex string match uuid beetwin the words

Discover more from Dev solutions