how to find exact match and 5 vocab before and after it in python

byMR

March 31, 2022

I have below dataframe in python,

Text = provide written informed consent healthy male or female age between 31 to 59 years fluent in german language

it needs looking fore age and add 5 vocab before and after that word.
target value = age
my desired output:

result = healthy male or female age between 31 to 59 years

my code:

 Text = "provide written informed consent healthy male or female age between 31 to 59 years fluent in german language"
 r1 = re.search(r"(?:[a-zA-Z'-]+[^a-zA-Z'-]+){0,3} age (?:[^a-zA-Z'-]+[a-zA-Z'-]+){0,3}", text)
 r1.group()

my result is

 age 16 years old

my data has some words like manage or agent that should be ignore.

thanks

>Solution :

One way to do so, without using regex, might be to split the text into words and retrieve the position of age in the word list.

Text = "provide written informed consent healthy male or female age between 31 to 59 years fluent in german language"
Text = Text.split()

result = Text[Text.index("age") - 4:Text.index("age") + 5]
print(result)  # ['healthy', 'male', 'or', 'female', 'age', 'between', '31', 'to', '59']