Home Extracting first numerical value occuring after some token in text in python

Questions

Extracting first numerical value occuring after some token in text in python

October 31, 2022

I have sentences in the following form. I want to extract all numeric values occurring after any given token. For example, I want to extract all numeric values after the phrase "tangible net worth"

Example sentences:

"A company must maintain a minimum tangible net worth of $100000000 and leverage ratio of 0.5"
"Minimum required tangible net worth the firm needs to maintain is $50000000".

From both of these sentences, I want to extract "$100000000" and "$50000000" and create a dictionary like this:

{
    "tangible net worth": "$100000000"
}

I am unsure how to use the re python module to achieve this. Also, one needs to be careful here, a significant portion of sentences contain multiple numeric values. So, I want only to extract the immediate value occurring after the match. I have tried the following expressions, but none of them are giving desired results

re.search(r'net worth.*(\d+)', sent)
re.search(r'(net worth)(.*)(\d+)', sent)
re.search(r'(net worth)(.*)(\d?)', sent)
re.findall(r'tangible net worth (.*)?(\d* )', sent)
re.findall(r'tangible net worth (.*)?( \d* )', sent)
re.findall(r'tangible net worth (.*)?(\d)', sent)

A little help with the regular expression will be highly appreciated. Thanks.

>Solution :

You could use this regex:

tangible net worth\D*(\d+)

which will skip any non-digit characters after tangible net worth before capturing the first digits that occur after it.

You can then place the result into a dict. Note I would recommend storing a number rather than a string as you can always format it on output (adding $, comma thousands separators etc).

strs = [
    "A company must maintain a minimum tangible net worth of $100000000 and leverage ratio of 0.5",
    "Minimum required tangible net worth the firm needs to maintain is $50000000"
]

result = []
for sent in strs:
    m = re.findall(r'tangible net worth\D*(\d+)', sent)
    if m:
        result += [{ 'tangible net worth' : int(m[0]) }]

print(result)

Output:

[
 {'tangible net worth': 100000000},
 {'tangible net worth': 50000000}
]

python-re

byMR

Published October 31, 2022

Add a comment

Unexpected template string expression.eslintno-template-curly-in-string

byMR

October 31, 2022

Questions

How can i fix concatenate tuple (not "list") to tuple

byMR

October 31, 2022

Questions

Trying to change the background color of a box – CSS

byMR

October 31, 2022

Questions

AttributeError: 'dict' object has no attribute 'distinct'

byMR

October 31, 2022

Questions

getOwnPropertyNames does not fetch methods of Date instance

byMR

October 31, 2022

Questions

Print out returned value from one method, to main class

byMR

October 31, 2022

Extracting first numerical value occuring after some token in text in python

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Unexpected template string expression.eslintno-template-curly-in-string

How can i fix concatenate tuple (not "list") to tuple

Trying to change the background color of a box – CSS

AttributeError: 'dict' object has no attribute 'distinct'

getOwnPropertyNames does not fetch methods of Date instance

Print out returned value from one method, to main class

Keep Up to Date with the Most Important News

Extracting first numerical value occuring after some token in text in python

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Unexpected template string expression.eslintno-template-curly-in-string

How can i fix concatenate tuple (not "list") to tuple

Trying to change the background color of a box – CSS

AttributeError: 'dict' object has no attribute 'distinct'

getOwnPropertyNames does not fetch methods of Date instance

Print out returned value from one method, to main class

Discover more from Dev solutions