Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

pandas str match is not working after "+"

pandas str.match not working after "+".

import pandas as pd

df = pd.DataFrame()

df['a'] = ["Huwaei p30 4GB+256GB"]

b = df.loc[df['a'].str.match(f"Huwaei p30 4GB+256GB")]
c = df.loc[df['a'].str.match(f"Huwaei p30 4GB+")]

print('b: ', b)
print('c: ', c)

b doesn’t work, returns empty.

c works, detects the "+", returns the row

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Any idea? This is turning me crazy. How can I match the whole string with the "+".

Thanks

>Solution :

b = df.loc[df['a'].str.match(f"Huwaei p30 4GB\+256GB")]
c = df.loc[df['a'].str.match(f"Huwaei p30 4GB\+")]
Output
b:                        a
0  Huwaei p30 4GB+256GB
c:                        a
0  Huwaei p30 4GB+256GB

Those are not doing pattern matching as per the regular expression provided. (ref)

Just for even clear understanding with your match pattern, if you change original entry to be df['a'] = ["Huwaei p30 4GB256GB"] both will return the row since + in regex means it would match the preceding entry as many times as possible. To be more precise, in previous case it matches still
Huwaei p30 4GB and then it tries to match it multiple times but do not find anything but also it doesn’t find a match for the next 2 character.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading