pandas str.match not working after "+".
import pandas as pd
df = pd.DataFrame()
df['a'] = ["Huwaei p30 4GB+256GB"]
b = df.loc[df['a'].str.match(f"Huwaei p30 4GB+256GB")]
c = df.loc[df['a'].str.match(f"Huwaei p30 4GB+")]
print('b: ', b)
print('c: ', c)
b doesn’t work, returns empty.
c works, detects the "+", returns the row
Any idea? This is turning me crazy. How can I match the whole string with the "+".
Thanks
>Solution :
b = df.loc[df['a'].str.match(f"Huwaei p30 4GB\+256GB")]
c = df.loc[df['a'].str.match(f"Huwaei p30 4GB\+")]
Output
b: a
0 Huwaei p30 4GB+256GB
c: a
0 Huwaei p30 4GB+256GB
Those are not doing pattern matching as per the regular expression provided. (ref)
Just for even clear understanding with your match pattern, if you change original entry to be df['a'] = ["Huwaei p30 4GB256GB"] both will return the row since + in regex means it would match the preceding entry as many times as possible. To be more precise, in previous case it matches still
Huwaei p30 4GB and then it tries to match it multiple times but do not find anything but also it doesn’t find a match for the next 2 character.