I am trying to test if a string in one column starts with a string in another column like so:
>>> import pandas as pd
>>>
>>> df = pd.DataFrame( {'A': ['Sam', 'Ham', 'Pam'], 'B': ['Samuelson', 'Mike', 'Pamela']})
>>> df
A B
0 Sam Samuelson
1 Ham Mike
2 Pam Pamela
>>> df.B.str.startswith(df.A)
0 NaN
1 NaN
2 NaN
Name: B, dtype: float64
>>>
Apparently this does not work. Anyone knows how to accomplish this kind of string comparison?
>Solution :
You can use apply:
df.apply(lambda row: row['B'].startswith(row['A']), axis=1)
which gives:
0 True
1 False
2 True
dtype: bool
or just list comprehension with zip here:
[y.startswith(x) for x,y in zip(df['A'], df['B'])]
If you want a new columns:
df['C'] = [y.startswith(x) for x,y in zip(df['A'], df['B'])]
Output:
A B C
0 Sam Samuelson True
1 Ham Mike False
2 Pam Pamela True