Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

pandas: cannot set column with substring extracted from other column

I’m doing something wrong when attempting to set a column for a masked subset of rows to the substring extracted from another column.

Here is some example code that illustrates the problem I am facing:

import pandas as pd

data = [
    {'type': 'A', 'base_col': 'key=val'},
    {'type': 'B', 'base_col': 'other_val'},
    {'type': 'A', 'base_col': 'key=val'},
    {'type': 'B', 'base_col': 'other_val'}
]

df = pd.DataFrame(data)
mask = df['type'] == 'A'
df.loc[mask, 'derived_col'] = df[mask]['base_col'].str.extract(r'key=(.*)')

print("df:")
print(df)
print("mask:")
print(mask)
print("extraction:")
print(df[mask]['base_col'].str.extract(r'key=(.*)'))

The output I get from the above code is as follows:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df:
  type   base_col  derived_col
0    A    key=val          NaN
1    B  other_val          NaN
2    A    key=val          NaN
3    B  other_val          NaN
mask:
0     True
1    False
2     True
3    False
Name: type, dtype: bool
extraction:
     0
0  val
2  val

The boolean mask is as I expect and the extracted substrings on the subset of rows (indexes 0, 2) are also as I expect yet the new derived_col comes out as all NaN. The output I would expect in the derived_col would be ‘val’ for indexes 0 and 2, and NaN for the other two rows.

Please clarify what I am getting wrong here. Thanks!

>Solution :

You should assign the serise not df , check the column should pick 0

mask = df['type'] == 'A'
df.loc[mask, 'derived_col'] = df[mask]['base_col'].str.extract(r'key=(.*)')[0]

df
Out[449]: 
  type   base_col derived_col
0    A    key=val         val
1    B  other_val         NaN
2    A    key=val         val
3    B  other_val         NaN 
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading