Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Change specific values based on loc condition

Sample data:

sample_data = [
  {'Case #': 'A25', 'Parent Case #': 'A24', 'Data': 'Blah blah'},
  {'Case #': 'B46', 'Parent Case #': nan, 'Data': 'Waka waka'},
  {'Case #': 'B89', 'Parent Case #': 'B46', 'Data': 'Moo moo'},
  {'Case #': 'C12', 'Parent Case #': nan, 'Data': 'Meow'},
  {'Case #': 'C44', 'Parent Case #': nan, 'Data': 'Woof'},
  {'Case #': 'C77', 'Parent Case #': 'C12', 'Data': 'Hiss'},
  {'Case #': 'D55', 'Parent Case #': 'D2', 'Data': 'Ribbet'}
]

df = pd.DataFrame(sample_data)

The data consists of cases that may or may not have parent cases (i.e., they may be children or not). No grandchildren / max depth = 1.

However, some of the referenced parents are not present in this data set, and so these cases are effectively orphans.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

For the purposes of my data, simply removing the reference to the parent will suffice for orphans. I can identify these orphans like so:

df.loc[~df["Parent Case #"].isna() & ~df2["Parent Case #"].isin(df2["Case #"].values)]

For these two matching rows, I want to remove the "Parent Case #" reference (make that value nan / empty for only these two rows). How do I do this? I feel like I am just missing one final step. I’m not sure how to do assignment using my condition with its & logic.

>Solution :

You need only specify column for assign missing values:

#changed df2 to df
df.loc[df["Parent Case #"].notna() & ~df["Parent Case #"].isin(df["Case #"]), 'Parent Case #'] = np.nan
print (df)
  Case # Parent Case #       Data
0    A25           NaN  Blah blah
1    B46           NaN  Waka waka
2    B89           B46    Moo moo
3    C12           NaN       Meow
4    C44           NaN       Woof
5    C77           C12       Hiss
6    D55           NaN     Ribbet

Another idea – missing values are reassign for all rows:

df.loc[~df["Parent Case #"].isin(df["Case #"]), 'Parent Case #'] = np.nan
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading