Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Flag column values that are not present in another dataframe

I have a benchmark df_1:

Col_1   insight_id    Col_2    Col_n
24249       ABC123      656      AAA
24249       ABC123      670      AXA
22549       ABC124      656      AAC
24249       ABC124      656      ADA
24236       ABC125      656      AAA

And a dataset df_2:

Col_a   insight_id    Col_b    Col_x
24299       ABC123      956      XAA
24299       ABC123      970      AXX
24299       ABC125      954      AAX
24299       ABC125      956      AXX

How do I mark the insight_ids that are not present in the second dataset? I know about:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df_1.loc[df_1['insight_id'].isin(df_2['insight_id'])]

But it doesn’t lead to my expected output, which, in this case is:

insight_id
    ABC124

>Solution :

You can negate the condition:

cond = df_1["insight_id"].isin(df_2["insight_id"])
df_1.loc[~cond, "insight_id"].drop_duplicates()
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading