Assume I have two Dataframes:
DF1: DATA1, DATA1, DATA2, DATA2
DF2: DATA2
I want to exclude all existence of data in DF2 while keeping duplicates in DF1, what should I do?
Expected result: DATA1, DATA1
>Solution :
Use left anti
When you join two DataFrame using Left Anti Join (leftanti), it returns only columns from the left DataFrame for non-matched records.
df3 = df1.join(df2, df1['id']==df2['id'], how='left_anti')