Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

remove same combinations in dataframe pandas

I have a dataframe that is a edgelist for a undirected graph it looks like this:

    node 1 node 2 doc
0   Kn  Kn  doc5477 
1   TS  Kn  doc5477 
2   Kn  TS  doc5477 
3   TS  TS  doc5477 
4   Kn  Kn  doc10967
5   Kn  TS  doc10967
6   TS  TS  doc10967
7   TS  Kn  doc10967    

How can I make sure that the combinations of nodes for each document only appear once. Meaning that because row 1 and 2 have are the same I only want it to appear once. Same for rows 5 and 7?

So that my dataframe looks like this:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

    node 1 node 2 doc
0   Kn  Kn  doc5477 
1   TS  Kn  doc5477     
3   TS  TS  doc5477 
4   Kn  Kn  doc10967
5   Kn  TS  doc10967
6   TS  TS  doc10967

>Solution :

First, select the columns on which you need a unique combination (node1, node2 and doc in your case) then apply a sort to return a series with a list of combinations, and finally use a boolean mask with a negative pandas.DataFrame.duplicated to keep only the rows that represent a unique combination.

Try this:

out= df.loc[~df[['node 1','node 2', 'doc']].apply(sorted, axis=1).duplicated()]

# Output :

print(out)

  node 1 node 2        doc
0     Kn     Kn    doc5477
1     TS     Kn    doc5477
3     TS     TS    doc5477
4     Kn     Kn   doc10967
5     Kn     TS   doc10967
6     TS     TS   doc10967
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading