I have a dataframe that is a edgelist for a undirected graph it looks like this:

```
node 1 node 2 doc
0 Kn Kn doc5477
1 TS Kn doc5477
2 Kn TS doc5477
3 TS TS doc5477
4 Kn Kn doc10967
5 Kn TS doc10967
6 TS TS doc10967
7 TS Kn doc10967
```

How can I make sure that the combinations of nodes for each document only appear once. Meaning that because row 1 and 2 have are the same I only want it to appear once. Same for rows 5 and 7?

So that my dataframe looks like this:

```
node 1 node 2 doc
0 Kn Kn doc5477
1 TS Kn doc5477
3 TS TS doc5477
4 Kn Kn doc10967
5 Kn TS doc10967
6 TS TS doc10967
```

### >Solution :

First, select the columns on which you need a unique combination (`node1`

, `node2`

and `doc`

in your case) then apply a sort to return a series with a list of combinations, and finally use a boolean mask with a negative ** pandas.DataFrame.duplicated** to keep only the rows that represent a unique combination.

Try this:

```
out= df.loc[~df[['node 1','node 2', 'doc']].apply(sorted, axis=1).duplicated()]
```

#### # Output :

```
print(out)
node 1 node 2 doc
0 Kn Kn doc5477
1 TS Kn doc5477
3 TS TS doc5477
4 Kn Kn doc10967
5 Kn TS doc10967
6 TS TS doc10967
```