Home Drop duplicates and keep first in r data.table

Questions

Drop duplicates and keep first in r data.table

March 25, 2022

not familiar with R sorry for the question that I could not find already.

Suppose I have a network of IPs of data of this type:

toy_data = data.table(from=c("A","B","A","C","D","C"), to=c("B","A","C","B","A","A"))

from	to
A	B
B	A
A	C
C	B
D	A
C	A

I cannot load the whole network in igraph and trying to compute statistics based on chunks. So given that the network is undirected I would like to drop all those lines that have the opposite from-to pattern (row 2, row 6).

I originally thought that something like this would work:
unique(toy_data[,.(c(from,to)|c(to,from))]) unfortunately

I thought to use two auxiliary columns:

toy_data[,orig:=paste(from,to,sep="")]
toy_data[,reverse:=paste(to,from,sep="")]

then work with something like:
unique(df[,.(?)])

but my guess is that this is way easier than what I am doing.

>Solution :

Instead of creating temporary column, paste the min by row (pmin) with the max by row (pmax) and remove the duplicates with duplicated and negate (!)

toy_data[!duplicated(paste(pmin(from, to), pmax(from, to)))]

-output

    from     to
   <char> <char>
1:      A      B
2:      A      C
3:      C      B
4:      D      A

data-manipulation

byMR

Published March 25, 2022

Add a comment

How do I add option 2?

byMR

March 25, 2022

Questions

return pointer with new operator. Where to put delete?

byMR

March 25, 2022

Questions

Differences between assignment constructor and others in c++

byMR

March 25, 2022

Questions

Stemming and lemming words

byMR

March 25, 2022

Questions

Elevated screen with small size on the main Screen

byMR

March 25, 2022

Questions

send email on model create – django

byMR

March 25, 2022

Drop duplicates and keep first in r data.table

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

How do I add option 2?

return pointer with new operator. Where to put delete?

Differences between assignment constructor and others in c++

Stemming and lemming words

Elevated screen with small size on the main Screen

send email on model create – django

Keep Up to Date with the Most Important News

Drop duplicates and keep first in r data.table

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

How do I add option 2?

return pointer with new operator. Where to put delete?

Differences between assignment constructor and others in c++

Stemming and lemming words

Elevated screen with small size on the main Screen

send email on model create – django

Discover more from Dev solutions