Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to remove duplicates in a loop in R

I have a loop which goes through a large number of .tsv files and runs a function to output results to one file. The loop works, however a copy of the .tsv files have duplicate values in one of the columns which prevents the loop working. I need to remove the rows with the duplicate values in column V5. I have tried previous commands addressed on this site, but they are not working for some reason..

My input .tsv files look like this (other_trait)

V1 V2 V3 V4 V5
10 201874235 G T rs389130213

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

10 201876195 G C rs121467298

10 201876295 T A rs121467298

My code starts as below to format the files before running through function.

files <- list.files(path =".", pattern = ".tsv")
files
datalist = list()
for(i in 1:length(files)) {  
  other_trait <- read.table(files[i])
  colnames(other_trait)[which(names(other_trait) == "V2")] <- "BP"
  other_trait<- merge(other_trait, subset_1[,c("BP","MAF")], by="BP")
  other_trait <- unique(other_trait$V5)

I have tried using unique as above and also
other_trait <- other_trait[!(duplicated(other_trait$V5)), ]
Unique deletes row the other values in dataframe and just retains the unique values in V5, and !(duplicated) doesn’t seem to do anything!

>Solution :

df <- read.table(text = "V1 V2 V3 V4 V5
10 201874235 G T rs389130213

10 201876195 G C rs121467298

10 201876295 T A rs121467298", h = T)

library(dplyr)
df %>% 
  rename(BP = V2) %>% 
  left_join(subset_1[,c("BP","MAF")], by="BP") %>% 
  distinct(V5, .keep_all = T)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading