Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How can you convert duplicates across multiple columns to be NA in R?

I have a dataset that I want to convert any duplicates across columns to be NA. I’ve found answers to help with just looking for duplicates in one column, and I’ve found ways to remove duplicates entirely (e.g., distinct()). Instead, I have this data:

library(dpylr)
test <- tibble(job = c(1:6), 
               name = c("j", "j", "j", "c", "c", "c"),
               id = c(1, 1, 2, 1, 5, 1))

And want this result:

library(dpylr)

answer <- tibble(job = c(1:6), 
                 id = c("j", NA, "j", "c", NA, "c"),
                  name = c(1, NA, 2, 1, NA, 5))

And I’ve tried a solution like this using duplicated(), but it fails:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

#Attempted solution
library(dpylr)
test %>%
  mutate_at(vars(id, name), ~case_when(
    duplicated(id, name) ~ NA,
    TRUE ~ .
  ))

I’d prefer to use tidy solutions, but I can be flexible as long as the answer can be piped.

>Solution :

We could create a helper and then identify duplicates and replace them with NA in an ifelse statement using across:

library(dplyr)
test %>% 
  mutate(helper = paste(id, name)) %>% 
  mutate(across(c(name, id), ~ifelse(duplicated(helper), NA, .)), .keep="unused")

    job name     id
  <int> <chr> <dbl>
1     1 j         1
2     2 NA       NA
3     3 j         2
4     4 c         1
5     5 c         5
6     6 NA       NA
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading