Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R: How to remove duplicated entry across columns within each row

I have a dataframe that looks like the following. Within each row, I would like to remove entries in X1:n that are duplicate entries.

> df <- data.frame(ID = c("100", "101", "102"),
+                  X1 = c("C23.2", "C23.2", "A79.1"), 
+                  X2 = c("C23.2", NA, "A79.1"),
+                  X3 = c("A19.2", NA, "A79.1"))

The output would look something like this

   ID    X2    X3    X4
1 100 C23.2 A19.2  <NA>
2 101 C23.2  <NA>  <NA>
3 102 A79.1  <NA>  <NA>

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Using pmap_dfr from purrr:

library(dplyr)
library(purrr)
df %>%
  pmap_dfr(., ~c(...) %>% replace(., duplicated(.), NA)) %>%
  bind_cols(select(df), .)

Output:

   ID    X1   X2    X3
1 100 C23.2 <NA> A19.2
2 101 C23.2 <NA>  <NA>
3 102 A79.1 <NA>  <NA>
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading