Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to remove duplicate values based on another column in R?

I have a dataset that looks like this:

   Study_ID       Stage
1       100 Early Stage
2       100      Stable
3       200      Stable
4       300 Early Stage
5       400 Early Stage
6       400      Stable
7       500 Early Stage
8       500      Stable
9       600      Stable
10      700 Early Stage

I would like to remove any Study IDs that are duplicates, but keep the entry where the patient is ‘stable’. In other words, I want to remove every duplicate study ID where the patient is ‘Early Stage’.

My desired output would look something like this:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

  Study_ID       Stage
1      100      Stable
2      200      Stable
3      300 Early Stage
4      400      Stable
5      500      Stable
6      600      Stable
7      700 Early Stage

How can I go about doing this?

Reproducible data:

data<-data.frame(Study_ID=c("100","100","200","300","400","400","500","500","600","700"),Stage=c("Early Stage","Stable","Stable","Early Stage","Early Stage","Stable","Early Stage","Stable","Stable","Early Stage"))

>Solution :

You can use the following code:

data<-data.frame(Study_ID=c("100","100","200","300","400","400","500","500","600","700"),Stage=c("Early Stage","Stable","Stable","Early Stage","Early Stage","Stable","Early Stage","Stable","Stable","Early Stage"))

library(dplyr)
filter(data, !duplicated(Study_ID, fromLast = TRUE) | Stage !="Early Stage")
#>   Study_ID       Stage
#> 1      100      Stable
#> 2      200      Stable
#> 3      300 Early Stage
#> 4      400      Stable
#> 5      500      Stable
#> 6      600      Stable
#> 7      700 Early Stage

Created on 2022-06-30 by the reprex package (v2.0.1)

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading