Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

reconciling duplicates in one column with various values in another column

I have a df that operates like this:

ID <- c("A", "B", "B", "C", "D", "D", "E", "F")
FLD<-c("GOOD","BAD","TERRIBLE", "BAD", "BAD","NA", "TERRIBLE", "NA")

df<-data.frame(ID, FLD)

I need to get it down so there’s only rows with unique ID’s. If ID’s are duplicated, and the FLD value for one duplicate is BAD and one value is TERRIBLE, I want the one that says TERRIBLE to remain and the one that says BAD to drop. If, however, in the case of ID "D", the FLD values are BAD and NA, I want to keep BAD.

I’m not sure how to deal with it, because the actual dataset has tens of thousands of entries, so I can’t just fix it by looking. Basically the dummy script I want is along the lines of:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

If duplicated, and one duplicate is BAD is TERRIBLE, drop the row with BAD. But if one duplicate is BAD and one is NA, keep the one with BAD. So the outcome would be:

  ID      FLD
1  A     GOOD
2  B TERRIBLE
3  C      BAD
4  D      BAD
5  E TERRIBLE
6  F       NA

And I am lost.

>Solution :

If we turn FLD into an ordered factor we can then use aggregate to get the minimum value per ID as follows:

df$FLD <- factor(df$FLD, levels=c("TERRIBLE", "BAD", "GOOD", "NA"), ordered = TRUE)
aggregate(data=df, FLD ~ ID, min)

  ID      FLD
1  A     GOOD
2  B TERRIBLE
3  C      BAD
4  D      BAD
5  E TERRIBLE
6  F       NA
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading