Filter for unique rows by ID

I have a dataset that looks like this:

 ID           countries        income
1                x                 HIC
1                x                 HIC
1                y                 LIC
1                z                 UIC
2                a                 HIC
3                y                 LIC
3                z                 UIC

I want to filter to keep unique income rows PER ID such that I would eliminate only the repeat income values within each ID, but not across all IDs; thus the data would look like:

    ID           countries        income
    1                x                 HIC
    1                y                 LIC
    1                z                 UIC
    2                a                 HIC
    3                y                 LIC
    3                z                 UIC

the current command I have now is this, but this removes all repeats of income values. What can I tweak so that it’s only unique by ID?

dat2 <- dat1 %>% distinct(income, .keep_all = TRUE)

2nd question, if I wanted to further merge the rows such that the above data would look like this, how could I condense the countries and income rows by ID? Could also create a new variable to achieve this?

    ID           countries        income
    1                x,y,z         HIC,LIC, UIC
    2                a             HIC
    3                y,z           LIC,UIC

Thank you in advance!

>Solution :

You may try

library(dplyr)
df %>%
  group_by(ID) %>%
  distinct(income, .keep_all = TRUE) %>%
  summarise(across(everything(), ~paste0(.x, collapse = ",")))


     ID countries income     
  <int> <chr>     <chr>      
1     1 x,y,z     HIC,LIC,UIC
2     2 a         HIC        
3     3 y,z       LIC,UIC    

Leave a Reply