Say I have a dataframe
Name <- c("Jon", "Jon", "Maria", "Maria", "Tina", "Tina") Score <- c(23, 23, 32, 32, 26, 78) df <- data.frame(Name, Score)
I would like to see if the Score column is the same or different per name. In theory, I expect the score for each column to be the same per name, but it could be the case that they’re different (like with Tina) and I would like to check.
What might be an efficient way to do this? (My dataframe has over 150 000 rows).
Try this to get the counts, then you can check if Name is duplicated
library(dplyr) x_df %>% count(Name, Score)%>% add_count(Name, name = "name_n")%>% filter(name_n > 1)