Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Finding the grouping variable for which the unique values of a variable is more than one

In DATA below, I was wondering how to find the unique study_id for which variable scale takes on more than one unique value?

The expected answer should be Li (scale for Li has other & MBTI). But I wonder how to find it via BASE or dplyr code?

m="
study_id   year es_id       r     se     n pub_type  context  ed_setting  age_grp L1    L2    prof  scale outcome
Dreyer     1992   130  0      0.0574   305 DocDisse~ Foreign~ CollegeUni~ Adult   Afri~ Engl~ NA    Other Listen~
Dreyer     1992   131  0.04   0.0574   305 DocDisse~ Foreign~ CollegeUni~ Adult   Afri~ Engl~ NA    Other Writing
Dreyer     1992   132 -0.03   0.0574   305 DocDisse~ Foreign~ CollegeUni~ Adult   Afri~ Engl~ NA    Other Reading
Dreyer     1992   133  0      0.0574   305 DocDisse~ Foreign~ CollegeUni~ Adult   Afri~ Engl~ NA    Other Overall
Ghapanchi  2011    89  0.31   0.0806   141 JournalA~ Foreign~ CollegeUni~ Adult   Pers~ Engl~ NA    Other Overall
Hassan     2001   177  0.25   0.117     71 NA        Foreign~ CollegeUni~ NA      Arab~ Engl~ NA    Other Speaki~
Kralova    2012   137  0.0252 0.117     75 JournalA~ Foreign~ CollegeUni~ Adult   Slov~ Engl~ Inte~ Other Speaki~
Li         2009    55 -0.04   0.132     59 JournalA~ Foreign~ CollegeUni~ Adult   Chin~ Engl~ NA    Other Grammar
Li         2009    56  0.355  0.124     59 JournalA~ Foreign~ CollegeUni~ Adult   Chin~ Engl~ NA    Other Pragma~
Li         2003    57  0.039  0.0735   187 JournalA~ Foreign~ CollegeUni~ Multip~ Chin~ Engl~ NA    MBTI  Overall
"

DATA <- read.table(text = m, h=T)

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Here’s a way in dplyr as well as base R –

The idea is to select rows with unique study_id where there is more than one unique scale values.

library(dplyr)

DATA %>%
  group_by(study_id) %>%
  dplyr::filter(n_distinct(scale) > 1) %>%
  ungroup %>%
  distinct(study_id)

# study_id
#  <chr>   
#1 Li      

Base R –

unique(subset(DATA, ave(scale, study_id, 
       FUN = function(x) length(unique(x))) > 1, select = study_id))

#  study_id
#8       Li
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading