Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to create new variable based on whether values in three columns are equal in R

I have a large dataframe where three variables has the following structure:

author1_gender <- c("Men", "Men", "Women")
author2_gender <- c("Women", "Men", "Women")
author3_gender <- c("Men", "Men", "Women")

genders <- tibble(author1_gender, author2_gender, author3_gender)

which produces

# A tibble: 3 × 3
  author1_gender author2_gender author3_gender
  <chr>          <chr>          <chr>         
1 Men            Women          Men           
2 Men            Men            Men           
3 Women          Women          Women         

I wish to create a new column based on whether there are mixed genders in the rows, i.e. if the three values in each row is equal or not. Ideally, I wish to add a columm that indicates whether it is only female, only males or mixed genders in the three columns, i.e.,

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

# A tibble: 3 × 4
  author1_gender author2_gender author3_gender gender_mix
  <chr>          <chr>          <chr>          <chr>     
1 Men            Women          Men            mix       
2 Men            Men            Men            men       
3 Women          Women          Women          women  

If I had two values, I could do this with identital(), however I can’t seem to find out how to it with three values. Can anyone help with this question which is probably quite trivial?

>Solution :

You could find the min and max for each row across columns whose names end with ‘gender’, then if the min equals the max return the max, else return ‘mix’.

library(dplyr, warn.conflicts = FALSE)
author1_gender <- c("Men", "Men", "Women")
author2_gender <- c("Women", "Men", "Women")
author3_gender <- c("Men", "Men", "Women")
genders <- tibble(author1_gender, author2_gender, author3_gender)

genders %>% 
  mutate(
    gender_mix =  
      lapply(c(pmax, pmin), do.call, across(ends_with('gender'))) %>% 
        {if_else(Reduce('==', .), .[[1]], 'mix')}
  )
#> # A tibble: 3 × 4
#>   author1_gender author2_gender author3_gender gender_mix
#>   <chr>          <chr>          <chr>          <chr>     
#> 1 Men            Women          Men            mix       
#> 2 Men            Men            Men            Men       
#> 3 Women          Women          Women          Women

Created on 2021-12-07 by the reprex package (v2.0.1)

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading