How can one create a new variable in the main dataframe using a conditional (if else) statement based on nested data in a list-column?
If we take the cars dataset:
library(dplyr)
cars_nest <- mtcars %>%
group_by(cyl) %>%
nest()
And want to create a binary variable where any value of carb > 2 equals 1, otherwise 0. I tried the following but get an error.
cars_nest <- cars_nest |>
mutate(test = ifelse(any(cars_nest$carb) > 2, 1, 0))
>Solution :
A few problems in your code:
- Your
anyshould include the condition (i.e.any(... > 2)) - You don’t have a
cars_nestvariable to use inmutate, that is the name of the dataframe object, not the column (it should bedatacolumn) - You need
rowwisefor this type of operation
One suggestion:
- Since the output of a logical comparison (
... > 2) is already logical, you can take advantage ofas.integerto corce it into integers without usingifelse
So the code should be:
library(tidyverse)
cars_nest <- mtcars %>%
group_by(cyl) %>%
nest()
cars_nest %>%
rowwise() %>%
mutate(test = as.integer(any(data$carb > 2)))
# or mutate(test = ifelse(any(data$carb > 2), 1, 0))
# A tibble: 3 × 3
# Rowwise: cyl
cyl data test
<dbl> <list> <dbl>
1 6 <tibble [7 × 10]> 1
2 4 <tibble [11 × 10]> 0
3 8 <tibble [14 × 10]> 1