Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Unnest dataframe that has lists for some values

I am trying to unnest a dataframe where some columns have lists for values. Ideally, the values would be unnested in such a way that each value creates a new row and values get filled with NA if there are fewer values than those in the list.

Example Data:

dat <- data.frame(matrix(ncol = 4, nrow = 2))
colnames(dat)[1:4] <- c("Date","value1","value2","value3")
dat$Date <- c(as.Date('1968-06-13'), as.Date('1968-09-17'))
dat$value1 <- c(list(c(79,78)),list(c(55,56,57)))
dat$value2 <- c(list(c(7.3,7.2)),list(c(6.6,6.7)))
dat$value3 <- c(0.27,0.55)
View(dat)

I have tried:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

library(tidyverse)

dat %>%
  unnest(cols = c(value1,value2,value3))

# Error in `unnest()`: ! In row 2, can't recycle input of size 3 to size 2.

ChatGPT suggested:

library(tidyr)

# First suggestion
dat %>%
  unnest_longer(cols = starts_with("value"), indices_to = "row") %>%
  pivot_wider(names_from = "row", values_from = starts_with("value"))

# Second suggestion
dat %>%
  unnest_wider(cols = starts_with("value"), names_sep = "_") %>%
  mutate(across(starts_with("value"), ~ ifelse(is.na(.), NA, as.numeric(.))))

# both produce same error - Error in unnest_wider(., cols = starts_with("value"), names_sep = "_") : unused argument (cols = starts_with("value"))

Ideal Output:

        Date value1 value2 value3
1 1968-06-13     79    7.3   0.27
2 1968-06-13     78    7.2     NA
3 1968-09-17     55    6.6   0.55
4 1968-09-17     56    6.7     NA
5 1968-09-17     57     NA     NA

>Solution :

maybe not the most concise solution, but works:

library(dplyr)
library(tidyr)

dat |>
  pivot_longer(starts_with('value'), values_transform = as.list) |>
  unnest_longer(value) |>
  group_by(Date, name) |>
  mutate(i = row_number()) |>
  pivot_wider() |>
  select(-i)
+ # A tibble: 5 x 4
# Groups:   Date [2]
  Date       value1 value2 value3
  <date>      <dbl>  <dbl>  <dbl>
1 1968-06-13     79    7.3   0.27
2 1968-06-13     78    7.2  NA   
3 1968-09-17     55    6.6   0.55
4 1968-09-17     56    6.7  NA   
5 1968-09-17     57   NA    NA  
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading