I have a registration database, that looks like that
reg <- tibble(
name = c("John","Bill","Chuck"),
email= c("john@john.com","bill@acme.com","chuck@someplace.org"),
option1=c("meal","meal","drink"),
value1=c("vegetarian","vegetarian","beer"),
option2=c(NA,"drink","room"),
value2=c(NA,"beer","single"),
option3=c(NA,"room",NA),
value3=c(NA,"single",NA)
)
in other words, each of the three options (more in the real data, of course) can end up in different columns, and each registrant may have none, one or many options.
This is not a very sensible format, and I would like to make it more practical, like this:
name email meal drink room
<chr> <chr> <chr> <chr> <chr>
1 John john@john.com vegetarian NA NA
2 Bill bill@acme.com vegetarian beer single
3 Chuck chuck@someplace.org NA beer single
What would be the relevant pivot_wider approach ?
>Solution :
I think we need to pivot longer, filter, then pivot wider:
library(dplyr)
library(tidyr)
pivot_longer(reg, -c(name, email), names_pattern = "(.*)[0-9]+$",
names_to = ".value") %>%
filter(!is.na(option)) %>%
pivot_wider(names_from = "option", values_from = "value")
# # A tibble: 3 × 5
# name email meal drink room
# <chr> <chr> <chr> <chr> <chr>
# 1 John john@john.com vegetarian NA NA
# 2 Bill bill@acme.com vegetarian beer single
# 3 Chuck chuck@someplace.org NA beer single