I am trying to make a tally dataframe. My starting dataframe is looking like:
sample bike
1: 1 gazelle
2: 1 batavus
3: 2 cortina
4: 2 Cube
5: 3 Giant
And what I need is as follows:
sample gazelle batavus cortina Cube Giant
1: 1 1 1 0 0 0
2: 2 0 0 1 1 0
3: 3 0 0 0 0 1
So make a 1 if the variable is present in a sample and 0 if not.
I thought:
df %>% group_by(sample, bike) %>%
summarize(count = n(), .group = "drop" %>%
pivot_wider(names_from = "bike", values_from = "count", values_fill = 0)
but that did not do the trick.
>Solution :
library(dplyr)
library(tidyr)
pivot_wider(
df,
names_from = bike, values_from = bike, values_fn = length, values_fill = 0L
)
# # A tibble: 3 × 6
# sample gazelle batavus cortina cube giant
# <int> <int> <int> <int> <int> <int>
# 1 1 1 1 0 0 0
# 2 2 0 0 1 1 0
# 3 3 0 0 0 0 1
Data
df = data.frame(
sample = c(1L,1L,2L,2L,3L),
bike = c("gazelle", "batavus", "cortina", "cube", "giant")
)