Let’s say I have a dataframe like this – for each sample (which represents an area), there are ten habitat classes (1:10) which can be present in the sample. The dataframe doesn’t have an entry for each one, only the ones present in each sample.
data <- data.frame(
sample.label = c(1, 1, 1, 2, 3, 3, 4, 4, 4, 4),
hclass = c(1, 2, 7, 6, 5, 7, 1, 4, 7, 10),
cover = c(0.2, 0.6, 0.2, 1, 0.7, 0.3, 0.2, 0.4, 0.1, 0.3))
sample.label hclass cover
1 1 1 0.2
2 1 2 0.6
3 1 7 0.2
4 2 6 1.0
5 3 5 0.7
6 3 7 0.3
7 4 1 0.2
8 4 4 0.4
9 4 7 0.1
10 4 10 0.3
I just need to reshape the dataframe so it looks like this, with a column for each habitat class and added 0s where the class isn’t present in the sample:
sample.label class1 class2 class3 class4 class5 class6 class7 class8 class9 class10
1 1 0.2 0.6 0 0.0 0.0 0 0.2 0 0 0.0
2 2 0.0 0.0 0 0.0 0.0 1 0.0 0 0 0.0
3 3 0.0 0.0 0 0.0 0.7 0 0.3 0 0 0.0
4 4 0.2 0.0 0 0.4 0.0 0 0.1 0 0 0.3
>Solution :
You could use complete + pivot_wider from {tidyr}.
library(tidyr)
data %>%
complete(sample.label, hclass = 1:10, fill = list(cover = 0)) %>%
pivot_wider(names_from = hclass, names_prefix = "class", values_from = cover)
# # A tibble: 4 × 11
# sample.label class1 class2 class3 class4 class5 class6 class7 class8 class9 class10
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 1 0.2 0.6 0 0 0 0 0.2 0 0 0
# 2 2 0 0 0 0 0 1 0 0 0 0
# 3 3 0 0 0 0 0.7 0 0.3 0 0 0
# 4 4 0.2 0 0 0.4 0 0 0.1 0 0 0.3