I have the following data:
df <- structure(list(automatic = c("organismo", "bolha", "organismo",
"organismo", "cosc_multiplo", "cosc_multiplo", "coscinodiscus",
"detritos", "mult_organismos", "multiplos", "organismo", "sombra",
"detritos", "mult_organismos", "detritos", "mult_organismos",
"detritos", "org_partes", "detritos", "organismo", "organismo",
"detritos", "organismo", "organismo", "organismo", "bolha", "coral_falso",
"coscinodiscus", "detritos", "LRaw", "multiplos", "organismo",
"sombra"), validated = c("appendicularia", "bolha", "cnidaria",
"copepodo", "cosc_multiplo", "coscinodiscus", "coscinodiscus",
"coscinodiscus", "coscinodiscus", "coscinodiscus", "coscinodiscus",
"coscinodiscus", "detritos", "detritos", "langanho", "mult_organismos",
"multiplos", "org_partes", "organismo", "organismo", "palmeria",
"pelotas_mix", "phyto", "phyto_cadeia", "phyto_espiral", "sombra",
"sombra", "sombra", "sombra", "sombra", "sombra", "sombra", "sombra"
), N = c(2L, 1L, 2L, 1L, 2L, 1L, 1229L, 3L, 2L, 4L, 5L, 57L,
1569L, 1L, 87L, 31L, 1L, 7L, 1L, 75L, 2L, 11L, 4L, 1L, 1L, 1L,
10L, 25L, 536L, 25L, 30L, 562L, 3678L)), row.names = c(NA, -33L
), class = c("tbl_df", "tbl", "data.frame"))
I would to shown all combinations in columns automatic and validated.
For example, I hadn’t the combination: bolha (in the automatic column) with appendicularia (in the validated column). I would like to show this combination, and the all other’s absents, with a value of 0 in column N.
Where are combinations it has to maintain their value in N column. Like bolha (in automatic column) with bolha (in validated column) has a value in N of 1, it does not have to change.
Thanks all
>Solution :
If you want to get all unique combinations and maintain the original values for N, then you can first use crossing from tidyr to get all unique combinations. Then, we can do a left join to add in the N values from the original dataframe, and finally change NA to 0 for N.
library(tidyverse)
left_join(crossing(automatic = df$automatic, validated = df$validated),
df,
by = c("automatic", "validated")) %>%
replace_na(list(N = 0))
Or a shorter option is to simply use rows_update instead of doing a join:
crossing(automatic = df$automatic, validated = df$validated, N = 0) %>%
rows_update(df, by = c("automatic", "validated"))
Output
# A tibble: 198 × 3
automatic validated N
<chr> <chr> <int>
1 bolha appendicularia 0
2 bolha bolha 1
3 bolha cnidaria 0
4 bolha copepodo 0
5 bolha cosc_multiplo 0
6 bolha coscinodiscus 0
7 bolha detritos 0
8 bolha langanho 0
9 bolha mult_organismos 0
10 bolha multiplos 0
# … with 188 more rows