Show all combinations of columns

I have the following data:

df <- structure(list(automatic = c("organismo", "bolha", "organismo", 
"organismo", "cosc_multiplo", "cosc_multiplo", "coscinodiscus", 
"detritos", "mult_organismos", "multiplos", "organismo", "sombra", 
"detritos", "mult_organismos", "detritos", "mult_organismos", 
"detritos", "org_partes", "detritos", "organismo", "organismo", 
"detritos", "organismo", "organismo", "organismo", "bolha", "coral_falso", 
"coscinodiscus", "detritos", "LRaw", "multiplos", "organismo", 
"sombra"), validated = c("appendicularia", "bolha", "cnidaria", 
"copepodo", "cosc_multiplo", "coscinodiscus", "coscinodiscus", 
"coscinodiscus", "coscinodiscus", "coscinodiscus", "coscinodiscus", 
"coscinodiscus", "detritos", "detritos", "langanho", "mult_organismos", 
"multiplos", "org_partes", "organismo", "organismo", "palmeria", 
"pelotas_mix", "phyto", "phyto_cadeia", "phyto_espiral", "sombra", 
"sombra", "sombra", "sombra", "sombra", "sombra", "sombra", "sombra"
), N = c(2L, 1L, 2L, 1L, 2L, 1L, 1229L, 3L, 2L, 4L, 5L, 57L, 
1569L, 1L, 87L, 31L, 1L, 7L, 1L, 75L, 2L, 11L, 4L, 1L, 1L, 1L, 
10L, 25L, 536L, 25L, 30L, 562L, 3678L)), row.names = c(NA, -33L
), class = c("tbl_df", "tbl", "data.frame"))

I would to shown all combinations in columns automatic and validated.
For example, I hadn’t the combination: bolha (in the automatic column) with appendicularia (in the validated column). I would like to show this combination, and the all other’s absents, with a value of 0 in column N.

Where are combinations it has to maintain their value in N column. Like bolha (in automatic column) with bolha (in validated column) has a value in N of 1, it does not have to change.

Thanks all

>Solution :

If you want to get all unique combinations and maintain the original values for N, then you can first use crossing from tidyr to get all unique combinations. Then, we can do a left join to add in the N values from the original dataframe, and finally change NA to 0 for N.

library(tidyverse)

left_join(crossing(automatic = df$automatic, validated = df$validated), 
          df,
          by = c("automatic", "validated")) %>% 
  replace_na(list(N = 0))

Or a shorter option is to simply use rows_update instead of doing a join:

crossing(automatic = df$automatic, validated = df$validated, N = 0) %>% 
  rows_update(df, by = c("automatic", "validated"))

Output

# A tibble: 198 × 3
   automatic validated           N
   <chr>     <chr>           <int>
 1 bolha     appendicularia      0
 2 bolha     bolha               1
 3 bolha     cnidaria            0
 4 bolha     copepodo            0
 5 bolha     cosc_multiplo       0
 6 bolha     coscinodiscus       0
 7 bolha     detritos            0
 8 bolha     langanho            0
 9 bolha     mult_organismos     0
10 bolha     multiplos           0
# … with 188 more rows

Leave a Reply