I want to count the occurrences of instances that can have different orders in the given row.
df:
Col1
A;B;C
C;B;A
A;D
A;C;B
D;A
D;C
df_result:
Col freq
A;B;C 3
A;D 2
D;C 1
>Solution :
Split the column at ;, loop over the list, sort, and paste the elements, get the frequency count with table and convert to data.frame in base R
as.data.frame(table(sapply(strsplit(df$Col1, ";"), \(x)
paste(sort(x), collapse = ";"))))
-output
Var1 Freq
1 A;B;C 3
2 A;D 2
3 C;D 1
Or use separate_rows with count in tidyverse
library(dplyr)
library(tidyr)
library(stringr)
df %>%
mutate(rn = row_number()) %>%
separate_rows(Col1) %>%
group_by(rn) %>%
summarise(Col1 = str_c(sort(Col1), collapse = ";")) %>%
count(Col1, name = 'Freq')
# A tibble: 3 × 2
Col1 Freq
<chr> <int>
1 A;B;C 3
2 A;D 2
3 C;D 1
data
df <- structure(list(Col1 = c("A;B;C", "C;B;A", "A;D", "A;C;B", "D;A",
"D;C")), class = "data.frame", row.names = c(NA, -6L))