I have a dataframe similar to the one below and I need to count how many times the same row pattern repeats in this data frame.
start_id | end_id | type | id
1 | 2 | a | 1
2 | 5 | a | 2
1 | 3 | b | 3
2 | 5 | a | 4
1 | 3 | b | 5
The result I want is this:
start_id | end_id | type | n
1 | 2 | a | 1
2 | 5 | a | 2
1 | 3 | b | 2
I tried the following code, but it is not merging the records, it is returning the same rows as they are, just adding a new column with the counter, which is bad for my analysis:
Sumary <- clear_filt_trip %>%
group_by(start_id, end_id, type) %>%
add_count(across(everything()))
I tried using summarize but it’s just repeating the columns.
What can I do about it?
>Solution :
dplyr
library(dplyr)
dat %>%
group_by(start_id, end_id, type) %>%
tally() %>%
ungroup()
# # A tibble: 3 x 4
# start_id end_id type n
# <dbl> <dbl> <chr> <int>
# 1 1 2 a 1
# 2 1 3 b 2
# 3 2 5 a 2
base R
aggregate(. ~ start_id + end_id + type, data = dat, FUN = length)
# start_id end_id type id
# 1 1 2 a 1
# 2 2 5 a 2
# 3 1 3 b 2
Data
dat <- structure(list(start_id = c(1, 2, 1, 2, 1), end_id = c(2, 5, 3, 5, 3), type = c("a", "a", "b", "a", "b"), id = 1:5), row.names = c(NA, -5L), class = "data.frame")