I have a dataframe containing different activites/events, the day they occured and the duration of the activity. I now want to create a new dataframe containing the number of occurences of event B only grouped by day (so two columns, day and number of occurences). Daily_Duration is a Duration of all events by Day example of the format I want for the other dataframe aswell.
library(dplyr)
df <- data.frame(
Event = c ("A", "B", "C", "B", "C", "B", "B"),
Day = c("Day 1", "Day 1", "Day 1", "Day 2", "Day 2","Day 2","Day 2")
Duration = c(1,2,4,5,1,3,5))
Daily_Duration<- aggregate(Daten$Duration, list(Daten$Day), FUN=sum)
I tried
Event_B_by_day<- df[df$Event == 'B', ]%>%
group_by(df$Day) %>%
summarise(Freq = length(df$Event))
which gives me the following error:
Error: Problem adding computed columns in `group_by()`.
x Problem with `mutate()` input `..1`.
i `..1 = df$Day`.
i `..1` must be size 4 or 1, not 7.
and
Event_B_by_day<- aggregate(df[df$Event=="B"], list(df$Day), FUN=length )
Which returns a dataframe that is not filtered by event B, but adds up the number of occurences for all three events by day, so is identical to:
Event_B_by_day<- aggregate(df$Event, list(df$Day), FUN=length )
So where is the mistake and how do I actually get the data frame I want?
>Solution :
base R
Use a formula.
aggregate(Duration ~ Day, data = df[df$Event == "B",], FUN = sum)
dplyr
df %>%
filter(Event == "B") %>%
group_by(Day) %>%
summarise(Duration = sum(Duration))
output
Day Duration
1 Day 1 2
2 Day 2 13