Adding a column based on counts of another column value in R

March 20, 2024

I have a data frame as follows:

comment_id <- c(1, 2, 2, 3, 4, 5, 6, 7, 8, 9, 10)
cat <- c("acc_sp", "acc_lex", "org_gen", "acc_gen", "ran_lex", "arg_rel", "len", "org_lay", "org_spe", "org_gen", "coh_link")
df <- data.frame(comment_id, cat)

You’ll notice that there are two items with comment_id = 2.

I need to create a new column which uniquely numbers each iteration of a comment_id. The first five lines of the new column would be as follows:

comment_cat_id
1_1
2_1
2_2
3_1
4_1

I’m thinking I can use:

df$comment_cat_id <- paste(comment_id, ?????, sep = "_")

to handle the creation of the new column. But I don’t know how to generate the unique count of each occurrence of each comment_id to place into the ????? slot in the above code.

Can anyone help?

>Solution :

That is a way to do it.

df$newcol <- NA # New Column
freq <- as.data.frame(table(df$comment_id)) # Occurencies

> for (i in unique(df$comment_id)) {
+   df[df$comment_id == i,"newcol"] <- paste(i, 1:freq[freq$Var1 == i,"Freq"], sep="_") # For each id, it fills the new column
+ }
> df
  comment_id      cat newcol
1           1   acc_sp    1_1
2           2  acc_lex    2_1
3           2  org_gen    2_2
4           3  acc_gen    3_1
5           4  ran_lex    4_1
6           5  arg_rel    5_1
7           6      len    6_1
8           7  org_lay    7_1
9           8  org_spe    8_1
10          9  org_gen    9_1
11         10 coh_link   10_1