I have a dataset that consists of unique identifiers for a group of raters and ratees. I would like to be able to get the interrater reliability for each item but am running into a problem with how the data is structured. Because each ratee was rated 4-5 times I am able to group the data by ratee ID. Unfortunately, because of the unique rater ID, I can’t set up the dataset properly to use the irr package.
My data looks something like this
| Rater | Ratee | Rating |
|---|---|---|
| 11111 | 12345 | 1 |
| 12112 | 12345 | 1 |
| 12232 | 12345 | 0 |
| 12457 | 12345 | 0 |
| 16794 | 12345 | 1 |
| 55555 | 16454 | 0 |
| 66666 | 16454 | 1 |
| 77777 | 16454 | 1 |
| 88888 | 16454 | 0 |
| 99999 | 16454 | 1 |
I would like to have some way to iteratively go through each group and rename the unique identifier for the rater to something I can use to pivot the data into the right format. For example, going through each group of ratee ID’s and assigning a new value to the rater like r1 for the first value, r2 for the second value and so on, and repeat once it finds a new group. The end result would hopefully look something like this:
| Rater | Ratee | Rating |
|---|---|---|
| r1 | 12345 | 1 |
| r2 | 12345 | 1 |
| r3 | 12345 | 0 |
| r4 | 12345 | 0 |
| r5 | 12345 | 1 |
| r1 | 16454 | 0 |
| r2 | 16454 | 1 |
| r3 | 16454 | 1 |
| r4 | 16454 | 0 |
| r5 | 16454 | 1 |
Can anyone help me do this? I am at a loss and have exhausted my R repertoire.
>Solution :
I think you want this:
library(dplyr)
your_data %>%
group_by(Ratee) %>%
mutate(new_rater_column = paste0("r", row_number())) %>%
ungroup()
I used a new column name instead of overwriting the old Rater column just in case the information there is useful.