I have a column of IDs in a dataframe that sometimes has duplicates, take for example,
| ID |
|---|
| 209 |
| 315 |
| 109 |
| 315 |
| 451 |
| 209 |
What I want to do is take this column and create another column that indicates what ID the row belongs to. i.e. I want it to look like,
| ID | ID Category |
|---|---|
| 209 | 1 |
| 315 | 2 |
| 109 | 3 |
| 315 | 2 |
| 451 | 4 |
| 209 | 1 |
Essentially, I want to loop through the IDs and if it equals to a previous one, I indicate that it is from the same ID, and if it is a new ID, I create a new indicator for it.
Does anyone know is there a quick function in R that I could do this with? Or have any other suggestions?
>Solution :
Convert to factor with levels ordered with unique (order of appearance in the data set) and then to numeric:
data$IDCategory <- as.numeric(factor(data$ID, levels = unique(data$ID)))
#> data
# ID IDCategory
#1 209 1
#2 315 2
#3 109 3
#4 315 2
#5 451 4
#6 209 1