Encode unique observations using identifier

Advertisements

I have a data frame where one column is consisting of strings, which is a unique identifier to a journey. A reproducible data frame:

df <- data.frame(tours = c("ansc123123", "ansc123123", "ansc123123", "baa3999", "baa3999", "baa3999"),
                 order = rep(c(1, 2, 3), 2))

Now my real data is much larger with many more observations and unique identifiers, but I would like to have an output on the format as when you do something like this (but not manually encoded), so that the journeys with the same tours value get encoded as the same journey.

df$journey <- c(1, 1, 1, 2, 2, 2)

>Solution :

You can convert it to a factor.

df$journey <- as.integer(factor(df$tours))

df$journey
#[1] 1 1 1 2 2 2

Or use match and unique.

match(df$tours, unique(df$tours))

Leave a ReplyCancel reply