Reshape strings within a dataframe in R

April 19, 2023

I have a dataframe that looks like this

data.frame(service=c("1,2,3,4","2,4,5","1,3,4"),
                      score=c("1,2,1,3","1,1,3","1,1,1"))
        a       b
1 1,2,3,4 1,2,1,3
2   2,4,5   1,1,3
3   1,3,4   1,1,1

The variable service refers to a categorical variable from 1-5, while the score, is also a categorical variable from 1-3 for each of the previous service. Note that each respondent have taken different services, so the length is not the same for each variable is not the same

I need to reshape this dataframe to associate the score to each service. The final result would look like this

data_frame(ind=c(1,1,1,1,2,2,2,3,3,3),
                 serv=c(1,2,3,4,2,4,5,1,3,4),
                 score=c(1,2,1,3,1,1,3,1,1,1))

    ind  serv score
   <dbl> <dbl> <dbl>
 1     1     1     1
 2     1     2     2
 3     1     3     1
 4     1     4     3
 5     2     2     1
 6     2     4     1
 7     2     5     3
 8     3     1     1
 9     3     3     1
10     3     4     1

I first split the variable service to create all the categories in this way

library("qdapTools","tidyverse")

lst1 <- lapply(strsplit(df$serv, ","), function(x) 
  replace(x, (! x %in% c("1", "2", "3","4","5")) & !is.na(x), "other"))

lst1_tab<-mtabulate(lst1)%>% setNames(paste0('serv_', names(.)))
df<-cbind(df,lst1_tab)

     serv   score serv_1 serv_2 serv_3 serv_4 serv_5
1 1,2,3,4 1,2,1,3      1      1      1      1      0
2   2,4,5   1,1,3      0      1      0      1      1
3   1,3,4   1,1,1      1      0      1      1      0

I did that to later reshape the df in long form. However, not all individuals took all services. For instance, individual 1 did not take service 5. Therefore, I didn’t know how to split also the variable score to associate it to each service

>Solution :

We could use

library(dplyr)
library(tidyr)
df1 %>% 
 mutate(ind = row_number(), .before = 1) %>% 
 separate_longer_delim(c(service, score), delim = ",")

-output

   ind service score
1    1       1     1
2    1       2     2
3    1       3     1
4    1       4     3
5    2       2     1
6    2       4     1
7    2       5     3
8    3       1     1
9    3       3     1
10   3       4     1