Shorten number of elements in comma-separated strings by vector

I have data with columns such as Area_bsl that contain strings of comma-separated values and a column diffr that states the number of elements by which Area_bsl must be shortened:

df <- data.frame(
  id = 1:3,
  Area_bsl = c("155,199,198,195,100,112,177,199,188,144",
               "100,99,98,95,100,112,111,99",                        
               "131,166,155,111,100,117,166,188,101,101,105,166"),
  diffr = c(3,0,6)
)

So what I need to do is cut off …

  • the last 3 elements in Area_bsl and id == 1
  • 0 elements in Area_bsl and id == 2
  • the last 6 elements in Area_bsl and id == 3

I’ve been approaching this task like this; the last part using slice_head throws an error:

library(tidyverse)
df %>%
  # separate comma-separated values into rows:
  separate_rows(Area_bsl) %>%
  # for each `id`...:
  group_by(id) %>%
  #... create a row counter:
  mutate(rowid = row_number()) %>%
  # ...create the cutoff point:
  mutate(cutoff = last(rowid) - diffr) %>%
  # ...slice out as many as `cutoff` rows: <--- does not work! 
  slice_head(n = cutoff[1])
Error in `slice_head()`:
! `n` must be a constant.
Caused by error in `force()`:
! object 'cutoff' not found

The desired result is this:

      id Area_bsl diffr rowid cutoff
   <int> <chr>    <dbl> <int>  <dbl>
 1     1 155          3     1      7
 2     1 199          3     2      7
 3     1 198          3     3      7
 4     1 195          3     4      7
 5     1 100          3     5      7
 6     1 112          3     6      7
 7     1 177          3     7      7
11     2 100          0     1      8
12     2 99           0     2      8
13     2 98           0     3      8
14     2 95           0     4      8
15     2 100          0     5      8
16     2 112          0     6      8
17     2 111          0     7      8
18     2 99           0     8      8
19     3 131          6     1      6
20     3 166          6     2      6
21     3 155          6     3      6
22     3 111          6     4      6
23     3 100          6     5      6
24     3 117          6     6      6

>Solution :

First we remove the n = diffr from the string Area_bsl with strsplit() then collapse again. Finally we use separate_rows:

library(dplyr)
library(tidyr)

df %>% 
  rowwise() %>% 
  mutate(Area_bsl = ifelse(diffr == 0, Area_bsl, paste(head(strsplit(Area_bsl, ",")[[1]], -diffr), collapse = ","))) %>% 
  separate_rows(Area_bsl, sep = ",") %>% 
  data.frame()

OR

library(dplyr)
library(tidyr)

df %>% 
  rowwise() %>% 
  mutate(Area_bsl = ifelse(diffr == 0, Area_bsl, paste(head(strsplit(Area_bsl, ",")[[1]], -diffr), collapse = ","))) %>% 
  separate_longer_delim(Area_bsl, delim = ",")
 id Area_bsl diffr
1   1      155     3
2   1      199     3
3   1      198     3
4   1      195     3
5   1      100     3
6   1      112     3
7   1      177     3
8   2      100     0
9   2       99     0
10  2       98     0
11  2       95     0
12  2      100     0
13  2      112     0
14  2      111     0
15  2       99     0
16  3      131     6
17  3      166     6
18  3      155     6
19  3      111     6
20  3      100     6
21  3      117     6

Leave a Reply