Advertisements
I have data with columns such as Area_bsl
that contain strings of comma-separated values and a column diffr
that states the number of elements by which Area_bsl
must be shortened:
df <- data.frame(
id = 1:3,
Area_bsl = c("155,199,198,195,100,112,177,199,188,144",
"100,99,98,95,100,112,111,99",
"131,166,155,111,100,117,166,188,101,101,105,166"),
diffr = c(3,0,6)
)
So what I need to do is cut off …
- the last 3 elements in
Area_bsl
andid == 1
- 0 elements in
Area_bsl
andid == 2
- the last 6 elements in
Area_bsl
andid == 3
I’ve been approaching this task like this; the last part using slice_head
throws an error:
library(tidyverse)
df %>%
# separate comma-separated values into rows:
separate_rows(Area_bsl) %>%
# for each `id`...:
group_by(id) %>%
#... create a row counter:
mutate(rowid = row_number()) %>%
# ...create the cutoff point:
mutate(cutoff = last(rowid) - diffr) %>%
# ...slice out as many as `cutoff` rows: <--- does not work!
slice_head(n = cutoff[1])
Error in `slice_head()`:
! `n` must be a constant.
Caused by error in `force()`:
! object 'cutoff' not found
The desired result is this:
id Area_bsl diffr rowid cutoff
<int> <chr> <dbl> <int> <dbl>
1 1 155 3 1 7
2 1 199 3 2 7
3 1 198 3 3 7
4 1 195 3 4 7
5 1 100 3 5 7
6 1 112 3 6 7
7 1 177 3 7 7
11 2 100 0 1 8
12 2 99 0 2 8
13 2 98 0 3 8
14 2 95 0 4 8
15 2 100 0 5 8
16 2 112 0 6 8
17 2 111 0 7 8
18 2 99 0 8 8
19 3 131 6 1 6
20 3 166 6 2 6
21 3 155 6 3 6
22 3 111 6 4 6
23 3 100 6 5 6
24 3 117 6 6 6
>Solution :
First we remove the n = diffr
from the string Area_bsl
with strsplit()
then collapse
again. Finally we use separate_rows
:
library(dplyr)
library(tidyr)
df %>%
rowwise() %>%
mutate(Area_bsl = ifelse(diffr == 0, Area_bsl, paste(head(strsplit(Area_bsl, ",")[[1]], -diffr), collapse = ","))) %>%
separate_rows(Area_bsl, sep = ",") %>%
data.frame()
OR
library(dplyr)
library(tidyr)
df %>%
rowwise() %>%
mutate(Area_bsl = ifelse(diffr == 0, Area_bsl, paste(head(strsplit(Area_bsl, ",")[[1]], -diffr), collapse = ","))) %>%
separate_longer_delim(Area_bsl, delim = ",")
id Area_bsl diffr
1 1 155 3
2 1 199 3
3 1 198 3
4 1 195 3
5 1 100 3
6 1 112 3
7 1 177 3
8 2 100 0
9 2 99 0
10 2 98 0
11 2 95 0
12 2 100 0
13 2 112 0
14 2 111 0
15 2 99 0
16 3 131 6
17 3 166 6
18 3 155 6
19 3 111 6
20 3 100 6
21 3 117 6