Sorry for the post title; I could not think of a more succinct way to describe my problem. Say I have a dataset with three sets of outcome variables for each participant. Each outcome variable has two columns, one signifying a group each observation belongs to, and another signifying the score or value for that observation. something like this

```
set.seed(1)
d <- tibble(id = factor(rep(c("tb_10",
"ah_04",
"ck_17"), each = 3)),
out1Fact = factor(sample(x = letters[1:5],
size = 9,
replace = T)),
out1Num = rnorm(9),
out2Fact = factor(sample(x = letters[1:5],
size = 9,
replace = T)),
out2Num = rnorm(9),
out3Fact = factor(sample(x = letters[1:5],
size = 9,
replace = T)),
out3Num = rnorm(9))
d
# output
# # A tibble: 9 x 7
# id out1Fact out1Num out2Fact out2Num out3Fact out3Num
# <fct> <fct> <dbl> <fct> <dbl> <fct> <dbl>
# 1 tb_10 a 0.487 b 0.0746 b -0.832
# 2 tb_10 d 0.738 a -1.99 e -1.17
# 3 tb_10 a 0.576 d 0.620 b -1.07
# 4 ah_04 b -0.305 a -0.0561 a -1.56
# 5 ah_04 e 1.51 d -0.156 c 1.16
# 6 ah_04 c 0.390 c -1.47 c 0.832
# 7 ck_17 b -0.621 b -0.478 d -0.227
# 8 ck_17 c -2.21 b 0.418 c 0.266
# 9 ck_17 c 1.12 d 1.36 a -0.377
```

Now what I need to do is collapse this into three variables: the first that indicates the outcome contained in the first part of the column name (i.e. either `out1`

, `out2`

, or `out3`

), the second the value of the Factor for the observation (i.e. contained in all the columns ending in ‘Fact’), the third the value of the numerical observation (i.e. contained in all the columns ending in ‘Num’).

It should look something like this (note the values in `factVal`

and `numVal`

will not match the corresponding values in the original dataframe, this is just to show you the shape I need)

```
# # A tibble: 27 x 4
# id outType factVal numVal
# <fct> <int> <chr> <dbl>
# 1 tb_10 1 a 1.10
# 2 tb_10 1 e 0.144
# 3 tb_10 1 e -0.118
# 4 tb_10 2 a -0.912
# 5 tb_10 2 a -1.44
# 6 tb_10 2 c -0.797
# 7 tb_10 3 b 1.25
# 8 tb_10 3 b 0.772
# 9 tb_10 3 c -0.220
# 10 ah_04 1 b -0.425
# # ... with 17 more rows
```

Now `pivot_longer()`

and `pivot_wider()`

can do some miraculous stuff but this one is beyond me. I tried using the ‘names_pattern` argument but this particular problem was beyond me. Any help much appreciated.

### >Solution :

```
pivot_longer(d, -id, names_pattern = "out([0-9]+)(.*)", names_to = c("outType", ".value"))
# # A tibble: 27 x 4
# id outType Fact Num
# <fct> <chr> <fct> <dbl>
# 1 tb_10 1 a 0.487
# 2 tb_10 2 b 0.0746
# 3 tb_10 3 b -0.832
# 4 tb_10 1 d 0.738
# 5 tb_10 2 a -1.99
# 6 tb_10 3 e -1.17
# 7 tb_10 1 a 0.576
# 8 tb_10 2 d 0.620
# 9 tb_10 3 b -1.07
# 10 ah_04 1 b -0.305
# # ... with 17 more rows
```

In the `names_pattern`

, we identify two portions of the column names that we want to work off of: the numeric portion (`"(0-9]+)"`

) and all characters after that number (`"(.*)"`

). Those two patterns correspond to the two components of `names_to`

, and the `".value"`

special value maps to one or more different columns (in this case two).

Said different, the non-special `"outType"`

maps the numbers (1, 2, 3) in the column names to a single column names `"outType"`

(obviously).

If we could imagine mapping something like

```
names_pattern = "out([0-9]+)(Fact|Num)"
names_to = c("outType", "Fact", "Num")
# \_ ".value" _/
```

then it might help to understand how the *dynamic* nature of the output columns are determined and mapped.