dplyr & dataset: how to pivot?

I’m trying to pivot this dataframe without success


> head( data_shape, 15)
# A tibble: 15 x 5
COD_REG    apqual   num_pen    pens_mean   pens_mean_real
<int>       <int>    <int>     <dbl>         <dbl>
1            16      116       843.           7.69
1            15      674      1309.          11.9 
2            15       36      1496.          13.6 
2            16        3      1200           10.9 
3            15      917      1494.          13.6 
3            16       99       915.           8.35
4            15      314      1380.          12.9 
4            16       36       734.           6.84
5            16       68       724.           6.76
5            15      556      1405.          13.1 
6            15      195      1642.          15.3 
6            16       26       952            8.88
7            15      269      1386.          12.6 
7            16       52       946.           8.63
8            15      655      1425.          13.3 

I need to have the apqual as columns such that all the other values of the other columns referred to the same COD_REG lie on a single row ( in my dataset you can notice that they are distributed on 2 rows)

I mean this

COD_REG  apqual15   apqual16  num_pen15  num_pen16   pens_mean15  pens_mean16  pens_mean_real15 pens_mean_real16

1            15        16       674          116      1309.            843.          11.9         7.69           
2            15        16      36             3       1496.            1200          13.6         10.9             
3            15        16     917            99       1494.            915.          13.6         8.35           
4            15        16     314            36       1380.            734.          12.9         6.84                      
5            15        16     556            68       1405.            724.          13.1         6.76
6            15        16     195            26       1642.            952           15.3         8.88            
7            15        16     269            52       1386.            946.          12.6         8.63

I’ve tried different pivot_longer, but it seems I’m not able to set it correctly.. Can anyone help me???

>Solution :

Since you’re increasing the number of columns, you want to pivot_wider:

pivot_wider(data_shape,
            names_from = 'apqual',
            values_from = c('apqual', 'num_pen', 'pens_mean', 'pens_mean_real'))

# A tibble: 8 × 7
  COD_REG apqual_16 apqual_15 num_pen_16 num_pen_15 pens_mean_16 pens_mean_15 pens_mean_real_16 pens_mean_real_15
    <int>     <int>     <int>      <int>      <int>        <dbl>        <dbl>             <dbl>             <dbl>
1       1        16        15        116        674          843         1309              7.69              11.9
2       2        16        15          3         36         1200         1496             10.9               13.6
3       3        16        15         99        917          915         1494              8.35              13.6
4       4        16        15         36        314          734         1380              6.84              12.9
5       5        16        15         68        556          724         1405              6.76              13.1
6       6        16        15         26        195          952         1642              8.88              15.3
7       7        16        15         52        269          946         1386              8.63              12.6
8       8        NA        15         NA        655           NA         1425             NA                 13.3

You can select all the columns to spread by name, as above, or use other tidyverse methods to specify the columns. Both of the below functions will return the same thing:

# Specify all columns not to spread wider:
pivot_wider(data_shape,
            names_from = 'apqual',
            values_from = -'COD_REG')

# Use tidyverse selector functions
pivot_wider(data_shape,
            names_from = 'apqual',
            values_from = c('apqual', matches('pen')))

Leave a Reply