Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to properly use pivot_wider() to align the values of two variables?

I have one dataset as below.

library(dplyr)
library(tidyr)

df= tibble::tibble(
    variety=rep(c("CV1", "CV2", "CV3"), each=16L),
    irrigation=rep(rep(c("yes", "no"), 3), each=8L),
    fertilizer=rep(rep(c("Organic", "Urea"), 6), each=4L),
    reps=c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 3, 4, 3, 1, 3, 4, 3, 2, 1, 4,
    2, 3, 1, 4, 1, 3, 1, 2, 2, 4, 1, 2, 1, 4, 2, 3, 1, 4, 2, 3, 2, 4),
    yield=c(8.379842, 8.058658, 9.73285, 9.224371999999999, NA, 6.996108000000001,
    9.865782, 7.112071666666666, 5.968758, 8.976471666666667, 7.980724, 9.35065,
    5.5111574999999995, 6.998728, 6.164252, 5.118412857142857, 7.748125, 8.58071,
    NA, NA, 7.673354999999999, 7.91948, NA, NA, 11.190445, 8.463484999999999,
    9.61818, 10.89841, 7.83943, 8.44905, 9.844165, 9.98026, 10.130675, 9.59432,
    NA, NA, 9.502525, 9.216965, NA, NA, 7.807259999999999, 9.94434, 7.92808,
    11.88664, 10.700185000000001, 10.723835000000001, 11.363140000000001,
    11.846934999999998),
    nutrients=c(0.42549600000000004, 0.417924, 0.47264, 0.45002, NA, 0.381154, 0.484084,
    0.3597316666666666, 0.32555, 0.45681666666666665, 0.38164600000000004,
    0.456822, 0.30655, 0.363892, 0.350876, 0.30200857142857146, 0.26754,
    0.30954499999999996, NA, NA, 0.328395, 0.30893, NA, NA, 0.37877, 0.33532,
    0.40417000000000003, 0.4581, 0.32077500000000003, 0.33331500000000003,
    0.39925, 0.40179000000000004, 0.40585499999999997, 0.339465, NA, NA, 0.339545,
    0.34077500000000005, NA, NA, 0.3227, 0.37770000000000004, 0.34663, 0.48564,
    0.43601500000000004, 0.38200500000000004, 0.47248500000000004, 0.506255),
)
head(df,5)
variety irrigation  fertilizer  reps  yield     nutrients
CV1     yes         Organic     1     8.379842  0.425496
CV1     yes         Organic     2     8.058658  0.417924
CV1     yes         Organic     3     9.732850  0.472640
CV1     yes         Organic     4     9.224372  0.450020
CV1     yes         Urea        1     NA        NA
.
.
.

I want to align the yields for Organic fertilizer with those for Urea fertilizer, as well as the nutrients between the two fertilizers to create regression graph between two different yield and nutrients. Initially, I attempted to use pivot_wider().

df2= data.frame(df %>%
                group_by(variety, irrigation) %>%
                pivot_wider(names_from=fertilizer, values_from=nutrients))
head(df2,8)
variety irrigation  reps     nutrients  Organic   Urea
1   CV1     yes         1    0.4254960  8.379842  NA
2   CV1     yes         2    0.4179240  8.058658  NA
3   CV1     yes         3    0.4726400  9.732850  NA
4   CV1     yes         4    0.4500200  9.224372  NA
5   CV1     yes         1    NA         NA        NA
6   CV1     yes         2    0.3811540  NA        6.996108
7   CV1     yes         3    0.4840840  NA        9.865782
8   CV1     yes         4    0.3597317  NA        7.112072
    .
    .
    .

Currently, the yields for Organic and Urea are not aligned next to each other. What I’m aiming for is a layout like the following.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

variety irrigation  reps     nutrients  Organic   Urea
1   CV1     yes         1    0.4254960  8.379842  NA
2   CV1     yes         2    0.4179240  8.058658  6.996108
3   CV1     yes         3    0.4726400  9.732850  9.865782
4   CV1     yes         4    0.4500200  9.224372  7.112072
    .
    .
    .

How can I solve this problem? Also, is there any way to transpose both yield and nutrients at once?

Thanks,

>Solution :

Here’s my best guess. It looks like in a few cases you have multiple observations with the same variety/irrigation/reps/fertilizer. To address that, I add a variable obs to distinguish these.

df |>
  arrange(variety, irrigation, reps) %>% 
  mutate(obs = row_number(), .by = c(variety, irrigation, reps, fertilizer)) %>%
  pivot_wider(names_from = fertilizer, values_from = c(yield, nutrients))

Result

# A tibble: 28 × 8
   variety irrigation  reps   obs yield_Organic yield_Urea nutrients_Organic nutrients_Urea
   <chr>   <chr>      <dbl> <int>         <dbl>      <dbl>             <dbl>          <dbl>
 1 CV1     no             1     1          5.97       5.51             0.326          0.307
 2 CV1     no             2     1          8.98       7.00             0.457          0.364
 3 CV1     no             3     1          7.98       6.16             0.382          0.351
 4 CV1     no             4     1          9.35       5.12             0.457          0.302
 5 CV1     yes            1     1          8.38      NA                0.425         NA    
 6 CV1     yes            2     1          8.06       7.00             0.418          0.381
 7 CV1     yes            3     1          9.73       9.87             0.473          0.484
 8 CV1     yes            4     1          9.22       7.11             0.450          0.360
 9 CV2     no             1     1         11.2        7.84             0.379          0.321
10 CV2     no             1     2         NA          9.84            NA              0.399
# ℹ 18 more rows
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading