My data look like this
| Person ID | Treatment code |
|---|---|
| A | 00001 |
| A | 00002 |
| A | 00003 |
| B | 00002 |
| B | 00012 |
| C | 00003 |
| C | 00015 |
Each patient can have an infinite number of treatment.
What I would like is a line for each patient with as much columns as the maximum number of treatment a patient can have. And, in each column of treatment, the corresponding code.
| Pateint Id | Treatment 1 | Treatment 2 | Treatment 3 |
|---|---|---|---|
| A | 00001 | 00002 | 00003 |
| B | 00002 | 00012 | missing |
| C | 00003 | 00015 | missing |
I used to do this very easily in Stata but I am now struggling with R. I have tried to use the tidyr package and the pivot_wider function but did not succeed.
Thanks a lot for your help!
>Solution :
While this could look like a clear duplicate of How to reshape data from long to wide format, there is a missing link. Before we start pivoting we need to obtain the row number per group.
E.g.
library(dplyr)
library(tidyr)
df |>
group_by(PersonID) |>
mutate(name = row_number()) |>
ungroup() |>
pivot_wider(id_cols = "PersonID",
values_from = "Treatmentcode",
names_prefix = "Treatment")
Output:
# A tibble: 3 × 4
PersonID Treatment1 Treatment2 Treatment3
<chr> <chr> <chr> <chr>
1 A 00001 00002 00003
2 B 00002 00012 NA
3 C 00003 00015 NA
Data:
library(readr)
df <- read_table("PersonID Treatmentcode
A 00001
A 00002
A 00003
B 00002
B 00012
C 00003
C 00015")