Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Reformat data frame in R for hourly values

I have some data in my R data frame like this:

CharDate         loc  t_0h  t_1h       t_2h     t_3h     t_4h     t_5h     t_6h
xyz_20110101     A    2     20.00000    5       12.00000 13.00000 25.00000  2.00000
xyz_20110101     B    3     26.66667    7       14.33333 15.33333 28.33333 14.66667
xyz_20110102     A    4     33.33333    9       16.66667 17.66667 31.66667 27.33333
xyz_20110102     B    5     40.00000   11       19.00000 20.00000 35.00000 40.00000

The first two columns are in character format and the rest are numeric. The columns t_0h.. represents the value at a specific hour.

I want to reformat the above data like this: date, hour (separate column, 0 – 6), hourly values for A and B as separate columns, and so on (next date..).

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

CharData        Hour t_A  t_B
xyz_20110101    0    2    3
xyz_20110101    1    20   26.6
xyz_20110101    2    5    7
xyz_20110101    3    12   14.3
xyz_20110101    4    13   15.3
xyz_20110101    5    25   28.3
xyz_20110101    6    2    14.6
xyz_20110102    0    4    5
xyz_20110102    1    33.3 40
xyz_20110102    2    9    11
xyz_20110102    3    16.6 19
xyz_20110102    4    17.6 20
xyz_20110102    5    31.6 35
xyz_20110102    6    27.3 40

I have tried the followings:

# A simple transpose
df2 <- data.frame(t(df[-c(1,2)]))
# Melt via data.table
long <- melt(setDT(df), id.vars = c("CharDate"))
and pivot_longer() from tidyr package

but no success so far. Can someone help here (or give hints) to achieve the desired output?

Here is the sample data for reproducibility:

CharDate<- c("xyz_20110101", "xyz_20110101", "xyz_20110102", "xyz_20110102")
loc <- c("A", "B", "A", "B")
t_0h <- c(seq(2,5,length.out=4))
t_1h <- c(seq(20,40,length.out=4))
t_2h <- c(seq(5,11,length.out=4))
t_3h <- c(seq(12,19,length.out=4))
t_4h <- c(seq(13,20,length.out=4))
t_5h <- c(seq(25,35,length.out=4))
t_6h <- c(seq(2,40,length.out=4))

df <- (data.frame(CharDate,loc,t_0h,t_1h,t_2h,t_3h,t_4h,t_5h,t_6h))
df

>Solution :

We can use pivot_longer/pivot_wider

library(tidyr)
library(dplyr)
 pivot_longer(df, cols = starts_with("t_"),
    names_to = c(".value", "Hour"), names_pattern = "^(.)_(\\d+)h") %>% 
  pivot_wider(names_from = loc, values_from = t)

-output

# A tibble: 14 × 4
   CharDate     Hour      A     B
   <chr>        <chr> <dbl> <dbl>
 1 xyz_20110101 0       2     3  
 2 xyz_20110101 1      20    26.7
 3 xyz_20110101 2       5     7  
 4 xyz_20110101 3      12    14.3
 5 xyz_20110101 4      13    15.3
 6 xyz_20110101 5      25    28.3
 7 xyz_20110101 6       2    14.7
 8 xyz_20110102 0       4     5  
 9 xyz_20110102 1      33.3  40  
10 xyz_20110102 2       9    11  
11 xyz_20110102 3      16.7  19  
12 xyz_20110102 4      17.7  20  
13 xyz_20110102 5      31.7  35  
14 xyz_20110102 6      27.3  40  
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading