I have some data where time is nested within individuals:
set.seed(124)
x = rnorm(25)
data.frame(id=rep(1:5, each=5), time=1:5, x=x)
What would be a base R solution to append a column that calculates deviations of each observation from the same person’s average across time (i.e., centering around the person’s mean)? The output should look like this (x.c is the appended column that calculates the deviations from the person’s mean):
id time x x.c
1 1 1 -1.38507062 3.814056e-07
2 1 2 0.03832318 1.423394e+00
3 1 3 -0.76303016 6.220408e-01
4 1 4 0.21230614 1.597377e+00
5 1 5 1.42553797 2.810609e+00
6 2 1 0.74447982 2.233398e-08
7 2 2 0.70022940 -4.425040e-02
8 2 3 -0.22935461 -9.738344e-01
9 2 4 0.19709386 -5.473859e-01
10 2 5 1.20715377 4.626740e-01
11 3 1 0.31833673 2.642477e-08
12 3 2 -1.42379885 -1.742136e+00
13 3 3 -0.40509086 -7.234276e-01
14 3 4 0.99538657 6.770499e-01
15 3 5 0.95881779 6.404811e-01
16 4 1 0.91808790 -3.680049e-09
17 4 2 -0.15096960 -1.069058e+00
18 4 3 -1.22306879 -2.141157e+00
19 4 4 -0.86882429 -1.786912e+00
20 4 5 -1.04248536 -1.960573e+00
21 5 1 -1.10363778 2.169331e-07
22 5 2 0.44418506 1.547823e+00
23 5 3 -0.20495061 8.986874e-01
24 5 4 1.67563243 2.779270e+00
25 5 5 -0.13132225 9.723158e-01
I know the tidyverse solution is group_by but I would like a base R solution. Thank you!
>Solution :
A base R solution would be to get the mean by ‘id’ with ave and subtract from the individual observations of ‘x’
df1$x.c <- with(df1, x - ave(x, id))