model.matrix works for "month" column of df but gives unexpected output for "week" column

I am trying to construct a model matrix using model.matrix. Here’s my data, stored as a data frame called wILI:

date       value      week month year
1997-10-01  0.002734167 1   10  1997
1997-10-08  0.003612784 2   10  1997
1997-10-15  0.004757731 3   10  1997
1997-10-22  0.006238000 4   10  1997
1997-10-29  0.008132015 5   10  1997
1997-11-05  0.010522688 6   11  1997
1997-11-12  0.013487294 7   11  1997
1997-11-19  0.017080349 8   11  1997
1997-11-26  0.021308731 9   11  1997
1997-12-03  0.026101156 10  12  1997
1997-12-10  0.031279133 11  12  1997
1997-12-17  0.036542190 12  12  1997
1997-12-24  0.041482753 13  12  1997
1997-12-31  0.045640193 14  12  1997
1998-01-07  0.048587584 15  01  1998
1998-01-14  0.050025386 16  01  1998
1998-01-21  0.049847167 17  01  1998
1998-01-28  0.048152678 18  01  1998
1998-02-04  0.045207680 19  02  1998
1998-02-11  0.041371773 20  02  1998
1998-02-18  0.037022686 21  02  1998
1998-02-25  0.032498271 22  02  1998
1998-03-04  0.028064335 23  03  1998
1998-03-11  0.023905745 24  03  1998
1998-03-18  0.020133246 25  03  1998
1998-03-25  0.016798043 26  03  1998
1998-04-01  0.013908254 27  04  1998
1998-04-08  0.011443810 28  04  1998
1998-04-15  0.009368329 29  04  1998
1998-04-22  0.007637759 30  04  1998
1998-04-29  0.006206186 31  04  1998
1998-05-06  0.005029414 32  05  1998
1998-05-13  0.004066965 33  05  1998
1998-05-20  0.003282970 34  05  1998
1998-05-27  0.002646398 35  05  1998 

I am testing two models for the wILI data, one with a month regressor and the other with a week regressor. That is, I want a coefficient for each month (model 1), and each week (model 2). For the above data, the possible months are 1,2,3,4,5,10,11,12 and the possible weeks are 1,2,…,35. When I use model.matrix(~ 0 + month, wILI), it works as expected:

month01 month02 month03 month04 month05 month10 month11 month12
0   0   0   0   0   1   0   0
0   0   0   0   0   1   0   0
0   0   0   0   0   1   0   0
0   0   0   0   0   1   0   0
0   0   0   0   0   1   0   0
0   0   0   0   0   0   1   0
0   0   0   0   0   0   1   0
0   0   0   0   0   0   1   0
0   0   0   0   0   0   1   0
0   0   0   0   0   0   0   1
0   0   0   0   0   0   0   1
0   0   0   0   0   0   0   1
0   0   0   0   0   0   0   1
0   0   0   0   0   0   0   1
1   0   0   0   0   0   0   0
1   0   0   0   0   0   0   0
1   0   0   0   0   0   0   0
1   0   0   0   0   0   0   0
0   1   0   0   0   0   0   0
0   1   0   0   0   0   0   0
0   1   0   0   0   0   0   0
0   1   0   0   0   0   0   0
0   0   1   0   0   0   0   0
0   0   1   0   0   0   0   0
0   0   1   0   0   0   0   0
0   0   1   0   0   0   0   0
0   0   0   1   0   0   0   0
0   0   0   1   0   0   0   0
0   0   0   1   0   0   0   0
0   0   0   1   0   0   0   0
0   0   0   1   0   0   0   0
0   0   0   0   1   0   0   0
0   0   0   0   1   0   0   0
0   0   0   0   1   0   0   0
0   0   0   0   1   0   0   0

The element in the ith row has a 1 in the column of its corresponding month, and zeros in all the other columns, just like I want. But when I try the same thing using "week" instead of "month", I get this:

week
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35

…Huh?? Why am I getting a 35×1 vector? I want a 35×35 matrix where the first row has a 1 in the first column and zeros everywhere else, the second row has a 1 in the second column and zeros everywhere else, the third row has a 1 in the third column and zeros everywhere else, etc (i.e. the 35×35 identity matrix). Any suggestions for how to accomplish this? And why should the output be so different by simply changing "month" to "week"?

>Solution :

Ensure that week and month are factor (or character). Numeric predictors become a single column in the model matrix whereas a factor generates a column for each level or all except one level if there is an intercept. If the column were already factor or character then factor(…) surrounding the variable could be omitted.

model.matrix(~ factor(month) + 0, wILI)
model.matrix(~ factor(week) + 0, wILI)

Another way to write this which gives nicer coefficient names is:

model.matrix(~ month + 0, transform(wILI, month = factor(month)))
model.matrix(~ week + 0, transform(wILI, week = factor(week)))

Leave a Reply