Suppose I have variables in dataframe Data1
STUDENT COURSE GRADE TIME FEUW GH X1 Y6 U9 W3 Q0
and wish to create this model
model = lm(GRADE ~ COURSE + TIME + GH + X1 + Y6 + W3 + Q0)
is there a way to shorten the typing out of variable names for example so i can say
model = lm(GRADE ~ COURSE + TIME + GH/Y6 + W3/Q0) ?
>Solution :
dplyr::select gives you flexibility to include or exclude variables based on name, position (e.g. 1 is the first column), ranges of names/positions, starting/ending/containing phrases within the name, etc.
For instance, for ranges of columns:
lm(mpg ~ ., data = dplyr::select(mtcars, mpg, cyl:drat, gear:carb))
> mtcars
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
...
dplyr::select(mtcars, mpg, cyl:drat, gear:carb)
mpg cyl disp hp drat gear carb
Mazda RX4 21.0 6 160.0 110 3.90 4 4
Mazda RX4 Wag 21.0 6 160.0 110 3.90 4 4
Datsun 710 22.8 4 108.0 93 3.85 4 1
Hornet 4 Drive 21.4 6 258.0 110 3.08 3 1
...