Why does it ignore creating the dummy for females?
suppressMessages(library(tidyverse))
data(titanic_train, package = "titanic")
titanicTib <- as_tibble(titanic_train) %>%
mutate_at(.vars = c("Survived", "Sex", "Pclass"), .funs = factor) %>%
mutate(FamSize = SibSp + Parch) %>%
select(Pclass, Sex, Age, Fare, FamSize)
X <-model.matrix(~ -1 + Pclass + Sex + Age + Fare + FamSize, data=titanicTib)
head(X, 5)
>Solution :
To have all levels of variable Sex in the model matrix, add
contrasts.arg = list(Sex = contrasts(titanicTib$Sex, contrasts = FALSE))
to the model.matrix call.
X <- model.matrix(~ -1 + Pclass + Sex + Age + Fare + FamSize, data = titanicTib,
contrasts.arg = list(Sex = contrasts(titanicTib$Sex, contrasts = FALSE)))
head(X, 5)
#> Pclass1 Pclass2 Pclass3 Sexfemale Sexmale Age Fare FamSize
#> 1 0 0 1 0 1 22 7.2500 1
#> 2 1 0 0 1 0 38 71.2833 1
#> 3 0 0 1 1 0 26 7.9250 0
#> 4 1 0 0 1 0 35 53.1000 1
#> 5 0 0 1 0 1 35 8.0500 0
Created on 2023-02-21 with reprex v2.0.2