I have the dataframe:
sub3 <- df1[, c('Attrition', "Age", "DistanceFromHome", "MonthlyIncome", "NumCompaniesWorked", "PercentSalaryHike", "TotalWorkingYears", "TrainingTimesLastYear", "YearsAtCompany", "YearsSinceLastPromotion", "YearsWithCurrManager")]
sub3
Where Attrition is the response variable.
I am trying to run a loop of ANOVA test in R to validate the relation between my response variable and the categorical ones, my code is:
df_num <- function(x) {
aov <- aov(as.numeric(sub3$Attrition) ~ sub3[, x], data = sub3)
res <- data.frame('row' = 'Attrition'
, 'column' = colnames(sub3)[x]
, "p.value" = summary(aov)[[1]][["Pr(>F)"]]
)
return(res)
}
num_df <- do.call(rbind, lapply(seq_along(sub3)[-1], df_num))
head(num_df)
But my result is:
p.value
1 Attrition Age 1.996802e-26
2 Attrition Age NA
3 Attrition DistanceFromHome 5.182860e-01
4 Attrition DistanceFromHome NA
5 Attrition MonthlyIncome 3.842748e-02
6 Attrition MonthlyIncome NA
I do not understand why the code is not running for all dataset variables and the reason why the Age, DistanceFromHome and MonthlyIncome are duplicated
>Solution :
Your code probably runs for all the variables but you’re only displaying the first 6 entries by running head
! Try running print(num_df, n=nrow(num_df))
, which will display all entries.
The reason for the duplicated values in num_df
is that the aov
object you’re creating has 2 rows, so subsetting the column Pr(>F)
returns two values. You can test for yourself by trying this, which will compute ANOVA for the pair of Attrition and Age:
aov <- aov(as.numeric(sub3$Attrition) ~ sub3[, 2], data = sub3)
summary(aov)[[1]][["Pr(>F)"]] # this will report the p-value, and a NA value
To fix the duplication, you need to extract the first value from the Pr(>F)
column, like so:
df_num <- function(x) {
aov <- aov(as.numeric(sub3$Attrition) ~ sub3[, x], data = sub3)
res <- data.frame('row' = 'Attrition'
, 'column' = colnames(sub3)[x]
, "p.value" = summary(aov)[[1]][["Pr(>F)"]][1] # use only the first value of the p-value column
)
return(res)
}