I had problem to get variable value in subset function, from the code i get a warning "Warning: Error in -: invalid argument to unary operator" because "val" in subset function "-c(val)" not define as variable above
cname <- c("A1","A2","A3","A4","A5","A6","A7","A8","A9","A10",
"A11","A12","A13","A14","A15","A16","A17","A18","A19","A20",
"A21","A22","A23","A24","A25","A26","A27","A28","A29","A30","A31")
for (i in 15:length(cname)) {
val <- cname[i]
ifelse(sum(!is.na(df2$val))==0,
df2 <- subset(df2, select = -c(val)),
df2)
}
The df2 result
enter image description here
My expected result is to remove unnecessary column thas has NA values only
enter image description here
Please anyone tell me how to get value from val, so i can remove column that has only NA values. Im still a beginner using r
>Solution :
We can use subset without a loop – use the vectorized colSums on a logical matrix (is.na(df2)) to return the count of NAs in each column, compare (!=) it with the number of rows (nrow(df2)) to create a logical vector, subset the column names, use that in select argument in subset
subset(df2, select = names(df2)[colSums(is.na(df2)) != nrow(df2)])
-output
A1 A2 A4 A5
1 1 1 NA 10
2 2 2 NA 10
3 3 3 NA 10
4 4 NA 3 10
5 5 5 2 10
Or with tidyverse – use select and check for any non-NA elements in each column for selecting the column
library(dplyr)
df2 %>%
select(where(~ any(!is.na(.x))))
-output
A1 A2 A4 A5
1 1 1 NA 10
2 2 2 NA 10
3 3 3 NA 10
4 4 NA 3 10
5 5 5 2 10
data
df2 <- data.frame(A1 = 1:5, A2 = c(1:3, NA, 5), A3 = NA_integer_,
A4 = c(NA, NA, NA, 3, 2), A5 = 10)