Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Convert numerical variable to categorical variable

I have a list of columns that contain 0 and 1 as values. Right now they are treated as numerical variables but I want them to be treated as categorical.

I tried

as.factor(df[,"diseasesA":"diseaseM"], exclude = NULL)

but received the following error message:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Error in as.factor(df[,"diseasesA":"diseaseM"],  : 
  unused argument (exclude = NULL)

not using "exclude = NULL" gave me the following error message:

Error in "diseasesA":"diseaseM" : NA/NaN argument
In addition: Warning messages:
1: In eval(jsub, setattr(as.list(seq_along(x)), "names", names_x),  :
  NAs introduced by coercion
2: In eval(jsub, setattr(as.list(seq_along(x)), "names", names_x),  :
  NAs introduced by coercion

>Solution :

factor() or as.factor() works on a single column, not a data frame. So you need to apply that function to the columns you want to convert. Here are a few equivalent methods:

cols = paste0("disease", LETTERS[1:13]) # assuming your naming pattern is consistent

## base R with lapply
df[cols] = lapply(df[cols], factor)

## base R with for loop
for(i in seq_along(cols)) {
  df[[i]] = factor(df[[i]])
}

## dplyr
library(dplyr)
df = df %>%
  mutate(across(diseaseA:diseaseM, factor))

I will note that your question is inconsistent in its column naming pattern, disease vs diseases. In the base R methods I assumed that’s a typo and further assumed you wanted to convert columns diseaseA, diseaseB, diseaseC, …, diseaseM. In dplyr we can use across() to use X:Z to operate on all columns starting with X through Z–but there are many other methods possible to select which columns to work on, e.g., starts_with("diesease").

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading