Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Error in rowSums(., na.rm = TRUE) : 'x' must be numeric – despite verifying variables are numeric

When I tried summing 24 rows for specific columns in my data frame, it spit out

Error in rowSums(., na.rm = TRUE) : 'x' must be numeric 

I tried various methods to determine whether the columns of interest were numeric.

x_isnum <- select_if(x2009, is.numeric)
names(x_isnum)
# Check data type of every variable in data frame
str(x2009)

All columns of interest were listed as numeric. Then I even opened the data frame and hovered over each column to verify they were numeric; they were.
I acknowledge that since the df is so large, it’s possible I overlooked something. So I subset the data to learn about just the columns in question.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

p = x2009[,c(48,49, 70:91)]
is.numeric(p)

FALSE

Since it returned false, I ran

str(p)

'data.frame':   17090 obs. of  24 variables:
 $ poss_cannabis_female_over_64 : num  0 0 0 0 0 0 0 0 0 0 ...
 $ poss_cannabis_female_under_10: num  0 0 0 0 0 0 0 0 0 0 ...
 $ poss_cannabis_male_over_64   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ poss_cannabis_male_under_10  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ poss_cannabis_tot_10_12      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ poss_cannabis_tot_13_14      : num  0 1 0 0 0 0 1 0 0 0 ...
 $ poss_cannabis_tot_15         : num  0 1 0 3 0 0 0 1 0 0 ...
 $ poss_cannabis_tot_16         : num  1 0 3 2 1 0 2 2 2 1 ...
 $ poss_cannabis_tot_17         : num  1 0 1 3 1 2 0 3 2 1 ...
 $ poss_cannabis_tot_18         : num  0 0 1 2 2 1 1 1 0 0 ...
 $ poss_cannabis_tot_19         : num  0 2 0 4 1 0 3 0 0 0 ...
 $ poss_cannabis_tot_20         : num  0 1 0 2 0 0 2 1 1 3 ...
 $ poss_cannabis_tot_21         : num  0 0 0 1 1 0 0 0 1 0 ...
 $ poss_cannabis_tot_22         : num  0 2 0 1 0 0 2 0 1 0 ...
 $ poss_cannabis_tot_23         : num  1 0 0 3 2 0 1 1 0 0 ...
 $ poss_cannabis_tot_24         : num  1 0 0 0 1 0 0 0 0 0 ...
 $ poss_cannabis_tot_25_29      : num  0 0 2 3 2 1 0 0 1 2 ...
 $ poss_cannabis_tot_30_34      : num  0 0 0 1 0 1 0 1 0 0 ...
 $ poss_cannabis_tot_35_39      : num  1 0 0 1 1 0 0 1 0 0 ...
 $ poss_cannabis_tot_40_44      : num  0 1 0 0 0 0 0 1 0 0 ...
 $ poss_cannabis_tot_45_49      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ poss_cannabis_tot_50_54      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ poss_cannabis_tot_55_59      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ poss_cannabis_tot_60_64      : num  0 0 0 0 1 0 0 0 0 0 ...

I also ran

sapply(p, is.numeric)

poss_cannabis_female_over_64 
                         TRUE 
poss_cannabis_female_under_10 
                         TRUE 
   poss_cannabis_male_over_64 
                         TRUE 
  poss_cannabis_male_under_10 
                         TRUE 
      poss_cannabis_tot_10_12 
                         TRUE 
      poss_cannabis_tot_13_14 
                         TRUE 
         poss_cannabis_tot_15 
                         TRUE 
         poss_cannabis_tot_16 
                         TRUE 
         poss_cannabis_tot_17 
                         TRUE 
         poss_cannabis_tot_18 
                         TRUE 
         poss_cannabis_tot_19 
                         TRUE 
         poss_cannabis_tot_20 
                         TRUE 
         poss_cannabis_tot_21 
                         TRUE 
         poss_cannabis_tot_22 
                         TRUE 
         poss_cannabis_tot_23 
                         TRUE 
         poss_cannabis_tot_24 
                         TRUE 
      poss_cannabis_tot_25_29 
                         TRUE 
      poss_cannabis_tot_30_34 
                         TRUE 
      poss_cannabis_tot_35_39 
                         TRUE 
      poss_cannabis_tot_40_44 
                         TRUE 
      poss_cannabis_tot_45_49 
                         TRUE 
      poss_cannabis_tot_50_54 
                         TRUE 
      poss_cannabis_tot_55_59 
                         TRUE 
      poss_cannabis_tot_60_64 
                         TRUE 

Finally, I ran sapply(p, class), which again displayed numeric for each variable. I again hovered over each column in the subsetted data frame, and again, each column said it was numeric

There must be something I am missing if r is telling me it’s not numeric. I doubt the code is the problem because I ran it on a smaller, made up df with no issues, but just in case, here is what I ran to sum the rows of specific columns.

x2009 = x2009 %>%
  mutate(poss_cannabis_juv_tot = select(., c(49,71:76))) %>% 
  rowSums(na.rm = TRUE) %>% 
  mutate(poss_cannabis_adult_tot = select(., c(48,70,77:91))) %>%
  rowSums(na.rm = TRUE) %>% 
  relocate(poss_cannabis_juv_tot, .after = poss_cannabis_male_17) %>% 
  relocate(poss_cannabis_adult_tot, .after = poss_cannabis_male_over_64) 

What is going on??

>Solution :

The issue is in creating a column from from select. Instead, select the columns within across and get the rowSums

library(dplyr)
x2009 %>%
    mutate(poss_cannabis_juv_tot = rowSums(across(where(is.numeric)), 
        na.rm = TRUE))

Or if it should be with indexes

x2009 %>%
    mutate(poss_cannabis_juv_tot = rowSums(across(c(49,71:76)), na.rm = TRUE),
     poss_cannabis_adult_tot = rowSums(across(c(48,70,77:91)), na.rm = TRUE)) %>%
    relocate(poss_cannabis_juv_tot, .after = poss_cannabis_male_17) %>% 
    relocate(poss_cannabis_adult_tot, .after = poss_cannabis_male_over_64) 

In the OP’s code, the rowSums part is selecting all the columns because the column created with select is a data.frame (in addition to the other non-numeric columns)

> head(iris) %>%
    mutate(new = select(., 2:4)) %>%
    str
'data.frame':   6 obs. of  6 variables:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4
 $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9
 $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7
 $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4
 $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1
 $ new         :'data.frame':   6 obs. of  3 variables:
  ..$ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9
  ..$ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7
  ..$ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4

head(iris) %>% 
   mutate(new = select(., 2:4)) %>%
  rowSums(na.rm = TRUE)
Error in rowSums(., na.rm = TRUE) : 'x' must be numeric

Instead, with across

head(iris) %>%
    mutate(new = rowSums(across(2:4), na.rm = TRUE))
 Sepal.Length Sepal.Width Petal.Length Petal.Width Species new
1          5.1         3.5          1.4         0.2  setosa 5.1
2          4.9         3.0          1.4         0.2  setosa 4.6
3          4.7         3.2          1.3         0.2  setosa 4.7
4          4.6         3.1          1.5         0.2  setosa 4.8
5          5.0         3.6          1.4         0.2  setosa 5.2
6          5.4         3.9          1.7         0.4  setosa 6.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading