Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R – Problem Converting Multiple Columns to Numeric

I have several dataframes in my dataset that contain numeric values, but are type chr. They look like this:

   IDs     FCC_Faces FCNC_Faces          FCC_Objects         FCNC_Objects  FCC_Scenes FCNC_Scenes FCC_Words  FCNC_Words    Age Group
   <chr>       <dbl> <chr>               <chr>               <chr>         <chr>      <chr>       <chr>      <chr>       <dbl> <dbl>
 1 CON_L01      1    0.58330000000000004 0.83330000000000004 0.5833000000~ 0.6        0.6         0.9165999~ 0.66659999~    63     0
 2 CON_L03      0.92 0.83                1                   0.92          0.9        0.9         0.83       1              37     0
 3 CON_L04      0.75 0.83                0.92                0.92          0.9        0.8         0.67       0.92           48     0
 4 CON_L05      0.75 1                   0.92                1             0.9        1           1          1              49     0
 5 CON_L07      0.58 0.17                0.57999999999999996 0.75          0.8        0.7         0.83       0.67           69     0
 6 CON_L10      0.92 0.67                0.83                0.5799999999~ 0.8        0.9         0.33       0.57999999~    58     0
 7 CON_L14      0.83 0.83                0.83                0.75          0.8        0.9         0.92       0.67           62     0
 8 CON_L16      1    0.92                NA                  NA            0.9        0.9         1          1              40     0
 9 CON_L17      0.83 0.57999999999999996 0.75                0.83          0.9        0.6         1          0.83           48     0
10 CON_L18      0.75 0.75                0.75                0.75          0.9        0.5         0.75       0.67           55     0
# ... with 70 more rows

I wanted to write a function to which I could pass the dataframe, the column names, and have it convert all of them to numeric. My first attempt was just using lapply():

cols_to_numeric <- function(dframe, columns) {
    dframe[ , columns] <- lapply(dframe[ , columns], numeric)
    return(dframe)
}

However, I get this error:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Error in FUN(X[[i]], ...) : invalid 'length' argument 
3.
FUN(X[[i]], ...) 
2.
lapply(dframe[, columns], numeric)

So, looking for another solution, this thread suggested a nice approach using Tidyverse:

cols_to_numeric <- function(dframe, columns) {
    require(magrittr)
    require(tidyverse)
    
    dframe %<>% mutate_at(columns, numeric)
    return(dframe)
}

But I get a very similar error:

 Error: Problem with `mutate()` input `FCC_Faces`.
x invalid 'length' argument
i Input `FCC_Faces` is `(function (length = 0L) ...`.

But the length of this variable is very clearly not zero:

> length(CON_data$FCC_Faces)
[1] 80

Further, if I just do the coercion for each column manually, it works without complaints:

> CON_data$FCC_Faces <- as.numeric(CON_data$FCC_Faces)

> str(CON_data)
'data.frame':   80 obs. of  11 variables:
 $ IDs         : chr  "CON_L01" "CON_L03" "CON_L04" "CON_L05" ...
 $ FCC_Faces   : num  1 0.92 0.75 0.75 0.58 0.92 0.83 1 0.83 0.75 ...
 $ FCNC_Faces  : chr  "0.58330000000000004" "0.83" "0.83" "1" ...
 $ FCC_Objects : chr  "0.83330000000000004" "1" "0.92" "0.92" ...
 $ FCNC_Objects: chr  "0.58330000000000004" "0.92" "0.92" "1" ...
 $ FCC_Scenes  : chr  "0.6" "0.9" "0.9" "0.9" ...
 $ FCNC_Scenes : chr  "0.6" "0.9" "0.8" "1" ...
 $ FCC_Words   : chr  "0.91659999999999997" "0.83" "0.67" "1" ...
 $ FCNC_Words  : chr  "0.66659999999999997" "1" "0.92" "1" ...
 $ Age         : num  63 37 48 49 69 58 62 40 48 55 ...
 $ Group       : num  0 0 0 0 0 0 0 0 0 0 ...
> 

What on earth am I doing wrong here? I don’t want to have to manually coerce every single column for each dataframe every time I make any changes.

>Solution :

A possible solution:

library(dplyr)

df <- tibble(
  x = c("0.2", "0.4", "0.9"), 
  y = c("1.2", "2.4", "3.9"))

df %>% 
  mutate(across(x:y, as.numeric))
#> # A tibble: 3 × 2
#>       x     y
#>   <dbl> <dbl>
#> 1   0.2   1.2
#> 2   0.4   2.4
#> 3   0.9   3.9
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading