Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Change Character Value Into Negative Number Using R

I have the following dataframe. All the columns are character columns, but except the 1st column, rest of the columns contain numeric values:

df <- data.frame(col1=c('', 'assets', 'loss', 'liability'),
                 col2=c("", "5,000", "(1,400)", "300"),
                 col3=c("", "4,500", "(1,100)", "500"))

df

col1       col2      col3
<chr>      <chr>    <chr>
        
assets     5,000    4,500
loss      (1,400)  (1,100)
liability   300      500 
                                                                  ​               ​        ​

The task is — if a value is within () [for example, loss value], I need to make that value as a negative number.

I am trying the following script:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

clean_columns <- function(dataframe){

  dataframe_new <- copy(dataframe)

  for (idx in 2:length(names(dataframe))) {
      
      # Check if text contains -- ( )
      if (grepl(is.na(dataframe_new[, idx]), "(", fixed = TRUE)) {
        
      # Remove characters -- () , -- and multiply -1
      dataframe_new[, idx] <- gsub("[(,)]", "", dataframe_new[, idx]) %>%
        as.numeric(dataframe_new[, idx])*(-1)
      
      } else 
      # Remove characters -- ,
      dataframe_new[, idx] <- gsub("[,]", "", dataframe_new[, idx]) %>%
        as.numeric(dataframe_new[, idx])
} 
  return(dataframe_new)
}

The function removes the characters correctly, but make the loss value NA.

clean_columns(df)

col1       col2     col3
<chr>      <dbl>    <dbl>
            NA       NA
assets     5000     4500
loss        NA       NA
liability  300      500

Desired Output

col1       col2     col3
<chr>      <dbl>    <dbl>
            NA       NA
assets     5000     4500
loss      -1400    -1100
liability   300      500

Any suggestions would be appreciated. Thanks!

>Solution :

We loop across the columns having at least some digits, remove the ), replace the ( with - and parse it with parse_number

library(dplyr)
library(stringr)
 df %>% 
  mutate(across(-1, 
   ~ readr::parse_number(str_replace(str_remove(.x, "\\)"), fixed("("), "-") )))

-output

       col1  col2  col3
1              NA    NA
2    assets  5000  4500
3      loss -1400 -1100
4 liability   300   500

Or instead of removing the (, we can detect the ( to multiply by -1

df %>%
   mutate(across(-1, ~ c(1, -1)[1 + str_detect(.x, fixed("("))] * 
         readr::parse_number(.x) ))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading