Home Removing specific values in R from entire dataframe

Questions

Removing specific values in R from entire dataframe

November 15, 2021

I need to remove specific text/values from all cells it is present within a dataset. This should be the equivalent of the ‘Find and Replace All’ tool within Excel. This cannot be simply removing/omitting all ‘NA’ values as several entries are NA and need to remain such, thus meaning it needs to be removing "NA |". This also cannot be limited to just the columns listed below and must be applied to the entire data frame, as there are 20+ additional columns to the right of V6.

Input:

V1            V2           V3           V4          V5         V6
NA|TR1000     NA|chr1      NA|1000      1200        A          + 
TR1000        chr1         1100         NA|1200     B          + 
NA|TR1000     NA|chr1      NA|1000      NA          C          +  
TR2000        NA|chr2      2000         NA          D          +   
TR2000        NA|chr2      2100         2500        E          + 
TR3000        NA           3000         3500        F          + 
NA|TR3000     chr3         NA|3000      3500        F          +

I Need:

V1         V2         V3         V4         V5        V6
TR1000     chr1       1000       1200       A          + 
TR1000     chr1       1100       1200       B          + 
TR1000     chr1       1000       NA         C          +  
TR2000     chr2       2000       NA         D          +   
TR2000     chr2       2100       2500       E          + 
TR3000     NA         3000       3500       F          + 
TR3000     chr3       3000       3500       F          +

>Solution :

This should do it:

df[] = lapply(df, sub, pattern = "NA|", replacement = "", fixed = TRUE)

Demo:

df = read.table(text = 'V1            V2           V3           V4          V5         V6
NA|TR1000     NA|chr1      NA|1000      1200        A          + 
TR1000        chr1         1100         NA|1200     B          + 
NA|TR1000     NA|chr1      NA|1000      NA          C          +  
TR2000        NA|chr2      2000         NA          D          +   
TR2000        NA|chr2      2100         2500        E          + 
TR3000        NA           3000         3500        F          + 
NA|TR3000     chr3         NA|3000      3500        F          + ', header = T)

df[] = lapply(df, sub, pattern = "NA|", replacement = "", fixed = TRUE)

df
  #       V1   V2   V3   V4 V5 V6
  # 1 TR1000 chr1 1000 1200  A  +
  # 2 TR1000 chr1 1100 1200  B  +
  # 3 TR1000 chr1 1000 <NA>  C  +
  # 4 TR2000 chr2 2000 <NA>  D  +
  # 5 TR2000 chr2 2100 2500  E  +
  # 6 TR3000 <NA> 3000 3500  F  +
  # 7 TR3000 chr3 3000 3500  F  +

If you like using dplyr, here’s a method there:

df %>% 
  mutate(across(everything(), sub, pattern = "NA|", replacement = "", fixed = TRUE))

dataframe

byMR

Published November 15, 2021

Add a comment

scipy.minimize- missing 1 required positional argument

byMR

November 15, 2021

Questions

Typescript: How to define a function that references an object using generics

byMR

November 15, 2021

Questions

How do I convert a BufferedReader-Input (String) to Integer and save it in an Integer List in java?

byMR

November 15, 2021

Questions

How to implode and foreach get

byMR

November 15, 2021

Questions

Transform column headers to values and respective counts in Python

byMR

November 15, 2021

Questions

Manipulating Private variables of parent class from child class

byMR

November 15, 2021

Removing specific values in R from entire dataframe

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

scipy.minimize- missing 1 required positional argument

Typescript: How to define a function that references an object using generics

How do I convert a BufferedReader-Input (String) to Integer and save it in an Integer List in java?

How to implode and foreach get

Transform column headers to values and respective counts in Python

Manipulating Private variables of parent class from child class

Keep Up to Date with the Most Important News

Removing specific values in R from entire dataframe

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

scipy.minimize- missing 1 required positional argument

Typescript: How to define a function that references an object using generics

How do I convert a BufferedReader-Input (String) to Integer and save it in an Integer List in java?

How to implode and foreach get

Transform column headers to values and respective counts in Python

Manipulating Private variables of parent class from child class

Discover more from Dev solutions