Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R: Change value to NA in one column if in another column the datapoint is also NA

I have the following data:

structure(list(Name = c("A", "A", "A", "A", "A", "A", "B", "B", 
"B", "B", "B", "B", "C", "C", "C", "C", "C", "C"), Date = c("01.09.2018", 
"02.09.2018", "03.09.2018", "05.11.2021", "06.11.2021", "07.11.2021", 
"01.09.2018", "02.09.2018", "03.09.2018", "05.11.2021", "06.11.2021", 
"07.11.2021", "01.09.2018", "02.09.2018", "03.09.2018", "05.11.2021", 
"06.11.2021", "07.11.2021"), Category = c(1L, 1L, NA, NA, 2L, 
2L, 3L, 3L, 3L, NA, NA, 4L, 4L, 2L, NA, 2L, 2L, NA), Size = c(34L, 
23L, 12L, 53L, 23L, 53L, 23L, 54L, 65L, 75L, 67L, 45L, 45L, 23L, 
23L, 12L, 12L, NA), Sum_Size = c(34L, 23L, 35L, 128L, 35L, 53L, 
23L, 54L, 65L, 128L, 67L, 45L, 45L, 23L, 35L, 12L, 35L, NA)), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -18L), groups = structure(list(
    Category = c(1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 
    NA, NA, NA, NA), Date = c("01.09.2018", "02.09.2018", "02.09.2018", 
    "05.11.2021", "06.11.2021", "07.11.2021", "01.09.2018", "02.09.2018", 
    "03.09.2018", "01.09.2018", "07.11.2021", "03.09.2018", "05.11.2021", 
    "06.11.2021", "07.11.2021"), .rows = structure(list(1L, 2L, 
        14L, 16L, c(5L, 17L), 6L, 7L, 8L, 9L, 13L, 12L, c(3L, 
        15L), c(4L, 10L), 11L, 18L), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -15L), .drop = TRUE))

How can I change the data in column "Sum_Size" to NA if in the column "Category" there is also NA? So for instance, in row 3, "Sum_Size" should be NA.

I really appreciate any help!

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Here is a possible solution using an ifelse statement:

library(dplyr)

df %>% 
  mutate(Sum_Size= ifelse(is.na(Category), NA_real_, Sum_Size))
   Name  Date       Category  Size Sum_Size
   <chr> <chr>         <int> <int>    <dbl>
 1 A     01.09.2018        1    34       34
 2 A     02.09.2018        1    23       23
 3 A     03.09.2018       NA    12       NA
 4 A     05.11.2021       NA    53       NA
 5 A     06.11.2021        2    23       35
 6 A     07.11.2021        2    53       53
 7 B     01.09.2018        3    23       23
 8 B     02.09.2018        3    54       54
 9 B     03.09.2018        3    65       65
10 B     05.11.2021       NA    75       NA
11 B     06.11.2021       NA    67       NA
12 B     07.11.2021        4    45       45
13 C     01.09.2018        4    45       45
14 C     02.09.2018        2    23       23
15 C     03.09.2018       NA    23       NA
16 C     05.11.2021        2    12       12
17 C     06.11.2021        2    12       35
18 C     07.11.2021       NA    NA       NA
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading