I have a df like below.:
I would like to do following two steps:
- if first letter of col name matched the value in
Gradeand current value is NA, then replace it as 0. - update the value from A1:B2 as current value, (
current value * 100/ colsums()).
How can I make this happen?
df <- structure(list(Grade = c("A", "A", "B", "B"), Pass = c("Y", "N",
"Y", "N"), A1 = c(7, 8, NA, NA), A2 = c(4, 5, NA, NA), A3 = c(9,
NA, NA, NA), B1 = c(NA, NA, 8, NA), B2 = c(NA, NA, 3, 4)), row.names = c(NA,
-4L), class = c("tbl_df", "tbl", "data.frame"))
>Solution :
With the dplyr package, we can use two across, the first one sets relevant missing values to zero and the second one calculates and displays the percentage.
library(dplyr)
df %>%
mutate(across(A1:B2, ~ifelse(str_extract(cur_column(), "^.") == Grade & is.na(.x), 0, .x)),
across(A1:B2, ~ifelse(is.na(.x), NA, paste0(.x, " (", round(.x * 100/(sum(.x, na.rm = T)), digits = 2), "%)"))))
# A tibble: 4 × 7
Grade Pass A1 A2 A3 B1 B2
<chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 A Y 7 (46.67%) 4 (44.44%) 9 (100%) NA NA
2 A N 8 (53.33%) 5 (55.56%) 0 (0%) NA NA
3 B Y NA NA NA 8 (100%) 3 (42.86%)
4 B N NA NA NA 0 (0%) 4 (57.14%)
