Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Add value Y to Column B if Column A is within range(X)

I have the following data frame:

df <- data.frame(id=1:56,
                 x_1 = runif(56), x_2 = runif(56), x_3 = runif(56),    
                 x_4 = runif(56))

I am trying to add a value (let’s say 0.5) to a new column of this df if id is equal to any of the following numbers: 1:27, 32, 44:50, 54, 55, 56, and then add another value (let’s say 0.4) to all of the remaining rows that have an id value not included in the aforementioned range. Then I will multiply each other cell in the row by these new values.

I know I can do this with an ifelse or mutate statement, but that would require typing out each individual number that id could be equal to which I’m hoping is unnecessary.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Any help would be appreciated!

>Solution :

A suggestion using dplyr:

library(dplyr)
IDs <- c(1:27, 32, 44:50, 54, 55, 56)
set.seed(42)
df <- data.frame(id=1:56,
#                  x_1 = runif(56), x_2 = runif(56), x_3 = runif(56),    
#                  x_4 = runif(56))
IDs <- c(1:27, 32, 44:50, 54, 55, 56)

We can first assign multi to be one of 0.4 or 0.5 based on the id:

tibble(df) %>%
  mutate(multi = if_else(id %in% IDs, 0.5, 0.4)) %>%
  tail()
# # A tibble: 6 x 6
#      id    x_1     x_2   x_3   x_4 multi
#   <int>  <dbl>   <dbl> <dbl> <dbl> <dbl>
# 1    51 0.333  0.740   0.602 0.547   0.4
# 2    52 0.347  0.733   0.197 0.893   0.4
# 3    53 0.398  0.536   0.535 0.490   0.4
# 4    54 0.785  0.00227 0.180 0.172   0.5
# 5    55 0.0389 0.609   0.452 0.543   0.5
# 6    56 0.749  0.837   0.317 0.961   0.5

From this, we can easily multiple that value against a subset of columns using dplyr::across.

tibble(df) %>%
  mutate(multi = if_else(id %in% IDs, 0.5, 0.4)) %>%
  mutate(across(x_1:x_4, ~ multi * .)) %>%
  tail
# # A tibble: 6 x 6
#      id    x_1     x_2    x_3    x_4 multi
#   <int>  <dbl>   <dbl>  <dbl>  <dbl> <dbl>
# 1    51 0.133  0.296   0.241  0.219    0.4
# 2    52 0.139  0.293   0.0788 0.357    0.4
# 3    53 0.159  0.214   0.214  0.196    0.4
# 4    54 0.392  0.00114 0.0898 0.0858   0.5
# 5    55 0.0195 0.304   0.226  0.272    0.5
# 6    56 0.374  0.418   0.159  0.481    0.5

The two steps could be combined into a single mutate. They could even be combined into a single assignment without it if you no longer need multi around.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading