Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R: How to perform calculation by rows using data stored in defined columns?

The example data is shown as below:

id a b n1 n2
1 1 1 10 20
2 2 2 20 40
3 0 0 10 20
4 1 0 20 40
5 0 1 10 20

I need to calculate score k1 and k2 in R.

Assuming C is a constant.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

k1=(a/b)/(n1/n2+C)

k2=(a/b)/(n1+n2+C)

Because row3 is double-arm zero data, k1 and k2 will be NA. If k1 or k2 is NA, an alternative formula will be used:

k1=n1/(n1+n2)

k2=n2/(n1+n2)

What I did is using for loop to locate the exact value in every single cell. But it will be very slow when applied to a huge dataset. apply function seems to be a faster method. But I’m too naive to create a runnable function for apply(data, 1, function). I don’t know what kind of input should be given into apply. Is there any elegant and faster way to do this job except for the for loop? Thank you so much.

My code is pasted below:

k1 = c()
k2 = c()
C = 0.25

for (i in 1:nrow(data)){
  k1[i] = (data[i,"a"]/data[i,"b"])/(data[i,"n1"]/data[i,"n2"]+C)
  k2[i] = (data[i,"a"]/data[i,"b"])/(data[i,"n1"]+data[i,"n2"]+C)
  
  if (is.na(k1[i])){
    k1[i] = data[i,"n1"]/(data[i,"n1"]+data[i,"n2"])
  }
  
  if (is.na(k2[i])){
    k2[i] = data[i,"n2"]/(data[i,"n1"]+data[i,"n2"])
  }
}

>Solution :

You can use the mutate() function from {dplyr}:

# Calculate k1 and k2
data <- data %>% 
    # Perform calculation
    mutate(k1 = (a/b)/(n1/n2+C),                     # k1
           k2 = (a/b)/(n1+n2+C),                     # k2
           k1 = ifelse(is.na(k1), n1/(n1+n2), k1),   # Other formula for k1 if k1 is NA
           k2 = ifelse(is.na(k2), n2/(n1+n2), k2))   # Other formula for k2 if k2 is NA

This gives me the same as your code returned, but is more efficient:

# A tibble: 5 × 6
      a     b    n1    n2      k1       k2
  <dbl> <dbl> <dbl> <dbl>   <dbl>    <dbl>
1     1     1    10    20   1.33    0.0331
2     2     2    20    40   1.33    0.0166
3     0     0    10    20   0.333   0.667 
4     1     0    20    40 Inf     Inf     
5     0     1    10    20   0       0    
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading