Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Using an if else loop to populate a dataframe in R

I am working with a dataset of about 2000 rows and 22 columns.
I need to assign a score of 0-4 into a new column, based on the contents of two others. Here’s the example:

SAMP_SMPT_USER_REFERENCE MEAS_ANAL_METH_CODE oP.Res TP.Res OP_METH TP_METH
0031 21 0.02 0.04
0032 24 0.12 0.003

(with apologies for how the database generated that heading)

I am trying to build a loop that looks at MEAS_ANAL_METH_CODE and oP.Res. First it will see if MEAS_ANAL_METH_CODE is 21 or 24, then based on that answer it will look at the value in oP.Res and add a category (0,1,2,3,4) to the OP_METH column. Then it will so it again for TP.Res, looking at MEAS_ANAL_METH_CODE, TP.Res and assigning a value to TP_Meth.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I’m new at loops!

This is what I have tried:


for(oP.Res in Merged){
  
  if(Merged$MEAS_ANAL_METH_CODE == 24 && Merged$oP.Res < 0.01) {
    "0"
  } else if(Merged$MEAS_ANAL_METH_CODE == 24 && Merged$oP.Res >= 0.01 && Merged$oP.Res < 0.0375) {
    "1"
  } else if(Merged$MEAS_ANAL_METH_CODE == 24 && Merged$oP.Res >= 0.0375 && Merged$oP.Res < 0.0725) {
    "2"
  } else if(Merged$MEAS_ANAL_METH_CODE == 24 && Merged$oP.Res >= 0.0725 && Merged$oP.Res < 0.1) {
    "3"
  } else if(Merged$MEAS_ANAL_METH_CODE == 24 && Merged$oP.Res >= 0.1) {
    "4"
  } else if(Merged$MEAS_ANAL_METH_CODE == 21 && Merged$oP.Res < 0.01) {
    "4"
  } else if(Merged$MEAS_ANAL_METH_CODE == 21 && Merged$oP.Res >= 0.01 && Merged$oP.Res < 0.0375) {
    "3"
  } else if(Merged$MEAS_ANAL_METH_CODE == 21 && Merged$oP.Res >= 0.0375 && Merged$oP.Res < 0.0725) {
    "2"
  } else if(Merged$MEAS_ANAL_METH_CODE == 21 && Merged$oP.Res >= 0.0725 && Merged$oP.Res < 0.1) {
    "1"
  } else {"0"} 
}

I can see there is probably an issue about R not knowing where to put the outputs, but I cannot figure it out.

>Solution :

A solution using loops and a lookup list.

First store the cut breaks and labels for each code in a list.

tmp=list(
  "21"=list(
    "brk"=c(0,0.01,0.0375,0.0725,0.1,1),
    "lab"=0:4
  ),
  "24"=list(
    "brk"=c(0,0.01,0.0375,0.0725,0.1,1),
    "lab"=4:0
  )
)

Then loop over the columns of interest and for each code apply the cut function.

for(cc in c("oP.Res","TP.Res")) {
  Merged[paste0(cc,"_cut")]=NA
  for (ctg in unique(Merged$MEAS_ANAL_METH_CODE)) {
    Merged[Merged$MEAS_ANAL_METH_CODE==ctg,paste0(cc,"_cut")]=
      as.character(
        cut(
          Merged[Merged$MEAS_ANAL_METH_CODE==ctg,cc],
          tmp[[as.character(ctg)]][["brk"]],
          tmp[[as.character(ctg)]][["lab"]]
        )
      )
  }
}
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading