Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Iterating over each row for every column within a data frame

I’m working with a large data set in R in which the * character is used to denote cells with missing values. I am trying to replace cells that have this * with NA. To do this, I am trying to iterate over every row (per column) using the following code

for (i in 1:nrow(mydata)){
  if (i == "*"){
    mydata[i,] <- NA
  }
}

The code runs but the data frame remains unchanged. Can someone help me understand why it doesn’t work and help with different ways to get the intended result?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You can just do this

mydata[mydata == '*'] <- NA

Generally bad form in R to loop over a data frame since is is computationally inefficient, it should only be done if a a vectorized alternative is not available. Most operations on a data frame can be done without looping.

Also, your code doesn’t work because you’re checking whether the iterator i equals *, not if the value equals star. Your code is checking

1 == *
2 == *
3 == *
etc.

Which of course will each return FALSE, so no changes are made. To loop you’d need to loop over both rows and columns checking the value mydata[i, j] == '*' where i is your row index and j is your column index.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading