I’m working with a large data set in R in which the * character is used to denote cells with missing values. I am trying to replace cells that have this * with NA. To do this, I am trying to iterate over every row (per column) using the following code
for (i in 1:nrow(mydata)){
if (i == "*"){
mydata[i,] <- NA
}
}
The code runs but the data frame remains unchanged. Can someone help me understand why it doesn’t work and help with different ways to get the intended result?
>Solution :
You can just do this
mydata[mydata == '*'] <- NA
Generally bad form in R to loop over a data frame since is is computationally inefficient, it should only be done if a a vectorized alternative is not available. Most operations on a data frame can be done without looping.
Also, your code doesn’t work because you’re checking whether the iterator i equals *, not if the value equals star. Your code is checking
1 == *
2 == *
3 == *
etc.
Which of course will each return FALSE, so no changes are made. To loop you’d need to loop over both rows and columns checking the value mydata[i, j] == '*' where i is your row index and j is your column index.