I have a list of data.frames of equal size. There exist missing data in different rows and columns of each data.frame. I would like to remove the row of each data frame for which one of data.frames have a row that contains a NaN. The current lapply and na.omit code I have removes each row corresponding to the specific data.frame which makes sense as it goes through each data.frame in the list before moving on to the next one. However, I would like to make it so that if an NaN exists in one row of a data.frame that row gets removed from all other data.frames
Some example code:
#Make list
ls <- list(x1=data.frame(a=c(1,2,3,4),b=c(2,3,4,5),c=c(3,4,NaN,6)),
x2=data.frame(a=c(1,NaN,3,4),b=c(2,3,4,5),c=c(3,4,5,6)))
#Desired output
lscalc <- list(x1=data.frame(a=c(1,4),b=c(2,5),c=c(3,6)),
x2=data.frame(a=c(1,4),b=c(2,5),c=c(3,6)))
>Solution :
Assuming all the datasets have the same number of rows, get the row index from all the datasets first and then loop over the list and remove those rows
un1 <- unique(unlist(lapply(ls, function(x) which(is.na(x), arr.ind = TRUE)[,1])))
lapply(ls, function(x) x[!seq_len(nrow(x)) %in% un1, ])
$x1
a b c
1 1 2 3
4 4 5 6
$x2
a b c
1 1 2 3
4 4 5 6