Remove a row from all dataframes in a list if NA value in one of the rows

May 24, 2022

I have a list of data.frames of equal size. There exist missing data in different rows and columns of each data.frame. I would like to remove the row of each data frame for which one of data.frames have a row that contains a NaN. The current lapply and na.omit code I have removes each row corresponding to the specific data.frame which makes sense as it goes through each data.frame in the list before moving on to the next one. However, I would like to make it so that if an NaN exists in one row of a data.frame that row gets removed from all other data.frames

Some example code:

#Make list
ls <- list(x1=data.frame(a=c(1,2,3,4),b=c(2,3,4,5),c=c(3,4,NaN,6)),
           x2=data.frame(a=c(1,NaN,3,4),b=c(2,3,4,5),c=c(3,4,5,6)))
#Desired output
lscalc <- list(x1=data.frame(a=c(1,4),b=c(2,5),c=c(3,6)),
               x2=data.frame(a=c(1,4),b=c(2,5),c=c(3,6)))

>Solution :

Assuming all the datasets have the same number of rows, get the row index from all the datasets first and then loop over the list and remove those rows

un1 <- unique(unlist(lapply(ls, function(x) which(is.na(x), arr.ind = TRUE)[,1])))
lapply(ls, function(x) x[!seq_len(nrow(x)) %in% un1, ])
$x1
  a b c
1 1 2 3
4 4 5 6

$x2
  a b c
1 1 2 3
4 4 5 6