Remove Rows With Same Value Across All Columns Using R

December 20, 2022

I have the following dataset, and I need to remove rows if they are all empty or have same value across all the columns:

df <- data.frame(players=c('', 'Uncredited', 'C', 'D', 'E'),
                 assists=c("", "Uncredited", 4, 4, 3),
                 ratings=c("", "Uncredited", 4, 7, ""))


df

players      assists      ratings
<chr>         <chr>        <chr>
        
Uncredited  Uncredited  Uncredited
  C             4            4
  D             4            7
  E             3

In our example, the 1st row is all empty and the 2nd row has the same value of Uncredited. Hence, the 1st two rows would be removed.

Desired Output

players assists ratings
 <chr>  <dbl>   <chr>
  C       4       4
  D       4       7
  E       3

Any suggestions would be appreciated. Thanks!

>Solution :

You can use apply to loop over all rows and filter for those that have more than a single distinct value. Note that if all value in a row are empty the row also has only one distinct value, so the first condition is part of the second condition.

df[apply(df,
         MARGIN = 1, # rowwise
         FUN = function(x) length(unique(x)) > 1), ]
#>   players assists ratings
#> 3       C       4       4
#> 4       D       4       7
#> 5       E       3