Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Drop rows whose name aren't certain values

I have two dataframes and I am trying to drop rows whose name is not found in the column names of other dataframe.

For example:

DF1

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

ahmed emad ali
—- —- —- —-
—- —- —- —-
—- —- —- —-

DF2

names
emad
ahmed
ibrahim
saad
hassan

I am trying to drop the DF1 columns whose names aren’t in the names of DF2.

My code so far

library(dplyr)

`%notin%` <- Negate(`%in%`)

for ( i in seq_along(colnames(DF1))){
  if (colnames(DF1)[i] %notin% rownames(DF2){
    DF1=select(DF1,-i)
  }
}

It gets the job done, however it raises this error:

Error: Can’t subset columns that don’t exist.

and if run the code again it drops "ahmed" and "emad" column even they exist in the DF2!!

>Solution :

The issue with your loop is that you set up looping to go across all columns, but you delete columns as you go. By the time you get to column 20, you have deleted several columns and there is no longer a column 20!

But you don’t need a loop at all for this.

cols_to_keep = intersect(colnames(DF1), rownames(DF2))
DF1 = DF1[cols_to_keep]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading