select rows of a dataframe from other dataframe

April 29, 2022

I have a dataframe like this

 tab A
                X                 y         z              w
tom              1.0049838    1.9803902    2.144567    1.6365186
genny            2.8837235    2.6227277    2.840155    0.1565288
sam             4.0138902     4.5632588    5.600432     7.3532597
Monn             86.4080397   88.2598639   88.375718   88.8078334
Lyla            30.8195037     23.6123446  30.683800    28.8027282
giul            68.7287429     68.9194569   65.954704    76.2295242

         
tab B
                    A                  B               
tom               1.71               0.3  
genny             1.6                0.15
sam               2.0                9.57

and my goal is to obtain a tab C with columns X,Y,Z,W but selecting only the rows that are present in tab B. I thought that I should use dplyr and setting "rownames", but I will appreciate your helps.

>Solution :

In base R, we can subset tabA to just the rownames that are present in tabB.

tabC <- tabA[row.names(tabA) %in% row.names(tabB),]

# Or with subset
# tabC <- subset(tabA, row.names(tabA) %in% row.names(tabB))

Or with dplyr:

library(dplyr)

tabC <- tabA %>% 
  filter(row.names(tabA) %in% row.names(tabB))

Output

tabC
             X        y        z         w
tom   1.004984 1.980390 2.144567 1.6365186
genny 2.883723 2.622728 2.840155 0.1565288
sam   4.013890 4.563259 5.600432 7.3532597

Or a slightly longer tidyverse version would be to convert the rownames for each dataframe, then join the dataframes together, then select the desired columns from tabA.

library(tidyverse)

list(tabA, tabB) %>%
  map(~ .x %>%
        as.data.frame %>%
        rownames_to_column('rn')) %>%
  reduce(right_join, by = 'rn') %>%
  column_to_rownames('rn') %>% 
  select(names(tabA))

Data

tabA < structure(list(X = c(1.0049838, 2.8837235, 4.0138902, 86.4080397, 
30.8195037, 68.7287429), y = c(1.9803902, 2.6227277, 4.5632588, 
88.2598639, 23.6123446, 68.9194569), z = c(2.144567, 2.840155, 
5.600432, 88.375718, 30.6838, 65.954704), w = c(1.6365186, 0.1565288, 
7.3532597, 88.8078334, 28.8027282, 76.2295242)), class = "data.frame", row.names = c("tom", 
"genny", "sam", "Monn", "Lyla", "giul"))

tabB < structure(list(A = c(1.71, 1.6, 2), B = c(0.3, 0.15, 9.57)), class = "data.frame", row.names = c("tom", 
"genny", "sam"))