Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

What type of joining should I use

I have two databases with different numbers of columns. All columns of the second database are included in the second database. The patients in the two databases are also different. I need to merge the two databases. The function merge (or _join of dplyr) will not work in principle since I have to overlay the databases. The binding (rowbind) should not also works cause I have different columns. What is the simple way to do it?

mydata<-data.frame(
  ID=c(1,1,1,2,2),B=rep("b",5),C=rep("c",5),D=rep("d",5)
)

mydata2<-data.frame(ID=c(3,4),B=c("b2","b2"),C=c("c2","c2"))

The expected dataset is this below:

  ID  B  C    D
1  1  b  c    d
2  1  b  c    d
3  1  b  c    d
4  2  b  c    d
5  2  b  c    d
6  3 b2 c2 <NA>
7  4 b2 c2 <NA>

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

A mere merge should suffice

merge( mydata, mydata2, all=T )
  ID  B  C    D
1  1  b  c    d
2  1  b  c    d
3  1  b  c    d
4  2  b  c    d
5  2  b  c    d
6  3 b2 c2 <NA>
7  4 b2 c2 <NA>
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading