Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Concatenating two (or more) dataframes while adding column in python

I’m trying to generate the input dataset for multi-task learning, but I faced some problems during concatenating the datasets.
This is the example dataframes for merging.

import pandas as pd

df1 = pd.DataFrame({
    'sample': ['A', 'B', 'C', 'D'],
    'group': [1,0,1,0],
    'value': [123, 64, 534, 873]
})

df2 = pd.DataFrame({
    'sample': ['A', 'D', 'E'],
    'group': [1,1,0],
    'value': [372, 981, 23]
})

df1
# sample    group      value
#0     A        1        123
#1     B        0         64
#2     C        1        534
#3     D        0        873

df2
# sample    group      value
#0     A        1        372
#1     D        1        981
#2     E        0         23

Expected result

#group_x and group_y columns do not need to be the same name as the expected result.
df3
# sample    group_x   group_y      value
#0     A          1       NaN        123
#1     B          0       NaN         64
#2     C          1       NaN        534
#3     D          0       NaN        873
#4     A        NaN         1        372
#5     D        NaN         1        981
#6     E        NaN         0         23

I tried pd.concat and pd.merge for appending two dataframes because they share the same columns.
But 1) with pd.concat, I could not append group columns horizontally, and 2) pd.merge expand columns widely.
Is there any recommended function for this job?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You need a outer merge but on columns [‘sample’, ‘value’]

out = df1.merge(df2, how='outer', on=['sample', 'value'])

print(out)

  sample  group_x  value  group_y
0      A      1.0    123      NaN
1      B      0.0     64      NaN
2      C      1.0    534      NaN
3      D      0.0    873      NaN
4      A      NaN    372      1.0
5      D      NaN    981      1.0
6      E      NaN     23      0.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading