Home Combine two dataframes with scalable columns by placing the second dataframe directly below the first dataframe without deleting the keys

Questions

Combine two dataframes with scalable columns by placing the second dataframe directly below the first dataframe without deleting the keys

byMR

April 14, 2024

Here are my 2 dataframes that I have:

Code:

import pandas as pd

data1 = {'Key1': ['Value1'], 'Key2': ['Value2'], 'Key3': ['Value3'], 
         'Key4': ['Value4'], 'Key5': ['Value5']}

data2 = {'Key1': ['Value1', 'Value4'], 
         'Key4': ['Value2', 'Value5'], 
         'Key7': ['Value3', 'Value6']}

df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)

Dataframe 1:

Key1    Key2    Key3    Key4    Key5
Value1  Value2  Value3  Value4  Value5

Dataframe 2:

Key1    Key4    Key7
Value1  Value2  Value3
Value4  Value5  Value6

Expected Results:

Key1    Key2    Key3    Key4    Key5
Value1  Value2  Value3  Value4  Value5
Key1    Key4    Key7    NaN     NaN
Value1  Value2  Value3  NaN     NaN
Value4  Value5  Value6  NaN     NaN

How can I join both of these dataframes without merging the keys at all?

>Solution :

Assuming your input DataFrames have Key1/Key2/… as column headers, you can first move those to rows, then concat.

For that, two options, transpose and reset_index (which can be problemantic depending on the data types):

dfs = [df1, df2]

out = pd.concat([d.T.reset_index().T for d in dfs], ignore_index=True)

Or use a double concat:

dfs = [df1, df2]

out = pd.concat([pd.concat([pd.DataFrame([d.columns]),
                            pd.DataFrame(d.to_numpy())])
                 for d in dfs],
                ignore_index=True)

Reproducible input:

df1 = pd.DataFrame({'Key1': ['Value1'], 'Key2': ['Value2'], 'Key3': ['Value3'],
                    'Key4': ['Value4'], 'Key5': ['Value5']})
df2 = pd.DataFrame({'Key1': ['Value1', 'Value4'],
                    'Key2': ['Value2', 'Value5'],
                    'Key3': ['Value3', 'Value6']})

Output:

        0       1       2       3       4
0    Key1    Key2    Key3    Key4    Key5
1  Value1  Value2  Value3  Value4  Value5
2    Key1    Key2    Key3     NaN     NaN
3  Value1  Value2  Value3     NaN     NaN
4  Value4  Value5  Value6     NaN     NaN