Here are my 2 dataframes that I have:
Code:
import pandas as pd
data1 = {'Key1': ['Value1'], 'Key2': ['Value2'], 'Key3': ['Value3'],
'Key4': ['Value4'], 'Key5': ['Value5']}
data2 = {'Key1': ['Value1', 'Value4'],
'Key4': ['Value2', 'Value5'],
'Key7': ['Value3', 'Value6']}
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
Dataframe 1:
Key1 Key2 Key3 Key4 Key5
Value1 Value2 Value3 Value4 Value5
Dataframe 2:
Key1 Key4 Key7
Value1 Value2 Value3
Value4 Value5 Value6
Expected Results:
Key1 Key2 Key3 Key4 Key5
Value1 Value2 Value3 Value4 Value5
Key1 Key4 Key7 NaN NaN
Value1 Value2 Value3 NaN NaN
Value4 Value5 Value6 NaN NaN
How can I join both of these dataframes without merging the keys at all?
>Solution :
Assuming your input DataFrames have Key1/Key2/… as column headers, you can first move those to rows, then concat.
For that, two options, transpose and reset_index (which can be problemantic depending on the data types):
dfs = [df1, df2]
out = pd.concat([d.T.reset_index().T for d in dfs], ignore_index=True)
Or use a double concat:
dfs = [df1, df2]
out = pd.concat([pd.concat([pd.DataFrame([d.columns]),
pd.DataFrame(d.to_numpy())])
for d in dfs],
ignore_index=True)
Reproducible input:
df1 = pd.DataFrame({'Key1': ['Value1'], 'Key2': ['Value2'], 'Key3': ['Value3'],
'Key4': ['Value4'], 'Key5': ['Value5']})
df2 = pd.DataFrame({'Key1': ['Value1', 'Value4'],
'Key2': ['Value2', 'Value5'],
'Key3': ['Value3', 'Value6']})
Output:
0 1 2 3 4
0 Key1 Key2 Key3 Key4 Key5
1 Value1 Value2 Value3 Value4 Value5
2 Key1 Key2 Key3 NaN NaN
3 Value1 Value2 Value3 NaN NaN
4 Value4 Value5 Value6 NaN NaN