I have two Dataframes
df1 = pd.DataFrame(
{
"A": ["1", "3", "22", "43"],
"B": ["6", "19", "4", "31"],
"C": ["47", "15", "8", "19"],
},
index=[0, 1, 2, 3],
)
df2 = pd.DataFrame(
{
"A": ["65", "47", "6", "13"],
"B": ["29", "5", "2", "21"],
"C": ["69", "9", "11", "80"],
},
index=[4, 5, 6, 7],
)
By using pandas, the final result should be:
A B C Ti ID
0 1 6 47 am 01
1 3 19 15 am 01
2 22 4 8 am 01
3 43 31 19 am 01
4 65 29 69 pm 01
5 47 5 9 pm 01
6 6 2 11 pm 01
7 13 21 80 pm 01
I went through Pandas Documentation, and I am trying to merge these two Dataframes by using pd.concat. The code is:
new_df = pd.concat([df1, df2], keys=['am', 'pm']).reset_index()
However, the new Dataframe came out with an extra column level_1 that I don’t want it to be there:
level_0 level_1 A B C
0 am 0 1 6 47
1 am 1 3 19 15
2 am 2 22 4 8
3 am 3 43 31 19
4 pm 4 65 29 69
5 pm 5 47 5 9
6 pm 6 6 2 11
7 pm 7 13 21 80
I know reset_index() created that unwanted column. But why?
What else to do to get the same final Dataframe using pandas?
>Solution :
After you concatenate the DataFrames,
new_df = pd.concat([df1, df2], keys=['am', 'pm'])
new_df looks like
A B C
am 0 1 6 47
1 3 19 15
2 22 4 8
3 43 31 19
pm 4 65 29 69
5 47 5 9
6 6 2 11
7 13 21 80
If we look at the index new_df.index, it’s a MultiIndex where the first level is the keys and second level is the old index:
MultiIndex([('am', 0),
('am', 1),
('am', 2),
('am', 3),
('pm', 4),
('pm', 5),
('pm', 6),
('pm', 7)],
)
Then first we can rename the MultiIndex levels using rename_axis, and reset_index but only remove the first level from the index (which then becomes a column with its name). Note that by default, reset_index removes all levels from the index. That’s why you see level_0 and level_1 columns added after reset_index.
new_df = new_df.rename_axis(['Ti', None]).reset_index(level=0)
You can rearrange the columns by reassigning the DataFrame with a list of the columns with the desired order.
cols = new_df.columns.tolist()
new_df = new_df[cols[1:]+[cols[0]]]
new_df['ID'] = '01'
Output:
A B C Ti ID
0 1 6 47 am 01
1 3 19 15 am 01
2 22 4 8 am 01
3 43 31 19 am 01
4 65 29 69 pm 01
5 47 5 9 pm 01
6 6 2 11 pm 01
7 13 21 80 pm 01