I have multiple dataframes and would like a dataframe that contains all column names from said multiple dataframes.
For example :
# Existing Dataframes
df1 =
df1_colA df1_colB df1_colC
0 1 2 3
1 4 5 6
2 7 8 9
df2 =
df2_colA df2_colB df3_colC
0 10 11 12
1 13 14 15
2 16 17 18
df3 =
df3_colA df3_colB df3_colC
0 30 31 32
1 33 34 35
2 36 37 38
I would like to get a dataframe like this :
names =
df_names col_names
0 df1 df1_colA
1 df1 df1_colB
2 df1 df1_colC
3 df2 df2_colA
4 df2 df2_colB
5 df2 df2_colC
6 df3 df3_colA
7 df3 df3_colB
8 df3 df3_colC
Help would be very appreciated and thank you in advance!
>Solution :
If possible extract DataFrame names fom columns names use list comprehension with concat and last for new column in first position use DataFrame.insert with Series.str.extractSeries.str.extractall for values from columnsnames before _:
dfs = [df1, df2, df3]
df = pd.concat([df.columns.to_frame(name='col_names') for df in dfs], ignore_index=True)
df.insert(0, 'df_names', df['col_names'].str.extract('^(.*)_'))
print (df)
df_names col_names
0 df1 df1_colA
1 df1 df1_colB
2 df1 df1_colC
3 df2 df2_colA
4 df2 df2_colB
5 df3 df3_colC
6 df3 df3_colA
7 df3 df3_colB
8 df3 df3_colC
Similar ide with flatten list comprehension:
dfs = [df1, df2, df3]
df = pd.DataFrame({'col_names': [x for df in dfs for x in df.columns]})
df.insert(0, 'df_names', df['col_names'].str.extract('^(.*)_'))
print (df)
df_names col_names
0 df1 df1_colA
1 df1 df1_colB
2 df1 df1_colC
3 df2 df2_colA
4 df2 df2_colB
5 df3 df3_colC
6 df3 df3_colA
7 df3 df3_colB
8 df3 df3_colC
Alternative is create dictionary of DataFrames and in dict comprehension use concat, keys of dict create first level of MultiIndex, so not necessary parse columns names:
dfs = {'df1':df1, 'df2':df2, 'df3':df3}
df = (pd.concat({k:v.columns.to_frame(name='col_names') for k, v in dfs.items()})
.droplevel(1)
.rename_axis('df_names')
.reset_index())
print (df)
df_names col_names
0 df1 df1_colA
1 df1 df1_colB
2 df1 df1_colC
3 df2 df2_colA
4 df2 df2_colB
5 df2 df3_colC
6 df3 df3_colA
7 df3 df3_colB
8 df3 df3_colC