Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

flatten dataframe columns to multi-indexed

I want to go from a previously flattened dataframe with single-level columns back to a multi-indexed dataframe.

Here is an example:

import pandas as pd

# Create a sample dataframe with multi-indexed columns
df = pd.DataFrame({('A', 'a'): [1, 2, 3], ('A', 'b'): [4, 5, 6], ('B', 'a'): [7, 8, 9], ('B', 'b'): [10, 11, 12]})

print(df)

The multi-indexed dataframe:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

   A     B    
   a  b  a   b
0  1  4  7  10
1  2  5  8  11
2  3  6  9  12

Than the flattening:

# Flatten the columns using the to_flat_index() method
df.columns = df.columns.to_flat_index()

print(df)

flattened dataframe with single-level columns:

   (A, a)  (A, b)  (B, a)  (B, b)
0       1       4       7      10
1       2       5       8      11
2       3       6       9      12

How to go from a flattened dataframe with single-level columns back to a multi-indexed dataframe?

>Solution :

First Case

Taking the flattened dataframe from the example above:

# Create a new MultiIndex from the columns tuples names
new_columns = pd.MultiIndex.from_tuples(df.columns)

# Assign the new MultiIndex to the columns
df.columns = new_columns

print(df)

Results in:

   A     B    
   a  b  a   b
0  1  4  7  10
1  2  5  8  11
2  3  6  9  12

Because the df.columns = df.columns.to_flat_index() method returns a new Index object with a single level, which is the result of flattening the multi-level column names into tuples.

Therefore, to create a new multi-indexed dataframe from a flattened dataframe, you need to extract the tuples of the original multi-indexed column names and pass them to the pd.MultiIndex.from_tuples method.

Second Case

You might encounter differently flattened dataframes like in this example:

import pandas as pd

# Create a sample dataframe with multi-indexed columns
df = pd.DataFrame({('A', 'a'): [1, 2, 3], ('A', 'b'): [4, 5, 6], ('B', 'a'): [7, 8, 9], ('B', 'b'): [10, 11, 12]})

# Flatten the columns
df.columns = ['_'.join(col) for col in df.columns]

print(df)

Which results in:

   A_a  A_b  B_a  B_b
0    1    4    7   10
1    2    5    8   11
2    3    6    9   12

In this case you can use the following code to get a multi-indexed dataframe agian:

# Create a new MultiIndex from a list of tuples
new_columns = pd.MultiIndex.from_tuples([tuple(col.split('_')) for col in df.columns])

# Assign the new MultiIndex to the columns
df.columns = new_columns

print(df)

which results in:

   A     B    
   a  b  a   b
0  1  4  7  10
1  2  5  8  11
2  3  6  9  12
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading