Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Make index and columns the same set (their union) in Pandas dataframe

In our problem, rows (index) and columns belong to the same category of objects. We want to enlarge a Pandas DataFrame, adding rows and columns filled with NaNs or predefined values, so that both the index and column sets are now the union of the original index and column sets.
E.g. transform

A C
B 0 1
C 1 1

into

A B C
A NaN NaN NaN
B 0 NaN 1
C 1 NaN 1

Practical example – constructing an adjacency matrix of a directed graph, with vertex labels in rows and columns. At some stage, some of the columns and rows with no directed edge from them/to them are to be filled.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

The core issue is how to do it efficiently. Being such a basic issue it feels like it should be implemented as a standard method. Is there one?

The simple solution is to iterate over all the entries in index and columns that are not in the other set and add columns/rows (respectively) to the dataframe.

The problem with simple reindex etc. is that we’re simultaneously enlarging the dataframe, and also the missing values can be in between other columns.

>Solution :

I would get the index union and reindex:

idx = df.index.union(df.columns)
out = df.reindex(index=idx, columns=idx)

Output:

     A   B    C
A  NaN NaN  NaN
B  0.0 NaN  1.0
C  1.0 NaN  1.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading