Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas Dataframe from matrix-like dictionary where keys are tuples of indices

I have a dictionary whose keys are tuples of the form (i,j) and whose values are matrix entries.

So if you think of a mathematical matrix $A = (a_{i,j})$ then matrix_dict[(i,j)] would give the value of row i and column j.

I would like to have a pandas dataframe where the values of matrix_dict[(i,0)] for i in range 1 to m+1 are the names of the rows, matrix_dict[(0,j)] for j in range 1 to n+1 the names of the columns and all values where none of the tuple indices (i,j) are 0 to be the entries of the df with the corresponding row and column index.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

The dictionary would look like this:

matrix_dict = {
    (0, 0): 'RowIndex\ColumnIndex',
    (0, 1): 'Column1',
    (0, 2): 'Column2',
    (1, 0): 'Row1',
    (1, 1): 1,
    (1, 2): 2,
    (2, 0): 'Row2',
    (2, 1): 3,
    (2, 2): 4
}

I thought it would be easy to convert that into a pandas dataframe as the structure already matches in a way, but the solutions I found on here using pd.DataFrame.from_dict are for different problems where the key tuple is supposed to become part of the dataframe or multi-indices.

>Solution :

If I understood correctly, use pandas.Series and unstack:

dic = {(0, 0): 1, (0, 1): 2, (1, 0): 3, (1, 1): 4, (2, 2): 5}

df = pd.Series(dic).unstack(fill_value=0)

Output:

   0  1  2
0  1  2  0
1  3  4  0
2  0  0  5

You can also reindex using m and n:

m, n = 4, 5

df = (pd.Series(dic).unstack(fill_value=0)
        .reindex(index=range(m), columns=range(n), fill_value=0)
     )

Output:

   0  1  2  3  4
0  1  2  0  0  0
1  3  4  0  0  0
2  0  0  5  0  0
3  0  0  0  0  0

updated question:

matrix_dict = {
    (0, 0): 'RowIndex\ColumnIndex',
    (0, 1): 'Column1',
    (0, 2): 'Column2',
    (1, 0): 'Row1',
    (1, 1): 1,
    (1, 2): 2,
    (2, 0): 'Row2',
    (2, 1): 3,
    (2, 2): 4
}

m, n = 2, 2

df = (pd.Series(matrix_dict).unstack(fill_value=0)
        .reindex(index=range(m+1), columns=range(n+1), fill_value=0)
        .set_index(0)
        .pipe(lambda d: d.set_axis(d.iloc[0], axis=1).iloc[1:])
        .rename_axis(index=None, columns=None)
     )

Output:

     Column1 Column2
Row1       1       2
Row2       3       4

Bonus:

df = (pd.Series(matrix_dict).unstack(fill_value=0)
        .reindex(index=range(m+1), columns=range(n+1), fill_value=0)
        .set_index(0)
        .pipe(lambda d: d.set_axis(d.iloc[0], axis=1).iloc[1:])
        .rename_axis(**dict(zip(('index', 'columns'),
                                matrix_dict[(0, 0)].split('\\'))))
     )

Output:

ColumnIndex Column1 Column2
RowIndex                   
Row1              1       2
Row2              3       4
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading