Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to efficiently create a matrix of index and column names combinations in python/pandas

I want to create a dataframe/matrix where each entry is a concatenation of the corresponding index and column names. I can do it with a nested for loop [see below] but was wondering whether there is a better/faster way to do this?

My problem partly comes from just not knowing how to phrase correctly what I am trying to do. I think it is like matrix multiplication, but with strings and concatenation instead of multiplication. [So any comments on how this problem could be better phrased are welcome too]

I found that itertools has a product() function that does somewhat what I want but creates a generator instead of returning a matrix/dataframe.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

[Motivation; In biology we have plates with 96 wells for sample storage etc. They have columns 1-12 and rows A-H. Some machines return the well code (A3, G12 etc.). I want to generate the codes to map between plates and experiment outputs.]

import pandas as pd

rows = [x for x in "ABCDEFGH"]
columns = [str(x) for x in range(1, 13)]
wells = pd.DataFrame(columns=columns, index=rows)
for i in rows:
    for j in columns:
        wells.loc[i, j] = "".join([i, j])

>wells
Out[37]: 
   1   2   3   4   5   6   7   8   9    10   11   12
A  A1  A2  A3  A4  A5  A6  A7  A8  A9  A10  A11  A12
B  B1  B2  B3  B4  B5  B6  B7  B8  B9  B10  B11  B12
C  C1  C2  C3  C4  C5  C6  C7  C8  C9  C10  C11  C12
D  D1  D2  D3  D4  D5  D6  D7  D8  D9  D10  D11  D12
E  E1  E2  E3  E4  E5  E6  E7  E8  E9  E10  E11  E12
F  F1  F2  F3  F4  F5  F6  F7  F8  F9  F10  F11  F12
G  G1  G2  G3  G4  G5  G6  G7  G8  G9  G10  G11  G12
H  H1  H2  H3  H4  H5  H6  H7  H8  H9  H10  H11  H12

>Solution :

You can use np.char.add() and broadcasting for this:

rows = [x for x in "ABCDEFGH"]
columns = [str(x) for x in range(1, 13)]
data = np.char.add(np.array(rows)[:, None], np.array(columns)[None, :])
wells = pd.DataFrame(data=data, columns=columns, index=rows)
#     1   2   3   4   5   6   7   8   9   10   11   12
# A  A1  A2  A3  A4  A5  A6  A7  A8  A9  A10  A11  A12
# B  B1  B2  B3  B4  B5  B6  B7  B8  B9  B10  B11  B12
# C  C1  C2  C3  C4  C5  C6  C7  C8  C9  C10  C11  C12
# D  D1  D2  D3  D4  D5  D6  D7  D8  D9  D10  D11  D12
# E  E1  E2  E3  E4  E5  E6  E7  E8  E9  E10  E11  E12
# F  F1  F2  F3  F4  F5  F6  F7  F8  F9  F10  F11  F12
# G  G1  G2  G3  G4  G5  G6  G7  G8  G9  G10  G11  G12
# H  H1  H2  H3  H4  H5  H6  H7  H8  H9  H10  H11  H12
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading