Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Split a python pandas column using positional string in another column

I have a dataframe with the structure:

ID Split Data
1 GT:RC:BC:CN 1:4:5:3
2 GT:RC:CN 1:7:0
3 GT:BC 4:2

I would like to create n new columns and populate with the data in the Data column, where n is the total number of unique fields split by a colon in the Split column (in this case, this would be 4 new columns: GT, RC, BC, CN). The new columns should be populated with the corresponding data in the Data column, so for ID 3, only column GT and BC should be populated. I have tried using string splitting, but that doesn’t take into account the correct column to move the data to.

The output should look like this:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

ID Split Data GT RC BC CN
1 GT:RC:BC:CN 1:4:5:3 1 4 5 3
2 GT:RC:CN 1:7:0 1 7 0
3 GT:BC 4:2 4 2

Thanks in advance!

>Solution :

You can use:

out = df.join(pd.concat([pd.Series(d.split(':'), index=s.split(':'))
                         for s,d in zip(df['Split'], df['Data'])], axis=1).T)

output:

   ID        Split     Data GT   RC   BC   CN
0   1  GT:RC:BC:CN  1:4:5:3  1    4    5    3
1   2     GT:RC:CN    1:7:0  1    7  NaN    0
2   3        GT:BC      4:2  4  NaN    2  NaN
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading