Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Combine different columns into a new column in a dataframe using pandas

I have a sample dataframe of a very huge dataframe as given below.

import pandas as pd
import numpy as np

NaN = np.nan

data = {'Start_x':['Tom', NaN, NaN, NaN,NaN],
    'Start_y':[NaN, 'Nick', NaN, NaN, NaN],
    'Start_z':[NaN, NaN, 'Alison', NaN, NaN],
    'Start_a':[NaN, NaN, NaN, 'Mark',NaN],
    'Start_b':[NaN, NaN, NaN, NaN, 'Oliver'],
    'Sex': ['Male','Male','Female','Male','Male']}

df = pd.DataFrame(data)
df

I want the final result to look like the image given below. The 4 columns have to be merged to a single new column but the ‘Sex’ column should be as it is.

enter image description here

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Any help is greatly appreciated. Thank you!

>Solution :

One option could be to backfill Start columns by rows and then take the first column:

df['New_Column'] = df.filter(like='Start').bfill(axis=1).iloc[:, 0]

df
  Start_x Start_y Start_z Start_a Start_b     Sex New_Column
0     Tom     NaN     NaN     NaN     NaN    Male        Tom
1     NaN    Nick     NaN     NaN     NaN    Male       Nick
2     NaN     NaN  Alison     NaN     NaN  Female     Alison
3     NaN     NaN     NaN    Mark     NaN    Male       Mark
4     NaN     NaN     NaN     NaN  Oliver    Male     Oliver
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading