Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Create 2534 new columns in a dataframe based on the median values in the existing 2534 columns

I have a dataframe (df) similar to the one mentioned below:

PATID Gene1 Gene2 Gene3
1001 5.899 67.87 9.87
1002 6.899 60.87 10.87
1003 7.899 63.87 8.87

I need a dataframe as mentioned below by looping through the code

Gene1_med = df['Gene1'].map(lambda exp: 'low' if exp <= df['Gene1'].median() else 'high')

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

PATID Gene1 Gene1_med Gene2 Gene2_med Gene3 Gene3_med
1001 5.899 low 67.87 high 9.87 high
1002 6.899 low 60.87 low 10.87 high
1003 7.899 high 63.87 low 8.87 low

I am new to python and any help is highly appreciated. Thanks in advance!

>Solution :

Use numpy.where with selecting columns of Genes, compare by DataFrame.le and assign changed columns names, last sorting columns:

cols = ['Gene1','Gene2','Gene3']

#columns PATID to index
df = df.set_index('PATID')

df[pd.Index(cols) + '_med'] = np.where(df[cols].le(df[cols].median()), 'low','high')

df1 = df.sort_index(axis=1).reset_index()

Or seelct all columns with substring Gene by DataFrame.filter, last sorting all columns without first and append to first column:

df1 = df.filter(like='Gene')

df[df1.columns + '_med'] = np.where(df1.le(df1.median()), 'low','high')

df1 = df.iloc[:, :1].join(df.iloc[:, 1:].sort_index(axis=1))

Or use list for columns names:

cols = ['Gene1','Gene2','Gene3']

df[pd.Index(cols) + '_med'] = np.where(df[cols].le(df[cols].median()), 'low','high')

df1 = df.iloc[:, :1].join(df.iloc[:, 1:].sort_index(axis=1))

print (df1)

   PATID  Gene1 Gene1_med  Gene2 Gene2_med  Gene3 Gene3_med
0   1001  5.899       low  67.87      high   9.87       low
1   1002  6.899       low  60.87       low  10.87      high
2   1003  7.899      high  63.87       low   8.87       low
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading