Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

python code to extract a record from a data frame from excel based on condition and create and input as column value

I am trying to extract a record from dataframe basing on condition and make it as a column. I have tried to work on below code .

import pandas as pd
df = pd.read_excel("C:/Users/0403 - Copy (3).xlsx")

def add_action(row): 
    if row["FILE TYPE"] == 'NaN': 
        return row['Row Labels']   

df = df.assign(Label_header=df.apply(add_action, axis=1)) 

Input dataframe

       |FILE TYPE |Row Labels               |TOTAL COUNT|    

0      | NaN      | CHANGE                  | 668   |
1      | 87I      | ei.cfc.h87.n5.20211     | 98    |   
2      | 87P      | ep.cfc.m87.n5.2023      | 570   |   
3      | NaN      | CHN                     | 5642  |   
4      | 87P      | NMMCMS.AC.2021_1_R.txt  | 1     |   
5      | 87P      | NS.AC.201_1_R.txt       | 1     |

Expected output dataframe

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

       |FILE TYPE |Row Labels               |TOTAL COUNT|    Label_header |

0      | 87I      | ei.cfc.h87.n5.20211     | 98    |        CHANGE
1      | 87P      | ep.cfc.m87.n5.2023      | 570   |        CHANGE
2      | 87P      | NMMCMS.AC.2021_1_R.txt  | 1     |        CHN
3      | 87P      | NS.AC.201_1_R.txt       | 1     |        CHN

>Solution :

Use Series.isna for test missing values with Series.where and Series.ffill for repeat values of Row Labels by mask, last use boolean indexing with inverted mask by ~:

m = df["FILE TYPE"].isna()

df['Label_header'] = df['Row Labels'].where(m).ffill()

out = df[~m].reset_index(drop=True)
print (out)
  FILE TYPE              Row Labels  TOTAL COUNT Label_header
0       87I     ei.cfc.h87.n5.20211           98       CHANGE
1       87P      ep.cfc.m87.n5.2023          570       CHANGE
2       87P  NMMCMS.AC.2021_1_R.txt            1          CHN
3       87P       NS.AC.201_1_R.txt            1          CHN

m = df["FILE TYPE"].isna()

out = df.assign(Label_header=df['Row Labels'].where(m).ffill())[~m].reset_index(drop=True)
print (out)
  FILE TYPE              Row Labels  TOTAL COUNT Label_header
0       87I     ei.cfc.h87.n5.20211           98       CHANGE
1       87P      ep.cfc.m87.n5.2023          570       CHANGE
2       87P  NMMCMS.AC.2021_1_R.txt            1          CHN
3       87P       NS.AC.201_1_R.txt            1          CHN
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading