Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

An apply function in pandas to create a column based on conditions in a column

I’m working up some data where I’ve looked at data and created a flag column because I want to provide a column (I’ve called it result although this is my aim) where it is empty top_tier is empty or returns the parent region.

I’ve tried ffill and apply but I can’t get the result.

Update: My original thinking was to use an apply function that if top_tier was null to return the region above that row and then have a ffill, after removing all the top_tier rows.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

region           admissions_total   year    top_tier    result
England          292404             2014    TRUE    
North East       17409              2014    TRUE    
North East       17409              2014    TRUE    
County Durham    2240               2014                North East
Darlington       283                2014                North East
North West       41358              2014    TRUE    
North West       41358              2014    TRUE    
Ashton, Leigh & Wigan   1943    2014                    North West
Blackburn With Darwen Care Trust Plus   865 2014        North West
East Midlands    19256              2014    TRUE    
East Midlands    19256              2014    TRUE    
Bassetlaw        428                2014                East Midlands
Derby            2524               2014                East Midlands

>Solution :

Example Code

import pandas as pd
data1 = {'region': ['England', 'North East', 'North East', 'County Durham', 'Darlington', 'North West', 
                    'North West', 'Ashton', 'Blackburn', 
                    'East Midlands', 'East Midlands', 'Bassetlaw', 'Derby'], 
         'admissions_total': ['292404', '17409', '17409', '2240', '283', '41358', '41358', '1943', 
                              '865', '19256', '19256', '428', '2524'], 
         'year': ['2014', '2014', '2014', '2014', '2014', '2014', '2014', '2014', '2015', '2014', 
                  '2014', '2014', '2014'], 
         'top_tier': ['TRUE', 'TRUE', 'TRUE', None, None, 'TRUE', 'TRUE', None, None, 'TRUE', 
                      'TRUE', None, None]}
df = pd.DataFrame(data1)

df

    region          admissions_total    year    top_tier
0   England         292404              2014    TRUE
1   North East      17409               2014    TRUE
2   North East      17409               2014    TRUE
3   County Durham   2240                2014    None
4   Darlington      283                 2014    None
5   North West      41358               2014    TRUE
6   North West      41358               2014    TRUE
7   Ashton          1943                2014    None
8   Blackburn       865                 2015    None
9   East Midlands   19256               2014    TRUE
10  East Midlands   19256               2014    TRUE
11  Bassetlaw       428                 2014    None
12  Derby           2524                2014    None

Code

use where and ffill

cond = df['top_tier'].eq('TRUE')
df.assign(result=df['region'].where(cond).ffill().where(~cond))

result:

    region          admissions_total    year    top_tier    result
0   England         292404              2014    TRUE        NaN
1   North East      17409               2014    TRUE        NaN
2   North East      17409               2014    TRUE        NaN
3   County Durham   2240                2014    None        North East
4   Darlington      283                 2014    None        North East
5   North West      41358               2014    TRUE        NaN
6   North West      41358               2014    TRUE        NaN
7   Ashton          1943                2014    None        North West
8   Blackburn       865                 2015    None        North West
9   East Midlands   19256               2014    TRUE        NaN
10  East Midlands   19256               2014    TRUE        NaN
11  Bassetlaw       428                 2014    None        East Midlands
12  Derby           2524                2014    None        East Midlands
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading