Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to rename column on basis of condition in Polars python?

I am trying to rename column on basis of a condition in Polars python but getting errors.

Data:

import polars as pl

test_df = pl.DataFrame({'Id': [100118647578,
  100023274028,100023274028,100023274028,100118647578,
  100118647578,100118647578,100023274028,100023274028,
  100023274028,100118647578,100118647578,100023274028,
  100118647578,100118647578,100118647578,100118647578,
  100118647578,100118647578,100023274028,100118647578,
  100118647578,100118647578,100118647578,100023274028,
  100118647578,100118647578,100118647578,100023274028,
  100118647578,100118647578,100023274028],

 'Age': [49,22,25,18,41,45,42,30,28,
  20,44,56,26,53,40,35,29,
  8,55,23,54,36,52,33,29,
  10,34,39,27,51,19,31],

 'Status': [2,1,1,1,1,1,1,3,2,1,1,
  1,2,1,1,1,1,1,1,2,1,1,1,1,2,1,1,
  1,1,1,1,4]})

Below code is to filter the data on basis of value from argument and rename on same basis:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

def Age_filter(status_filter_value = 1):
    return (
        test_df
        .filter(pl.col('Status') == status_filter_value)
        .sort(['Id','Age'])
        .groupby('Id')
        .agg( pl.col('Age').first())
        .sort('Id')

        # below part of code is giving error
        .rename({'Age' : pl.when(status_filter_value == 1)
                            .then('30_DPD_MOB')
                            .otherwise(pl.when(status_filter_value == 2)
                                       .then('60_DPD_MOB')
                                       .otherwise(pl.when(status_filter_value == 3)
                                                  .then('90_DPD_MOB')
                                                  .otherwise('120_DPD_MOB')
                                                  )
                                        )
                })
    )

Age_filter()

this gives an error: TypeError: argument 'new': 'Expr' object cannot be converted to 'PyString'

I have also tried below code but that is also not working:

def Age_filter1(status_filter_value = 1):
    {
    renamed_value = pl.when(status_filter_value == 1)
                            .then('30')
                            .otherwise(pl.when(status_filter_value == 2)
                                       .then('60')
                                       .otherwise(pl.when(status_filter_value == 3)
                                                  .then('90')
                                                  .otherwise('120')
                                                  )
                                        )


    return (
        test_df
        .filter(pl.col('Status') == status_filter_value)
        .sort(['Id','Age'])
        .groupby('Id')
        .agg( pl.col('Age').first())
        .sort('Id')
        .rename({'Age' : renamed_value
                })
    )
    }

Age_filter1()

>Solution :

As the error states, the rename method takes a dict of string to string only. No complicated expressions needed – in fact, pl.when, etc. should also be taking expressions, not a static boolean value.

You can do something like this programmatically for your case:

.rename({'Age' : f'{status_filter_value*30}_DPD_MOB')
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading