Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Define data type in creating custom function

I have below custom function

import pandas as pd
def MyFn(DF : pd.DataFrame) -> float :
  return DF['Col_A'].values[1] - DF['Col_B'].values[1]

However I want to force user to supply a dataframe with 2 columns and having column names as 'Col_A' and 'Col_B'

Any insight how can I do it would be very appreciated.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You should be able to use assert and DF.columns.tolist() == ['Col_A', 'Col_B']:

def MyFn(DF : pd.DataFrame) -> float :
    assert DF.columns.tolist() == ['Col_A', 'Col_B'], 'DF must have two columns named "Col_A" and "Col_B"'
    return DF['Col_A'].values[1] - DF['Col_B'].values[1]

MyFn(pd.DataFrame([[1, 2], [3, 5]], columns=['Col_A', 'Col_B']))
# -2

MyFn(pd.DataFrame([[1, 2], [3, 5]], columns=['Col_A', 'Col_C']))
# AssertionError: DF must have two columns named "Col_A" and "Col_B"

MyFn(pd.DataFrame([[1, 2, 3], [4, 6, 7]], columns=['Col_A', 'Col_B', 'Col_C']))
# AssertionError: DF must have two columns named "Col_A" and "Col_B"

Note that this would require exactly two columns, "Col_A" and "Col_B", in this particular order.

A more flexible option would be to use try/except, which would allow you to handle another potential error independently: the fact that passing a DataFrame with less that two rows would trigger an IndexError:

def MyFn(DF : pd.DataFrame) -> float :
    try:
        return DF['Col_A'].values[1] - DF['Col_B'].values[1]
    except KeyError as e:
        raise Exception('DF must contain "Col_A" and "Col_B"') from e
    except IndexError as e:
        raise Exception('DF must have at least two rows') from e

MyFn(pd.DataFrame([[1, 2], [3, 5]], columns=['Col_A', 'Col_B']))
# -2

#MyFn(pd.DataFrame([[1, 2], [3, 5]], columns=['Col_A', 'Col_C']))
# Exception: DF must contain "Col_A" and "Col_B"

MyFn(pd.DataFrame([[1, 2, 3], [4, 6, 7]], columns=['Col_A', 'Col_B', 'Col_C']))
# -2

MyFn(pd.DataFrame([[1, 2]], columns=['Col_A', 'Col_B']))
# Exception: DF must have at least two rows
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading