I have below custom function
import pandas as pd
def MyFn(DF : pd.DataFrame) -> float :
return DF['Col_A'].values[1] - DF['Col_B'].values[1]
However I want to force user to supply a dataframe with 2 columns and having column names as 'Col_A' and 'Col_B'
Any insight how can I do it would be very appreciated.
>Solution :
You should be able to use assert and DF.columns.tolist() == ['Col_A', 'Col_B']:
def MyFn(DF : pd.DataFrame) -> float :
assert DF.columns.tolist() == ['Col_A', 'Col_B'], 'DF must have two columns named "Col_A" and "Col_B"'
return DF['Col_A'].values[1] - DF['Col_B'].values[1]
MyFn(pd.DataFrame([[1, 2], [3, 5]], columns=['Col_A', 'Col_B']))
# -2
MyFn(pd.DataFrame([[1, 2], [3, 5]], columns=['Col_A', 'Col_C']))
# AssertionError: DF must have two columns named "Col_A" and "Col_B"
MyFn(pd.DataFrame([[1, 2, 3], [4, 6, 7]], columns=['Col_A', 'Col_B', 'Col_C']))
# AssertionError: DF must have two columns named "Col_A" and "Col_B"
Note that this would require exactly two columns, "Col_A" and "Col_B", in this particular order.
A more flexible option would be to use try/except, which would allow you to handle another potential error independently: the fact that passing a DataFrame with less that two rows would trigger an IndexError:
def MyFn(DF : pd.DataFrame) -> float :
try:
return DF['Col_A'].values[1] - DF['Col_B'].values[1]
except KeyError as e:
raise Exception('DF must contain "Col_A" and "Col_B"') from e
except IndexError as e:
raise Exception('DF must have at least two rows') from e
MyFn(pd.DataFrame([[1, 2], [3, 5]], columns=['Col_A', 'Col_B']))
# -2
#MyFn(pd.DataFrame([[1, 2], [3, 5]], columns=['Col_A', 'Col_C']))
# Exception: DF must contain "Col_A" and "Col_B"
MyFn(pd.DataFrame([[1, 2, 3], [4, 6, 7]], columns=['Col_A', 'Col_B', 'Col_C']))
# -2
MyFn(pd.DataFrame([[1, 2]], columns=['Col_A', 'Col_B']))
# Exception: DF must have at least two rows