Advertisements
Would someone please let me know how to apply the function with 2 parameters into DataFrame? I have tried a lot of solution but still not successful. Here is my code below.
import pandas as pd
df=pd.DataFrame({'tran_amt_lcy':[40,500,60],'tran_amt_usd':[30,40,50],'client_id':['2001','2033','2045']})
df.dtypes
def test_func(col1,col2):
if col1>30 & col2<500:
tran_status='approved'
else:
tran_status='declined'
return tran_status
df['tran_stat']=df.apply(lambda x:test_func(df['tran_amt_usd'],df['tran_amt_lcy']),axis=1)
The error message still pop up as ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I don’t know why it is still failed. Does anyone tell me the possible way?
Thanks a lot.
>Solution :
For binary condition, you can use numpy.where
:
import numpy as np
# Boolean mask
m = (df['tran_amt_usd'] > 30) & (df['tran_amt_lcy'] < 500)
df['tran_stat'] = np.where(m, 'approved', 'declined')
print(df)
# Output
tran_amt_lcy tran_amt_usd client_id tran_stat
0 40 30 2001 declined
1 500 40 2033 declined
2 60 50 2045 approved
There are many post which explain this error. In fact, Python can’t compare a list (or a Series) to a scalar value. In your case, you try to evaluate:
([30, 40, 50] > 30) & ([40, 500, 60] < 100)
Update
Do this with a def function
def test_func(col1, col2):
m = (col1 > 30) & (col2 < 500)
return np.where(m, 'approved', 'declined')
# You don't need apply here
df['tran_stat'] = test_func(df['tran_amt_usd'], df['tran_amt_lcy'])