Apply function to all rows in pandas dataframe (lambda)

Advertisements

I have the following function for getting the column name of last non-zero value of a row

import pandas as pd

def myfunc(X, Y):
    df = X.iloc[Y]
    counter = len(df)-1
    while counter >= 0:
        if df[counter] == 0:
            counter -= 1
        else:
            break
    return(X.columns[counter])

Using the following code example

data = {'id':  ['1', '2', '3', '4', '5', '6'],
        'name': ['AAA', 'BBB', 'CCC', 'DDD', 'EEE', 'GGG'],
        'A1': [1, 1, 1, 0, 1, 1],
        'B1': [0, 0, 1, 0, 0, 1],
        'C1': [1, 0, 1, 1, 0, 0],
        'A2': [1, 0, 1, 0, 1, 0]}

df = pd.DataFrame(data)
df

myfunc(df, 5) # 'B1'

I would like to know how can I apply this function to all rows in a dataframe, and put the results into a new column of df

I am thinking about looping across all rows (which probably is not the good approach) or using lambdas with apply function. However, I have not suceed with this last approach. Any help?

>Solution :

I’ve modified your function a little bit to work across rows:

def myfunc(row):
     counter = len(row)-1
     while counter >= 0:
         if row[counter] == 0:
             counter -= 1
         else:
             break
     return row.index[counter]

Now just call df.apply your function and axis=1 to call the function for each row of the dataframe:

>>> df.apply(myfunc, axis=1)
0    A2
1    A1
2    A2
3    C1
4    A2
5    B1
dtype: object

However, you can ditch your custom function and use this code to do what you’re looking for in a much faster and more concise manner:

>>> df[df.columns[2:]].T.cumsum().idxmax()
0    A2
1    A1
2    A2
3    C1
4    A2
5    B1
dtype: object

Leave a ReplyCancel reply