Home Apply function to several rows and columns of a pandas dataframe using pd.loc

Questions

Apply function to several rows and columns of a pandas dataframe using pd.loc

March 24, 2022

I have a pandas dataframe that has incorrect values at certain rows and columns. An example dataframe would be something like this:

dict1 = {'Total Points': [10, 20, 30, 40, 50],
         'A pts': [8, 'X', 20, 30, 40],
         'A pts percentage': [0.80, 0.75, 0.67, 0.75, 0.80],
         'B pts': [2, 5, 'X', 'X', 10],
         'B pts percentage': [0.20, 0.25, 0.33, 0.25, 0.20]
        }

df = pd.DataFrame(dict1)
cols = ["A pts", "B pts"]
pct_cols = [2, 4]

bad_rows = df.index[df[cols].isin(['X']).any(axis=1)]

I can calculate which rows have the value I want to remove. What I want to do is calculate the actual pts scored by taking the next column (percentage) and multiplying it by the total points scored for that row. Ideally, if I could only perform that to the columns that have the bad value, that’d be ideal. I know I can use

df.loc[bad_rows, cols] = value

to set all values at those rows and columns to a particular value, but don’t know how to apply different values. I tried doing

df.loc[bad_rows, cols] = df.iloc[bad_rows, pct_cols] * df.loc[bad_rows, "Total Points"]

but that doesn’t work. Any input would be greatly appreciated.

Edit: I know I could simply apply a function to the entire dataframe for certain columns, but I’m trying to avoid doing that if possible, particularly because the dataset I’m working with is quite large and that would be incredibly inefficient. I know I could simply use for loops to iterate through everything and set each value one at a time, but that also seems inefficient.

>Solution :

Use mul on index axis:

df.loc[bad_rows, cols] = df.iloc[bad_rows, pct_cols].mul(df.loc[bad_rows, 'Total Points'], axis=0).values
print(df)

# Output
   Total Points A pts  A pts percentage B pts  B pts percentage
0            10     8              0.80     2              0.20
1            20  15.0              0.75   5.0              0.25
2            30  20.1              0.67   9.9              0.33
3            40  30.0              0.75  10.0              0.25
4            50    40              0.80    10              0.20