Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

pandas series name conflict with column name

So I was going over pandas leetcode and I came across this question. Where I am getting errors int not subscriptable. And then I found out because the column name and the series name is the problem where I cant get the column properly, is there any way to avoid this kinda problem?

data = [[2, 'Meir', 3000], [3, 'Michael', 3800], [7, 'Addilyn', 7400], [8, 'Juan', 6100], [9, 'Kannon', 7700]]
employees = pd.DataFrame(data, columns=['employee_id', 'name', 'salary']).astype({'employee_id':'int64', 'name':'object', 'salary':'int64'})
def get_bonus(row):
    if (row.name[0] == "M") and (row.employee_id %2 == 0)
        return row.salary
    else:
        return 0
    
def calculate_special_bonus(employees: pd.DataFrame) -> pd.DataFrame:
    employees["bonus"] = employees.apply(get_bonus, axis="columns")
    return employees[["employee_id", "bonus"]]

calculate_special_bonus(employees)

employee_id       2
name           Meir
salary         3000
bonus          None
Name: 0, dtype: object
employee_id          3
name           Michael
salary            3800
bonus             None
Name: 1, dtype: object
employee_id          7
name           Addilyn
salary            7400
bonus             None
Name: 2, dtype: object
employee_id       8
name           Juan
salary         6100
bonus          None
Name: 3, dtype: object
employee_id         9
name           Kannon
salary           7700
bonus            None
Name: 4, dtype: object

this is the row/series going into the apply function

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Don’t use dot notation here, prefer bracket notation to use the standard indexing.

From the documentation:

  • You can use this access only if the index element is a valid Python identifier, e.g. s.1 is not allowed.

  • The attribute will not be available if it conflicts with an existing method name, e.g. s.min is not allowed, but s['min'] is possible.

  • Similarly, the attribute will not be available if it conflicts with any of the following list: index, major_axis, minor_axis, items.

  • In any of these cases, standard indexing will still work, e.g. s['1'], s['min'], and s['index'] will access the corresponding element or column.

So you should replace:

if (row.name[0] == "M") and ...

with:

if (row[name][0] == "M") and ...

Obviously, you can also do (vectorized code):

m1 = employees.name.str[0] == 'M'
m2 = employees.employee_id.mod(2) == 0
employees['bonus'] = employees['salary'].where(m1 & m2, other=0)
print(employees)

# Output
   employee_id     name  salary  bonus
0            2     Meir    3000   3000
1            3  Michael    3800      0
2            7  Addilyn    7400      0
3            8     Juan    6100      0
4            9   Kannon    7700      0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading