Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python 3.8 Errorr while trying to define a range of values on a Pandas column

Hello I’m learning Python and Pandas, and I’m working on an exercise. I’m loading in 2 csv files and merging them into one dataframe.

import pandas as pd
# File to Load (Remember to Change These)
school_data_to_load = "Resources/schools_complete.csv"
student_data_to_load = "Resources/students_complete.csv"

# Read School and Student Data File and store into Pandas DataFrames
school_data_df = pd.read_csv(school_data_to_load)
student_data_df = pd.read_csv(student_data_to_load)

# Combine the data into a single dataset.  
school_data_complete_df = pd.merge(student_data_df, school_data_df, how="left", on=["school_name", "school_name"])
school_data_complete_df.head()

enter image description here

The output looks like the picture above.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I’m trying to:

  • Calculate the percentage of students with a passing math score (70 or greater)

  • Calculate the percentage of students with a passing reading score (70 or greater)

  • Calculate the percentage of students who passed math and reading (% Overall Passing)

I’m looking to populate a new dataframe by looking at students who only got 70 or greater on their math and reading scores by using the loc command

enter image description here

I got this error. I don’t understand because the values in the columns should all be integers so why is it saying I’m trying to pass strings in there as well?

>Solution :

You are not comparing the values in the column. You are just comparing "math_score" >= 70. There’s a string on the left, and an integer on the right, hence your problem.

Fix the location of your angle brackets, and you should be good to go:

passing_maths_total = school_data_complete_df.loc[school_data_complete_df["math score"] >= 70]

Pandas broadcasts the result of the >= comparison, so comparing the Pandas Series school_data_complete_df["math score"] with 70 results in a boolean Pandas Series which can be used for indexing, e.g. in .loc.

The colon is unnecessary because the row index comes first in .loc anyways.

This solution is not tested.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading