Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas scoring system: sort_values

So my task is pretty simple.
We have a .CSV file with the results of the decathlon competition. They need to be changed into tasks, ranked and assigned places. Everything works fine apart from one line:

modified_data.sort_values(by=["Total points"])

Why doesn’t it sort the result for me?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

My work below:

import pandas as pd
import numpy as np

# Modification of CSV file data by adding header names and splitting data
data = pd.read_csv("static/data/Decathlon.csv", delimiter=';', header=None)
data = data.assign(Total_points=0)
data = data.assign(Ranking=0)
header_list = ['Player', '100 metres', 'Long jump', 'Short put', 'High jump', '400 metres', '110 metres hurdles',
               'Discus throw', 'Pole vault', 'Javelin throw', '1500 metres', 'Total points', 'Ranking']
data.to_csv("static/data/Decathlon_modified.csv", header=header_list, index=False)
modified_data = pd.read_csv("static/data/Decathlon_modified.csv", delimiter=',')
print(modified_data)

# Conversion of CSV data into the necessary units of measurement,
# so that it can be applied to the calculation of the resulting formulas:
temporary_list = []
changed_list = []
for time in modified_data["1500 metres"]:
    temporary_list.append(time.split('.'))
for new_value in temporary_list:
    value = (int(new_value[0]) * 60) + int(new_value[1]) + int(new_value[2]) * 0.01
    changed_list.append(value)
for index, new_value in enumerate(changed_list):
    modified_data.loc[index, "1500 metres"] = new_value

# Results are calculated according to formulas:
# Points = INT(A(B — P)C) for track events (faster time produces a higher score)
modified_data["100 metres"] = round((25.4347 * (18 - modified_data["100 metres"]) ** 1.81))
modified_data["400 metres"] = round(1.53775 * (82 - modified_data["400 metres"]) ** 1.81)
modified_data["110 metres hurdles"] = round(5.74352 * (28.5 - modified_data["110 metres hurdles"]) ** 1.92)
modified_data["1500 metres"] = round(0.03768 * (480 - modified_data["1500 metres"].astype(float)) ** 1.85)

# Points = INT(A(P — B)C) for field events (greater distance or height produces a higher score)
modified_data["Long jump"] = round(0.14354 * ((modified_data["Long jump"] * 100) - 220) ** 1.4)
modified_data["Short put"] = round(51.39 * (modified_data["Short put"] - 1.5) ** 1.05)
modified_data["High jump"] = round(0.8465 * ((modified_data["High jump"] * 100) - 75) ** 1.42)
modified_data["Discus throw"] = round(12.91 * (modified_data["Discus throw"] - 4) ** 1.1)
modified_data["Pole vault"] = round(0.2797 * (modified_data["Pole vault"] * 100 - 100) ** 1.35)
modified_data["Javelin throw"] = round(10.14 * (modified_data["Javelin throw"] - 7) ** 1.08)

# Total calculation and rewriting of each player's result in a common table
total_points = modified_data["100 metres"] + modified_data["Long jump"] + modified_data["Short put"] + \
               modified_data["High jump"] + modified_data["400 metres"] + modified_data["110 metres hurdles"] + \
               modified_data["Discus throw"] + modified_data["Pole vault"] + modified_data["Javelin throw"] \
               + modified_data["1500 metres"]
for index, new_value in enumerate(total_points):
    modified_data.loc[index, "Total points"] = new_value


# Ranking according to collected points
modified_data.reset_index(drop=False)
modified_data.index = np.arange(1, len(modified_data) + 1)

# TODO
modified_data.sort_values(by=["Total points"])
print(modified_data)

modified_data["Ranking"] = modified_data["Total points"]. \
    apply(lambda score:
          modified_data.index[modified_data["Total points"] == score].astype(str)).str.join("-")
print(modified_data)

modified_data.to_json(r'static/data/Decathlon.json')

I tried:

modified_data["Total points"] = modified_data["Total points"].astype(int)
modified_data.sort_values(by=["Total points"])

AND

modified_data["Total points"] = modified_data["Total points"].astype(int)
modified_data.sort_values('Total points')

Also this:
(https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.sort_values.html)

>Solution :

You should use inplace = True or assign the dataframe to the same variable:

modified_data.sort_values(by=["Total points"], inplace=True)
# Or alternatively
modified_data = modified_data.sort_values(by=["Total points"])
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading