Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to minimize parameter in row pandas dataframe

I have dataframe with bus stop arrival forecast:

path_id | forecast | forecast_made_at | bus_id
 int    | datetime |  datetime        | int

We make predictions every 5 minutes, so database entries can be duplicated. for example

In 11:50 we predict bus #11544 will arrive at 11:59
In 11:50 we predict bus #95447 will arrive at 11:55
--......--
In 11:55 we predict bus #11544 will arrive at 12:02

I want to get newest prediction with biggest forecast_made_at parameter:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

res = pd.DataFrame()
for k, row in t_data.iterrows():
  prediction = dict(**row)
  forecasts = t_data[t_data["bus_id"] == prediction["bus_id"]] # Forecasts with the same bus_id
  prediction["best"] = (prediction["forecast_made_at"] == max(forecasts["forecast_made_at"]))
  res = res.append(prediction, ignore_index=True)

res = res[res["best"] == True]

In this code, we are working with dictionaries and not with pandas objects, so this one is very slow. How can I do this using pandas tools

>Solution :

What you need is a combination of grouping by bus_id, sorting by date and selection of most recent row.

One option – dropping duplicates by bus_id and only keeping most recent record:

t_data.sort_values('forecast_made_at').drop_duplicates(subset=['bus_id'], keep='last')

Another option: Grouping by bus_id and selecting last record:

t_data.sort_values('forecast_made_at').groupby('bus_id').last().reset_index()
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading