Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python: Selecting days in-between with input date by user

I am trying to take some values from a Covid database and I wrote the following code which works as I want (see below) but I have a question for you after the code:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt


def main():

  pd.set_option('display.max_rows', None)          
  df = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv')

  df=df[df["Country/Region"]=="Italy"]

  df=df.drop(columns=["Province/State","Lat","Long","Country/Region"])
  df = df.columns.to_frame().T.append(df, ignore_index=True)
  df.columns = range(len(df.columns))
  df=df.T    
  df = df.rename(columns={0: 'date', 1: 'nuovi_casi'})
  df['nuovi_casi'] = df['nuovi_casi'].diff(periods=1).fillna(1)
  df = df[(df['date'] > '11/26/21') & (df['date'] <= '12/8/21')]

  print(df)

  dati_giornalieri=list(df.nuovi_casi)
  sommatoriaitalia=(sum(dati_giornalieri)/1390000000)*100
  
  print(sommatoriaitalia)
  print(dati_giornalieri)

Now I want to add this part of the code to ask the user what is the starting date and the finish date:

    def main():
      
      start_date=str(input("Enter starting date in format mm/dd/yy"))          
      end_date=str(input("Enter ending date in format mm/dd/yy"))             

      pd.set_option('display.max_rows', None)          
      df = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv')

      df=df[df["Country/Region"]=="Italy"]
      df=df.drop(columns=["Province/State","Lat","Long","Country/Region"])
      df = df.columns.to_frame().T.append(df, ignore_index=True)
      df.columns = range(len(df.columns))
      df=df.T    
      df = df.rename(columns={0: 'date', 1: 'nuovi_casi'})
      df['nuovi_casi'] = df['nuovi_casi'].diff(periods=1).fillna(1)
      df = df[(df['date'] > start_date) & (df['date'] <= end_date)]

but in the line df = df[(df[‘date’] > start_date) & (df[‘date’] <= end_date)] there is an error because he cannot compare date to string. I actually tried importing datetime:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

start_date = datetime.strptime(input('Enter Start date in the format m/d/y'), '%m/%d/%y')

but I actually had the same result because there is still a problem because for some reason it only consider a day per month or something similar but anyway the result is not as wanted.

How to solve the problem, selecting the days in between? Thanks.

>Solution :

Convert the values to datetime before comparing:

start_date = pd.to_datetime(start_date, format="%m/%d/%y")
end_date = pd.to_datetime(end_date, format="%m/%d/%y")
df["date"] = pd.to_datetime(df["date"], format="%m/%d/%y")

df = df[df["date"].between(start_date, end_date, inclusive="right")]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading