Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Insert dates starting from 00:00:00 and ending 23:59:59 in pandas

I have a following code that groups value from 07:10:00 to 17:30:00.

import pandas as pd


dict = {"datetime" : ["2022-10-26 07:12:13", "2022-10-26 09:14:43", "2022-10-26 17:25:19"], "value" : [1, 3, 5]}
df = pd.DataFrame(dict)
df["time"] = pd.to_datetime(df["datetime"])

df = (
            df.groupby([pd.Grouper(freq=f"5Min", key="time")])[
                "value"
            ]
            .sum()
            .reset_index(name="value")
        )

df = df.rename(columns={"time": "interval start"})
df.insert(
            1, "interval end", df["interval start"] + pd.Timedelta("5Min")
        )

But how can I prolong this group starting from 00:00:00 to 23:59:59 ?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Simple way to achieve:

  • First get the max, min times to find the actual start_time and end_time (you asked for day start time / end time)
  • Create master dataframe with the required time limit and then merge with the actual dataframe.

Including changes

import pandas as pd


dict = {"datetime" : ["2022-10-26 07:12:13", "2022-10-26 09:14:43", "2022-10-26 17:25:19"], "value" : [1, 3, 5]}
df = pd.DataFrame(dict)

df["time"] = pd.to_datetime(df["datetime"])


m_df = pd.DataFrame(data={"time": [df["time"].min().replace(hour=0, minute=0,second=0), df["time"].max().replace(hour=23, minute=59,second=59)]})
m_df = m_df.groupby(pd.Grouper(freq=f"5Min", key="time")).sum().reset_index()

df = (
            df.groupby([pd.Grouper(freq=f"5Min", key="time")])[
                "value"
            ]
            .sum()
            .reset_index(name="value")
        )


df = m_df.merge(df, on='time', how='left').fillna(0)

df = df.rename(columns={"time": "interval start"})
df.insert(
            1, "interval end", df["interval start"] + pd.Timedelta("5Min")
        )

print(df.head())
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading