Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to create pivot table by 6 month interval rather than year?

I have simple daily dataframe df:

date_rng = pd.date_range(start='2023-01-01', end='2024-01-05', freq='D')
data = np.random.rand(len(date_rng), 3)
df = pd.DataFrame(data, columns=['Column1', 'Column2', 'Column3'], index=date_rng)

And instead of using date_rng.dt.year to pivot by year, I would like it in 6 month date increments.

pivot_df = pd.pivot_table(df, values='Vessel', index=date_rng.dt.year, columns='Col1',     aggfunc=pd.Series.nunique)

Any suggestions?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I tried using dt.quarter but this just shows the unique counts by quarter, not be year and quarter. Really any custom grouping other than just yearly would be ideal.

>Solution :

You can try (note I’ve added Vessel column to make as you have in your example in the question):

date_rng = pd.date_range(start="2023-01-01", end="2024-01-05", freq="D")
data = np.random.rand(len(date_rng), 3)
df = pd.DataFrame(data, columns=["Column1", "Column2", "Column3"], index=date_rng)

# added Vessel column
df["Vessel"] = np.random.randint(1, 5, size=len(date_rng))

pivot_df = pd.pivot_table(
    df,
    index=[df.index.year, np.where(df.index.month <= 6, "H1", "H2")],
    columns="Vessel",
    values=["Column1", "Column2", "Column3"],
    aggfunc="nunique",
)
print(pivot_df)

Prints:

        Column1                   Column2                   Column3                  
Vessel        1     2     3     4       1     2     3     4       1     2     3     4
2023 H1    39.0  41.0  59.0  42.0    39.0  41.0  59.0  42.0    39.0  41.0  59.0  42.0
     H2    43.0  53.0  34.0  54.0    43.0  53.0  34.0  54.0    43.0  53.0  34.0  54.0
2024 H1     NaN   1.0   3.0   1.0     NaN   1.0   3.0   1.0     NaN   1.0   3.0   1.0

EDIT: To convert index back to dates:

pivot_df.index = [
    pd.to_datetime(f'{year}-{"01-01" if half == "H1" else "06-01"}')
    for year, half in pivot_df.index
]
print(pivot_df)

Prints:

           Column1                   Column2                   Column3                  
Vessel           1     2     3     4       1     2     3     4       1     2     3     4
2023-01-01    48.0  44.0  43.0  46.0    48.0  44.0  43.0  46.0    48.0  44.0  43.0  46.0
2023-06-01    49.0  41.0  48.0  46.0    49.0  41.0  48.0  46.0    49.0  41.0  48.0  46.0
2024-01-01     1.0   1.0   NaN   3.0     1.0   1.0   NaN   3.0     1.0   1.0   NaN   3.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading