Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Plot year over year data with month-date on X axis

I have a Pandas Dataframe containing daily data for a variable for 3 years. I need plot a line chart where the X axis is 1/1 to 12/31 (month and day only, no year) and has 3 lines, one for each year. How do I plot this in matplotlib?

        DateTime    price
10      2007-10-06  35756.0
11      2007-10-06  34747.0
12      2007-10-07  35748.0
13      2007-10-07  34743.0
14      2007-10-08  35740.0
... ... ...
1519    2009-10-29  29564.0
1520    2009-10-30  32035.0
1521    2009-10-30  29397.0
1522    2009-10-31  32003.0
1523    2009-10-31  29256.0

For the above data, I need X axis to be dates from 1/1 to 12/31 and 3 lines, one each for 2007, 2008 and 2009

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

1- using month-day strings for the X-axis

You can extract the year and month-day (with strftime), then pivot and plot:

import pandas as pd
import numpy as np

# create a dummy dataset
df = pd.DataFrame({'DateTime': pd.date_range('2020-01-01', '2022-12-31'),
                   'price': np.sin(np.arange(1096)/80)
                  })

# if needed, ensure we have datetime
df['DateTime'] = pd.to_datetime(df['DateTime'])

# create new columns with year and month-day
# pivot and plot
(df.assign(year=lambda d: d['DateTime'].dt.year,
           day=lambda d: d['DateTime'].dt.strftime('%m-%d')
          )
   .pivot(index='day', columns='year', values='price')
   .plot()
)

Output:

pandas plot time series with month-day as X and year as different lines

Variant with seaborn.lineplot:

import seaborn as sns
import matplotlib.ticker as ticker

ax = sns.lineplot(data=df, x=df['DateTime'].dt.strftime('%m-%d'), y='price',
                  hue=df['DateTime'].dt.year)

ax.xaxis.set_major_locator(ticker.LinearLocator(10))

Output:

seaborn plot time series with month-day as X and year as different lines

Time series with the extracted year and month-day:

       DateTime     price  year    day
0    2020-01-01  0.000000  2020  01-01
1    2020-01-02  0.012500  2020  01-02
2    2020-01-03  0.024997  2020  01-03
3    2020-01-04  0.037491  2020  01-04
4    2020-01-05  0.049979  2020  01-05
...         ...       ...   ...    ...
1091 2022-12-27  0.877742  2022  12-27
1092 2022-12-28  0.883663  2022  12-28
1093 2022-12-29  0.889445  2022  12-29
1094 2022-12-30  0.895088  2022  12-30
1095 2022-12-31  0.900592  2022  12-31

[1096 rows x 4 columns]

2- using a real datetime X-axis with a common year, then changing the formatting

Another option that could be nicer would be to set up a common leap year (e.g. 2000) for plotting to benefit from the nice MonthLocator ticks, then to change the format to hide the year:

import matplotlib.dates as mdates

ax = (df
   .assign(year=lambda d: d['DateTime'].dt.year,
           day=lambda d: d['DateTime'].add(pd.DateOffset(year=2000))
          )
   .pivot(index='day', columns='year', values='price')
   .plot()
)

# add a tick for each month
ax.xaxis.set_major_locator(mdates.MonthLocator(interval=1))
# change the format to the month name
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b'))

Output:

pandas plot time series with month-day as X and year as different lines and changing the MonthLocator format

Same logic with seaborn:

import seaborn as sns
import matplotlib.dates as mdates

ax = sns.lineplot(data=df, x=df['DateTime'].add(pd.DateOffset(year=2000)),
                  y='price', hue=df['DateTime'].dt.year)

ax.xaxis.set_major_locator(mdates.MonthLocator(interval=1))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b'))

Output:

seaborn plot time series with month-day as X and year as different lines and changing the MonthLocator format

NB. you can notice for all the graphs a small break in the data between February 28th − March 1st for 2021 and 2022 since those are not leap years.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading