I have a Pandas Dataframe containing daily data for a variable for 3 years. I need plot a line chart where the X axis is 1/1 to 12/31 (month and day only, no year) and has 3 lines, one for each year. How do I plot this in matplotlib?
DateTime price
10 2007-10-06 35756.0
11 2007-10-06 34747.0
12 2007-10-07 35748.0
13 2007-10-07 34743.0
14 2007-10-08 35740.0
... ... ...
1519 2009-10-29 29564.0
1520 2009-10-30 32035.0
1521 2009-10-30 29397.0
1522 2009-10-31 32003.0
1523 2009-10-31 29256.0
For the above data, I need X axis to be dates from 1/1 to 12/31 and 3 lines, one each for 2007, 2008 and 2009
>Solution :
1- using month-day strings for the X-axis
You can extract the year and month-day (with strftime), then pivot and plot:
import pandas as pd
import numpy as np
# create a dummy dataset
df = pd.DataFrame({'DateTime': pd.date_range('2020-01-01', '2022-12-31'),
'price': np.sin(np.arange(1096)/80)
})
# if needed, ensure we have datetime
df['DateTime'] = pd.to_datetime(df['DateTime'])
# create new columns with year and month-day
# pivot and plot
(df.assign(year=lambda d: d['DateTime'].dt.year,
day=lambda d: d['DateTime'].dt.strftime('%m-%d')
)
.pivot(index='day', columns='year', values='price')
.plot()
)
Output:
Variant with seaborn.lineplot:
import seaborn as sns
import matplotlib.ticker as ticker
ax = sns.lineplot(data=df, x=df['DateTime'].dt.strftime('%m-%d'), y='price',
hue=df['DateTime'].dt.year)
ax.xaxis.set_major_locator(ticker.LinearLocator(10))
Output:
Time series with the extracted year and month-day:
DateTime price year day
0 2020-01-01 0.000000 2020 01-01
1 2020-01-02 0.012500 2020 01-02
2 2020-01-03 0.024997 2020 01-03
3 2020-01-04 0.037491 2020 01-04
4 2020-01-05 0.049979 2020 01-05
... ... ... ... ...
1091 2022-12-27 0.877742 2022 12-27
1092 2022-12-28 0.883663 2022 12-28
1093 2022-12-29 0.889445 2022 12-29
1094 2022-12-30 0.895088 2022 12-30
1095 2022-12-31 0.900592 2022 12-31
[1096 rows x 4 columns]
2- using a real datetime X-axis with a common year, then changing the formatting
Another option that could be nicer would be to set up a common leap year (e.g. 2000) for plotting to benefit from the nice MonthLocator ticks, then to change the format to hide the year:
import matplotlib.dates as mdates
ax = (df
.assign(year=lambda d: d['DateTime'].dt.year,
day=lambda d: d['DateTime'].add(pd.DateOffset(year=2000))
)
.pivot(index='day', columns='year', values='price')
.plot()
)
# add a tick for each month
ax.xaxis.set_major_locator(mdates.MonthLocator(interval=1))
# change the format to the month name
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b'))
Output:
Same logic with seaborn:
import seaborn as sns
import matplotlib.dates as mdates
ax = sns.lineplot(data=df, x=df['DateTime'].add(pd.DateOffset(year=2000)),
y='price', hue=df['DateTime'].dt.year)
ax.xaxis.set_major_locator(mdates.MonthLocator(interval=1))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b'))
Output:
NB. you can notice for all the graphs a small break in the data between February 28th − March 1st for 2021 and 2022 since those are not leap years.



