Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

pandas boxplot by names of day of week: how to re-order the day-of-week names?

Using pandas boxplot(by="name_of_week_day"), I’d like to control the order of names of week days display on the plot. They are displayed in a wrong order, and I want them to be displayed in the order of the week, beginning by Monday and finishing by Sunday.

Here is a simple reproductible example:

import pandas as pd

#create DataFrame
df = (
    # we create a 2 columns df: date and sales
    pd.DataFrame({'date': pd.date_range(start='1/5/2022', freq='D', periods=15),
                   'sales': [6, 8, 9, 5, 4, 8, 8, 3, 5, 9, 8, 3, 4, 7, 7]})
    # we create a new column to get the name of the day of the week
    .assign(name_of_day = lambda df: df.date.dt.day_name())
    )

The above code delivers a df. Here are its first 3 rows:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

date    sales   name_of_day
0   2022-01-05  6   Wednesday
1   2022-01-06  8   Thursday
2   2022-01-07  9   Friday

Now plot the boxplot with:

df.boxplot(by="name_of_day");

It returns the plot:

boxplot by names of week days: names of days are not in the right order

I would like the plot to deliver the names of the days of week in the right order.

How to do with pandas.boxplot() or with (pandas.plot.box() ?

Nota: yes, we could do it with seaborn, but my question is about pandas boxplot() or pandas plot.box().

>Solution :

Use a CategoricalDtype to reorder the days:

from calendar import day_name

days = pd.CategoricalDtype(list(day_name), ordered=True)

df.astype({'name_of_day': days}).boxplot(by="name_of_day")

Or in your original pipeline:

df = (
    # we create a 2 columns df: date and sales
    pd.DataFrame({'date': pd.date_range(start='1/5/2022', freq='D', periods=15),
                   'sales': [6, 8, 9, 5, 4, 8, 8, 3, 5, 9, 8, 3, 4, 7, 7]})
    # we create a new column to get the name of the day of the week
    .assign(name_of_day = lambda df: pd.Categorical(df.date.dt.day_name(),
                                                    categories=list(day_name),
                                                    ordered=True)
                                                   )
    )

Output:

boxplots ordered by name of day

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading