Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Convert MultiIndex to DatetimeIndex in grouped dataframe

I have a pandas DataFrameGroupBy (df_groups) that I have created by grouping another dataframe (df_pub) containing a list of publications by their day/month/year index.

df_groups = df_pub.groupby(by=df_pub.index.day, df_pub.index.month,df_pub.index.year],sort=False)

I then want to check how many unique publications exist in each group, so I use:

n_unique_pub = df_groups.Title.nunique()

This is a pandas series with a MutiIndex that looks like the following:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

MultiIndex([( 1,  7, 2020),
            ( 2,  7, 2020),
            ( 3,  7, 2020),
            ( 4,  7, 2020),
            ( 5,  7, 2020),
            ( 6,  7, 2020),
            ( 7,  7, 2020),
            ( 8,  7, 2020),
            ( 9,  7, 2020),
            (10,  7, 2020),
            ...
            ( 8, 11, 2021),
            ( 9, 11, 2021),
            (10, 11, 2021),
            (11, 11, 2021),
            (12, 11, 2021),
            (13, 11, 2021),
            (14, 11, 2021),
            (15, 11, 2021),
            (16, 11, 2021),
            (17, 11, 2021)],
           names=['Date', 'Date', 'Date'], length=497)

I would like to convert this MultiIndex into a DatetimeIndex so that it would look like:

DatetimeIndex(['2020-07-01', '2020-07-02', '2020-07-03', '2020-07-04',
               '2020-07-05', '2020-07-06', '2020-07-07', '2020-07-08',
               '2020-07-09', '2020-07-10',
               ...
               '2021-11-08', '2021-11-09', '2021-11-10', '2021-11-11',
               '2021-11-12', '2021-11-13', '2021-11-14', '2021-11-15',
               '2021-11-16', '2021-11-17'],
              dtype='datetime64[ns]', name='Date', length=505, freq='D')

Is there a simple way to do it? So far I’ve tried several approaches but nothing has worked. For example if I do pd.to_datetime(n_unique_pub.index) I have an error: TypeError: <class 'tuple'> is not convertible to datetime.

>Solution :

Use pd.to_datetime:

# mi is your MultiIndex instance, like mi = df.index
>>> pd.DatetimeIndex(pd.to_datetime(mi.rename(['day', 'month', 'year']).to_frame()))

DatetimeIndex(['2020-07-01', '2020-07-02', '2020-07-03', '2020-07-04',
               '2020-07-05', '2020-07-06', '2020-07-07', '2020-07-08',
               '2020-07-09', '2020-07-10', '2021-11-08', '2021-11-09',
               '2021-11-10', '2021-11-11', '2021-11-12', '2021-11-13',
               '2021-11-14', '2021-11-15', '2021-11-16', '2021-11-17'],
              dtype='datetime64[ns]', freq=None)

How to replace a MultiIndex to DatetimeIndex:

idx = pd.to_datetime(df.index.rename(['day', 'month', 'year']).to_frame())
df = df.set_index(idx)
print(df)

# Output:
                   A
2020-07-01  0.961038
2020-07-02  0.098132
2020-07-03  0.406996
2020-07-04  0.008376
2020-07-05  0.568059
2020-07-06  0.576610
2020-07-07  0.137144
2020-07-08  0.672219
2020-07-09  0.142874
2020-07-10  0.509231
2021-11-08  0.368762
2021-11-09  0.249107
2021-11-10  0.136282
2021-11-11  0.119291
2021-11-12  0.052388
2021-11-13  0.434899
2021-11-14  0.770705
2021-11-15  0.850914
2021-11-16  0.621283
2021-11-17  0.379888
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading