Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

pandas reset_index of certain level removes entire level of multiindex

I have DataFrame like this:

                          performance
year      month     week
2015      1         2     4.170358
                    3     3.423766
                    4    -1.835888
                    5     8.157457
          2         6    -3.276887
...                            ...
2018      7         30   -1.045241
                    31   -0.870845
          8         31    0.950555
                    32    6.757876
                    33   -2.203334

I want to have week in range(0 or 1,n) where n = number of weeks in current year and month.

Well, the easy way I thought is to use

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df.reset_index(level=2, drop=True)

But it’s mistake I realized later, in best scenario I would get

                          performance
year      month     week
2015      1         0     4.170358
                    1     3.423766
                    2    -1.835888
                    3     8.157457
          2         4    -3.276887
...                            ...
2018      7         n-4  -1.045241
                    n-3  -0.870845
          8         n-2   0.950555
                    n-1   6.757876
                    n    -2.203334

But after I did that, I got an unexpected behaviour

                        close
timestamp timestamp
2015      1          4.170358
          1          3.423766
          1         -1.835888
          1          8.157457
          2         -3.276887
...                       ...
2018      7         -1.045241
          7         -0.870845
          8          0.950555
          8          6.757876
          8         -2.203334

I lost entire 2nd level of index! Why? I thought it will be 0 to n for each ‘cluster’ (Ye, it’s mistake, I realized it as I mentioned above)…
I solved my problem somesthing like that

df.groupby(level = [0, 1]).apply(lambda x: x.reset_index(drop=True))

And got my desired form of DataFrame like that:

                 performance
year month
2015 1     0  4.170358
           1  3.423766
           2 -1.835888
           3  8.157457
     2     0 -3.276887
...                ...
2018 7     3 -1.045241
           4 -0.870845
     8     0  0.950555
           1  6.757876
           2 -2.203334

But WHY? Why reset_index on certain level just drops it? That’s the main quastion!

>Solution :

reset_index with drop=True adds a default index only when you are reseting the whole index. If you’re reseting just a single level of a multi-level index, it will just remove it.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading