Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Slice MultiIndex by multiple tuples

I have a DataFrame with multiple index levels. I define some subset by selecting multiple combinations of all levels but the last. Then I want to slice the original DataFrame with that subset, but I cannot find how. Best is to look at a simple example:

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({'a': ['A', 'B', 'A', 'B'], 'b': ['X', 'X', 'X', 'Y'],
   ...:                    'c': ['S', 'T', 'T', 'T'], 'd': [1, 2, 3, 1]}).set_index(['a', 'b', 'c'])

In [3]: print(df.to_string())
       d
a b c
A X S  1
B X T  2
A X T  3
B Y T  1

In [4]: sel = df.index.droplevel('c')[df.d == 1]  # Some selection on multiple index levels.

In [5]: print(sel)
MultiIndex([('A', 'X'),
            ('B', 'Y')],
           names=['a', 'b'])

Now I would like all rows from df where (a, b) in sel, in this case all but the second row. I tried .loc, .xs and more.

I’m sure I can manipulate the index (drop level c, select, then add level c again), but that feels like a workaround. The same goes for an inner join. I must be overlooking some method…?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

One idea is use Index.isin with boolean indexing:

df = df[df.index.droplevel('c').isin(sel)]
print (df)
       d
a b c   
A X S  1
    T  3
B Y T  1
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading