I have a data frame with 3 index levels:
d
a b c
1 9 4 1
2 8 2 4
3 7 5 2
4 6 4 5
5 5 6 3
6 4 5 6
7 3 7 4
8 2 6 7
9 1 8 5
and I have a multi index object with only 2 levels:
MultiIndex([(1, 9),
(2, 8),
(3, 7),
(4, 6),
(9, 1)],
names=['a', 'b'])
How can I select the entries on the data frame that match this multi index?
Toy code:
import pandas
df1 = pandas.DataFrame(
dict(
a = [1,2,3,4,5,6,7,8,9],
b = [9,8,7,6,5,4,3,2,1],
c = [4,2,5,4,6,5,7,6,8],
d = [1,4,2,5,3,6,4,7,5],
)
).set_index(['a','b','c'])
select_this = multi_idx = pandas.MultiIndex.from_tuples([(1, 9), (2, 8), (3, 7), (4, 6), (9, 1)], names=['a', 'b'])
selected = df1.loc[select_this]
print(select_this)
print(df1)
print(selected)
which produces ValueError: operands could not be broadcast together with shapes (5,2) (3,) (5,2).
What I want to do can be achieved with
selected = df1.reset_index('c').loc[select_this].set_index('c', append=True)
However, this forces me to do this extra reset_index and then set_index. I want to avoid this.
>Solution :
You can filter different levels by Index.difference and filter by boolean indexing with Index.isin:
lvl = df1.index.names.difference(multi_idx.names)
out = df1[df1.index.droplevel(lvl).isin(multi_idx)]
print (out)
d
a b c
1 9 4 1
2 8 2 4
3 7 5 2
4 6 4 5
9 1 8 5