Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

python pandas: using pd.IndexSlice for both rows and columns in a double multiindex dataframe

I have a double Multiindex dataframe as follows. I slice the rows with idx = pd.IndexSlice but I dont know how to do the same with the columns
so provided this data:

df = pd.DataFrame(data=pd.DataFrame(data=np.random.randint(0, 10, size=(9, 5))))
# rows
list1 = ['2021-01-01','2022-02-01','2022-03-01']
list2 = ['PHOTO', 'QUE','TXR']
combinations = [(x, y) for x in list1 for y in list2]     
df.index = pd.MultiIndex.from_tuples(combinations, names = ["DATE","DB"])
df.index.set_names(["DATE","DB"], inplace=True)
#columns
list1c = [('AB30','ACTIVE','A2'),('CD55','ACTIVE','A1'),('ZT52','UNACTIVE','A2'),('MIKE','PENSIONER','A2'),('ZZ00001','ACTIVE','A1')]
df.columns = pd.MultiIndex.from_tuples(list1c, names = ["UserID","KIND","DEPARTMENT"])

I slice the rows as follows:

# filtering in rows
idx = pd.IndexSlice

###### ROWS #######
# slicing dates
date_start = '2021-01-01'
date_end   = '2021-02-01'
# slicing databases
databases = ['PHOTO','QUE']

# creating the index sclice for rows
i_s = idx[date_start:date_end, databases]

###### COLUMNS ######
# ??? here mask for the columns i_c = ???

df.loc[i_s, ] 

My goal is to use the same method to slice the columns
So how I generate the IndexSlice for columns that give me for example:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

pseudocode:
KIND= ACTIVE
DEPARTMENT = A2

I would like to use the same approach, for each multilevel defining a mask

>Solution :

You can use:

i_c = idx[: , 'ACTIVE', 'A2']

df.loc[i_s, i_c] 

Output:

UserID             AB30
KIND             ACTIVE
DEPARTMENT           A2
DATE       DB          
2021-01-01 PHOTO      5
           QUE        9

ACTIVE or PENSIONER

i_c = idx[: , ['ACTIVE', 'PENSIONER'], 'A2']

df.loc[i_s, i_c] 

Output:

UserID             AB30      MIKE
KIND             ACTIVE PENSIONER
DEPARTMENT           A2        A2
DATE       DB                    
2021-01-01 PHOTO      5         3
           QUE        9         2
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading