Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How do divide specific entries in a multi-index pandas by a single value

I have a multi-index pandas dataframe. It has 31 columns, and then a second indexing level which is the file from which the data comes from.

I want to modify values in certain columns by a specific number (they are in pixel values and I want to convert them to mm by dividing them by a scale factor).

The data in each of columns are floats, and the px_to_mm is an int.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Instead of returning a float as I would expect, it returns NaN values across all entries of the column.

My code is as follows:

unique_animals = df.index.get_level_values('File').unique() 
px_to_mm = 791
columns_in_px = ['mouth_x', 'mouth_y', 'stomach_centre_x',
       'stomach_centre_y', 'aboral_organ_x',
       'aboral_organ_y', 'tentacle_1_x',
       'tentacle_1_y', 'tentacle_2_x', 'tentacle_2_y',
        'cilia_1_x', 'cilia_1_y', 
       'cilia_2_x', 'cilia_2_y',  'X_diff_stomach',
       'Y_diff_stomach']

for animal in unique_animals:
    for column in columns_in_px:
        df.loc[animal, column] = df.loc[animal, column] / px_to_mm

This is what the df index looks like:

MultiIndex([('CtenoEgg230801_',     0),
        ('CtenoEgg230801_',     1),
        ('CtenoEgg230801_',     2),
        ('CtenoEgg230801_',     3),
        ('CtenoEgg230801_',     4),
        ('CtenoEgg230801_',     5),
        ('CtenoEgg230801_',     6),
        ('CtenoEgg230801_',     7),
        ('CtenoEgg230801_',     8),
        ('CtenoEgg230801_',     9),
        ...
        ('CtenoEgg230802_', 66240),
        ('CtenoEgg230802_', 66241),
        ('CtenoEgg230802_', 66242),
        ('CtenoEgg230802_', 66243),
        ('CtenoEgg230802_', 66244),
        ('CtenoEgg230802_', 66245),
        ('CtenoEgg230802_', 66246),
        ('CtenoEgg230802_', 66247),
        ('CtenoEgg230802_', 66248),
        ('CtenoEgg230802_', 66249)],
       names=['File', None], length=106632)

and a sample of the first few rows:

mouth_x mouth_y mouth_likelihood    stomach_centre_x    stomach_centre_y    stomach_centre_likelihood   aboral_organ_x  aboral_organ_y  aboral_organ_likelihood tentacle_1_x    ... X_diff_stomach  Y_diff_stomach  Velocity_stomach    Acceleration_stomach    Theta_mouth_stomach Theta_Velocity_mouth_stomach    Theta_Acceleration_mouth_stomach    Theta_deg_mouth_stomach Height_Index    Frame
File                                                                                        
0   231.626724  233.873352  0.999196    200.364288  191.369202  0.998929    168.946747  140.374954  0.996564    202.717392  ... NaN NaN NaN NaN 0.936630    NaN NaN 53.692184   86.915467   0
1   230.637405  234.197998  0.999158    200.186630  191.611725  0.998900    169.261520  140.385788  0.997088    203.156342  ... -0.177658   0.242523    NaN NaN 0.950049    0.013419    NaN 54.461426   87.094083   1
2   230.883316  233.928162  0.999064    200.056335  191.886490  0.999025    169.505844  139.894012  0.997208    205.199158  ... -0.130295   0.274765    0.304093    NaN 0.938103    -0.011946   -0.025365   53.776596   87.748322   2
3   229.841034  234.385590  0.999249    199.935638  191.638977  0.999073    170.242477  139.233582  0.995122    203.273712  ... -0.120697   -0.247513   0.275373    -0.028720   0.960341    0.022238    0.034185    55.051394   87.569965   3
4   229.045685  234.782135  0.999314    200.159286  191.692688  0.999104    169.480316  138.349838  0.994976    203.242462  ... 0.223648    0.053711    0.230007    -0.045366   0.980226    0.019885    -0.002353   56.191290   88.285329   4

>Solution :

The outer loop seems to be useless here:

for animal in unique_animals:  # all rows?
    for column in columns_in_px:
        df.loc[animal, column] = df.loc[animal, column] / px_to_mm

So just use:

df[columns_in_px] /= px_to_mm
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading