Slicing numpy arrays in a for loop

March 21, 2023

I have a multidimentional numpy array of elasticities with one of the dimensions being "age-groups" (below/above 18 years) and the other "income-groups" (low/high income).

I would like to create a table with the mean elasticities for each combination of subgroups using a for loop.

My code is as follows:

import numpy as np

elasticity = np.random.rand(2,92)
print(elasticity.shape)

income = ['i0','i1']
age_gr= [':18','18:']

table = {}
for i in range(len(age_gr)):
    for j in range(len(income)):
        key = age_gr[i]+"_"+income[j]
        table[key] = np.mean(elasticity[age_gr[i],j])
print(table)

My problem is that "age_gr[i]" gives me an error "IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices
". In reality I have many more age-groups, so I can’t do this manually.

I would like to have something like this as a result:

with … representing the mean of elasticities for the sub-group.

>Solution :

The error you’re seeing is because you’re trying to use a string value ("18:") as an index for the numpy array. Instead, you should use the corresponding integer indices for the age groups.

An example could be:

age_gr_idx = {'<18': 0, '18+': 1}

Then, in your loop, you can use this mapping to get the correct integer index for each age group:

import numpy as np

elasticity = np.random.rand(2, 92)
print(elasticity.shape)

income = ['i0', 'i1']
age_gr = ['<18', '18+']
age_gr_idx = {'<18': 0, '18+': 1}

table = {}
for i in range(len(age_gr)):
    for j in range(len(income)):
        key = age_gr[i] + "_" + income[j]
        table[key] = np.mean(elasticity[age_gr_idx[age_gr[i]], j])

Of course this is not the only way to accomplish this kind of result, but I think that is quite close to your solution.