I have a multidimentional numpy array of elasticities with one of the dimensions being "age-groups" (below/above 18 years) and the other "income-groups" (low/high income).
I would like to create a table with the mean elasticities for each combination of subgroups using a for loop.
My code is as follows:
import numpy as np
elasticity = np.random.rand(2,92)
print(elasticity.shape)
income = ['i0','i1']
age_gr= [':18','18:']
table = {}
for i in range(len(age_gr)):
for j in range(len(income)):
key = age_gr[i]+"_"+income[j]
table[key] = np.mean(elasticity[age_gr[i],j])
print(table)
My problem is that "age_gr[i]" gives me an error "IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices
". In reality I have many more age-groups, so I can’t do this manually.
I would like to have something like this as a result:
with … representing the mean of elasticities for the sub-group.
>Solution :
The error you’re seeing is because you’re trying to use a string value ("18:") as an index for the numpy array. Instead, you should use the corresponding integer indices for the age groups.
An example could be:
age_gr_idx = {'<18': 0, '18+': 1}
Then, in your loop, you can use this mapping to get the correct integer index for each age group:
import numpy as np
elasticity = np.random.rand(2, 92)
print(elasticity.shape)
income = ['i0', 'i1']
age_gr = ['<18', '18+']
age_gr_idx = {'<18': 0, '18+': 1}
table = {}
for i in range(len(age_gr)):
for j in range(len(income)):
key = age_gr[i] + "_" + income[j]
table[key] = np.mean(elasticity[age_gr_idx[age_gr[i]], j])
Of course this is not the only way to accomplish this kind of result, but I think that is quite close to your solution.
