Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Get counts of one numpy array using another array as what to count based on

I have the following code using bincounts to get occurrences

print(categories = df[attribute].cat.categories)
>>> Int64Index([0, 1, 2], dtype='int64')
print(df[attribute].to_numpy())
>>> [0 1 0 1 1]
partition = np.bincount(df[attribute].to_numpy())
print(partition)
>>> [2 3]

What I want is so that it is counting but using bins based on the categories array such that it would be [2 3 0] because there are no 2’s in the array. Is there any way to do this? My dataframes are always setup such that categorical data types are integer encoded starting from 0 up to the number of classes. I want to avoid using df[attribute].value_count() because profiling makes it seem like it is a bottleneck, though I’m not entirely sure.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You can use np.unique with return_counts=True:

df = pd.DataFrame({'attribute': [0, 0, 1, 1, 1]})
df = df.astype({'attribute': pd.CategoricalDtype([0, 1, 2])})

cat, count = np.unique(df['attribute'], return_counts=True)

Output:

>>> cat, count
(array([0, 1]), array([2, 3]))

Suggested by @jezrael, to get your expected output, you can use:

>>> pd.Series(count, index=cat).reindex(df['attribute'].cat.categories, fill_value=0)
0    2
1    3
2    0
dtype: int64

But you have to compare the performance with:

>>> df['attribute'].value_counts(sort=False)
0    2
1    3
2    0
Name: attribute, dtype: int64
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading