Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Sorting 2d array into bins and add weights in each bin

Suppose I have a series of 2d coordinates (x,y) each corresponding to a weight, and after I sort them into bins (i.e. a little square area) I want to find the weight in each bin, which should be the added weights of points that fall into the bin. I used np.digitize to find which bins my data fall into, then I added weights in each bin using a loop. My code is like this:

import numpy as np

x=np.random.uniform(low=0.0, high=10.0, size=(5000,)) #x variable
y=np.random.uniform(low=0.0, high=10.0, size=(5000,)) #y variable
w=np.random.uniform(low=0.0,high=10.0,size=(5000,)) #weight at each (x,y)

binx=np.arange(0,10,1)
biny=np.arange(0,10,1)

indx=np.digitize(x,binx)
indy=np.digitize(y,biny)

#initialise empty list
weight=[[0]*len(binx) for _ in range(len(biny))]

for n in range(0,len(x)):
    for i in range(0,len(binx)):
        for j in range(0,len(biny)):
            if indx[n]==binx[i] and indy[n]==biny[j]:
                weight[i][j]=+w[n]

But the first line of the output weight is empty, which doesn’t make sense. Why does this happen? Is there a more efficient way to do what I want?

Edit: I have a good answer below (one I accepted), but I wonder how it works if I have bins as floats?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You can do this with simple indexing. First get the bin number in each direction. You don’t need np.digitize for evenly spaced bins:

xbin = np.floor_divide(x, 1, dtype=int, casting='unsafe')
ybin = np.floor_divide(y, 1, dtype=int, casting='unsafe')

This is equivalent (but faster than) xbin = (x // 1).astype(int). Now make an output grid:

grid = np.zeros_like(w, shape=(xbin.max() + 1, ybin.max() + 1))

Now the trick to getting the addition done correctly with repeated bins is to do it in unbuffered mode. Ufuncs like np.add have a method at just for this purpose:

np.add.at(grid, (xbin, ybin), w)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading