Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Faster way of converting a dataframe of x,y,z values into an image?

I have a simple dataframe structure that looks like this:

print(scene_2d_df.head())

     x       y  z
0  963  1691.0  0
1  911  1881.0  0
2  837   864.0  1
3  785  1054.0  0
4  897    59.0  0

print(scene_2d_df.shape)

(2294591, 3)

Every row represents a white or black dot (1 or 0) in an image. The x and y columns are the pixel positions. The image is approx. 1200 x 1800 in this case. I have code which I believe works, but is running very slowly even on a modern machine. The approach is a bit brute-force.

def construct_image_from_df(df_1):
    xmax = int(df_1.max(axis=0)['x'])
    xmin = int(df_1.min(axis=0)['x'])
    ymax = int(df_1.max(axis=0)['y'])
    ymin = int(df_1.min(axis=0)['y'])
    zmax = int(df_1.max(axis=0)['z'])
    zmin = int(df_1.min(axis=0)['z'])
    
    print("xmin :: " + str(xmin) + " // xmax :: " + str(xmax)) # 1200-something
    print("ymin :: " + str(ymin) + " // ymax :: " + str(ymax)) # 1800-something
    print("zmin :: " + str(zmin) + " // zmax :: " + str(zmax)) # 1, all values 0 or 1
    
    img = np.zeros((xmax, ymax))
    
    length = df_1.shape[0] # number of rows
    for i in range(0, length):
        x, y, z = int(df_1.iloc[i]['x']), int(df_1.iloc[i]['y']), int(df_1.iloc[i]['z'])
        img[x - 1, y - 1] = z

    return img

Basically I am grabbing every row of the dataframe, and manually doing a pixel write into my 2D img array. It is very slow.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Is there a faster (maybe vectorized) way to do this?

>Solution :

You can use the coordinates almost directly in an indexing expression.

First, don’t compute min and max multiple times:

x_max, y_max = df[['x', 'y']].max()
x_min, y_min = df[['x', 'y']].min()

Then, place the z values into an image buffer directly:

img = np.zeros((x_max + 1, y_max + 1), dtype=df['z'].dtype)
img[df['x'].to_numpy(dtype=int), df['y'].to_numpy(dtype=int)] = df['z'].to_numpy()

Changing the dtype is necessary, because y appears to contain floats with integer values. Indexing arrays need to be integers. You can also adjust the dtype to the minimum required to hold z with np.min_scalar_type:

np.zeros((x_max + 1, y_max + 1), dtype=np.min_scalar_type(df['z'].max()))

If you want a boolean mask and you know that z represents True/False values, force the dtype of img to bool.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading