I have a simple dataframe structure that looks like this:
print(scene_2d_df.head())
x y z
0 963 1691.0 0
1 911 1881.0 0
2 837 864.0 1
3 785 1054.0 0
4 897 59.0 0
print(scene_2d_df.shape)
(2294591, 3)
Every row represents a white or black dot (1 or 0) in an image. The x and y columns are the pixel positions. The image is approx. 1200 x 1800 in this case. I have code which I believe works, but is running very slowly even on a modern machine. The approach is a bit brute-force.
def construct_image_from_df(df_1):
xmax = int(df_1.max(axis=0)['x'])
xmin = int(df_1.min(axis=0)['x'])
ymax = int(df_1.max(axis=0)['y'])
ymin = int(df_1.min(axis=0)['y'])
zmax = int(df_1.max(axis=0)['z'])
zmin = int(df_1.min(axis=0)['z'])
print("xmin :: " + str(xmin) + " // xmax :: " + str(xmax)) # 1200-something
print("ymin :: " + str(ymin) + " // ymax :: " + str(ymax)) # 1800-something
print("zmin :: " + str(zmin) + " // zmax :: " + str(zmax)) # 1, all values 0 or 1
img = np.zeros((xmax, ymax))
length = df_1.shape[0] # number of rows
for i in range(0, length):
x, y, z = int(df_1.iloc[i]['x']), int(df_1.iloc[i]['y']), int(df_1.iloc[i]['z'])
img[x - 1, y - 1] = z
return img
Basically I am grabbing every row of the dataframe, and manually doing a pixel write into my 2D img array. It is very slow.
Is there a faster (maybe vectorized) way to do this?
>Solution :
You can use the coordinates almost directly in an indexing expression.
First, don’t compute min and max multiple times:
x_max, y_max = df[['x', 'y']].max()
x_min, y_min = df[['x', 'y']].min()
Then, place the z values into an image buffer directly:
img = np.zeros((x_max + 1, y_max + 1), dtype=df['z'].dtype)
img[df['x'].to_numpy(dtype=int), df['y'].to_numpy(dtype=int)] = df['z'].to_numpy()
Changing the dtype is necessary, because y appears to contain floats with integer values. Indexing arrays need to be integers. You can also adjust the dtype to the minimum required to hold z with np.min_scalar_type:
np.zeros((x_max + 1, y_max + 1), dtype=np.min_scalar_type(df['z'].max()))
If you want a boolean mask and you know that z represents True/False values, force the dtype of img to bool.