Tensor repeat for image patches

October 29, 2022

I have a batch of 20 flattened tensors representing 256X256 images.

>>> imgs.shape
(20, 65536)

Each image was split into 32×32 patches (a total of 64 patches per image). I have calculated a score for each patch and got a vector with the shape of (20,64)

I would like to multiply each pixel with the corresponding patch score.

imgs * score yields an error and score.repeat(1,1,64) didn’t repeat the scores in a way that preserves the score of each pixel.

How can this be achieved?

EDIT:

A simple example can be using

import torch
img_size = 4
patch_size = 2
img = torch.rand((2,img_size,img_size)) # (2,4,4)
score = torch.tensor([[1,2,3,4],[5,6,7,8]]) # (2,4)

And trying to achieve

score = [[1,1,3,3],[2,2,4,4],[5,5,6,6][7,7,8,8]]

>Solution :

I would suggest reshaping your scores array to preserve information about how it relates to the original image, then using repeat_interleave twice.

Example:

import torch
img_size = 4
patch_size = 2
patches_per_axis = int(img_size / patch_size)
num_images = 2
img = torch.rand((2,img_size,img_size)) # (2,4,4)
score = torch.tensor([[1,2,3,4],[5,6,7,8]]) # (2,4)

def expand_scores(scores):
    # Unflatten scores
    scores = scores.reshape((num_images, patches_per_axis, patches_per_axis))
    # Repeat scores to match dimensions of image, in vertical direction
    scores = scores.repeat_interleave(repeats=patch_size, axis=1)
    # Repeat scores to match dimensions of image, in horizontal direction
    scores = scores.repeat_interleave(repeats=patch_size, axis=2)
    # Optional: use reshape() to re-flatten scores. If you do that here, you'll need to do it to the image tensor too.
    return scores

(I added two constants at the top to your example, num_images, and patches_per_axis. In your original example, these would be set to 20 and 8, respectively.)

When you call expand_scores(), you’ll get the following output:

tensor([[[1, 1, 2, 2],
         [1, 1, 2, 2],
         [3, 3, 4, 4],
         [3, 3, 4, 4]],

        [[5, 5, 6, 6],
         [5, 5, 6, 6],
         [7, 7, 8, 8],
         [7, 7, 8, 8]]])

You can multiply that by the pixel values: