Suppose I have a tensor like
[0.6, 0.7, 0.4]
and a mask like:
[1, 0,0]
How can I normalize it to:
[1,0,0]
my try:
normalized_attn_scores = F.softmax(attn_scores, 1)
normalized_attn_scores = normalized_attn_scores.mul(attn_mask)
But it doesn’t produce the desired output
>Solution :
You can normalize after masking by dividing the masked tensor by its sum, like this:
import torch
attn_scores = torch.tensor([0.6, 0.7, 0.4])
attn_mask = torch.tensor([1, 0, 0])
normalized_attn_scores = attn_scores * attn_mask
normalized_attn_scores = normalized_attn_scores / normalized_attn_scores.sum()
print(normalized_attn_scores)
This should produce the output:
tensor([1., 0., 0.])
Update due to user requesting softmax
If you want to preserve the results of the softmax function, you can apply the softmax function to the masked tensor before normalizing it. Here’s an example:
import torch
attn_scores = torch.tensor([0.6, 0.7, 0.4])
attn_mask = torch.tensor([1, 0, 0])
softmax_attn_scores = torch.softmax(attn_scores, dim=0)
masked_softmax = softmax_attn_scores * attn_mask
normalized_masked_softmax = masked_softmax / masked_softmax.sum()
print(normalized_masked_softmax)
It should still produce the same output as the above one without softmax
So, you get the desired result while preserving the softmax results.