Converting matrix of strings to PyTorch tensor

I wanted to convert the following matrix into a PyTorch tensor:

[['SELF', '', '', '', ''],
 ['nsubj', 'SELF', '', '', ''],
 ['', 'compound', 'SELF', '', ''],
 ['dobj', '', '', 'SELF', ''],
 ['pobj', '', '', '', 'SELF']]

I wanted to have a boolean matrix where any position with a string other than empty would have a 1, otherwise 0. This should be easy, but I do not seem to find an answer that does not require to iterate through the matrix and build the tensor a cell at a time.

The solution I have:

size = len(sample["edges"])
edge_mask = torch.zeros([size, size])

for i, row in enumerate(sample["edges"]):
    for j, v in enumerate(row):
        if v != "":
            edge_mask[i, j] = 1

>Solution :

You can convert it to a boolean array, then use torch.from_numpy followed with a convert to int:

torch.from_numpy(np.array(sample["edges"], dtype=bool)).to(int)

Leave a Reply