Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Understanding gradient computation using backward() in PyTorch

I’m trying to understand the basic pytorch autograd system:

x = torch.tensor(10., requires_grad=True)
print('tensor:',x)
x.backward()
print('gradient:',x.grad)

output:

tensor: tensor(10., requires_grad=True)
gradient: tensor(1.)

since x is a scalar constant and no function is applied to it, I expected 0. as the gradient output. Why is the gradient 1. instead?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Whenever you are using value.backward(), you compute the derivative value (in your case value == x) with respect to all your parameters (in your case that is just x). Roughly speaking, this means all tensors that are somehow involved in your computation that have requires_grad=True. So this means

x.grad = dx / dx = 1

To add to that: With the automatic differentiation you always ever compute with "constant" values: All your function or networks are always evaluated at a concrete point. And the gradient you get is the gradient evaluated at that same point. There is no symbolic computation taking place. All the information needed for the computation of the gradient is encoded in the computation graph.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading