Variables
Deep learning algorithms are often represented as computation graphs. Here is a simple example of the variable computation graph that we built in our example:
Each circle in the preceding computation graph represents a variable. A variable forms a thin wrapper around a tensor object, its gradients, and a reference to the function that created it. The following figure shows Variable class components:
The gradients refer to the rate of the change of the loss function with respect to various parameters (W, b). For example, if the gradient of a is 2, then any change in the value of a would modify the value of Y by two times. If that is not clear, do not worry—most of the deep learning frameworks take care of calculating gradients for us. In this chapter, we learn how to use these gradients to improve the performance of our model.
Apart from gradients, a variable also has a reference to the function that created it, which in turn refers to how each variable was created. For example, the variable a has information that it is generated as a result of the product between X and W.
Let's look at an example where we create variables and check the gradients and the function reference:
x = Variable(torch.ones(2,2),requires_grad=True)
y = x.mean()
y.backward()
x.grad
Variable containing: 0.2500 0.2500 0.2500 0.2500 [torch.FloatTensor of size 2x2]
x.grad_fn
Output - None
x.data
1 1
1 1
[torch.FloatTensor of size 2x2]
y.grad_fn
<torch.autograd.function.MeanBackward at 0x7f6ee5cfc4f8>
In the preceding example, we called a backward operation on the variable to compute the gradients. By default, the gradients of the variables are none.
The grad_fn of the variable points to the function it created. If the variable is created by a user, like the variable x in our case, then the function reference is None. In the case of variable y, it refers to its function reference, MeanBackward.
The Data attribute accesses the tensor associated with the variable.