Saturday, 15 May 2021

Using PyTorch's autograd efficiently with tensors by calculating the Jacobian

In my previous question I found how to use PyTorch's autograd with tensors:

import torch
from torch.autograd import grad
import torch.nn as nn
import torch.optim as optim

class net_x(nn.Module): 
        def __init__(self):
            super(net_x, self).__init__()
            self.fc1=nn.Linear(1, 20) 
            self.fc2=nn.Linear(20, 20)
            self.out=nn.Linear(20, 4) #a,b,c,d

        def forward(self, x):
            x=torch.tanh(self.fc1(x))
            x=torch.tanh(self.fc2(x))
            x=self.out(x)
            return x

nx = net_x()

#input
t = torch.tensor([1.0, 2.0, 3.2], requires_grad = True) #input vector
t = torch.reshape(t, (3,1)) #reshape for batch

#method 
dx = torch.autograd.functional.jacobian(lambda t_: nx(t_), t)
dx = torch.diagonal(torch.diagonal(dx, 0, -1), 0)[0] #first vector
#dx = torch.diagonal(torch.diagonal(dx, 1, -1), 0)[0] #2nd vector
#dx = torch.diagonal(torch.diagonal(dx, 2, -1), 0)[0] #3rd vector
#dx = torch.diagonal(torch.diagonal(dx, 3, -1), 0)[0] #4th vector
dx 
>>> 
tensor([-0.0142, -0.0517, -0.0634])

The issue is that grad only knows how to propagate gradients from a scalar tensor (which my network's output is not), which is why I had to calculate the Jacobian.

However, this is not very efficient and a bit slow as my matrix is large and calculating the entire Jacobian takes a while (and I'm also not using the entire Jacobian matrix).

Is there a way to calculate only the diagonals of the Jacobian (to get the 4 vectors in this example)?

There appears to be an open feature request but it doesn't appear to have gotten much attention.

Update 1:
I tried what @iacob said about setting torch.autograd.functional.jacobian(vectorize=True) However, this seems to be slower. To test this I changed my network output from 4 to 400, and my input t to be:

val = 100
t = torch.rand(val, requires_grad = True) #input vector
t = torch.reshape(t, (val,1)) #reshape for batch

Without vectorized = True:

Wall time: 10.4 s

With:

Wall time: 14.6 s


from Using PyTorch's autograd efficiently with tensors by calculating the Jacobian

No comments:

Post a Comment