Warm tip: This article is reproduced from serverfault.com, please click

How to prevent memory use growth when updating weights and biases in a Pytorch model

发布于 2020-11-25 07:54:22

I'm trying to build a VGG16 model to make an ONNX export using Pytorch. I want to force the model with my own set of weights and biases. But in this process my computer quickly runs out of memory.

Here is how I want to do it (this is only a test, in the real version I read the weights and biases in a set of files), this example only force all values to 0.5

# Create empty VGG16 model (random weights)
from torchvision import models
from torchsummary import summary

vgg16 = models.vgg16()
# la structure est : vgg16.__dict__
summary(vgg16, (3, 224, 224))

#  convolutive layers
for layer in vgg16.features:
    print()
    print(layer)
    if (hasattr(layer,'weight')):
        dim = layer.weight.shape
        print(dim)
        print(str(dim[0]*(dim[1]*dim[2]*dim[3]+1))+' params')

        # Remplacement des poids et biais
        for i in range (dim[0]):
            layer.bias[i] = 0.5
            for j in range (dim[1]):
                for k in range (dim[2]):
                    for l in range (dim[3]):
                        layer.weight[i][j][k][l] = 0.5

# Dense layers
for layer in vgg16.classifier:
    print()
    print(layer)
    if (hasattr(layer,'weight')):
        dim = layer.weight.shape
        print(str(dim)+' --> '+str(dim[0]*(dim[1]+1))+' params')
        for i in range(dim[0]):
            layer.bias[i] = 0.5
            for j in range(dim[1]):
                layer.weight[i][j] = 0.5

When I look at the memory usage of the computer, it grows linealrly and saturates the 16GB RAM during the first dense layer processing. Then python crashes...

Is there another better way to do this, keeping in mind that I want to onnx export the model afterwards? Thanks for your help.

Questioner
Fabrice Auzanneau
Viewed
0
Poe Dator 2020-11-25 18:00:42

The memory growth is caused by the need to adjust gradient for every weight and bias change. Try setting .requires_grad attribute to False before the update and restoring it after the update. Example:

for layer in vgg16.features:
    print()
    print(layer)
    if (hasattr(layer,'weight')):
        
        # supress .requires_grad
        layer.bias.requires_grad = False
        layer.weight.requires_grad = False
        
        dim = layer.weight.shape
        print(dim)
        print(str(dim[0]*(dim[1]*dim[2]*dim[3]+1))+' params')

        # Remplacement des poids et biais
        for i in range (dim[0]):
            layer.bias[i] = 0.5
            for j in range (dim[1]):
                for k in range (dim[2]):
                    for l in range (dim[3]):
                        layer.weight[i][j][k][l] = 0.5
        
        # restore .requires_grad
        layer.bias.requires_grad = True
        layer.weight.requires_grad = True