Asking for help about implementing Bilinear CNN using PyTorch

According to the Tensorflow versionhttps://github.com/abhaydoke09/Bilinear-CNN-TensorFlow, I try to implement Bilinear-CNN using a two-stage training strategy. First to freeze beginning layers and train the last FC layer only, then fine-tuning the whole network.

When I am trying to implement it using PyTorch, in the first stage (lr=1, lr_stepsize=10,total_epoch=20); the accuracy rises to 55%. In the second stage (start from epoch20, lr=1e-2), the accuracy ends at 63%. It is unsatifactory. Training losses and accuracies are as follows:

Key codes are as f loss_acc ollows:

# The definition of Bilinear CNN
# input: [batch, channel, height, width]
class VggBasedNet_bilinear(nn.Module):
    def __init__(self, originalModel):
        super(VggBasedNet_bilinear, self).__init__()
        # feature extraction from Conv5_3 with relu
        self.features = nn.Sequential(*list(original_vgg16.features)[:-1]) 

        self.classifier = nn.Linear(512 * 512, args.numClasses)

    def forward(self, x):
        # feature extraction from Conv5_3 with relu
        x = self.features(x).view(-1,512,784)

        #  outer production of features on each position over height*width; average pooling
        x = torch.matmul(x, x.permute(0,2,1)).view(-1,512*512)/784.0

        # signed sqrt
        x = torch.mul(torch.sign(x),torch.sqrt(torch.abs(x)+1e-12)) 

        # L2 normalization
        x = F.normalize(x, p=2, dim=1)

        # final FC layer
        x = self.classifier(x)

        return x

I am sure that there is no wrong in rest codes because I only changed the network structure based on a VGG16 fine-tuning script.

Anyone who is familiar with both Bilinear-CNN and PyTorch can help me?

Comments (0)