Asking for help about implementing Bilinear CNN using PyTorch
According to the Tensorflow versionhttps://github.com/abhaydoke09/Bilinear-CNN-TensorFlow, I try to implement Bilinear-CNN using a two-stage training strategy. First to freeze beginning layers and train the last FC layer only, then fine-tuning the whole network.
When I am trying to implement it using PyTorch, in the first stage (lr=1, lr_stepsize=10,total_epoch=20); the accuracy rises to 55%. In the second stage (start from epoch20, lr=1e-2), the accuracy ends at 63%. It is unsatifactory. Training losses and accuracies are as follows:
Key codes are as follows:
# The definition of Bilinear CNN
# input: [batch, channel, height, width]
class VggBasedNet_bilinear(nn.Module):
def __init__(self, originalModel):
super(VggBasedNet_bilinear, self).__init__()
# feature extraction from Conv5_3 with relu
self.features = nn.Sequential(*list(original_vgg16.features)[:-1])
self.classifier = nn.Linear(512 * 512, args.numClasses)
def forward(self, x):
# feature extraction from Conv5_3 with relu
x = self.features(x).view(-1,512,784)
# outer production of features on each position over height*width; average pooling
x = torch.matmul(x, x.permute(0,2,1)).view(-1,512*512)/784.0
# signed sqrt
x = torch.mul(torch.sign(x),torch.sqrt(torch.abs(x)+1e-12))
# L2 normalization
x = F.normalize(x, p=2, dim=1)
# final FC layer
x = self.classifier(x)
return x
I am sure that there is no wrong in rest codes because I only changed the network structure based on a VGG16 fine-tuning script.
Anyone who is familiar with both Bilinear-CNN and PyTorch can help me?