Drastic difference in accuracy for varying batch sizes?

August 30, 2022

When training my CNN image classifier using PyTorch I noticed a ~20+% difference in accuracy when using a batch size of 4 vs 32. What might be causing such drastic differences?

batch_size 4

100%|██████████| 10/10 [02:50<00:00, 17.04s/it, TestAcc=71%, TrainAcc=74%, loss=0.328]

batch_size 32

100%|██████████| 10/10 [02:38<00:00, 15.85s/it, TestAcc=53%, TrainAcc=57%, loss=0.208]

Model:

class Net(nn.Module):
def __init__(self):
    super().__init__()
    self.conv1 = nn.Conv2d(3, 6, 5)
    self.pool = nn.MaxPool2d(2, 2)
    self.conv2 = nn.Conv2d(6, 16, 5)
    self.fc1 = nn.Linear(16 * 5 * 5, 120)
    self.fc2 = nn.Linear(120, 84)
    self.fc3 = nn.Linear(84, num_classes)

def forward(self, x):
    x = self.pool(F.relu(self.conv1(x)))
    x = self.pool(F.relu(self.conv2(x)))
    x = torch.flatten(x, 1)  # flatten all dimensions except batch
    x = F.relu(self.fc1(x))
    x = F.relu(self.fc2(x))
    x = self.fc3(x)
    return x

>Solution :

You can try to adjust your learning rate too.
With a larger batch size you should use a smaller learning rate.
Usually the the parameter update is computed as the sum of the loss gradients of all samples in a batch. So a larger batch size leads to a larger update step.