Skip to content

Incorrect accuracy calculation with DistributedDataParallel - examples/imagenet/main.py #905

Open
@numpee

Description

@numpee

The accuracy() function here divides number of correct samples by the batch size. Then, the top1 accuracy is updated with AverageMeter, and the top1.avg is returned. This is incorrect, since the input to the top1 AverageMeter is accuracy, and n=images.size(0). Essentially, the number of correct samples is divided by the batch size twice.

Furthermore, the validate function will return different values for each process when using DDP. The results should be combined across all GPUs.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions