For Faster R-CNN, the boundary box is predicted relative to anchors. And we do put some constraint on the value of delta x & y. But YOLOv2 switch to anchor box idea too. So the author claim is more a theoretically benefit rather than a solid proof.

But back to your real question. I don’t think we can find the real answer easily since there are too many moving parts. I personally suspect one possible reason is that YOLO make fewer predictions.

