I assume you get the blue box instead of the red box. I think the red box is more “semantic” rich. If you cannot get the red one, likely the model is not trained for that. The red one has some challenges. If you want to detect individual lanes, the rectangular based region may partially overlap for different lanes but it is not easy to tell whether it is another lane or a duplicate from another guess in the next grid cell. It is hard to get you the answer until really get the hand dirty to see whether the deep network can handle it.
