What do we learn from region based object detectors (Faster R-CNN, R-FCN, FPN)?

Sliding-window detectors

Sliding-windows (From right to left, up and down)
Warp an image to a fixed size image.
System flow for the sliding-window detector.
for window in windows
patch = get_patch(image, window)
results = detector(patch)

Selective Search

(Image source: van de Sande et al. ICCV’11)

R-CNN

Use region proposals, CNN, affine layers to locate objects.
System flow for R-CNN
ROIs = region_proposal(image)
for ROI in ROIs
patch = get_patch(image, ROI)
results = detector(patch)

Boundary box regressor

Use regression to refine the original ROI in blue to the red one.

Fast R-CNN

Apply region proposal on feature maps and form fixed size patches using ROI pooling.
feature_maps = process(image)
ROIs = region_proposal(image)
for ROI in ROIs
patch = roi_pooling(feature_maps, ROI)
results = detector2(patch)
  • Top left below: our feature maps.
  • Top right: we overlap the ROI (blue) with the feature maps.
  • Bottom left: we split ROIs into the target dimension. For example, with our 2×2 target, we split the ROIs into 4 sections with similar or equal sizes.
  • Bottom right: find the maximum for each section and the result is our warped feature maps.
Input feature map (top left), output feature map (bottom right), blue box is the ROI (top right).

Faster R-CNN

feature_maps = process(image)
ROIs = region_proposal(image) # Expensive!
for ROI in ROIs
patch = roi_pooling(feature_maps, ROI)
results = detector2(patch)
Network flow is the same as the Fast R-CNN.
The external region proposal is replaced by an internal deep network.
Source

Performance for R-CNN methods

Region-based Fully Convolutional Networks (R-FCN)

feature_maps = process(image)
ROIs = region_proposal(feature_maps)
for ROI in ROIs
patch = roi_pooling(feature_maps, ROI)
class_scores, box = detector(patch) # Expensive!
class_probabilities = softmax(class_scores)
feature_maps = process(image)
ROIs = region_proposal(feature_maps)
score_maps = compute_score_map(feature_maps)
for ROI in ROIs
V = region_roi_pool(score_maps, ROI)
class_scores, box = average(V) # Much simpler!
class_probabilities = softmax(class_scores)
Create a new feature map from the left to detect the top left corner of an object.
Generate 9 score maps
Apply ROI onto the feature maps to output a 3 x 3 array.
Overlay a portion of the ROI onto the corresponding score map to calculate V[i][j]
ROI pool

Our journey so far

for window in windows
patch = get_patch(image, window)
results = detector(patch)
ROIs = region_proposal(image)
for ROI in ROIs
patch = get_patch(image, ROI)
results = detector(patch)

Further reading on FPN, R-FCN and Mask R-CNN

Resources

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store