In this lecture, we cover Sliding Windows Detection — the foundational algorithm for detecting objects anywhere in a full scene image using a trained Convolutional Neural Network.
You will learn:
— The difference between object localization and sliding windows detection
— How to train a ConvNet binary classifier on closely cropped images
— How the sliding window scans the full image region by region
— Why multiple window sizes are needed to detect objects at different scales
— The stride tradeoff: fine stride vs coarse stride and its impact on accuracy and speed
— Why naive sliding windows is computationally too expensive with deep networks
— What comes next: the convolutional implementation that solves the speed problem