We extend the pictorial structure model in three aspects. First, when the model contains only a single part, we develop new methods ranging from regularized subwindow search, nested window search, to twisted window search, for handling richer priors and more flexible shapes. Second, we develop the notion of a weak pictorial structure, as opposed to the strong one, for the characterization of a loose geometric layout in a rotationally invariant way. Third, we develop nested models to encode topological inclusion relations between parts to represent richer patterns.
We show that all the extended models can be efficiently matched to images by using dynamic programming and variants of the generalized distance transform, which computes the lower envelope of transformed cones on a dense image grid. This transform turns out to be important for a wide variety of computer vision tasks and often accelerates the computation at hand by an order of magnitude. We demonstrate improved results in either quality or speed, and sometimes both, in object matching, saliency measure, online and offline tracking, object localization and recognition.