Grouping What is grouping? Why grouping? Pixels property of sensor, not world Reasoning at object level (might) make things easy: objects at consistent depth

objects can be recognized objects move as one "I stand at the window and see a house, trees, sky. Theoretically I might say there were 327 brightnesses and nuances of colour. Do I have "327"? No. I have sky, house, and trees." Max Wertheimer

Regions Boundaries Is grouping well-defined? Depends on purpose A

B C Object parts Background segmentation

Is grouping well-defined? A Perceptual organization forms a tree: Image B

A BG B L-bird R-bird

C grass bush far

beak body beak body

C eye D. Martin, C. Fowlkes, D. Tal, J. Malik. "A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics", ICCV, 2001 head

eye head How do we group things? Gestalt principles Principle of proximity

https://courses.lumenlearning.com/wsu-sandbox/chapter/gestalt-principles-of-perception/ How do we group things? Gestalt principles Principle of similarity https://courses.lumenlearning.com/wsu-sandbox/chapter/gestalt-principles-of-perception/

How do we group things? Gestalt principles Principle of continuity and closure https://courses.lumenlearning.com/wsu-sandbox/chapter/gestalt-principles-of-perception/ How do we group things? Gestalt principles

Principle of common fate Gestalt principles in the context of images Principle of proximity: nearby pixels are part of the same object Principle of similarity: similar pixels are part of the same object

Look for differences in color, intensity, or texture across the boundary Principle of closure and continuity: contours are likely to continue High-level knowledge? Regions

Boundaries Designing a good boundary detector Differences in color, intensity, or texture across the boundary Continuity and closure

High-level knowledge Criteria for a good boundary detector Criteria for a good boundary detector: Good detection: Fire only on real edges, not anywhere else Good localization the edges detected must be as close as possible to the true edges

the detector must return one point only for each true edge point Source: L. Fei-Fei Canny edge detector The classic edge detector Baseline for all later work on grouping

Theoretical model: step-edges corrupted by additive Gaussian noise J. Canny, A Computational Approach To Edge Detection, IEEE Trans. Pattern Analysis and Machine Intelligence, 8:679-714, 1986. 22,000 citations! Source: L. Fei-Fei

Image gradient The gradient of an image: The gradient points in the direction of most rapid increase in intensity The edge strength is given by the gradient magnitude: The gradient direction is given by:

how does this relate to the direction of the edge? Source: Steve Seitz Image gradient Source: L. Lazebnik With a little Gaussian noise

Gradient Source: D. Hoiem Effects of noise Noisy input image

Where is the edge? Source: S. Seitz Effects of noise Noise is high frequency Differentiation accentuates noise

Solution: smooth first f h f*h To find edges, look for peaks in

Source: S. Seitz Associative property of convolution Differentiation is a convolution Convolution is associative: This saves us one operation:

f Source: S. Seitz 2D edge detection filters Gaussian

derivative of Gaussian (x) Derivative of Gaussian filter x-direction y-direction

Example original image Compute Gradients (DoG) X-Derivative of Gaussian

Y-Derivative of Gaussian Image gradient The gradient of an image: The gradient points in the direction of most rapid increase in intensity The edge strength is given by the gradient magnitude:

The gradient direction is given by: how does this relate to the direction of the edge? Source: Steve Seitz Gradient magnitude and orientation Orientation is undefined at pixels with 0 gradient

Magnitude Orientation theta = numpy.arctan2(gy, gx) Non-maximum suppression for each orientation At q, we have a

maximum if the value is larger than those at both p and at r. Interpolate to get these values. Source: D. Forsyth

Before Non-max Suppression After Non-max Suppression Hysteresis thresholding Threshold at low/high levels to get weak/strong edge pixels Do connected components, starting from strong edge pixels

Final Canny Edges Canny edge detector 1. Filter image with x, y derivatives of Gaussian 2. Find magnitude and orientation of gradient 3. Non-maximum suppression: Thin multi-pixel wide ridges down to single pixel width

4. Thresholding and linking (hysteresis): Define two thresholds: low and high Use the high threshold to start edge curves and the low threshold to continue them Source: D. Lowe, L. Fei-Fei

Does Canny always work? The challenges of edge detection Texture Low-contrast boundaries Regions

Boundaries Grouping by clustering G R

B Grouping by clustering Idea: embed pixels into highdimensional space (e.g. 3dimensions) Each pixel is a

point Group together points K-means Assumption: each group is a Gaussian with different means and same standard deviation

Suppose we know all . Which group should a point belong to? The j with highest = The j with smallest K-means Assumption: each group = a Gaussian with different means and same standard deviation

If means are known, then trivial assignment to groups. How? Assign data point to nearest mean! K-means Problem: means are not known What if we know a set of points from each cluster?

belong to cluster k What should be ? K-means Problem: means are not known If assignment of points to clusters is known, then finding means is easy How? Compute the mean of every cluster!

K-means Given means, can assign points to clusters Given assignments, can compute means Idea: iterate! K-means Step-1 : randomly pick k centers

K-means Step 2: Assign each point to nearest center K-means Step 3: re-estimate centers K-means

Step 4: Repeat K-means - another example K-means Input: set of data points, k 1. Randomly pick k points as means

2. For i in [0, maxiters]: 1. Assign each point to nearest center 2. Re-estimate each center as mean of points assigned to it K-means - the math Input: set of data points , k 1. Randomly pick k points as means

2. For iteration in [0, maxiters]: 1. Assign each point to nearest center 2. Re-estimate each center as mean of points assigned to it K-means - the math An objective function that must be minimized:

Every iteration of k-means takes a downward step: Fixes and sets to minimize objective Fixes and sets to minimize objective K-means on image pixels K-means on image pixels

Picture courtesy David Forsyth One of the clusters from kmeans K-means on image pixels What is wrong?

Pixel position Nearby pixels are likely to belong to the same object Far-away pixels are likely to belong to different objects How do we incorporate pixel position?

Instead of representing each pixel as (r,g,b) Represent each pixel as (r,g,b,x,y) K-means on image pixels The issues with k-means Captures pixel similarity

but Doesnt capture continuity Captures proximity only weakly Can merge far away objects together Requires knowledge of k!

Oversegmentation and superpixels We dont know k. What is a safe choice? Idea: Use large k Can potentially break big objects, but will hopefully not merge unrelated objects Later processing can decide which groups to merge

Called superpixels