Image Filtering and Edge Detection ECE 847: Digital Image Processing Stan Birchfield Clemson University Motivation Two closely related problems: blur

(to remove noise) differentiate (to highlight details) S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Motivation Two closely related problems: blur (to remove noise) differentiate

(to highlight details) Underlying math is the same! S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Recall: Types of image transformations Graylevel transforms I(x,y) f( I(x,y) ) (arithmetic, logical, thresholding, histogram equalization, )

Geometric transforms I(x,y) f( I(x,y) ) (flip, flop, rotate, scale, ) Area-based transforms I(x,y) f( I(x,y), I(x+1,y+1), ) filtering (morphological operators, convolution) Global transforms I(x,y) f( I(x,y), x,y ) (Fourier transform, wavelet transform) }

A S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Outline Convolution

Gaussian convolution kernels Smoothing an image Differentiating an image Canny edge detection Laplacian of Gaussian Linear time-invariant (LTI) systems System produces output from input:

Two properties of linear systems: 1. homogeneity (or scaling): Scaling of input propagates to output 2. superposition (or additivity): Sum of two inputs yields sum of two outputs Time-invariant (or shift-invariant): Output does not depend upon absolute time (shift) of input

S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 System examples Linear time invariant: y(t) = 5x(t) y(t) = x(t-1) + 2x(t) + x(t+1) Linear time varying: y(t) = tx(t) Nonlinear: y(t) = cos( x(t) ) S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Question

Is this system linear? y(t) = mx(t) + b S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Question Is this system linear? y(t) = mx(t) + b No, not if b 0, because scaling the input does not scale the output: max(t) + b = amx(t) + b ay(t) Technically, this is called an affine system

Ironic that a linear equation does not describe a linear system S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 LTI systems are described by convolution Continuous (Notation is usually *, but we want to avoid confusion with multiplication) Discrete

Note: Convolution is commutative S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Relationship to cross-correlation Continuous complex conjugate no flip Discrete

Note: Convolution and cross-correlation are identical when the signal is real and the kernel is symmetric S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Convolution with discrete, finite duration signals width and half-width of kernel: (if w is odd) convolution assumes

(if w is odd) examples: (underline indicates origin) S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Convolution implementation (1D) In memory, indices must be non-negative So shift kernel by : Algorithm:

Flip g (not needed in practice) For each x Align g so that center is at x Sum the products S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Kernel flipping In practice, no need to flip g If g is symmetric, then same result [ 1 2 1 ] If g is antisymmetric, then just sign change [ -1 0 1 ]

For real images and symmetric kernels, convolution = cross-correlation S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Convolution implementation (1D) Algorithm: Note: Convolution cannot be done in place Output must be separate from input, to avoid corrupting computation

S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 1D convolution example Given the 1D signal Suppose we want to compute the average of each pixel and its 2 neighbors Solution: slide kernel across signal, elementwise multiplication and sum : Result: This is known as shift-multiply-add S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847

Signal borders Zero extension: [1 5 6 7] * [1/3 1/3 1/3] is the same as [ 0 1 5 6 7 0 ] * [ 0 1 1 1 0 ]/3 Result: [ 0 0 0 0.33 2 4 6 4.33 2.33 0 0 0 ] If signal has length n and kernel has length w, then result has length n+w-1 But we adopt convention of cropping output to same size as input

S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 (replication/ reflection) 8 1D convolution example kernel image

8 24 48 32 16 16

(8) + (24) + (48) = 26 = (8) + (8) + (24) = 12 (flipped)

*

12 26 38 32 20 (48) + (32) + (16) = 32 (32) + (16) + (16) = 20 S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Convolution in 2D

image convolution kernel (shifted by and ) Algorithm: kernel is a little image containing weights Algorithm: Flip g (horizontally and vertically not important in practice) For each (x,y)

Align g so that center is at (x,y) Sum the products S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Convolution implementation (2D) Algorithm: Again, note that convolution cannot be done in place S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Convolution as matrix multiplication Equivalent to multiplying by Toeplitz matrix

(ignoring border effects): extend by replication Applicable to any dimension S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Convolution as Fourier multiplication Convolution in spatial domain is multiplication in frequency domain Computationally efficient only when

kernel is large Equivalent to circular convolution (signals are treated as periodic) S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Outline

Convolution Gaussian convolution kernels Smoothing an image Differentiating an image Canny edge detection Laplacian of Gaussian Two types of convolution kernels Smoothing Differentiating

(Smoothing a constant function should not change it) (Differentiating a constant function should return 0) Example: (1/3) * [1 1 1] Example: (1/2) * [-1 0 1] Lowpass filter (Gaussian)

Highpass filter (derivative of Gaussian) also bandpass filters (Laplacian of Gaussian) S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Box filter Simplest smoothing kernel is the box filter: 1/n 0 (n=1) (n=3) (n=5)

Odd length avoids undesirable shifting of signal S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Gaussian kernel Gaussian function is bell curve: Performs weighted average of neighboring values mean (center) standard deviation (width) variance Normalization factor ensures PDF:

Gaussian Gaussian provides weighted average of neighbors Why is Gaussian so important? completely described by 1st and 2nd order statistics central limit theorem localized in space and frequency convolution of two Gaussians is Gaussian =1+2 and 2=12+22 separable (efficient) S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847

Repeated averaging Repeated convolution with box filter yields convolution with Gaussian (from Central Limit Theorem): These are the odd rows of the binomial (Pascals) triangle: (2k+1)th row approximates Gaussian with 2=k/2 k=1, 2=0.5, =0.7 k=2, 2=1, =1 Note that scale factor is power

of two (makes normalization fast because division is just bit shift) S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Repeated averaging (cont.) Trinomial triangle also yields Gaussian approximations: kth row approximates Gaussian with 2=2k/3 Example: In general,

n repeated convolutions with 2 Gaussian approximates single convolution with n2 Gaussian S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Computing the variance of a smoothing kernel Set of values: Mean is Variance is deviation from mean:

Computing the variance of a smoothing kernel Kernel is a sequence of values: Is this the mean? Is this the variance? No, because a sequence is not a set (order matters) Computing the variance of a smoothing kernel Kernel is a sequence of values:

Mean: Variance: Note: These are weighted averages, with weights g(i) Example Kernel: g[0] g[1] g[2] Mean: Variance:

2 1 1 0 1 2 Note:

Building a Gaussian kernel How to build Gaussian kernel with arbitrary Sample continuous Gaussian continuous normalization f=2.5 is reasonable, to capture 2.5 because discrete version approximately captures But we want to ensure discrete normalization

S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Building a Gaussian kernel Choosing number of samples Common Gaussian kernels (The subscript is 2.) Sampling effects Resulting discrete function will not have the

same as original continuous function Example: Sample =1 (=1) (0.4026) * [0.1353 0.6065 1.0000 0.6065 0.1353] Resulting discrete kernel has =(0.4026) * 2 * [(0.1353)*(22)+(0.6065)*(12)] = 0.92 Another example: Sample =1.08 (=1.17) (0.3755) * [0.1800 0.6514 1.0000 0.6514 \approx approx (1/16) [ 1 4 6 4 1 ] Resulting discrete kernel has =1

0.1800] S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 (Some say) three samples is not enough to capture Gaussian Spatial domain: Capture 98.76% of the area with 2.5 (continuous) Frequency domain: kernel width = 2*(halfwidth)+1

Capture 98.76% of the area with (Note: sampling at 1 pixel intervals cutoff frequency is 2(0.5) = [Trucco and Verri, p. 60] But in practice three samples is common S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 What is the best 3x1 Gaussian kernel? Spatial domain:

Capture some percentage of the area with w 2 Frequency domain: kernel width = 2*(halfwidth)+1 Capture the same percentage with 2 2 Combining these yields: = 0.69 97% of Gaussian is captured (not bad!)

S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Binomial triangle Gaussians are too wide Recall that the binomial (Pascals) triangle is an easy way to construct a Gaussian of width (2k+1) and variance 2=k/2 Recall that to capture 98.76% of the area, we want For = 1.0, width is perfect, but for larger the kernel is too wide What are the implications for repeated averaging? S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847

Outline Convolution Gaussian convolution kernels Smoothing an image Differentiating an image Canny edge detection

Laplacian of Gaussian Separability Can always construct 2D kernel by convolving 1D kernels: Some 2D kernels can be decomposed into 1D kernels: A 2D kernel is separable iff all rows / columns are linearly dependent S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Separability Separable convolution is less expensive:

O(n^2) operations O(2n) operations Allowed because convolution is associative: Convolve with two 1D kernels instead of one 2D kernel S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Separability of Gaussian 2D Gaussian (isotropic):

Convolution with 2D Gaussian is same as convolution with two 1D Gaussians (horizontal and vertical): S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Separable convolution horizontal then input vertical

(or vice versa) output temporary Remember: Do not try to perform convolution in place! S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Separable convolution Smoothing with a Gaussian

from http://www-static.cc.gatech.edu/classes/AY2007/cs4495_fall/html/materials.html Effects of different sigmas from http://www-static.cc.gatech.edu/classes/AY2007/cs4495_fall/html/materials.html Gaussian pyramid smooth downsample Shannons sampling theorem: After smoothing, many pixels

are redundant. Therefore, we can discard them (by downsampling) without losing information from http://www-static.cc.gatech.edu/classes/AY2007/cs4495_fall/html/materials.html Other linear filters Finite impulse response (FIR): (This is just convolution) Infinite impulse response (IIR): Advantages of IIR: Recursive computation can be performed in place (unlike convolution)

Can be faster for large S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Nonlinear filters Median filter: Replace pixel with median of surrounding n x n region Good for impulse noise (salt-andpepper noise) image

impulse noise mean filtering median filtering What is mean filter good for? additive Gaussian noise S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847

Grayscale morphology Grayscale erosion Similar to convolution, but replace sum with min and replace multiplication with subtraction Reduces bright details Grayscale dilation Similar to convolution, but replace sum with max and replace multiplication with addition Reduces dark details S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847

Outline Convolution Gaussian convolution kernels Smoothing an image Differentiating an image Canny edge detection

Laplacian of Gaussian Edge detection What is an edge? No precise definition (like many problems in computer vision) Basically a place where the intensity changes significantly Four types: step edge line

edge roof edge ramp edge We will concentrate on step edges (the most common) S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 The importance of intensity edges

from Walther et al., Simple line drawings suffice for functional MRI decoding of natural scene categories, PNAS 2011 Much information is retained by the edges S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 The importance of intensity edges from Walther et al., Simple line drawings suffice for functional MRI decoding of natural scene categories, PNAS 2011 Much information is retained by the edges

S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Finite differences Simplest differentiating kernel is finite differences: Forward difference Backward difference Example: (Dont forget to flip kernel before convolving)

S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Central differences To avoid undesirable shift, use central difference kernel: Central difference Example: Note: Result has the same basic shape S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847

What is central difference doing? Central difference seems strange b/c it completely ignores the pixel: same result! But it averages the two slopes: f x-1 (backward diff)

x x+1 (forward diff) S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Smoothed differentiation Central difference kernel is a smoothed version of forward / backward difference: Smoothing, then differentiating, is the same as

convolving with a smoothed differentiation kernel: (b/c convolution is associative) S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Derivative of Gaussian Natural extension is to convolve with the derivative of a Gaussian Gaussian derivative +

- equivalent view S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Building a Gaussian derivative kernel How to build Gaussian derivative kernel with arbitrary Sample continuous

Gaussian derivative x Convolution with ramp should yield slope of ramp S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Building a Gaussian derivative kernel Cs compound assignment operators, -=, /=

Differentiating in 2D Derivative of function of one variable: derivative yields slope of function Extended to several variables yields gradient: gradient yields slope in the direction of maximal slope (points uphill) S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847

2D Gaussian derivative S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 2D Gaussian derivative is separable horizontal 1D Gaussian derivative kernel vertical 1D Gaussian kernel

smooth in one direction, differentiate in the other Computing image gradient Scharr Sobel Prewitt Common 2D Gauss derivative kernels S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847

Frei-Chen Frei-Chen projects 3x3 subimage onto orthogonal basis: Basis for edge subspace Basis for line subspace S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847

Roberts cross operator from Roberts (1965), the first computer vision thesis S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Statistics of natural images All natural images have similar statistics not Gaussian (high kurtosis) [Huang and Mumford, Statistics of natural images and models, CVPR 1999; Fergus et al., Removing camera shake from a single photograph, SIGGRAPH 2006]

S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Outline Convolution Gaussian convolution kernels Smoothing an image

Differentiating an image Canny edge detection Laplacian of Gaussian Canny edge detection Landmark paper (1986) Key ideas:

theoretical basis for Gaussian derivative non-maximal suppression edge following hysteresis (double thresholding) feature synthesis (early version of scale space) Still popular because easy to implement

small number of intuitive parameters computationally efficient good results S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Canny algorithm Four steps: 1. Convolve with derivative of Gaussian to get gx and gy 2. Non-maximal suppression 3. Edge-linking with hysteresis 4. Feature synthesis (optional)

S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Step 1: Gradient (convert image to floating point first) S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Step 2: Non-maximal suppression (or +)

for each pixel p, if (p

Step 3: Edge linking with hysteresis Determine thresholds: 1. Sort gradient magnitude values 2. Select thhi so that p% of pixels are edges (p = 10)

3. Select thlo = thhi / ratio (ratio = 5) thhi Apply thresholds: 1. Set all pixels above thhi to 1 2. Perform floodfill from all these p% times the number of pixels in the image pixels, setting all adjacent (alternate approach: use histogram)

pixels above thlo to 1 S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Edge linking with hysteresis (cont.) low high final S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847

More Canny results two different implementations and parameter values S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Canny theory Ideal step edge:

Actual edge: amplitude of signal f(x) is filter we are trying to find amplitude of noise Two criteria: messy math response to true edge

RMS response to noise RMS distance to true edge Good detection Good localization (maximize SNR) (minimize RMS distance)

maximize product: S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Localization-Detection Tradeoff By substitution, Wider filter (larger ) achieves better detection worse localization in the same amounts! Therefore, the product is invariant to scale of the filter

S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Canny theory (cont.) In 1D, box filter maximizes the product spurious detections but causes New constraint: There should be only one response to single edge Numerical simulation yields a function that looks similar to derivative of a Gaussian In 2D, convolve with Gaussian partial derivative, then steer to

direction of edge normal S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Edge template matching image Canny edges chamfer distance At each location, sum the chamfer distances of all

edge pixels template object found! search edges probability map S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847

Outline Convolution Gaussian convolution kernels Smoothing an image Differentiating an image

Canny edge detection Laplacian of Gaussian Second derivative of Gaussian This is the only 3-tap 2nd derivative Gaussian kernel (just like [-1 0 1] is the only 3-tap 1st derivative Gaussian kernel) S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Second derivative of Gaussian General case:

S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Laplacian of Gaussian Laplacian operator is the divergence of the gradient: Associative property: 2D LoG: where (isotropic zero-mean 2D Gaussian) S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847

Simplest LoG kernel Just add the 2nd-partial in x with the 2nd-partial in y: S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Creating 3x3 LoG kernels S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Laplacian of Gaussian (cont.) +

+ - + + inverted Mexican hat bandpass filter center surround 20.1 2=0.5

2=0.33 2=0.25 2=0.20 2=0.167 example discrete kernels: S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847

Common LoG kernels Marr-Hildreth operator Marr-Hildreth (1980) Zero crossings of 2nd derivative are extrema of 1st derivative Center surround, biologically plausible (difference of Gaussian) Not separable (but sum of two kernels, each of which is separable) Not as good as Canny, b/c smooths across boundary

Generates closed curves S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 LoG and sLoG image zero crossings LoG sign of the LoG (sLoG)

S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 More differentiation practice Second derivative w.r.t. x First derivative w.r.t. surprising result! S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Difference of Gaussians

2 = 1.6 1 DoG (What about 2D?) LoG S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 LoG pyramid smooth difference downsample

} } } S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Scale space x

H normalized Hessian matrix S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Scale Invariant Feature Transform (SIFT) 2. keypoint localization 1. scale-space extrema detection 3. orientation assignment 3. keypoint descriptor S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847

SIFT for object recognition S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Extra slides Creating 3x1 kernels Sample Gaussian: [a b a] Normalize by 2a+b Value of a and b are determined by Examples:

(1/4) * [1 2 1] (1/16) * [3 10 3] S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Creating 3x1 kernels Sample Gaussian derivative: [a 0 -a] Convolution with ramp should yield slope of ramp: [a 0 a] .* [m+2 m+1 m+0] = 1 a = , where m is arbitrary So normalize by a/2 to get

(1/2) * [1 0 -1] This is the only 3x1 Gauss deriv kernel! S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Creating 3x1 kernels Sample Gaussian 2nd derivative: [a -b a] Convolution with constant should yield zero: 2ab=0 b = 2a: [a -2a a] Convolution with changing ramp should yield change in slope: [a -2a a] .* [p n m] = (p-n) (n-m) a = 1

Alternatively, convolution with parabola y=x2 should yield 2: [a -2a a] .* [1 0 1] = 2 a = 1 Either way, this yields: [1 -2 1] This is the only 3x1 Gauss 2nd deriv kernel! S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847 Creating 1D kernels General approach, which works for all 1D kernels: To create 1D kernel of nth derivative of Gaussian,

sample continuous function normalize by scaled central moment: why? Because S. Birchfield, Clemson Univ., ECE 847, http://www.ces.clemson.edu/~stb/ece847