No, we are not off edge-detection topic yet. However, in order to introduce some fancier algorithms, we have to do some ground work first. And to keep every step fruitful, we’ll start with Gaussian blur.
The math (as in code)
I don’t want to go into much details so here’s the brief version of Gaussian blur introduction – it blurs an image by adding weighted grayscales of surrounding pixels on to the pixel to be blurred. The center pixel will have biggest weight, and the surrounding pixels will have lesser and lesser weight as their distances to the center increase. The table of weights are calculated using Gaussian function, which is
in two dimensions. The standard deviation controls how the blur the image will get. x,y are coordinates relative to the center pixel. For example, you have a 3X3 picture will grayscale values as shown in the following table:
200 | 200 | 100 |
130 | 200 | 120 |
30 | 130 | 30 |
The weight table when =1 (or a sampled Gaussian kernel in fancier terms) is calculated:
G(-1,-1)=0.0585498 | G(0,-1)=0.0965324 | G(1,-1)=0.0585498 |
G(-1,0)=0.0965324 | G(0,0)=0.159155 | G(1,0)=0.0965324 |
G(-1,1)=0.0585498 | G(0,1)=0.0965324 | G(1,1)=0.0585498 |
Notice that the table (kernel) is constant for given size, so you don’t need to re-calculate it for each pixel. You may also notice that lots of cells hold the same value in above table. This is because Gaussian function is circularly symmetric. Of course I can’t get away without showing the pretty picture of a 2D Gaussian function:
What we need to do next is to apply the G(x,y) values to grayscale values and add them up as the new grayscale value at (0,0):
newValue = 200*G(-1,-1) + 200*G(0,-1) + 100*G(1,-1) + 130*G(-1,0) + …… 30*G(1,1)
Basically the function takes the pixel itself (at center) and its 8 neighbors and mix them together as the new value of center. The original center pixel has the most effect on the result because it has biggest weight, however the neighbors also get a share in the final result. Repeat the operation of each pixel of the image then you’ll get a blurred image.
Another thing to mention is that G(x,y) can be split into two steps by doing the 1D G(x) first and then applying 1D G(y) on the result. 1D Gaussian function looks like this:
Applying G(x) and G(y) separately saves lots of calculation because 1D Gaussian kernel is a 1D array, not a nXn matrix. In the code below we’ll use the 2-pass solution.
Also, in the above example we chose a 3X3 matrix, however it’s not enough. In theory the value of Gaussian function never falls down to zero so to really consider all surrounding pixels we need a infinitely large matrix. However, as Gaussian function decreases rapidly – in other worlds the G(x,y) values will be close to zero as the distance goes further from the center. Usually 3 can be considered a safe cut-off point, so for
=1, what we need is a 7X7 matrix (3
on each side plus center).
At last, as the values in the kernel doesn’t add up to 1, the result image will be darkened or brightened. To fix this the values in the matrix need to be normalized. This can be done by dividing each value by the sum of all values in the matrix. The following sample code implements such normalization.
The code
The Gaussian class
public class Gaussian { public static double[,] Calculate1DSampleKernel(double deviation, int size) { double[,] ret = new double[size, 1]; double sum = 0; int half = size / 2; for (int i = 0; i < size; i++) { ret[i, 0] = 1 / (Math.Sqrt(2 * Math.PI) * deviation) * Math.Exp(-(i - half) * (i - half) / (2 * deviation * deviation)); sum += ret[i, 0]; } return ret; } public static double[,] Calculate1DSampleKernel(double deviation) { int size = (int)Math.Ceiling(deviation * 3) * 2 + 1; return Calculate1DSampleKernel(deviation, size); } public static double[,] CalculateNormalized1DSampleKernel(double deviation) { return NormalizeMatrix(Calculate1DSampleKernel(deviation)); } public static double[,] NormalizeMatrix(double[,] matrix) { double[,] ret = new double[matrix.GetLength(0), matrix.GetLength(1)]; double sum = 0; for (int i = 0; i < ret.GetLength(0); i++) { for (int j = 0; j < ret.GetLength(1); j++) sum += matrix[i,j]; } if (sum != 0) { for (int i = 0; i < ret.GetLength(0); i++) { for (int j = 0; j < ret.GetLength(1); j++) ret[i, j] = matrix[i,j] / sum; } } return ret; } }
The class defines a series of overloaded method to calculate a sample kernel base on given standard deviation. For example, to get a kernel for deviation 1, call:
double[,] ret = Gaussian.CalculateNormalized1DSampleKernel(1);
The returned array is a 7X1 array containing the kernel.
Then the code to apply the kernel is like this (also part of Gaussian class):
public static double[,] GaussianConvolution(double[,] matrix, double deviation) { double[,] kernel = CalculateNormalized1DSampleKernel(deviation); double[,] res1 = new double[matrix.GetLength(0), matrix.GetLength(1)]; double[,] res2 = new double[matrix.GetLength(0), matrix.GetLength(1)]; //x-direction for (int i = 0; i < matrix.GetLength(0); i++) { for (int j = 0; j < matrix.GetLength(1); j++) res1[i, j] = processPoint(matrix, i, j, kernel, 0); } //y-direction for (int i = 0; i < matrix.GetLength(0); i++) { for (int j = 0; j < matrix.GetLength(1); j++) res2[i, j] = processPoint(res1, i, j, kernel, 1); } return res2; } private static double processPoint(double[,] matrix, int x, int y, double[,] kernel, int direction) { double res = 0; int half = kernel.GetLength(0) / 2; for (int i = 0; i < kernel.GetLength(0); i++) { int cox = direction == 0 ? x + i - half : x; int coy = direction == 1 ? y + i - half : y; if (cox >= 0 && cox < matrix.GetLength(0) && coy >= 0 && coy < matrix.GetLength(1)) { res += matrix[cox, coy] * kernel[i, 0]; } } return res; }
At last, the code to process the Bitmap type:
private Color grayscale(Color cr) { return Color.FromArgb(cr.A, (int)(cr.R * .3 + cr.G * .59 + cr.B * 0.11), (int)(cr.R * .3 + cr.G * .59 + cr.B * 0.11), (int)(cr.R * .3 + cr.G * .59 + cr.B * 0.11)); } public override Bitmap FilterProcessImage(double d, Bitmap image) { Bitmap ret = new Bitmap(image.Width, image.Height); double[,] matrix = new double[image.Width, image.Height]; for (int i = 0; i < image.Width; i++) { for (int j = 0; j < image.Height; j++) matrix[i, j] = grayscale(image.GetPixel(i, j)).R; } matrix = Gaussian.GaussianConvolution(matrix, d); for (int i = 0; i < image.Width; i++) { for (int j = 0; j < image.Height; j++) { int val = (int) Math.Min(255, matrix[i,j]); ret.SetPixel(i, j, Color.FromArgb(255,val,val,val)); } } return ret; }
The result
The following is the screen shot of my toolkit. To the left is the original image. In the middle is the blurred (smoothed) image – you can’t easily tell on a natural photograph like this, though. To the right is the edge detected by Krisch detector based on the blurred image. You can already see the result is a great improvement from the previous post – the facial hair don’t cause false-positives; the branch and the arm start to take shape; eyes and nose are cleared marked, etc. However this is not the full story yet. Stay tuned for further discussions on edge detection!
And the following is an example of Gaussian blur with a big deviation (8). Note in the toolkit I implemented filters on all channels so the original color is preserved – all you need to do is to apply the filter on each channel and then combine them back together.
We can also have a little fun by applying the filter to only one channel. The left image shows a Star War-like sword and a boring red blade. Applying a Gaussian blur on red channel lights up the blade. Note here i cheated a little by multiplying Green channel by 2 to make it brighter in the result. Nothing significant here – this is just for fun ;).
Thanks for this Explanation and code - jatinder
ReplyDeleteI think there's a bug in the kernel creation code.
ReplyDeleteThe resulting left size and right side of the array should be mirrorred. But they are not.
For example, if the function is called with deviation 1, then size will be (3 * 1) * 2 + 1 = 7. Size would be 7 / 2 = 3.5.
The resulting array [0-6] is Array ( [0] => 0.00087347307590098 [1] => 0.017544175717419 [2] => 0.12963489858551 [3] => 0.35238418915912 [4] => 0.35238418915912 [5] => 0.12963489858551 [6] => 0.017544175717419 )
1 and 6, 2 and 5 and 3 and 4 are equal. This is wrong. 0 and 6, 1 and 5 and 2 and 4 should be equal. And 3 should be in the middle.
The cause of this problem is that the 'half' formula is wrong:
int half = size / 2;
Half of 7 is indeed 3.5, but the array keys run from 0 to 6 and 3.5 is not in the middle of 0 and 6.
The formula should be changed to:
int half = (size - 1) / 2;
Also I think half should be a double, not an int. Otherwise the number are wrong if size is an even number. (for example size = 4, then half would be (4 - 1) / 2 = 1, while it should be 1.5 (in the middle of 0 and 3)
Please correct me if I'm wrong.
Oh and thanks for the code, it's all very clear!
ReplyDeleteHi, thank you for this tutorial. Really usefull.
ReplyDeleteHowever, i am looking for a gaussian blur with color and not grey scale result.
Can you tell me what should i change to do this ?
Tanks again !