CNN: Understanding edge detection with an example

The convolution operation is one of the fundamental building blocks of a convolutional neural network and here we’ll discuss edge detection as an example to see how the convolution operation works. In my last post, we saw how the early layers of the neural network might detect edges and then some later layers might detect parts of objects and then even later layers may detect parts of complete objects like people’s faces.

Image Source: Andrew Ng Deep Learning Course

In this post, we’ll see how we can detect edges in an image.

Given a picture below:

Image Source: Andrew Ng Deep Learning Course

that for a computer to figure out what are the objects in this picture, the first thing you might do is maybe detect vertical edges in this image. For example, the first image on left has all vertical lines (marked in red), where the buildings are and pedestrians and so those get detected in this vertical edge detector output. And we also want to detect horizontal edges so for example, there is a very strong horizontal line where the railing is also got detected. Now let’s see in detail, how we detect edges in images like above?

Let us look with an example, Here is a 6×6 grayscale image and because this is a grayscale image, this is just a 6x6x1 matrix rather than 6x6x3 because they are on separate RGB channels. In order to detect edges or lets say vertical edges in this image, what you can do is construct a 3×3 matrix and in the terminology of convolutional neural networks, this is going to be called a filter and we’re going to construct a 3×3 filter or 3×3 matrix and now what we are going to do is take the 6×6 image and convolve it and the convolution operation is denoted by this asterisk(*) and convolve it with the 3 x3 filter. The output of this convolution operator will be a 4×4 matrix, which we can interpret as a 4×4 image. The way we compute this 4 x4 output is shown in fig below.

Image Source: Andrew Ng Deep Learning Course

Explanation: Looking on to the diagram to compute the first elements, the upper left element of the 4×4 matrix, what we are going to do is take the 3×3 filter and paste it on top of the 3×3 region of our original input image (left side) and what we should do is take the element-wise product (check the calculation method at the top of diagram in red). Similarly, we can calculate elements of a 4×4 image. These are really just matrices of various dimensions. But the matrix on the left is convenient to interpret as an image, and the one in the middle we interpret as a filter and the one on the right, we can interpret that as maybe another image And this turns out to be a vertical edge detector.

Let’s look at another example and see how exactly vertical edge detection is happening?. To illustrate this, we are going to use a simplified image. Here is a simple 6×6 image where the left half of the image is 10 and the right half is zero. If you plot this as a picture, it might look like this, where the left half, the 10s, give you brighter pixel intensive values and the right half gives you darker pixel intensive values. We are using a shade of gray to denote zeros, in this image, there is clearly a very strong vertical edge right down the middle of this image as it transitions from white to black or white to a darker color. When we convolve this with a 3×3 filter and so this 3×3 filter can be visualized as follows, where we can see that lighter, brighter pixels are on the left and zeroes in the middle and then darker on the right. What we get is the rightmost matrix.

Image Source: Andrew Ng Deep Learning Course

Now, if we plot rightmost matrix’s image it will look like that where there is this lighter region right in the middle and that corresponds to this having detected this vertical edge down the middle of our 6×6 image. In case the dimensions here seem a little bit wrong that the detected edge seems really thick, that’s only because we are working with very small images in this example. And if we use, say a 1000×1000 image rather than a 6×6 image then you find that this does a pretty good job, really detecting the vertical edges in our image. In this example, the bright region in the middle is just the output images way of saying that it looks like there is a strong vertical edge right down the middle of the image. So in this way, we detect vertical edges using convolutional operator.

In the next post, I will try to explain more edge detection examples also horizontal edge detection etc.

Happy learning!

Deep learning on large images: challenges and CNN

Applying deep learning on large images is always a challenge but there a solution using convolutional but first, let’s understand in brief where is the challenge? So one of the challenges in computer vision is that inputs become very big as we increase image size.

Ok. Consider the example of a basic image classification problem e.g.; cat detection.

Let’s take an input 64×64 image of a cat and try to figure out if that is a cat or not? to do that we’ll need 64x64x3, (where 3 is the no of RGB channel) parameters, so the x input feature will have 12288 dimensions though this is not a very large number considering just the 64×64 image size which is very small there are lots of input features to deal with..

Say now if we take 1000×1000 image (1MB) which is decent in size but there will be 1000x1000x3 (where 3 are the RGB channels) ~ 3M input features

so if you put in a deep network then x will be 3M, and suppose if first hidden layer has 1000 hidden units then the total no of weights for a fully connected network will be (Weight matrix, X) ~ (1000, 3M) dimension means that the matrix will have 3 Billion parameters which are very very large, so with that much of data it is difficult to avoid overfitting in Neural Network and also the computational requirement to train the 3 Billion params is not feasible.

This is just for a 1MB image but in computer vision problem you don’t want to stick with using just tiny images using bigger images results in overfitting and huge input feature vector so here comes convolution operation which is the basic building block of Convolutional Neural Network (CNN).

Image Source: Andrew Ng Deep Learning Course



More on CNN in the next post.

Cheers!

Inspiration & source of understanding: Andrew Ng, Deep Learning Specialization

Deep learning specialization notes

A couple of months back I have completed Deep Learning Specialization taught by AI guru Andrew NG. During the learning process, I have made personal notes from all the 5 courses.  Notes are based on lecture video and supplementary material provided and my own understanding of the topic. I have used lots of diagrams and code snippets which I made from course videos and slides. I am fully complying with The Honor Code. No programming assignment and solutions are published on GitHub or any other site.

Please note that most of the places I am not using exact mathematical symbol and other notation, instead using plain English name this is just to save some time, also please note that this is a personal diary made during course and I guess a bit longer too and few places not very well organized, so in any form doesn’t replace the content and learning process one follows during course which includes quizzes, programming assignments, project etc. This is a great course so I encourage you to enroll.

What you will learn at the end of the specialization:

Neural Networks and Deep Learning: This course gives foundations of neural networks and deep learning. How to build and train. At the end of this course, we’ll in position to recognize cat so will make a cat recognizer.  [PDF]

Improving Deep Neural Networks – Hyperparameter Tuning, Regularization and Optimization: In this course, we’ll learn about practical aspects of the NN. Now you have made NN/deep network so the focus is on how to make it perform well. We’ll fine tune various things like hyperparamater tuning, regularization algorithms and optimization algorithms like RMSProp, Adam etc. So this course helps greatly in making model perform well.  [PDF]

Structuring your Machine Learning Project: In this course, we’ll learn how to structure machine learning projects. It is observed that strategy for machine learning projects has been changed a lot in deep learning era. For example, the way you divide data in train/test/dev set has been changed in the era of deep learning also whether train and test data comes from the same distributions etc.? we’ll also learn about end-to-end deep learning. The material in this course is relatively unique.  [PDF]

Convolutional Neural Networks(CNN): CNN is often applied in images mainly in computer vision problems. In this course, we’ll learn about how to make these models using CNN’s.  [PDF]

Natural Language Processing-Building Sequence Models: In this course, we’ll learn about algorithms like Recurrent Neural Network (RNN’s), LSTM (Long Short-Term Memory) and learn how to apply them with the sequence of data like natural language processing, speech recognition, music generation etc.  [PDF]

 

Happy learning!

Pl, drop a note in case of any feedback.

 

References:

Deep Learning Specialization:
https://www.deeplearning.ai/

Github (Source code and diagrams used in notes):
https://github.com/ppant/deeplearning.ai-notes

Deep learning Specialization completion certificate: https://www.coursera.org/account/accomplishments/specialization/WVPVCUMH94YS