CNN: Understanding edge detection with an example

The convolution operation is one of the fundamental building blocks of a convolutional neural network and here we’ll discuss edge detection as an example to see how the convolution operation works. In my last post, we saw how the early layers of the neural network might detect edges and then some later layers might detect parts of objects and then even later layers may detect parts of complete objects like people’s faces.

Image Source: Andrew Ng Deep Learning Course

In this post, we’ll see how we can detect edges in an image.

Given a picture below:

Image Source: Andrew Ng Deep Learning Course

that for a computer to figure out what are the objects in this picture, the first thing you might do is maybe detect vertical edges in this image. For example, the first image on left has all vertical lines (marked in red), where the buildings are and pedestrians and so those get detected in this vertical edge detector output. And we also want to detect horizontal edges so for example, there is a very strong horizontal line where the railing is also got detected. Now let’s see in detail, how we detect edges in images like above?

Let us look with an example, Here is a 6×6 grayscale image and because this is a grayscale image, this is just a 6x6x1 matrix rather than 6x6x3 because they are on separate RGB channels. In order to detect edges or lets say vertical edges in this image, what you can do is construct a 3×3 matrix and in the terminology of convolutional neural networks, this is going to be called a filter and we’re going to construct a 3×3 filter or 3×3 matrix and now what we are going to do is take the 6×6 image and convolve it and the convolution operation is denoted by this asterisk(*) and convolve it with the 3 x3 filter. The output of this convolution operator will be a 4×4 matrix, which we can interpret as a 4×4 image. The way we compute this 4 x4 output is shown in fig below.

Image Source: Andrew Ng Deep Learning Course

Explanation: Looking on to the diagram to compute the first elements, the upper left element of the 4×4 matrix, what we are going to do is take the 3×3 filter and paste it on top of the 3×3 region of our original input image (left side) and what we should do is take the element-wise product (check the calculation method at the top of diagram in red). Similarly, we can calculate elements of a 4×4 image. These are really just matrices of various dimensions. But the matrix on the left is convenient to interpret as an image, and the one in the middle we interpret as a filter and the one on the right, we can interpret that as maybe another image And this turns out to be a vertical edge detector.

Let’s look at another example and see how exactly vertical edge detection is happening?. To illustrate this, we are going to use a simplified image. Here is a simple 6×6 image where the left half of the image is 10 and the right half is zero. If you plot this as a picture, it might look like this, where the left half, the 10s, give you brighter pixel intensive values and the right half gives you darker pixel intensive values. We are using a shade of gray to denote zeros, in this image, there is clearly a very strong vertical edge right down the middle of this image as it transitions from white to black or white to a darker color. When we convolve this with a 3×3 filter and so this 3×3 filter can be visualized as follows, where we can see that lighter, brighter pixels are on the left and zeroes in the middle and then darker on the right. What we get is the rightmost matrix.

Image Source: Andrew Ng Deep Learning Course

Now, if we plot rightmost matrix’s image it will look like that where there is this lighter region right in the middle and that corresponds to this having detected this vertical edge down the middle of our 6×6 image. In case the dimensions here seem a little bit wrong that the detected edge seems really thick, that’s only because we are working with very small images in this example. And if we use, say a 1000×1000 image rather than a 6×6 image then you find that this does a pretty good job, really detecting the vertical edges in our image. In this example, the bright region in the middle is just the output images way of saying that it looks like there is a strong vertical edge right down the middle of the image. So in this way, we detect vertical edges using convolutional operator.

In the next post, I will try to explain more edge detection examples also horizontal edge detection etc.

Happy learning!

WebDAV: an old horse but still useful!

My first encounter with WebDav protocol back in 2004-05 when I was writing the web part for SharePoint. WebDav is an old protocol of the ’90s but still very useful in certain scenarios.

WebDAV (RFC 4918) is an extension to HTTP, the protocol that web browsers and web servers use to communicate with each other. The WebDAV protocol enables a webserver to behave like a fileserver too, supporting collaborative authoring of web content. For an example, one can edit a word document online directly on the server using WebDav protocol. A web server that supports WebDAV simultaneously works like a fileserver. So, that’s a powerful capability.

In many of its use cases, WebDAV is being replaced by more modern mechanisms like Wikis, cloud solutions, etc. But it is still a reliable workhorse when the right servers and clients are matched, so it’s still encountered in many different applications.

Some of the servers which have implemented WebDav:

  • Apache HTTP Server
  • Microsoft IIS
  • Box.com
  • WordPress
  • Drupal
  • Microsoft Sharepoint
  • Subversion
  • Git
  • Microsoft Office
  • Apple iWork
  • Adobe Photoshop etc.

References:
WebDav RFC: https://tools.ietf.org/html/rfc4918

Reinforcement Learning Explained in brief for a layperson

As we know, Machine Learning algorithms can broadly be divided into 3 main categories:

  • Supervised Learning
  • Unsupervised Learning
  • Reinforcement Learning (RL)

Let’s understand in layman term what is reinforcement learning. The main thing RL does is Learning Control – This is neither supervised or unsupervised learning but typically these are problems where you are learning to control the behavior of a system.

Example:

How to cycle.
Remember the days when you are trying to ride a cycle…. It’s trial and error. Actually, it is some kind of feedback which is not fully unsupervised. So we can say that this is a type of learning where you are trying to control the system with trial and error and with minimum feedback. RL learns from the close interaction with the environment, close interaction means in this context is that an agent senses that state of the environment and takes the appropriate action. So the agent takes feedback from the close environment and we typically assume that the environment is stochastic means every time you take action you are not getting the same response from the env.

Apart from the feedback, there is an evaluation measure from the env which tells how well you are performing in a particular task. So each Reinforcement learning algorithm’s goal is to implement a policy that maximizes some measure of long term performance.

Just to summarize:

Reinforcement learning algorithm:

  • Learn from close interaction
  • Stochastic environment
  • Noisy delayed scalar evaluation
  • Learn policy – Maximize a measure of long term performance

Some applications:        

  • Game playing  – Games like backgammon (One of the oldest board game), Atari
  • Robot navigation
  • Helicopter pilot      
  • VLSI placement 

This was a brief introduction to RL for an easy understanding of the concept. For further study look for a good book or course.

My recommendation:

Happy learning!