Reinforcement Learning Explained in brief for a layperson

As we know, Machine Learning algorithms can broadly be divided into 3 main categories:

  • Supervised Learning
  • Unsupervised Learning
  • Reinforcement Learning (RL)

Let’s understand in layman term what is reinforcement learning. The main thing RL does is Learning Control – This is neither supervised or unsupervised learning but typically these are problems where you are learning to control the behavior of a system.


How to cycle.
Remember the days when you are trying to ride a cycle…. It’s trial and error. Actually, it is some kind of feedback which is not fully unsupervised. So we can say that this is a type of learning where you are trying to control the system with trial and error and with minimum feedback. RL learns from the close interaction with the environment, close interaction means in this context is that an agent senses that state of the environment and takes the appropriate action. So the agent takes feedback from the close environment and we typically assume that the environment is stochastic means every time you take action you are not getting the same response from the env.

Apart from the feedback, there is an evaluation measure from the env which tells how well you are performing in a particular task. So each Reinforcement learning algorithm’s goal is to implement a policy that maximizes some measure of long term performance.

Just to summarize:

Reinforcement learning algorithm:

  • Learn from close interaction
  • Stochastic environment
  • Noisy delayed scalar evaluation
  • Learn policy – Maximize a measure of long term performance

Some applications:        

  • Game playing  – Games like backgammon (One of the oldest board game), Atari
  • Robot navigation
  • Helicopter pilot      
  • VLSI placement 

This was a brief introduction to RL for an easy understanding of the concept. For further study look for a good book or course.

My recommendation:

Happy learning!

Prognostic Analytics for Predictive Maintenance, a case study

First, let’s try to understand the difference between prognostic analysis and predictive analysis. Predictive analysis tells that something is going to fail in future whereas Prognostic analysis tells that something is going to fail says in next some days/weeks/months. so there is always a time dimension factor in the prognostic analysis. Prognostic analysis can help in planning things in advance before the system actually fails results in saving resources and time. Let elaborate further with a case study.

Case Study: Let’s take a case where we want to use Prognostic Analytics for Predictive Maintenance in IoT based Systems in large plants e.g.; aviation, oil & gas, big manufacturer, etc. Running a prognostic model can help in finding out that performance of which controls system is degrading by analyzing the key sensor data of the past which can give an early sign that the system may go down and one can take precaution which can result in a big saving. The control systems which also include sensor infrastructure used in heavy industries generate tons-tons of data continuously and most of the time the data is decades-old, so basically companies stores the data but get never used. So there is a huge opportunity for heavy manufacturers to use the past data to get a good insight into the different parts of the system.

Technically, ML team can make a pipeline and stream the data which is coming out of the control system as it works and stream to a cloud-like AWS/Google/Azure or private cloud and then ML models can be run to check for the abnormalities and preventive maintenance can be planned. For an example, if you in a power plant and some crucial parts fails then someone from the supplier has to rush, take a plane and deliver and install which costs a lot of money but if we start doing the prognostic analysis we can get an early sign of which parts might fail and can procure.

Just to summarise, we can use the old data of the system/putting new data to the cloud and make preventive/prognostic analysis which can save money, resources and time.

References-Thanks: Dr. Harpreet Singh, I heard him on a podcast and highly impressed by his vision on data science and different use cases.