As we know, Machine Learning algorithms can broadly be divided into 3 main categories:
- Supervised Learning
- Unsupervised Learning
- Reinforcement Learning (RL)
Let’s understand in layman term what is reinforcement learning. The main thing RL does is Learning Control – This is neither supervised or unsupervised learning but typically these are problems where you are learning to control the behavior of a system.
How to cycle.
Remember the days when you are trying to ride a cycle…. It’s trial and error. Actually, it is some kind of feedback which is not fully unsupervised. So we can say that this is a type of learning where you are trying to control the system with trial and error and with minimum feedback. RL learns from the close interaction with the environment, close interaction means in this context is that an agent senses that state of the environment and takes the appropriate action. So the agent takes feedback from the close environment and we typically assume that the environment is stochastic means every time you take action you are not getting the same response from the env.
Apart from the feedback, there is an evaluation measure from the env which tells how well you are performing in a particular task. So each Reinforcement learning algorithm’s goal is to implement a policy that maximizes some measure of long term performance.
Just to summarize:
Reinforcement learning algorithm:
- Learn from close interaction
- Stochastic environment
- Noisy delayed scalar evaluation
- Learn policy – Maximize a measure of long term performance
- Game playing – Games like backgammon (One of the oldest board game), Atari
- Robot navigation
- Helicopter pilot
- VLSI placement
This was a brief introduction to RL for an easy understanding of the concept. For further study look for a good book or course.
- List of books on RL
- There is an excellent course on NPTEL by Prof. Balaram Ravindran, IIT, Madras https://swayam.gov.in/nd1_noc19_cs55/preview