Reinforcement Learning

     Reinforcement Learning(RL) is a type of machine learning technique that enables an agent to learn in an interactive environment by trial and error using feedback from its own actions and experiences. Though both supervised and reinforcement learning use mapping between input and output, unlike supervised learning where feedback provided to the agent is correct set of actions for performing a task, reinforcement learning uses rewards and punishment as signals for positive and negative behavior.

    As compared to unsupervised learning, reinforcement learning is different in terms of goals. While the goal in unsupervised learning is to find similarities and differences between data points, in reinforcement learning the goal is to find a suitable action model that would maximize the total cumulative reward of the agent. The figure below represents the basic idea and elements involved in a reinforcement learning model.


Here are some important terms used in Reinforcement AI:

  • Agent: It is an assumed entity which performs actions in an environment to gain some reward.
  • Environment (e): A scenario that an agent has to face.
  • Reward (R): An immediate return given to an agent when he or she performs specific action or task.
  • State (s): State refers to the current situation returned by the environment.
  • Policy (π): It is a strategy which applies by the agent to decide the next action based on the current state.
  • Value (V): It is expected long-term return with discount, as compared to the short-term reward.
  • Value Function: It specifies the value of a state that is the total amount of reward. It is an agent which should be expected beginning from that state.
  • Model of the environment: This mimics the behavior of the environment. It helps you to make inferences to be made and also determine how the environment will behave.
  • Model based methods: It is a method for solving reinforcement learning problems which use model-based methods.
  • Q value or action value (Q): Q value is quite similar to value. The only difference between the two is that it takes an additional parameter as a current action.

A good way to understand reinforcement learning is to consider some of the examples and possible applications that have guided its development.

  • A mobile robot decides whether it should enter a new room in search of more trash to collect or start trying to find its way back to its battery recharging station. It makes its decision based on the current charge level of its battery and how quickly and easily it has been able to find the recharger in the past.
  • A master chess player makes a move. The choice is informed both by planning — anticipating possible replies and counter replies — and by immediate, intuitive judgments of the desirability of positions and moves.
  • An adaptive controller adjusts parameters of a petroleum refinery’s operation in real time. The controller optimizes the yield/cost/quality trade-off based on specified marginal costs without sticking strictly to the set points originally suggested by engineers.

    The main challenge in reinforcement learning lays in preparing the simulation environment, which is highly dependant on the task to be performed. When the model has to go superhuman in Chess, Go or Atari games, preparing the simulation environment is relatively simple. When it comes to building a model capable of driving an autonomous car, building a realistic simulator is crucial before letting the car ride on the street. The model has to figure out how to brake or avoid a collision in a safe environment, where sacrificing even a thousand cars comes at a minimal cost. Transferring the model out of the training environment and into to the real world is where things get tricky. Scaling and tweaking the neural network controlling the agent is another challenge. There is no way to communicate with the network other than through the system of rewards and penalties.

    Reinforcement learning is no doubt a cutting-edge technology that has the potential to transform our world. However, it need not be used in every case. Nevertheless, reinforcement learning seems to be the most likely way to make a machine creative – as seeking new, innovative ways to perform its tasks is in fact creativity. 











Comments

Popular posts from this blog

What is Fringe Science?

What is Quantum Computing?

Cyber Security