Deep Reinforcement Learning & Meta-Learning Series

6 min readSep 10, 2018

Deep Reinforcement Learning is about making the best decisions for what we see and what we hear. It sounds simple but making a decision is never easy. This subject is one of the hardest and one most rewarding. I try to explain things with an easy to understand angle. I don’t want to fill my readers with fancy talks that feel good but learn nothing. In reality, simplicity makes me see through the subject in better clarity. But I don’t want to skip the equations either. It just needs to be introduced in the proper manner. Understand them helps us to go deeper.

While there are still many articles need to be reviewed before publishing, the published one should give you enough details to start your journey. For the remaining articles, I will try to release them in 2019 Spring. So stay tuned.

Overview

RL— Introduction to Deep Reinforcement Learning

Deep reinforcement learning is about taking the best actions from what we see and hear. Unfortunately, reinforcement…

medium.com

RL — Deep Reinforcement Learning (Learn effectively like a human)

A human learns much efficient than RL. In this article, we will study other methods that may narrow this gap.

medium.com

RL — Transfer Learning (Learn from the Past)

Humans are explorers and we do it smartly. In reinforcement learning RL, model-free methods search the solution space…

jonathan-hui.medium.com

Value-learning

RL — Value Learning

Value learning is a fundamental concept in reinforcement learning. It is as basic as the fully connected network in…

medium.com

RL — Value Fitting & Q-Learning

We can learn the value function and the Q-value function iteratively. However, it cannot scale well to large state…

jonathan-hui.medium.com

Monte Carlo Tree Search (MCTS) in AlphaGo Zero

AlphaGo Zero uses MCTS to select the next move in a Go game.

medium.com

Q-learning

RL — DQN Deep Q-network

Can computers play video games like a human? In 2017, a professional team beats a DeepMind AI program in Starcraft 2…

medium.com

Policy Gradients

RL — Policy Gradients Explained

Policy Gradient Methods (PG) are frequently used algorithms in reinforcement learning (RL). The principle is very…

medium.com

RL — Policy Gradients Explained (Part 2)

In the first part of the Policy Gradients article, we cover the basic. In the second part, we continue on the Temporal…

medium.com

RL — Actor-Critic Methods: A3C, GAE, DDPG, Q-prop

jonathan-hui.medium.com

RL — Natural Policy Gradient Explained

Policy Gradient methods PG are popular in reinforcement learning RL. PG increases the chance of taking actions that…

medium.com

RL — Trust Region Policy Optimization (TRPO) Explained

TRPO, one of the most popular Policy Gradient methods (PG), addresses the convergence problem by introducing the…

medium.com

RL — Trust Region Policy Optimization (TRPO) Part 2

After discussing the basic concepts, we will discuss the detail of TRPO in Part 2. (TRPO is one of the most popular…

medium.com

RL — Appendix: Proof for the article in TRPO & PPO

Difference of discounted rewards

medium.com

RL — Actor-Critic using Kronecker-Factored Trust Region (ACKTR) Explained

In a previous article, we explain how Natural Policy Gradient allows the Policy Gradient methods to converge better by…

medium.com

RL — Proximal Policy Optimization (PPO) Explained

A quote from OpenAI on PPO:

medium.com

RL — The Math behind TRPO & PPO

TRPO Trust Region Policy Optimization & Proximal Policy Optimization PPO are based on the Minorize-Maximization MM…

medium.com

Model-based RL

RL — LQR & iLQR Linear Quadratic Regulator

Reinforcement learning can be divided into Model-free and Model-based learning. Model-free learning emphasizes heavily…

medium.com

RL — Model-based Reinforcement Learning

In reinforcement learning RL, we maximize the rewards for our actions, which depend on the policy and the system…

medium.com

RL — Guided Policy Search (GPS)

With Guided Policy Search GPS, a robot learns each skill in the video in 20 minutes. If it is trained by the Policy…

jonathan-hui.medium.com

RL — Guided Policy Search (A walkthrough)

In the previous article, we discuss the concept of the Guided Policy Search. Now we look into how it is trained.

jonathan-hui.medium.com

RL — Model-Based Learning with Raw Videos

Vision is a critical part of intelligence and the decision-making process. Many toy experiments avoid raw image…

jonathan-hui.medium.com

Technologies

RL — Imitation learning

Imitation is a key part in the human learning. In the high-tech world, if you are not an innovator, you want to be a…

medium.com

RL — Transfer Learning

We learn from past experiences. We apply learned knowledge to solve new tasks. In Deep Learning, training a deep…

jonathan-hui.medium.com

RL — Inverse Reinforcement Learning

It is a major challenge for reinforcement learning (RL) to process sparse and long-delayed rewards. It is difficult to…

medium.com

RL — Exploration

jonathan-hui.medium.com

RL — Prediction

How can we learn better? This is something we struggle with in real life also. Besides meta-learning, humans make…

jonathan-hui.medium.com

RL — PLATO Policy Learning using Adaptive Trajectory Optimization

Imitation plays a major role in learning. In RL, it reduces the amount of time in searching for solutions and it is…

jonathan-hui.medium.com

Comparison & Tips

RL — Reinforcement Learning Algorithms Overview

We have examined many Reinforcement Learning (RL) algorithms in this series, for instance, Policy Gradient methods for…

jonathan-hui.medium.com

RL — Tips on Reinforcement Learning

jonathan-hui.medium.com

RL — Reinforcement Learning Algorithms Comparison

Choosing an RL algorithm can be confusing. In this article, we will focus on different decision factors in choosing…

jonathan-hui.medium.com

Meta-learning

Meta-Learning (Learn how to Learn)

People learn continuously. We recall relevant skills and adjust them accordingly in handling new tasks. Overall…

jonathan-hui.medium.com

Meta-Learning (Bayesian Meta-Learning & Weak Supervision)

In part 2 of our Meta-Learning article, we will discuss Bayesian Meta-Learning, Unsupervised Learning, and Weak…

jonathan-hui.medium.com

RL — Meta-Learning

Many deep learning classifiers demonstrate superhuman performance but human still learns far more efficient than deep…

medium.com

Neural Turing Machines: a fundamental approach to access memory in deep learning

Memory is a crucial part of the brain and the computer. In some areas of deep learning, we extend the capabilities of…

medium.com

Cheat sheet

RL — Reinforcement Learning Algorithms Quick Overview

This article overviews the major algorithms in reinforcement learning. Each algorithm will be explained briefly in a…

medium.com

RL — Reinforcement Learning Terms

Reinforcement learning observes the environment and takes actions to maximize the rewards. It deals with exploration…

medium.com

Applications

AlphaGo Zero — a game changer. (How it works?)

Even AlphaGo is impressive, it requires bootstrapping the training with human games and knowledge. This is changed when…

medium.com

AlphaGo: How it works technically?

How does reinforcement learning join force with deep learning to beat the Go master? Since it sounds implausible, the…

medium.com

Basic

RL — Optimization Algorithms

This article contains the optimization algorithms often mentioned in RL.

medium.com

RL — Importance Sampling

In RL, Importance Sampling estimates the value functions for a policy π with samples collected previously from an older…

medium.com

RL — Dual Gradient Descent

We want to optimize an objective under a constraint:

medium.com

RL — Conjugate Gradient

We use the Conjugate Gradient (CG) method to solve a linear equation or to optimize a quadratic equation. It is more…

medium.com

Credit and references

Reinforcement learning is a huge topic and I owe a lot of debt to many professors, researchers, and bloggers. It is impossible to quote all videos, classes, research papers, and blog that I read. In fact, there are other university courses that help me a lot but I cannot recall the institutes anymore.

For here, I want to list a few that has the biggest impacts on me.

UCL reinforcement Learning

UC Berkeley Reinforcement Learning Bootcamp

Reinforcement Learning: An introduction

Nando de Freitas class

But I want to single out the UC Berkeley Reinforcement Learning course which offers every year for now. I start watching it in 2015. It is a tough course. The lesson on LQR almost made me give up RL. But with some perseverance, that makes the biggest impact on me. I hope it can have the same impact on you too.