RL — Guided Policy Search (A walkthrough)

Partially observable MDP (POMDP)

Modified from source
Modified from source

Feature points

Source
Modified from source
Source
  1. Set target end-effector pose.
  2. Train an exploratory controller.
  3. Learning the embedding for images.
  4. Provide the target goal.
  5. Train the final controller to achieve the final goal.
Modified from source
Modified from source
Modified from source
Modified from source
Source

Recap

Source

Architecture design

Source
Source

Revisited the problems

Modified from source
Source
Modified from source

Credits & references

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store