For DQN, the issue is finding the max below for a continuous space.

Image for post

It is not impossible. We just need some kind of optimization method to solve max Q. But that can be inefficient.

But for RL, I learn never to say never. Algorithms can also change to address what it is originally missing. DDQN is for continuous control. I wrote something on it but have not a chance to review and publish it. So you may have to read the original paper first:

Written by

Deep Learning

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store