How do you calculate the gradient of the policy of a specific state-action pair?1spencer kraislerJonathan HuiAug 13, 2019·1 min readJust like training DL. See the code in the section Policy gradient with automatic differentiation.