On-policy results

Hi,

Thanks for the nice paper. We implemented the D2RL approach in our in-house library, and were able to replicate your results on some mujoco environments using off-policy agents.
Out of curiosity, have you tried it with on-policy agents, and if yes, do you have results or insights to share on how it performed ?

Best,