Representation Convergence:
Mutual Distillation is Secretly a Form of Regularization

Anonymous Author(s)

bigfish with MDPO (ours)

bigfish with MDPO

bigfish with PPO

bigfish with PPO

coinrun with MDPO (ours)

coinrun with MDPO

coinrun with PPO

coinrun with PPO

fruitbot with MDPO (ours)

fruitbot with MDPO

fruitbot with PPO

fruitbot with PPO

Policy Robustness Testing
(brightness, contrast, saturation, and hue)

kl1 kl1-extra

Policy Robustness Testing
(random CNNs)

kl2 kl2-extra