Neural replicator dynamics D Hennes, D Morrill, S Omidshafiei, R Munos, J Perolat, M Lanctot, ... Proceedings of the 19th International Conference on Autonomous Agents and …, 2020 | 80* | 2020 |
PIPPS: Flexible model-based policy search robust to the curse of chaos P Parmas, CE Rasmussen, J Peters, K Doya Proceedings of the 35th International Conference on Machine Learning 80 …, 2018 | 78 | 2018 |
Total stochastic gradient algorithms and applications in reinforcement learning P Parmas Advances in Neural Information Processing Systems 31, 2018 | 13 | 2018 |
A unified view of likelihood ratio and reparameterization gradients P Parmas, M Sugiyama International Conference on Artificial Intelligence and Statistics, 4078-4086, 2021 | 9 | 2021 |
A unified view of likelihood ratio and reparameterization gradients and an optimal importance sampling scheme P Parmas, M Sugiyama arXiv preprint arXiv:1910.06419, 2019 | 4 | 2019 |
Model-based Reinforcement Learning with Scalable Composite Policy Gradient Estimators P Parmas, T Seno, Y Aoki Proceedings of the 40th International Conference on Machine Learning 202 …, 2023 | 2 | 2023 |
Proppo: a Message Passing Framework for Customizable and Composable Learning Algorithms P Parmas, T Seno Advances in Neural Information Processing Systems, 2022 | 2 | 2022 |
Total stochastic gradient algorithms and applications to model-based reinforcement learning P Paavo (No Title), 2020 | 1 | 2020 |
Which resampling methods can tame ill-behaved gradients in chaotic systems? P Parmas, J Peters, K Doya Ratio 15 (20), 25, 0 | | |