Follow
Paavo Parmas
Paavo Parmas
Verified email at kyoto-u.ac.jp
Title
Cited by
Cited by
Year
Neural replicator dynamics
D Hennes, D Morrill, S Omidshafiei, R Munos, J Perolat, M Lanctot, ...
Proceedings of the 19th International Conference on Autonomous Agents and …, 2020
80*2020
PIPPS: Flexible model-based policy search robust to the curse of chaos
P Parmas, CE Rasmussen, J Peters, K Doya
Proceedings of the 35th International Conference on Machine Learning 80 …, 2018
782018
Total stochastic gradient algorithms and applications in reinforcement learning
P Parmas
Advances in Neural Information Processing Systems 31, 2018
132018
A unified view of likelihood ratio and reparameterization gradients
P Parmas, M Sugiyama
International Conference on Artificial Intelligence and Statistics, 4078-4086, 2021
92021
A unified view of likelihood ratio and reparameterization gradients and an optimal importance sampling scheme
P Parmas, M Sugiyama
arXiv preprint arXiv:1910.06419, 2019
42019
Model-based Reinforcement Learning with Scalable Composite Policy Gradient Estimators
P Parmas, T Seno, Y Aoki
Proceedings of the 40th International Conference on Machine Learning 202 …, 2023
22023
Proppo: a Message Passing Framework for Customizable and Composable Learning Algorithms
P Parmas, T Seno
Advances in Neural Information Processing Systems, 2022
22022
Total stochastic gradient algorithms and applications to model-based reinforcement learning
P Paavo
(No Title), 2020
12020
Which resampling methods can tame ill-behaved gradients in chaotic systems?
P Parmas, J Peters, K Doya
Ratio 15 (20), 25, 0
The system can't perform the operation now. Try again later.
Articles 1–9