关注
Heyang Zhao
Heyang Zhao
在 cs.ucla.edu 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Nearly minimax optimal reinforcement learning for linear markov decision processes
J He, H Zhao, D Zhou, Q Gu
International Conference on Machine Learning, 12790-12822, 2023
392023
Variance-dependent regret bounds for linear bandits and reinforcement learning: Adaptivity and computational efficiency
H Zhao, J He, D Zhou, T Zhang, Q Gu
The Thirty Sixth Annual Conference on Learning Theory, 2023
202023
Linear contextual bandits with adversarial corruptions
H Zhao, D Zhou, Q Gu
arXiv preprint arXiv:2110.12615, 2021
182021
Optimal online generalized linear regression with stochastic noise and its application to heteroscedastic bandits
H Zhao, D Zhou, J He, Q Gu
International Conference on Machine Learning, 42259-42279, 2023
11*2023
Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits
Q Di, T Jin, Y Wu, H Zhao, F Farnoud, Q Gu
arXiv preprint arXiv:2310.00968, 2023
32023
Pessimistic nonlinear least-squares value iteration for offline reinforcement learning
Q Di, H Zhao, J He, Q Gu
arXiv preprint arXiv:2310.01380, 2023
32023
A Nearly Optimal and Low-Switching Algorithm for Reinforcement Learning with General Function Approximation
H Zhao, J He, Q Gu
arXiv preprint arXiv:2311.15238, 2023
12023
Feel-Good Thompson Sampling for Contextual Dueling Bandits
X Li, H Zhao, Q Gu
arXiv preprint arXiv:2404.06013, 2024
2024
系统目前无法执行此操作,请稍后再试。
文章 1–8