关注
Hanlin Zhu
Hanlin Zhu
在 berkeley.edu 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Guided dialog policy learning: Reward estimation for multi-domain task-oriented dialog
R Takanobu, H Zhu, M Huang
Conference on Empirical Methods in Natural Language Processing, 100-110, 2019
832019
Vector-matrix-vector queries for solving linear algebra, statistics, and graph problems
C Rashtchian, DP Woodruff, H Zhu
Approximation, Randomization, and Combinatorial Optimization. Algorithms and …, 2020
292020
Optimal conservative offline rl with general function approximation via augmented lagrangian
P Rashidinejad, H Zhu, K Yang, S Russell, J Jiao
arXiv preprint arXiv:2211.00716, 2022
252022
Starling-7b: Improving llm helpfulness & harmlessness with rlaif
B Zhu, E Frick, T Wu, H Zhu, J Jiao
November, 2023
102023
Importance weighted actor-critic for optimal conservative offline reinforcement learning
H Zhu, P Rashidinejad, J Jiao
Advances in Neural Information Processing Systems 36, 2024
72024
Provably efficient reinforcement learning via surprise bound
H Zhu, R Wang, J Lee
International Conference on Artificial Intelligence and Statistics, 4006-4032, 2023
52023
Average-case communication complexity of statistical problems
C Rashtchian, D Woodruff, P Ye, H Zhu
Conference on Learning Theory, 3859-3886, 2021
52021
Learning personalized story evaluation
D Wang, K Yang, H Zhu, X Yang, A Cohen, L Li, Y Tian
arXiv preprint arXiv:2310.03304, 2023
32023
On Representation Complexity of Model-based and Model-free Reinforcement Learning
H Zhu, B Huang, S Russell
arXiv preprint arXiv:2310.01706, 2023
32023
Provably efficient offline goal-conditioned reinforcement learning with general function approximation and single-policy concentrability
H Zhu, A Zhang
Advances in Neural Information Processing Systems 36, 2024
22024
Towards optimal statistical watermarking
B Huang, B Zhu, H Zhu, JD Lee, J Jiao, MI Jordan
arXiv preprint arXiv:2312.07930, 2023
22023
End-to-end Story Plot Generator
H Zhu, A Cohen, D Wang, K Yang, X Yang, J Jiao, Y Tian
arXiv preprint arXiv:2310.08796, 2023
12023
Avoiding Catastrophe in Continuous Spaces by Asking for Help
B Plaut, H Zhu, S Russell
arXiv preprint arXiv:2402.08062, 2024
2024
Efficient Prompt Caching via Embedding Similarity
H Zhu, B Zhu, J Jiao
arXiv preprint arXiv:2402.01173, 2024
2024
系统目前无法执行此操作,请稍后再试。
文章 1–14