Follow
Hanlin Zhu
Title
Cited by
Cited by
Year
Guided dialog policy learning: Reward estimation for multi-domain task-oriented dialog
R Takanobu, H Zhu, M Huang
Conference on Empirical Methods in Natural Language Processing, 100-110, 2019
842019
Vector-matrix-vector queries for solving linear algebra, statistics, and graph problems
C Rashtchian, DP Woodruff, H Zhu
Approximation, Randomization, and Combinatorial Optimization. Algorithms and …, 2020
292020
Optimal conservative offline rl with general function approximation via augmented lagrangian
P Rashidinejad, H Zhu, K Yang, S Russell, J Jiao
arXiv preprint arXiv:2211.00716, 2022
282022
Starling-7b: Improving llm helpfulness & harmlessness with rlaif
B Zhu, E Frick, T Wu, H Zhu, J Jiao
November, 2023
112023
Importance weighted actor-critic for optimal conservative offline reinforcement learning
H Zhu, P Rashidinejad, J Jiao
Advances in Neural Information Processing Systems 36, 2024
72024
Provably efficient reinforcement learning via surprise bound
H Zhu, R Wang, J Lee
International Conference on Artificial Intelligence and Statistics, 4006-4032, 2023
52023
Average-case communication complexity of statistical problems
C Rashtchian, D Woodruff, P Ye, H Zhu
Conference on Learning Theory, 3859-3886, 2021
52021
Towards optimal statistical watermarking
B Huang, B Zhu, H Zhu, JD Lee, J Jiao, MI Jordan
arXiv preprint arXiv:2312.07930, 2023
42023
Learning personalized story evaluation
D Wang, K Yang, H Zhu, X Yang, A Cohen, L Li, Y Tian
arXiv preprint arXiv:2310.03304, 2023
42023
On Representation Complexity of Model-based and Model-free Reinforcement Learning
H Zhu, B Huang, S Russell
arXiv preprint arXiv:2310.01706, 2023
32023
Provably efficient offline goal-conditioned reinforcement learning with general function approximation and single-policy concentrability
H Zhu, A Zhang
Advances in Neural Information Processing Systems 36, 2024
22024
End-to-end Story Plot Generator
H Zhu, A Cohen, D Wang, K Yang, X Yang, J Jiao, Y Tian
arXiv preprint arXiv:2310.08796, 2023
12023
Avoiding Catastrophe in Continuous Spaces by Asking for Help
B Plaut, H Zhu, S Russell
arXiv preprint arXiv:2402.08062, 2024
2024
Efficient Prompt Caching via Embedding Similarity
H Zhu, B Zhu, J Jiao
arXiv preprint arXiv:2402.01173, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–14