关注
Ziniu Li
Ziniu Li
其他姓名Zi-Niu Li
The Chinese University of Hong Kong, Shenzhen
在 link.cuhk.edu.cn 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Error bounds of imitating policies and environments
T Xu, Z Li, Y Yu
Advances in Neural Information Processing Systems 33, 15737-15749, 2020
87*2020
Error bounds of imitating policies and environments for reinforcement learning
T Xu, Z Li, Y Yu
IEEE Transactions on Pattern Analysis and Machine Intelligence 44 (10), 6968 …, 2021
232021
Self-Guided Evolution Strategies with Historical Estimated Gradients
FY Liu, ZN Li, C Qian
IJCAI, 1474-1480, 2020
162020
Rethinking ValueDice - Does It Really Improve Performance?
Z Li, T Xu, Y Yu, ZQ Luo
ICLR Blog Track, 2022
142022
HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning
Z Li, Y Li, Y Zhang, T Zhang, ZQ Luo
International Conference on Learning Representations, 2022
112022
Understanding adversarial imitation learning in small sample regime: A stage-coupled analysis
T Xu, Z Li, Y Yu, ZQ Luo
arXiv preprint arXiv:2208.01899, 2022
8*2022
Policy Optimization in RLHF: The Impact of Out-of-preference Data
Z Li, T Xu, Y Yu
arXiv preprint arXiv:2312.10584, 2023
32023
Remax: A simple, effective, and efficient method for aligning large language models
Z Li, T Xu, Y Zhang, Y Yu, R Sun, ZQ Luo
arXiv preprint arXiv:2310.10505, 2023
32023
Imitation Learning from Imperfection: Theoretical Justifications and Algorithms
Z Li, T Xu, Z Qin, Y Yu, ZQ Luo
Advances in Neural Information Processing Systems 36, 2024
2*2024
Provably Efficient Adversarial Imitation Learning with Unknown Transitions
T Xu, Z Li, Y Yu, ZQ Luo
UAI, 2367-2378, 2023
22023
A Note on Target Q-learning For Solving Finite MDPs with A Generative Oracle
Z Li, T Xu, Y Yu
arXiv preprint arXiv:2203.11489, 2022
12022
Efficient Exploration by Novelty-Pursuit
Z Li, XH Chen
Distributed Artificial Intelligence: Second International Conference, DAI …, 2020
12020
Why Transformers Need Adam: A Hessian Perspective
Y Zhang, C Chen, T Ding, Z Li, R Sun, ZQ Luo
arXiv preprint arXiv:2402.16788, 2024
2024
Deploying Offline Reinforcement Learning with Human Feedback
Z Li, K Xu, L Liu, L Li, D Ye, P Zhao
arXiv preprint arXiv:2303.07046, 2023
2023
系统目前无法执行此操作,请稍后再试。
文章 1–14