Han Zhong

Cited by

	All	Since 2019
Citations	450	450
h-index	13	13
i10-index	13	13

260

130

195

20212022202320245 40 146 257

Public access

View all

5 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Liwei WangProfessor, Peking UniversityVerified email at cis.pku.edu.cn
Tong ZhangHKUSTVerified email at tongzhang-ml.org
Wei XiongComputer Science, University of Illinois Urbana-ChampaignVerified email at illinois.edu
Zhaoran WangAssistant Professor at Northwestern UniversityVerified email at northwestern.edu
Zhuoran YangYale UniversityVerified email at yale.edu
Chengshuai ShiElectrical and Computer Engineering, University of VirginiaVerified email at virginia.edu
Cong ShenAssociate Professor, University of VirginiaVerified email at virginia.edu
Simon Shaolei DuAssistant Professor, School of Computer Science and Engineering, University of WashingtonVerified email at cs.washington.edu
Yunchang YangPeking UniversityVerified email at pku.edu.cn
Tianhao WuUniversity of California, BerkeleyVerified email at berkeley.edu
Michael I. JordanProfessor of Electrical Engineering and Computer Sciences and Professor of Statistics, UC BerkeleyVerified email at cs.berkeley.edu
Hanze DongSalesforce ResearchVerified email at salesforce.com
Chenlu YeHong Kong University of Science and TechnologyVerified email at connect.ust.hk
Shenao ZhangNorthwestern UniversityVerified email at gatech.edu
Xiaoyu ChenPeking UniversityVerified email at pku.edu.cn
Jose BlanchetStanford UniversityVerified email at stanford.edu
Jiyuan TanStanford UniversityVerified email at stanford.edu
Rui YangHong Kong University of Science and TechnologyVerified email at connect.ust.hk
Lin F. Yang (杨林)Assistant Professor, Department of Electrical and Computer Engineering @ UCLAVerified email at ee.ucla.edu
Jiayi HuangPeking UniversityVerified email at stu.pku.edu.cn

Han Zhong

Peking University

Verified email at stu.pku.edu.cn - Homepage

Machine Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP, and Beyond H Zhong, W Xiong, S Zheng, L Wang, Z Wang, Z Yang, T Zhang arXiv preprint arXiv:2211.01962, 2022	50*	2022
Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation X Chen, H Zhong, Z Yang, Z Wang, L Wang International Conference on Machine Learning, 3773-3793, 2022	44	2022
Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game W Xiong, H Zhong, C Shi, C Shen, L Wang, T Zhang arXiv preprint arXiv:2205.15512, 2022	42	2022
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint W Xiong, H Dong, C Ye, Z Wang, H Zhong, H Ji, N Jiang, T Zhang ICLR 2024 Workshop on Mathematical and Empirical Understanding of Foundation …, 2023	40*	2023
Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopically Rational Followers? H Zhong, Z Yang, Z Wang, MI Jordan Journal of Machine Learning Research 24 (35), 1-52, 2023	40*	2023
Pessimistic minimax value iteration: Provably efficient equilibrium learning from offline datasets H Zhong, W Xiong, J Tan, L Wang, T Zhang, Z Wang, Z Yang International Conference on Machine Learning, 27117-27142, 2022	38	2022
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration Z Liu, M Lu, W Xiong, H Zhong, H Hu, S Zhang, S Zheng, Z Yang, Z Wang Thirty-seventh Conference on Neural Information Processing Systems, 2023	25*	2023
A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games W Xiong, H Zhong, C Shi, C Shen, T Zhang International Conference on Machine Learning, 24496-24523, 2022	25	2022
Why robust generalization in deep learning is difficult: Perspective of expressive power B Li, J Jin, H Zhong, J Hopcroft, L Wang Advances in Neural Information Processing Systems 35, 4370-4384, 2022	21	2022
A theoretical analysis of optimistic proximal policy optimization in linear markov decision processes H Zhong, T Zhang Advances in Neural Information Processing Systems 36, 2024	19	2024
Double pessimism is provably efficient for distributionally robust offline reinforcement learning: Generic algorithm and robust partial coverage J Blanchet, M Lu, T Zhang, H Zhong Advances in Neural Information Processing Systems 36, 2024	18	2024
Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs H Zhong, Z Yang, Z Wang, C Szepesvári arXiv preprint arXiv:2110.08984, 2021	18	2021
Nearly optimal policy optimization with stable at any time guarantee T Wu, Y Yang, H Zhong, L Wang, S Du, J Jiao International Conference on Machine Learning, 24243-24265, 2022	13	2022
DPO Meets PPO: Reinforced Token Optimization for RLHF H Zhong, G Feng, W Xiong, L Zhao, D He, J Bian, L Wang arXiv preprint arXiv:2404.18922, 2024	9	2024
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment R Yang, X Pan, F Luo, S Qiu, H Zhong, D Yu, J Chen arXiv preprint arXiv:2402.10207, 2024	8	2024
Towards Robust Offline Reinforcement Learning under Diverse Data Corruption R Yang, H Zhong, J Xu, A Zhang, C Zhang, L Han, T Zhang arXiv preprint arXiv:2310.12955, 2023	7	2023
Tackling heavy-tailed rewards in reinforcement learning with function approximation: Minimax optimal and instance-dependent regret bounds J Huang, H Zhong, L Wang, L Yang Advances in Neural Information Processing Systems 36, 2024	6	2024
Provable Sim-to-real Transfer in Continuous Domain with Partial Observations J Hu, H Zhong, C Jin, L Wang arXiv preprint arXiv:2210.15598, 2022	6	2022
Breaking the Moments Condition Barrier: No-Regret Algorithm for Bandits with Super Heavy-Tailed Payoffs H Zhong, J Huang, L Yang, L Wang Advances in Neural Information Processing Systems 34, 2021	6	2021
A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning Y Yang, T Wu, H Zhong, E Garcelon, M Pirotta, A Lazaric, L Wang, SS Du International Conference on Learning Representations, 2021/9/29, 2021	6*	2021

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors