Qiang He

Cited by

	All	Since 2019
Citations	88	88
h-index	5	5
i10-index	4	4

202120222023202411 16 33 28

Public access

View all

2 articles

1 article

available

not available

Based on funding mandates

Co-authors

Xinwen HOUInstitute of Automation, Chinese Academy of SciencesVerified email at ia.ac.cn
Chen GongUniversity of VirginiaVerified email at virginia.edu
Jieyu ZhangUniversity of WashingtonVerified email at cs.washington.edu
Tianyi ZhouAssistant Professor of Computer Science, University of Maryland, College ParkVerified email at umiacs.umd.edu
Meng FangUniversity of LiverpoolVerified email at liverpool.ac.uk
Setareh MaghsudiRuhr-University BochumVerified email at rub.de
Huangyuan Chloe SuHarvard University, AWS AI LabsVerified email at amazon.com

Qiang He

Ruhr University Bochum

Verified email at ruhr-uni-bochum.de

Reinforcement Learning Deep Learning Large Language Models RLHF Alignment


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Wd3: Taming the estimation bias in deep reinforcement learning Q He, X Hou arXiv preprint arXiv:2006.12622, 2020	34*	2020
Mepg: A minimalist ensemble policy gradient framework for deep reinforcement learning Q He, C Gong, Y Qu, X Chen, X Hou, Y Liu ICML'23, DA in RL Workshop, 39th International Conference on Machine …, 2021	15	2021
Popo: Pessimistic offline policy optimization Q He, X Hou, Y Liu ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022	13	2022
Frustratingly Easy Regularization on Representation Can Boost Deep Reinforcement Learning Q He, H Su, J Zhang, X Hou CVPR'2023, Proceedings of the IEEE/CVF Conference on Computer Vision and …, 2023	11*	2023
Wide-sense stationary policy optimization with bellman residual on video games C Gong, Q He, Y Bai, X Hou, G Fan, Y Liu ICME'2021, 2021 IEEE International Conference on Multimedia and Expo (ICME), 1-6, 2021	9	2021
Eigensubspace of temporal-difference dynamics and how it improves value approximation in reinforcement learning Q He, T Zhou, M Fang, S Maghsudi ECML/PKDD'2023, Joint European Conference on Machine Learning and Knowledge …, 2023	2	2023
The f-Divergence Reinforcement Learning Framework C Gong, Q He, Y Bai*, X Chen, X Hou, Y Liu, G Fan arXiv preprint arXiv:2109.11867, 2021	2	2021
Task Adaptation from Skills: Information Geometry, Disentanglement, and New Objectives for Unsupervised Reinforcement Learning Y Yang, T Zhou, Q He, L Han, M Pechenizkiy, M Fang ICLR'2024 Spotlight; The Twelfth International Conference on Learning …, 2023	1	2023
Centralized Cooperative Exploration Policy for Continuous Control Tasks C Li, C Gong, Q He, X Hou, Y Liu AAMAS'2023, The 22nd International Conference on Autonomous Agents and …, 2023	1	2023
Advancing DRL Agents in Commercial Fighting Games: Training, Integration, and Agent-Human Alignment C Zhang, Q He, Z Yuan, ES Liu, H Wang, J Zhao, Y Wang The 41st International Conference on Machine Learning (ICML). https://arxiv …, 2024		2024
Keep Various Trajectories: Promoting Exploration of Ensemble Policies in Continuous Control C Li, C Gong, Q He, X Hou NeurIPS'2023, Thirty-seventh Conference on Neural Information Processing Systems, 2023		2023
Adaptive Regularization of Representation Rank as an Implicit Constraint of Bellman Equation Q He, T Zhou, M Fang, S Maghsudi ICLR'2024; The Twelfth International Conference on Learning Representations, 2023		2023

The system can't perform the operation now. Try again later.

Articles 1–12

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors