Follow
Fengshuo Bai
Title
Cited by
Cited by
Year
Meta-reward-net: Implicitly differentiable reward learning for preference-based reinforcement learning
R Liu, F Bai, Y Du, Y Yang
Advances in Neural Information Processing Systems 35, 22270-22284, 2022
302022
Picor: Multi-task deep reinforcement learning with policy correction
F Bai, H Zhang, T Tao, Z Wu, Y Wang, B Xu
Proceedings of the AAAI Conference on Artificial Intelligence 37 (6), 6728-6736, 2023
42023
Measuring Value Understanding in Language Models through Discriminator-Critique Gap
Z Zhang, F Bai, J Gao, Y Yang
arXiv preprint arXiv:2310.00378, 2023
32023
Zero-shot Preference Learning for Offline RL via Optimal Transport
R Liu, Y Du, F Bai, J Lyu, X Li
arXiv preprint arXiv:2306.03615, 2023
32023
Incentive Compatibility for AI Alignment in Sociotechnical Systems: Positions and Prospects
Z Zhang, F Bai, M Wang, H Ye, C Ma, Y Yang
arXiv preprint arXiv:2402.12907, 2024
2024
-DQN: Diverse Exploration via Learning a Behavior Function
H Zhang, F Bai, C Xiao, C Gao, M Müller
2023
BATTLE: Towards Behavior-oriented Adversarial Attacks against Deep Reinforcement Learning
F Bai, R Liu, Y Du, Y Wen, Y Yang
2023
Zero-shot Cross-task Preference Alignment for Offline RL via Optimal Transport
R Liu, Y Du, F Bai, J Lyu, X Li
2023
The system can't perform the operation now. Try again later.
Articles 1–8