Follow
Zhihui Xie
Zhihui Xie
University of Hong Kong, Shanghai Jiao Tong University
Verified email at connect.hku.hk - Homepage
Title
Cited by
Cited by
Year
Comparison-based Conversational Recommender System with Relative Bandit Feedback
Z Xie, T Yu, C Zhao, S Li
SIGIR 2021, 1400-1409, 2021
392021
Silkie: Preference Distillation for Large Visual Language Models
L Li*, Z Xie*, M Li, S Chen, P Wang, L Chen, Y Yang, B Wang, L Kong
arXiv preprint arXiv:2312.10665, 2023
202023
Pretraining in Deep Reinforcement Learning: A Survey
Z Xie, Z Lin, J Li, S Li, D Ye
arXiv preprint arXiv:2211.03959, 2022
202022
Knowledge-aware Conversational Preference Elicitation with Bandit Feedback
C Zhao, T Yu, Z Xie, S Li
WWW 2022, 483-492, 2022
192022
Dynamics-Aware Adaptation for Reinforcement Learning Based Cross-Domain Interactive Recommendation
J Wu*, Z Xie*, T Yu, H Zhao, R Zhang, S Li
SIGIR 2022, 290-300, 2022
152022
Future-conditioned unsupervised pretraining for decision transformer
Z Xie, Z Lin, D Ye, Q Fu, Y Wei, S Li
International Conference on Machine Learning, 38187-38203, 2023
142023
Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models
A Ormazabal, C Zheng, CM d'Autume, D Yogatama, D Fu, D Ong, E Chen, ...
arXiv preprint arXiv:2404.12387, 2024
6*2024
Layered Neighborhood Expansion for Incremental Multiple Graph Matching
Z Chen, Z Xie, J Yan, Y Zheng, X Yang
ECCV 2020, 251-267, 2020
52020
Discovering Low-rank Subspaces for Language-agnostic Multilingual Representations
Z Xie, H Zhao, T Yu, S Li
EMNLP 2022, 5617–5633, 2022
42022
Sim-to-Real Interactive Recommendation via Off-Dynamics Reinforcement Learning
J Wu, Z Xie, T Yu, Q Li, S Li
2rd Offline Reinforcement Learning Workshop Advances at NeurIPS, 2021
32021
Jailbreaking as a Reward Misspecification Problem
Z Xie, J Gao, L Li, Z Li, Q Liu, L Kong
arXiv preprint arXiv:2406.14393, 2024
2024
Calibrating Reasoning in Language Models with Internal Consistency
Z Xie, J Guo, T Yu, S Li
arXiv preprint arXiv:2405.18711, 2024
2024
Toward joint utilization of absolute and relative bandit feedback for conversational recommendation
Y Xia, Z Xie, T Yu, C Zhao, S Li
User Modeling and User-Adapted Interaction, 1-38, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–13