Follow
Ziran Yang
Title
Cited by
Cited by
Year
Red Teaming Game: A Game-Theoretic Framework for Red Teaming Language Models
C Ma, Z Yang, M Gao, H Ci, J Gao, X Pan, Y Yang
arXiv preprint arXiv:2310.00322, 2023
62023
Panacea: Pareto Alignment via Preference Adaptation for LLMs
Y Zhong, C Ma, X Zhang, Z Yang, Q Zhang, S Qi, Y Yang
arXiv preprint arXiv:2402.02030, 2024
52024
SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset
J Dai, T Chen, X Wang, Z Yang, T Chen, J Ji, Y Yang
arXiv preprint arXiv:2406.14477, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–3