Follow
Jiale Cheng
Jiale Cheng
phd student in Tsinghua University
Verified email at mails.tsinghua.edu.cn
Title
Cited by
Cited by
Year
On the safety of conversational models: Taxonomy, dataset, and benchmark
H Sun, G Xu, J Deng, J Cheng, C Zheng, H Zhou, N Peng, X Zhu, ...
arXiv preprint arXiv:2110.08466, 2021
532021
Safety assessment of chinese large language models
H Sun, Z Zhang, J Deng, J Cheng, M Huang
arXiv preprint arXiv:2304.10436, 2023
512023
Towards Safer Generative Language Models: A Survey on Safety Risks, Evaluations, and Improvements
J Deng, J Cheng, H Sun, Z Zhang, M Huang
arXiv preprint arXiv:2302.09270, 2023
26*2023
Alignbench: Benchmarking chinese alignment of large language models
X Liu, X Lei, S Wang, Y Huang, Z Feng, B Wen, J Cheng, P Ke, Y Xu, ...
arXiv preprint arXiv:2311.18743, 2023
112023
Critiquellm: Scaling llm-as-critic for effective and explainable evaluation of large language model generation
P Ke, B Wen, Z Feng, X Liu, X Lei, J Cheng, S Wang, A Zeng, Y Dong, ...
arXiv preprint arXiv:2311.18702, 2023
112023
Black-box prompt optimization: Aligning large language models without model training
J Cheng, X Liu, K Zheng, P Ke, H Wang, Y Dong, J Tang, M Huang
arXiv preprint arXiv:2311.04155, 2023
112023
Pal: Persona-augmented emotional support conversation generation
J Cheng, S Sabour, H Sun, Z Chen, M Huang
arXiv preprint arXiv:2212.09235, 2022
112022
Constructing highly inductive contexts for dialogue safety through controllable reverse generation
Z Zhang, J Cheng, H Sun, J Deng, F Mi, Y Wang, L Shang, M Huang
arXiv preprint arXiv:2212.01810, 2022
62022
InstructSafety: A Unified Framework for Building Multidimensional and Explainable Safety Detector through Instruction Tuning
Z Zhang, J Cheng, H Sun, J Deng, M Huang
Findings of the Association for Computational Linguistics: EMNLP 2023, 10421 …, 2023
22023
The system can't perform the operation now. Try again later.
Articles 1–9