Follow
Yuqing Wang
Title
Cited by
Cited by
Year
Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks?---A Neural Tangent Kernel Perspective
K Huang, Y Wang, M Tao, T Zhao
Advances in neural information processing systems 33, 2698-2709, 2020
972020
Large learning rate tames homogeneity: Convergence and balancing effect
Y Wang, M Chen, T Zhao, M Tao
arXiv preprint arXiv:2110.03677, 2021
372021
Momentum Stiefel Optimizer, with Applications to Suitably-Orthogonal Attention, and Optimal Transport
L Kong, Y Wang, M Tao
arXiv preprint arXiv:2205.14173, 2022
62022
Good regularity creates large learning rate implicit biases: edge of stability, balancing, and catapult
Y Wang, Z Xu, T Zhao, M Tao
arXiv preprint arXiv:2310.17087, 2023
12023
Markov chain Monte Carlo for Gaussian: A linear control perspective
B Yuan, J Fan, Y Wang, M Tao, Y Chen
IEEE Control Systems Letters, 2023
12023
The system can't perform the operation now. Try again later.
Articles 1–5