Follow
Ziwei Ji
Title
Cited by
Cited by
Year
Risk and parameter convergence of logistic regression
Z Ji, M Telgarsky
arXiv preprint arXiv:1803.07300, 2018
298*2018
Gradient descent aligns the layers of deep linear networks
Z Ji, M Telgarsky
arXiv preprint arXiv:1810.02032, 2018
2192018
Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow relu networks
Z Ji, M Telgarsky
arXiv preprint arXiv:1909.12292, 2019
1782019
Directional convergence and alignment in deep learning
Z Ji, M Telgarsky
Advances in Neural Information Processing Systems 33, 17176-17186, 2020
1442020
Characterizing the implicit bias via a primal-dual analysis
Z Ji, M Telgarsky
Algorithmic Learning Theory, 772-804, 2021
64*2021
Gradient descent follows the regularization path for general losses
Z Ji, M Dudík, RE Schapire, M Telgarsky
Conference on Learning Theory, 2109-2136, 2020
542020
Neural tangent kernels, transportation mappings, and universal approximation
Z Ji, M Telgarsky, R Xian
arXiv preprint arXiv:1910.06956, 2019
472019
Early-stopped neural networks are consistent
Z Ji, J Li, M Telgarsky
Advances in Neural Information Processing Systems 34, 1805-1817, 2021
322021
Generalization bounds via distillation
D Hsu, Z Ji, M Telgarsky, L Wang
arXiv preprint arXiv:2104.05641, 2021
292021
Fast margin maximization via dual acceleration
Z Ji, N Srebro, M Telgarsky
International Conference on Machine Learning, 4860-4869, 2021
282021
Reproducibility in optimization: Theoretical framework and limits
K Ahn, P Jain, Z Ji, S Kale, P Netrapalli, GI Shamir
Advances in Neural Information Processing Systems 35, 18022-18033, 2022
132022
Actor-critic is implicitly biased towards high entropy optimal policies
Y Hu, Z Ji, M Telgarsky
arXiv preprint arXiv:2110.11280, 2021
132021
Approximation power of random neural networks
B Bailey, Z Ji, M Telgarsky, R Xian
arXiv preprint arXiv:1906.07709, 2019
72019
Think before you speak: Training language models with pause tokens
S Goyal, Z Ji, AS Rawat, AK Menon, S Kumar, V Nagarajan
arXiv preprint arXiv:2310.02226, 2023
62023
Agnostic learnability of halfspaces via logistic loss
Z Ji, K Ahn, P Awasthi, S Kale, S Karp
International Conference on Machine Learning, 10068-10103, 2022
62022
Social welfare and profit maximization from revealed preferences
Z Ji, R Mehta, M Telgarsky
International Conference on Web and Internet Economics, 264-281, 2018
62018
Wikidata Vandalism Detection-The Loganberry Vandalism Detector at WSDM Cup 2017
Q Zhu, H Ng, L Liu, Z Ji, B Jiang, J Shen, H Gui
arXiv preprint arXiv:1712.06922, 2017
62017
Depth Dependence of P Learning Rates in ReLU MLPs
S Jelassi, B Hanin, Z Ji, SJ Reddi, S Bhojanapalli, S Kumar
arXiv preprint arXiv:2305.07810, 2023
42023
Convex analysis at infinity: An introduction to astral space
M Dudík, RE Schapire, M Telgarsky
arXiv preprint arXiv:2205.03260, 2022
22022
The implicit bias of gradient descent: from linear classifiers to deep networks
Z Ji
University of Illinois at Urbana-Champaign, 2022
2022
The system can't perform the operation now. Try again later.
Articles 1–20