Adaptive best-of-both-worlds algorithm for heavy-tailed multi-armed bandits J Huang, Y Dai, L Huang international conference on machine learning, 9173-9200, 2022 | 15 | 2022 |
Rlx2: Training a sparse deep reinforcement learning model from scratch Y Tan, P Hu, L Pan, J Huang, L Huang arXiv preprint arXiv:2205.15043, 2022 | 8 | 2022 |
Queue scheduling with adversarial bandit learning J Huang, L Golubchik, L Huang arXiv preprint arXiv:2303.01745, 2023 | 6 | 2023 |
Banker online mirror descent: A universal approach for delayed online bandit learning J Huang, Y Dai, L Huang International Conference on Machine Learning, 13814-13844, 2023 | 5 | 2023 |
Robust wireless scheduling under arbitrary channel dynamics and feedback delay J Huang, L Huang 2021 33th International Teletraffic Congress (ITC-33), 1-8, 2021 | 2 | 2021 |
Banker online mirror descent J Huang, L Huang arXiv preprint arXiv:2106.08943, 2021 | 2 | 2021 |
Scale-Free Adversarial Multi-Armed Bandit with Arbitrary Feedback Delays J Huang, Y Dai, L Huang arXiv preprint arXiv:2110.13400, 2021 | 1 | 2021 |