Roberta: A robustly optimized bert pretraining approach

Y Liu, M Ott, N Goyal, J Du, M Joshi, D Chen… - arXiv preprint arXiv …, 2019 - arxiv.org
Language model pretraining has led to significant performance gains but careful
comparison between different approaches is challenging. Training is computationally …

The curious case of neural text degeneration

A Holtzman, J Buys, L Du, M Forbes, Y Choi - arXiv preprint arXiv …, 2019 - arxiv.org
Despite considerable advancements with deep neural language models, the enigma of
neural text degeneration persists when these models are tested as text generators. The …

[HTML][HTML] Transformers: State-of-the-art natural language processing

T Wolf, L Debut, V Sanh, J Chaumond… - Proceedings of the …, 2020 - aclanthology.org
Recent progress in natural language processing has been driven by advances in both
model architecture and model pretraining. Transformer architectures have facilitated …

A survey of the usages of deep learning for natural language processing

DW Otter, JR Medina, JK Kalita - IEEE transactions on neural …, 2020 - ieeexplore.ieee.org
Over the last several years, the field of natural language processing has been propelled
forward by an explosion in the use of deep learning models. This article provides a brief …

[HTML][HTML] Deep learning-based electroencephalography analysis: a systematic review

Y Roy, H Banville, I Albuquerque… - Journal of neural …, 2019 - iopscience.iop.org
Context. Electroencephalography (EEG) is a complex signal and can require several years
of training, as well as advanced signal processing and feature extraction methodologies to …

Dialogpt: Large-scale generative pre-training for conversational response generation

Y Zhang, S Sun, M Galley, YC Chen, C Brockett… - arXiv preprint arXiv …, 2019 - arxiv.org
We present a large, tunable neural conversational response generation model, DialoGPT
(dialogue generative pre-trained transformer). Trained on 147M conversation-like …

Multilingual denoising pre-training for neural machine translation

Y Liu, J Gu, N Goyal, X Li, S Edunov… - Transactions of the …, 2020 - direct.mit.edu
This paper demonstrates that multilingual denoising pre-training produces significant
performance gains across a wide variety of machine translation (MT) tasks. We present …

AutoML: A survey of the state-of-the-art

X He, K Zhao, X Chu - Knowledge-Based Systems, 2021 - Elsevier
Deep learning (DL) techniques have obtained remarkable achievements on various tasks,
such as image recognition, object detection, and language modeling. However, building a …

Defending against neural fake news

R Zellers, A Holtzman, H Rashkin… - Advances in neural …, 2019 - proceedings.neurips.cc
Recent progress in natural language generation has raised dual-use concerns. While
applications like summarization and translation are positive, the underlying technology also …

Ctrl: A conditional transformer language model for controllable generation

NS Keskar, B McCann, LR Varshney, C Xiong… - arXiv preprint arXiv …, 2019 - arxiv.org
Large-scale language models show promising text generation capabilities, but users cannot
easily control particular aspects of the generated text. We release CTRL, a 1.63 billion …