Roberta: A robustly optimized bert pretraining approach
Language model pretraining has led to significant performance gains but careful
comparison between different approaches is challenging. Training is computationally …
comparison between different approaches is challenging. Training is computationally …
The curious case of neural text degeneration
Despite considerable advancements with deep neural language models, the enigma of
neural text degeneration persists when these models are tested as text generators. The …
neural text degeneration persists when these models are tested as text generators. The …
[HTML][HTML] Transformers: State-of-the-art natural language processing
Recent progress in natural language processing has been driven by advances in both
model architecture and model pretraining. Transformer architectures have facilitated …
model architecture and model pretraining. Transformer architectures have facilitated …
A survey of the usages of deep learning for natural language processing
Over the last several years, the field of natural language processing has been propelled
forward by an explosion in the use of deep learning models. This article provides a brief …
forward by an explosion in the use of deep learning models. This article provides a brief …
[HTML][HTML] Deep learning-based electroencephalography analysis: a systematic review
Context. Electroencephalography (EEG) is a complex signal and can require several years
of training, as well as advanced signal processing and feature extraction methodologies to …
of training, as well as advanced signal processing and feature extraction methodologies to …
Dialogpt: Large-scale generative pre-training for conversational response generation
We present a large, tunable neural conversational response generation model, DialoGPT
(dialogue generative pre-trained transformer). Trained on 147M conversation-like …
(dialogue generative pre-trained transformer). Trained on 147M conversation-like …
Multilingual denoising pre-training for neural machine translation
This paper demonstrates that multilingual denoising pre-training produces significant
performance gains across a wide variety of machine translation (MT) tasks. We present …
performance gains across a wide variety of machine translation (MT) tasks. We present …
AutoML: A survey of the state-of-the-art
Deep learning (DL) techniques have obtained remarkable achievements on various tasks,
such as image recognition, object detection, and language modeling. However, building a …
such as image recognition, object detection, and language modeling. However, building a …
Defending against neural fake news
Recent progress in natural language generation has raised dual-use concerns. While
applications like summarization and translation are positive, the underlying technology also …
applications like summarization and translation are positive, the underlying technology also …
Ctrl: A conditional transformer language model for controllable generation
Large-scale language models show promising text generation capabilities, but users cannot
easily control particular aspects of the generated text. We release CTRL, a 1.63 billion …
easily control particular aspects of the generated text. We release CTRL, a 1.63 billion …