Follow
Samuel R. Bowman
Samuel R. Bowman
Anthropic and NYU
Verified email at anthropic.com - Homepage
Title
Cited by
Cited by
Year
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
A Wang, A Singh, J Michael, F Hill, O Levy, SR Bowman
Proceedings of ICLR, 2019
94172019
A large annotated corpus for learning natural language inference
SR Bowman, G Angeli, C Potts, CD Manning
Proceedings of EMNLP, 2015
56142015
A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference
A Williams, N Nangia, SR Bowman
Proceedings of NAACL-HLT, 2018
55022018
Generating sentences from a continuous space
SR Bowman, L Vilnis, O Vinyals, AM Dai, R Jozefowicz, S Bengio
Proceedings of CoNLL, 2016
31312016
SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems
A Wang, Y Pruksachatkun, N Nangia, A Singh, J Michael, F Hill, O Levy, ...
Proceedings of NeurIPS, 2019
29452019
Constitutional AI: Harmlessness from AI feedback
Y Bai, S Kadavath, S Kundu, A Askell, J Kernion, A Jones, A Chen, ...
arXiv preprint arXiv:2212.08073, 2022
22232022
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models
A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ...
TMLR, 2023
20112023
Neural network acceptability judgments
A Warstadt, A Singh, SR Bowman
TACL 7, 625-641, 2019
17032019
XNLI: Evaluating Cross-lingual Sentence Representations
A Conneau, G Lample, R Rinott, A Williams, SR Bowman, H Schwenk, ...
Proceedings of EMNLP, 2018
16802018
Annotation artifacts in natural language inference data
S Gururangan, S Swayamdipta, O Levy, R Schwartz, SR Bowman, ...
Proceedings of NAACL, 2018
13932018
GPQA: A graduate-level google-proof Q&A benchmark
D Rein, BL Hou, AC Stickland, J Petty, RY Pang, J Dirani, J Michael, ...
arXiv preprint arXiv:2311.12022, 2023
12722023
What do you learn from context? Probing for sentence structure in contextualized word representations
I Tenney, P Xia, B Chen, A Wang, A Poliak, RT McCoy, N Kim, ...
Proceedings of ICLR, 2019
10812019
CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models
N Nangia, C Vania, R Bhalerao, SR Bowman
Proceedings of EMNLP, 2020
9172020
Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned
D Ganguli, L Lovitt, J Kernion, A Askell, Y Bai, S Kadavath, B Mann, ...
arXiv preprint arXiv:2209.07858, 2022
8312022
On Measuring Social Biases in Sentence Encoders
C May, A Wang, S Bordia, SR Bowman, R Rudinger
Proceedings of NAACL-HLT, 2019
8132019
Language models don't always say what they think: Unfaithful explanations in chain-of-thought prompting
M Turpin, J Michael, E Perez, S Bowman
Advances in Neural Information Processing Systems 36, 2023
6612023
BLiMP: A benchmark of linguistic minimal pairs for english
A Warstadt, A Parrish, H Liu, A Mohananey, W Peng, SF Wang, ...
TACL, 2020
6282020
Language models (mostly) know what they know
S Kadavath, T Conerly, A Askell, T Henighan, D Drain, E Perez, ...
arXiv preprint arXiv:2207.05221, 2022
6212022
BBQ: A Hand-Built Bias Benchmark for Question Answering
A Parrish, A Chen, N Nangia, V Padmakumar, J Phang, J Thompson, ...
Findings of ACL, 2022
6032022
Sentence encoders on STILTs: Supplementary training on intermediate labeled-data tasks
J Phang, T Févry, SR Bowman
arXiv preprint 1811.01088, 2018
5282018
The system can't perform the operation now. Try again later.
Articles 1–20