Authors
Quoc V Le, Tomas Mikolov
Publication date
2014/6/21
Journal
ICML
Volume
14
Pages
1188-1196
Description
Abstract Many machine learning algorithms require the input to be represented as a fixed-
length feature vector. When it comes to texts, one of the most common fixed-length features
is bag-of-words. Despite their popularity, bag-of-words features have two major
weaknesses: they lose the ordering of the words and they also ignore semantics of the
words. For example,“powerful,”“strong” and “Paris” are equally distant. In this paper, we
propose Paragraph Vector, an unsupervised algorithm that learns fixed-length feature ...
Total citations
20142015201634261274