论文阅读笔记 Word Embeddings A Survey
收获
Word Embedding 的定义
dense, distributed, fixed-length word vectors, built using word co-occurrence statistics as per the distributional hypothesis.
分布式假说(distributional hypothesis)
word with similar contexts have the same meaning.
知网词语相关性
词语在同一语境中共现的可能性。
综上述,相关性和分布式假说如出一辙!
Word Embedding 的分类
Prediction-based
neural network language model based.
predict next word.
E.g. NNLM, word2vec.
Count-based
word-context matrix based.
accout word-context co-occurrence.
E.g. GloVe.