问题描述
我需要您的帮助来确定分析正面"与负面"的行业特定句子(即电影评论)的最佳方法.之前看过OpenNLP之类的库,但是太底层了——它只是给了我基本的句子构成;我需要的是一个更高层次的结构:- 希望有词表- 希望可以在我的数据集上训练
I need your help in determining the best approach for analyzing industry-specific sentences (i.e. movie reviews) for "positive" vs "negative". I've seen libraries such as OpenNLP before, but it's too low-level - it just gives me the basic sentence composition; what I need is a higher-level structure:- hopefully with wordlists- hopefully trainable on my set of data
谢谢!
推荐答案
您正在寻找的通常称为 情绪分析.通常,情绪分析无法处理微妙的微妙之处,例如讽刺或讽刺,但如果您将大量数据投入其中,它的表现会很好.
What you are looking for is commonly dubbed Sentiment Analysis. Typically, sentiment analysis is not able to handle delicate subtleties, like sarcasm or irony, but it fares pretty well if you throw a large set of data at it.
情感分析通常需要相当多的预处理.至少是标记化、句子边界检测和词性标注.有时,句法解析可能很重要.正确地完成它是计算语言学研究的一个完整分支,除非您先花时间研究该领域,否则我不会建议您提出自己的解决方案.
Sentiment analysis usually needs quite a bit of pre-processing. At least tokenization, sentence boundary detection and part-of-speech tagging. Sometimes, syntactic parsing can be important. Doing it properly is an entire branch of research in computational linguistics, and I wouldn't advise you with coming up with your own solution unless you take your time to study the field first.
OpenNLP 有一些帮助情绪分析的工具,但如果你想要更严肃的东西,你应该查看 LingPipe 工具包.它有一些内置的 SA 功能和一个不错的教程一>.你可以在你自己的数据集上训练它,但不要认为这完全是微不足道的:-).
OpenNLP has some tools to aid sentiment analysis, but if you want something more serious, you should look into the LingPipe toolkit. It has some built-in SA-functionality and a nice tutorial. And you can train it on your own set of data, but don't think that it is entirely trivial :-).
谷歌搜索这个词可能也会给你一些资源.如果您有任何更具体的问题,尽管问,我正在密切关注 nlp 标签 ;-)
Googling for the term will probably also give you some resources to work with. If you have any more specific question, just ask, I'm watching the nlp-tag closely ;-)
这篇关于NLP:定性地“积极"与“负面"句子的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!