我需要检查文本中的拼写和语法,因此我开始使用LanguageTool API(Can be found here)。现在,当我编写他们提供的启动代码时,如下所示:
JLanguageTool langTool = new JLanguageTool(Language.ENGLISH);
langTool.activateDefaultPatternRules();
List<RuleMatch> matches = langTool.check("Eat I rice" +
"every day and go school to good as a boy");
for (RuleMatch match : matches) {
System.out.println("Potential error at line " +
match.getEndLine() + ", column " +
match.getColumn() + ": " + match.getMessage());
System.out.println("Suggested correction: " +
match.getSuggestedReplacements());
}
我没有任何错误。抱歉,如果我错了,但是“每天吃我饭,小时候上学要好”的句子(语法上)是否正确?如果是这样,如果不是,那么是否有任何方法可以使用该工具检测此类句子(无意义或语法错误)?
最佳答案
Languagetool是基于规则的。显然,任何规则都没有抓住“每天吃米饭,像个男孩一样去上学”这句话。
http://wiki.languagetool.org/tips-and-tricks包含有关如何将用户定义的规则添加到Languagetool的信息。
这是此类规则的示例:
<rule>
<pattern>
<token>
<exception regexp="yes">(that|ha[ds]|will|must|could|can|should|would|does|did|may|might|t|let)</exception>
<exception inflected="yes" regexp="yes">feel|hear|see|watch|prevent|help|stop|be</exception>
<exception postag="C[CD]|IN|DT|MD|NNP|\." postag_regexp="yes"></exception>
<exception scope="previous" postag="PRP$"/>
</token>
<token postag="NNP" regexp="yes">.{2,}<exception postag="JJ|CC|RP|DT|PRP\$?|NNPS|NNS|IN|RB|WRB|VBN" postag_regexp="yes"></exception></token>
<marker>
<token postag="VB|VBP" postag_regexp="yes" regexp="yes">\p{Lower}+<exception postag="VBN|VBD|JJ|IN|MD" postag_regexp="yes"></exception></token>
</marker>
<token postag="IN|DT" postag_regexp="yes"></token>
</pattern>
<message>The proper name in singular (<match no="2"></match>) must be used with a third-person verb: <suggestion><match no="3" postag="VBZ"></match></suggestion>.</message>
<short>Grammatical problem</short>
<example correction="walks" type="incorrect">Ann <marker>walk</marker> to the building.</example>
<example type="correct">Bill <marker>walks</marker> to the building.</example>
<example type="correct">Guinness <marker>walked</marker> to the building.</example>
<example type="correct">Roosevelt and Hoover speak each other's lines.</example>
<example type="correct">Boys are at higher risk for autism than girls.</example>
<example type="correct">In reply, he said he was too old for this.</example>
<example type="correct">I can see Bill looking through the window.</example>
<example type="correct">Richard J. Hughes made his Morris County debut in his bid for the Democratic gubernatorial elections.</example>
<example type="correct">... last night got its seven-concert Beethoven cycle at Carnegie Hall off to a good start.</example>
<example type="correct">... and through knowing Him better to become happier and more effective people.</example>
<!-- TODO: Fix false-positive: The library and Medical Center are to the north.-->
<!-- The present Federal program of vocational education began in 1917. -->
</rule>
在以下位置有一个在线规则编辑器
http://community.languagetool.org/ruleEditor2/
一个简单的解决方案是
<!-- English rule, 2014-09-19 -->
<rule id="ID" name="EatI">
<pattern> <token>Eat</token> <token>i</token> </pattern>
<message>Instead of <match no="2"/> <match no="1"/> it should be <match no="1"/> <match no="2"/></message>
<url>http://stackoverflow.com/questions/13016469/detecting-meaningless-and-or-grammatically-incorrect-sentence-with-languagetool/25933907#25933907</url>
<short>wrong order of verb and nown</short>
<example type='incorrect'><marker>Eat i</marker> rice</example> <example type='correct'>I eat rice</example>
</rule>
但这当然只覆盖动词“Eat”-但我希望您对图片有所了解...