问题描述
有没有人发现从文本中提取日期引用的简单而有效的方法?我已经做了大量的时间提取工具的搜索,但还没有很多。有几篇白皮书,但它似乎落入整个语义网的一个子集,但没有给予很多的关注。我只是在寻找有效80%的东西。没有必要捕捉2009年1月以后的月份,但是基本的常用日期实体会很好。
我对所有的建议开放,甚至是幻想正则表达式。
消失!
(感谢 - 亨利)
-
如果数据中的目标时间表达式格式有限,请使用正则表达式和迭代方法来优化系统
-
否则,请使用Stanford NLP工具包,,这可能是过度杀人,但绝对符合您的要求
Has anyone found a simple, but effective way to extract date references from text? I've done a fair amount of searching for temporal extraction tools, but there isn't a lot out there. There are a few white papers, but it seems to fall into a subset of the whole semantic web thingy but not given much attention.
I'm just looking for something that is 80% effective. There is no need to capture things like "the month after Jan 2009", but basic common dates entities would be nice.
I'm open to all suggestions, even fancy regex expressions.
Fire away!
(and thanks - Henry)
If the target temporal expressions in your data are only in limited format, use regular expression and iterative approach to refine your system
Otherwise, use Stanford NLP toolkit, SUTime, which might be an over-kill but definitely meet your demands
这篇关于时间抽取(即从自由表单文本中提取日期/时间实体) - 如何?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!