问题描述
我正在构建一个简单的解析器,它会像下面这样查询:
'show fizi从1/1/2010到11/2/2006提交'
到目前为止我有:
I am building a simple parser that takes a query like the following:'show fizi commits from 1/1/2010 to 11/2/2006'So far I have:
class QueryParser(object):
def parser(self, stmnt):
keywords = ["select", "from","to", "show","commits", "where", "group by", "order by", "and", "or"]
[select, _from, _to, show, commits, where, groupby, orderby, _and, _or] = [ CaselessKeyword(word) for word in keywords ]
user = Word(alphas+"."+alphas)
user2 = Combine(user + "'s")
startdate=self.getdate()
enddate=self.getdate()
bnf = (show|select)+(user|user2).setResultsName("user")+(commits).setResultsName("stats")\
+Optional(_from+startdate.setResultsName("start")+_to+enddate.setResultsName("end"))
a = bnf.parseString(stmnt)
return a
def getdate(self):
integer = Word(nums).setParseAction(lambda t: int(t[0]))
date = Combine(integer('year') + '/' + integer('month') + '/' + integer('day'))
#date.setParseAction(self.convertToDatetime)
return date
我希望日期更加通用。意义用户可以提供2010年1月20日或其他日期格式。我发现一个很好的日期解析在线,这样做。它将日期作为字符串,然后解析它。所以我剩下的是将该函数的值从我的解析器中获取。我如何去标记和捕获两个日期字符串。现在,它只捕获格式'y / m / d'格式。有没有办法让整个字符串不知道它的格式如何。像关键字之后的东西像捕捉这个词。非常感谢任何帮助。
I would like the dates to be more generic. Meaning user can provide 20 Jan, 2010 or some other date format. I found a good date parsing online that does exactly that. It takes a date as a string and then parses it. So what I am left with is to feed that function the date string I get from my parser. How do I go about tokenizing and capturing the two date strings. For now it only captures the format 'y/m/d' format. Is there a way to just get the entire string regarless of how its formatted. Something like capture the word right after keywords and . Any help is greatly appreciated.
推荐答案
一个简单的方法是要求引用日期。一个粗略的例子是这样的,但是如果需要,你需要调整以适应当前的语法:
A simple approach is to require the date be quoted. A rough example is something like this, but you'll need to adjust to fit in with your current grammar if needs be:
from pyparsing import CaselessKeyword, quotedString, removeQuotes
from dateutil.parser import parse as parse_date
dp = (
CaselessKeyword('from') + quotedString.setParseAction(removeQuotes)('from') +
CaselessKeyword('to') + quotedString.setParseAction(removeQuotes)('to')
)
res = dp.parseString('from "jan 20" to "apr 5"')
from_date = parse_date(res['from'])
to_date = parse_date(res['to'])
# from_date, to_date == (datetime.datetime(2015, 1, 20, 0, 0), datetime.datetime(2015, 4, 5, 0, 0))
这篇关于构建一个能够使用PyParse解析不同日期格式的简单解析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!