问题描述
我有一个Excel电子表格。我试图从包含日期的Excel表中捕获一行,然后使用 datetime.strptime()
解析日期。 以下是我正在使用的Excel工作表的一部分:
和我的相关代码:
pattern = re.compile(r'Listing(。+)',re.IGNORECASE)
a = pattern.findall(str(df))
print(a:,a)
new_a = str(a)
datetime_object = datetime.strptime(new_a,'%b%w%Y')
print(date:,datetime_object)
所以我捕获了 LISTING
之后的所有内容,并产生:
a:['JUN 11 2013未命名:1 \\']
然后我尝试提取 Jun
, 11
和 2013
但我失败了:
ValueError:时间数据['JUN 11 2013未命名:1 \\ \\\\\']不符合格式'%b %w%Y'
我相当确定这是一个简单的修复,但作为一个初学者,我无法看到如何修复它。我应该改变我的RegEx以减少收入吗?或者我应该修正 date.strptime()
正在接收的参数?
参数似乎是正确的在查看文档时:
感谢任何帮助。
p>您需要修改正在使用的正则表达式以从Excel文件获取日期。
pattern = re.compile(r'列表([AZ] + \d {1,2} \d {4})',re.IGNORECASE)
[AZ] +
表示一个或多个大写字母, \d {1,2}
表示一个或多个两个数字和 \d {4}
表示四个数字。
此外,使用不正确 - %w
表示平日(从星期日到星期六从0到6的周末数字),而您应该使用%d
哪一天匹配的一个月
所以应该看起来像这样:
datetime_object = datetime.strptime(new_a,'%b% d%Y')
I have an Excel spreadsheet. I am trying to capture a line from the Excel sheet that contains a date, then parse the date out with datetime.strptime()
.
Here is the bit of the Excel sheet I'm working with:
and my relevant code:
pattern = re.compile(r'Listing(.+)', re.IGNORECASE)
a = pattern.findall(str(df))
print("a:", a)
new_a = str(a)
datetime_object = datetime.strptime(new_a, '%b %w %Y')
print("date:", datetime_object)
So I capture everything that follows LISTING
and produce:
a: [' JUN 11 2013 Unnamed: 1 \\']
Then I try to extract the Jun
, 11
, and 2013
but I fail with:
ValueError: time data "[' JUN 11 2013 Unnamed: 1 \\\\']" does not match format '%b %w %Y'
I am fairly sure this is a simple fix but being a beginner I can't see how exactly to fix it. Should I alter my RegEx to capture less? Or should I fix the arguments that date.strptime()
is taking in?
The arguments seem to be right when looking at the documentation: https://docs.python.org/3.5/library/datetime.html
Thanks for any help.
You need to modify the regex you're using to get the date from the Excel file.
pattern = re.compile(r'Listing ([A-Z]+ \d{1,2} \d{4})', re.IGNORECASE)
[A-Z]+
means "one or more capital letters", \d{1,2}
means "one or two numbers" and \d{4}
means "four numbers".
Furthermore the format of date you're using is incorrect - %w
means weekday (numbers from 0 to 6 representing weekdays from Sunday to Saturday), while you should use %d
which matches day of the month
So it should look like this in the end:datetime_object = datetime.strptime(new_a, '%b %d %Y')
这篇关于麻烦使用datetime.strptime()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!