使用parse_date_time在包lubridate中以格式dmy和dmY解析日期

本文介绍了使用parse_date_time在包lubridate中以格式dmy和dmY解析日期的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个字符表示日期的向量，格式主要是 dmY （例如27-09-2013）， dmy （例如27-09-13），偶尔会有一些 b 或 B 个月。因此， parse_date_time 在包 lubridate 中，允许用户指定几种格式订单来处理异构日期时间字符表示对我来说可能是一个非常有用的功能。

I have a vector of character representation of dates, where formats mostly are dmY (e.g. 27-09-2013), dmy (e.g. 27-09-13), and occasionally some b or B months. Thus, parse_date_time in package lubridate that "allows the user to specify several format-orders to handle heterogeneous date-time character representations" could be a very useful function for me.

然而，似乎 parse_date_time 有解决问题的方法c $ c> dmy 日期与 dmY 日期一起发生。当单独解析 dmy 或 dmy 以及与我相关的其他一些格式时，可以正常工作。在@ Peyton的回答中也注意到了这种模式。。建议一个快速修复，但是我想问一下是否可以在 lubridate 中处理。

However, it seems that parse_date_time has problem parsing dmy dates when they occur together with dmY dates. When parsing dmy alone, or dmy together with some other formats relevant to me, it works fine. This pattern was also noted in a comment to @Peyton's answer here. A quick fix was suggested, but I wish to ask if it is possible to handle it in lubridate.

这里我会显示一些例子，我试图在 dmy 格式和其他格式一起解析日期，并相应地指定 orders 。

Here I show some examples where I try to parse dates on dmy format together with some other formats, and specifying orders accordingly.

library(lubridate)
# version: lubridate_1.3.0

# regarding how date format is specified in 'orders':
# examples in ?parse_date_time
# parse_date_time(x, "ymd")
# parse_date_time(x, "%y%m%d")
# parse_date_time(x, "%y %m %d")
# these order strings are equivalent and parses the same way
# "Formatting orders might include arbitrary separators. These are discarded"

# dmy date only
parse_date_time(x = "27-09-13", orders = "d m y")
# [1] "2013-09-27 UTC"
# OK

# dmy & dBY
parse_date_time(c("27-09-13", "27 September 2013"), orders = c("d m y", "d B Y"))
# [1] "2013-09-27 UTC" "2013-09-27 UTC"
# OK

# dmy & dbY
parse_date_time(c("27-09-13", "27 Sep 2013"), orders = c("d m y", "d b Y"))
# [1] "2013-09-27 UTC" "2013-09-27 UTC"
# OK

# dmy & dmY
parse_date_time(c("27-09-13", "27-09-2013"), orders = c("d m y", "d m Y"))
# [1] "0013-09-27 UTC" "2013-09-27 UTC"
# not OK

# does order of the date components matter?
parse_date_time(c("2013-09-27", "13-09-13"), orders = c("Y m d", "y m d"))
# [1] "2013-09-27 UTC" "0013-09-27 UTC"
# no

select_formats 参数？我很抱歉说这个，但我很难理解这一部分的帮助文件。并且，并得到了@vitoshka的快速回复：这是一个错误。

UpdateI posted the question on the lubridate bug report, and got a rapid reply from @vitoshka: "This is a bug".

推荐答案

它看起来像一个错误。我不确定所以你应该联系维护者。

It looks like a bug. I am not sure So you should contact the maintainer.

构建软件包源并在这个内部函数中改变一行（我替换 which.max by wich.min ）：

Building the package source and changing one line in this internal function ( I replace which.max by wich.min):

.select_formats <-   function(trained){
  n_fmts <- nchar(gsub("[^%]", "", names(trained))) + grepl("%Y", names(trained))*1.5
  names(trained[ which.min(n_fmts) ]) ## replace which.max  by which.min
}

似乎纠正了这个问题。坦白说，我不知道为什么这个工作，但我想这是一种排名。

seems to correct the problem. Frankly I don't know why this works, but I guess it is a kind of ranking..

parse_date_time(c("27-09-13", "27-09-2013"), orders = c("d m y", "d m Y"))
[1] "2013-09-27 UTC" "2013-09-27 UTC"

parse_date_time(c("2013-09-27", "13-09-13"), orders = c("Y m d", "y m d"))
[1] "2013-09-27 UTC" "2013-09-13 UTC"

这篇关于使用parse_date_time在包lubridate中以格式dmy和dmY解析日期的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！