问题描述
如果一个日期向量有两位数的年份,那么 mdy()
将年龄介于00和68之间,成为二十一世纪和二十世纪六十九年九十九岁之间。例如:
If a date vector has two-digit years, mdy()
turns years between 00 and 68 into 21st Century years and years between 69 and 99 into 20th Century years. For example:
library(lubridate)
mdy(c("1/2/54","1/2/68","1/2/69","1/2/99","1/2/04"))
提供以下输出:
Multiple format matches with 5 successes: %m/%d/%y, %m/%d/%Y.
Using date format %m/%d/%y.
[1] "2054-01-02 UTC" "2068-01-02 UTC" "1969-01-02 UTC" "1999-01-02 UTC" "2004-01-02 UTC"
我可以解决这个事实后,从错误的日期减去100,将2054年和2068年变成1954年和1968年。但是,有一个更优雅,更容易出错的解析两位数日期的方法,以便在解析过程中得到正确处理?
I can fix this after the fact by subtracting 100 from the incorrect dates to turn 2054 and 2068 into 1954 and 1968. But is there a more elegant and less error-prone method of parsing two-digit dates so that they get handled correctly in the parsing process itself?
更新: 之后@JoshuaUlrich指向我 strptime
我发现,它处理类似于我的问题,但是使用基数R.
Update: After @JoshuaUlrich pointed me to strptime
I found this question, which deals with an issue similar to mine, but using base R.
这似乎是一个很好的除了日期处理R将在解析函数的日期中处理世纪选择限制两位数的日期。
It seems like a nice addition to date handling in R would be some way to handle century selection cutoffs for two-digit dates within the date parsing functions.
推荐答案
这是一个允许您执行此操作的功能:
Here is a function that allows you to do this:
library(lubridate)
x <- mdy(c("1/2/54","1/2/68","1/2/69","1/2/99","1/2/04"))
foo <- function(x, year=1968){
m <- year(x) %% 100
year(x) <- ifelse(m > year %% 100, 1900+m, 2000+m)
x
}
尝试一下:
x
[1] "2054-01-02 UTC" "2068-01-02 UTC" "1969-01-02 UTC" "1999-01-02 UTC"
[5] "2004-01-02 UTC"
foo(x)
[1] "2054-01-02 UTC" "2068-01-02 UTC" "1969-01-02 UTC" "1999-01-02 UTC"
[5] "2004-01-02 UTC"
foo(x, 1950)
[1] "1954-01-02 UTC" "1968-01-02 UTC" "1969-01-02 UTC" "1999-01-02 UTC"
[5] "2004-01-02 UTC"
这里的魔法是使用模数运算符 %%
返回部门的分数部分。所以
1968 %% 100
产生68。
这篇关于是否有更优雅的方式将两位数的年份转化为四位数的年份?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!