本文介绍了Python numpy:无法将datetime64 [ns]转换为datetime64 [D](与Numba一起使用)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将datetime数组传递给Numba函数(该函数不能进行矢量化,否则会很慢).我了解Numba支持numpy.datetime64.但是,它似乎支持datetime64 [D](日精度),但不支持datetime64 [ns](毫秒精度)(我很难学到了:有记载吗?).

I want to pass a datetime array to a Numba function (which cannot be vectorised and would otherwise be very slow). I understand Numba supports numpy.datetime64. However, it seems it supports datetime64[D] (day precision) but not datetime64[ns] (millisecond precision) (I learnt this the hard way: is it documented?).

我试图将datetime64 [ns]转换为datetime64 [D],但似乎找不到办法!有什么想法吗?

I tried to convert from datetime64[ns] to datetime64[D], but can't seem to find a way! Any ideas?

我用下面的最少代码总结了我的问题.如果运行testdf(mydates),它是datetime64 [D],则可以正常工作.如果运行testdf(dates_input),它是datetime64 [ns],则不会.请注意,此示例只是将日期传递给Numba函数,该函数尚未执行任何操作.我尝试将dates_input转换为datetime64 [D],但是转换不起作用.在我的原始代码中,我从SQL表读取到pandas数据框,并且需要一列将每个日期的日期更改为15日.

I have summarised my problem with the minimal code below. If you run testdf(mydates), which is datetime64[D], it works fine. If you run testdf(dates_input), which is datetime64[ns], it doesn't. Note that this example simply passes the dates to the Numba function, which doesn't (yet) do anything with them. I try to convert dates_input to datetime64[D], but the conversion doesn't work. In my original code I read from a SQL table into a pandas dataframe, and need a column which changes the day of each date to the 15th.

import numba
import numpy as np
import pandas as pd
import datetime

mydates =np.array(['2010-01-01','2011-01-02']).astype('datetime64[D]')
df=pd.DataFrame()
df["rawdate"]=mydates
df["month_15"] = df["rawdate"].apply(lambda r: datetime.date( r.year, r.month,15 ) )

dates_input = df["month_15"].astype('datetime64[D]')
print dates_input.dtype # Why datetime64[ns] and not datetime64[D] ??


@numba.jit(nopython=True)
def testf(dates):
    return 1

print testf(mydates)

如果运行testdf(dates_input),我得到的错误是:

The error I get if I run testdf(dates_input) is:

numba.typeinfer.TypingError: Failed at nopython (nopython frontend)
Var 'dates' unified to object: dates := {pyobject}

推荐答案

Series.astype将所有类似日期的对象转换为datetime64[ns].若要转换为datetime64[D],请在调用astype之前使用values获取NumPy数组:

Series.astype converts all date-like objects to datetime64[ns]. To convert to datetime64[D], use values to obtain a NumPy array before calling astype:

dates_input = df["month_15"].values.astype('datetime64[D]')


请注意,NDFrame(例如Series和DataFrame)只能将类似日期时间的对象保存为dtype datetime64[ns]的对象.将所有日期时间喜欢项自动转换为通用dtype可以简化后续日期计算.但这使得不可能将datetime64[s]对象存储在DataFrame列中. Pandas核心开发人员, Jeff Reback解释


Note that NDFrames (such as Series and DataFrames) can only hold datetime-like objects as objects of dtype datetime64[ns]. The automatic conversion of all datetime-likes to a common dtype simplifies subsequent date computations. But it makes it impossible to store, say, datetime64[s] objects in a DataFrame column. Pandas core developer, Jeff Reback explains,


还要注意,即使df['month_15'].astype('datetime64[D]')具有dtype datetime64[ns]:


Also note that even though df['month_15'].astype('datetime64[D]') has dtype datetime64[ns]:

In [29]: df['month_15'].astype('datetime64[D]').dtype
Out[29]: dtype('<M8[ns]')

当您遍历系列中的项目时,您会得到熊猫Timestamps,而不是datetime64[ns].

when you iterate through the items in the Series, you get pandas Timestamps, not datetime64[ns]s.

In [28]: df['month_15'].astype('datetime64[D]').tolist()
Out[28]: [Timestamp('2010-01-15 00:00:00'), Timestamp('2011-01-15 00:00:00')]

因此,尚不清楚Numba实际上是否有datetime64[ns]问题,它可能只是有Timestamps问题.抱歉,我无法检查-我没有安装Numba.

Therefore, it is not clear that Numba actually has a problem with datetime64[ns], it might just have a problem with Timestamps. Sorry, I can't check this -- I don't have Numba installed.

但是,尝试一下可能会有用

However, it might be useful for you to try

testf(df['month_15'].astype('datetime64[D]').values)

因为df['month_15'].astype('datetime64[D]').values实际上是dtype datetime64[ns]的NumPy数组:

since df['month_15'].astype('datetime64[D]').values is truly a NumPy array of dtype datetime64[ns]:

In [31]: df['month_15'].astype('datetime64[D]').values.dtype
Out[31]: dtype('<M8[ns]')

如果可行,则不必将所有内容都转换为datetime64[D],只需将NumPy数组(而不是Pandas系列)传递给testf.

If that works, then you don't have to convert everything to datetime64[D], you just have to pass NumPy arrays -- not Pandas Series -- to testf.

这篇关于Python numpy:无法将datetime64 [ns]转换为datetime64 [D](与Numba一起使用)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-22 23:43