我查询一个MongoDB数据库,并返回一个带有value字段的数据框,其中10分钟时间戳是值的关键:

date        values
2017-04-30  {u'00:00:00': u'13.12', u'00:10:00': .....u'23:50:00': u'12.34'}
2017-05-01  {u'00:00:00': u'15.56', u'00:10:00': .....u'23:50:00': u'15.78'}
2017-05-02  {u'00:00:00': u'11.34', u'00:10:00': .....u'23:50:00': u'13.21'}
2017-05-03  {u'00:00:00': u'17.07', u'00:10:00': .....u'23:50:00': u'14.22'}
2017-05-04  {u'00:00:00': u'18.03', u'00:10:00': .....u'23:50:00': u'18.09'}


我想将其转换为易于绘制的系列或元组对象:

[(2017-04-30 00:00:00, 13.12)....... (2017-05-04 23:50:00, 18.09)]


我的方法现在涉及一些复杂的功能和大量的数据争用,也许在Pandas中有一种更简单的方法吗?

最佳答案

选项1
相当紧凑的理解

from pandas import to_datetime as todt
from pandas import to_timedelta as totd

[
    [
        (str(todt(r.date) + totd(k)), v) for k, v in r.values.items()
    ]
    for r in df.itertuples()
]


选项2
更像pandas的解决方案

from pandas import to_datetime as todt
from pandas import to_timedelta as totd

d1 = df.set_index('date')['values'] \
       .apply(pd.Series).rename_axis('time', 1) \
       .stack().reset_index(name='val')

d1.assign(
    datetime=d1.date + totd(d1.time)
).groupby('date').apply(
    lambda x: list(zip(x.datetime, x.val))
)


date
2017-04-30    [(2017-04-30 00:00:00, 12.84), (2017-04-30 00:...
2017-05-01    [(2017-05-01 00:00:00, 14.32), (2017-05-01 00:...
2017-05-02    [(2017-05-02 00:00:00, 16.86), (2017-05-02 00:...
2017-05-03    [(2017-05-03 00:00:00, 10.82), (2017-05-03 00:...
2017-05-04    [(2017-05-04 00:00:00, 9.93), (2017-05-04 00:1...
dtype: object


设定

tidx = pd.date_range('1970-01-01', periods=6 * 24, freq='10T')
dates = pd.date_range('2017-04-30', periods=5)
df = pd.DataFrame(dict(
        date=dates,
        values=[
            {d.strftime('%H:%M:%S'): str((np.random.rand(1) * 20).round(2)[0]) for d in tidx}
            for _ in range(5)
        ]
    ))

关于python - 如何在数据框中解包日期时间值?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/43746800/

10-12 22:04
查看更多