本文介绍了(在 pandas 中)当以表格形式存储在HDF5中时,为什么频率信息会丢失?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!



I am storing timeseries data in HDF5 format within pandas, Because I want to be able to access the data directly on disk I am using the PyTable format with table=True when writing.


It appears that I then loose frequency information on my TimeSeries objects after writing them to HDF5.


This can be seen by toggling is_table value in script below:

import pandas as pd

is_table = False

times = pd.date_range('2000-1-1', periods=3, freq='H')
series = pd.Series(xrange(3), index=times)

print 'frequency before =', series.index.freq

frame = pd.DataFrame(series)

with pd.get_store('data/simple.h5') as store:
    store.put('data', frame, table=is_table)

with pd.get_store('data/simple.h5') as store:
    x = store['data']

print 'frequency after =', x[0].index.freq

is_table = False:

frequency before = <1 Hour>
frequency after = <1 Hour>

is_table = True:

frequency before = <1 Hour>
frequency after = None


It would seem to me that PyTables provides a much richer storage mechanism and that this would not be the case.


Is there a fundamental reason that PyTables cannot store, or reproduce, this information? Or is this a possible bug pandas?



Just confirmed from pandas that this is not implemented in the current release.

请参阅: https://github.com/pydata/pandas/issues/3499#issuecomment-17262905 进行解决.


I will update this answer when it becomes available.

这篇关于(在 pandas 中)当以表格形式存储在HDF5中时,为什么频率信息会丢失?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-06 09:52