



经过大量搜索后,我找不到一种简单的方法来从 .h5 中提取数据,然后由 Numpy 将其传递给 data.Frame .code>或 Pandas ,以便保存在 .txt .csv 文件中.

After searching a lot I couldn't find a simple way to extract data from .h5 and pass it to a data.Frame by Numpy or Pandas in order to save in .txt or .csv file.

import h5py
import numpy as np
import pandas as pd

filename = 'D:\data.h5'
f = h5py.File(filename, 'r')

# List all groups
print("Keys: %s" % f.keys())
a_group_key = list(f.keys())[0]

# Get the data
data = list(f[a_group_key])
Keys: <KeysViewHDF5 ['dd48']>


When I print data I see following results:



I would appreciate the if someone explain me what are they and how I can extract data completely and save it in .csv file. It seems there hasn't been a routine way to do that and it's kind of challenging yet! Until now I just could see part of data via:

import numpy as np
dfm = np.fromfile('D:\data.h5', dtype=float)
print (dfm.shape)

#dfm.to_csv('hi.csv', sep=',', header=None, index=None)

我希望在 .h5 文件中提取时间戳测量.

My expectation is to extract time_stamps and measurements in .h5 file.


h5py 将以numpy数组访问HDF5数据集.调用获取键将返回数据集名称的列表.现在有了它们,将它们作为一个numpy数组进行访问并编写它们应该非常简单.您需要让dtype知道每一列中的内容才能正确格式化.

h5py will access HDF5 datasets as numpy arrays. Your call to get the keys returns a LIST of the dataset names. Now that you have them, it should be pretty simple to access them as a numpy array and write them. You need to get the dtype to know what is in each column to format correctly.

更新了5/22/2019 以反映评论链接中发布的 data.h5 的内容. np.savetxt()中的默认格式为'%.18e'.提供了非常简单(粗略)的逻辑来基于dtype修改这些数据集的格式.这需要更健壮的dtype检查和格式化以供一般使用.另外,您将需要添加逻辑以解码unicode字符串.

Updated 5/22/2019 to reflect content of data.h5 posted at link in comment.Default format in np.savetxt() is '%.18e'. Very simple (crude) logic provided to modify format based on dtype for these datasets. This requires more robust dtype checking and formatting for general use. Also, you will need to add logic to decode unicode strings.

import h5py
filename = 'D:\data.h5'
import numpy as np
h5f = h5py.File(filename, 'r')
# get a List of data sets in group 'dd48'
a_dset_keys = list(h5f['dd48'].keys())

# Get the data
for dset in a_dset_keys :
    ds_data = (h5f['dd48'][dset])
    print ('dataset=', dset)
    print (ds_data.dtype)
    if ds_data.dtype == 'float64' :
        csvfmt = '%.18e'
    elif ds_data.dtype == 'int64' :
        csvfmt = '%.10d'
        csvfmt = '%s'
    np.savetxt('output_'+dset+'.csv', ds_data, fmt=csvfmt, delimiter=',')


08-06 09:53