View
392
Download
2
Embed Size (px)
DESCRIPTION
HDF5 is a powerful and feature-rich creature, and getting the most out of it requires powerful tools. The MathWorks provides a "low-level" interface to the HDF5 library that closely corresponds to the C API and exposes much of its richness. This short tutorial will present ways to use the low-level MATLAB interface to build those tools and tackle such topics as subsetting, chunking, and compression.
Citation preview
© 2
008T
he M
athW
orks
, Inc
.
® ®
The MATLAB Low-Level HDF5 Interface
John Evans
2
® ®
IntroductionTwo Solutions Provided
“High Level” Interface: HDF5READ, HDF5WRITE, HDF5INFO mex-files. Provides “one-stop-shopping” to reading and writing HDF5 files. Easy introduction, but not flexible.
“Low Level” Interface: Over 200 functions that correspond to the HDF5 C API. Far more flexible, but the user needs to know the C API.
3
® ®
How We Differ From the C API
C API for H5Aread:
herr_t H5Aread(hid_t attr_id, hid_t mem_type_id, void *buf );
4
® ®
How We Differ From the C API
C API for H5Aread:
herr_t H5Aread (hid_t attr_id, hid_t mem_type_id, void *buf );
MATLAB API
buf = H5A.read(attr_id,mem_type_id);
No “herr_t” returns values.
OUTPUTs on the left, INPUTs on the right.
5
® ®
Help for H5A.read
>> help H5A.read H5A.read HDF5 H5Aread library function attr = H5A.read(attr_id, dtype_id) reads the attribute specified by attr_id. dtype_id specifies the attribute's memory datatype. The memory datatype may be 'H5ML_DEFAULT', which specifies that MATLAB should determine the appropriate memory datatype. To use this function, you must be familiar with the information about the Attribute Interface contained in the User's Guide and Reference Manual for HDF5 version 1.6.5. This documentation may be obtained from the National Center for Supercomputing Applications (NCSA) at <http://hdf.ncsa.uiuc.edu> or The HDF Group at <http://www.hdfgroup.org/>. For more help on the Attribute Interface functions, type: help imagesci/@H5A
6
® ®
What's with the dot in “H5A.read”?
The MATLAB HDF5 functions are divided into “classes” that correspond to the HDF5 interfaces. One such class is “H5A”, which has “read” as a member function.
H5, H5A, H5D, H5E, H5F, H5G, H5I, H5ML(??), H5P, H5R, H5S, H5T, H5Z.
H5ML is a special MATLAB “helper” class.
7
® ®
Example 1: Opening a file
>> file_id = H5F.open('example.h5','H5F_ACC_RDONLY','H5P_DEFAULT');
>> whos Name Size Bytes Class Attributes
file_id 1x1 H5ML.id
>> H5F.close(fid)
>> mode = H5ML.get_constant_value('H5F_ACC_RDONLY')
mode =
0
>> file_id = H5F.open('example.h5',0,'H5P_DEFAULT'); >> H5F.close(file_id);
8
® ®
Example 2: Creating a 2x3 Dataset
file_id = H5F.create('tst.h5','H5F_ACC_TRUNC','H5P_DEFAULT','H5P_DEFAULT');
dims = [2 3];space_id = H5S.create_simple(2,dims,[]);dset_id = H5D.create(file_id,'foo','H5T_STD_I32LE',space_id,'H5P_DEFAULT');
data = [1 3 5; 2 4 6]; data = int32(data);
H5D.write(dset_id,'H5T_NATIVE_INT','H5S_ALL','H5S_ALL','H5P_DEFAULT',indata);H5D.close(dset_id); H5F.close(file_id);
file_id = H5F.open('tst.h5','H5F_ACC_RDONLY','H5P_DEFAULT');dset_id = H5D.open(file_id,'/foo');outdata = H5D.read(dset_id,'H5ML_DEFAULT','H5S_ALL','H5S_ALL','H5P_DEFAULT');H5D.close(dset_id); H5F.close(file_id);
9
® ®
Example 1: Output
DATASET "foo" { DATATYPE H5T_STD_I32LE DATASPACE SIMPLE { ( 2, 3 ) / ( 2, 3 ) } DATA { (0,0): 1, 2, 3, (1,0): 4, 5, 6 } }
10
® ®
Example 1: Output
data =
1 3 5 2 4 6
outdata =
1 4 2 5 3 6
11
® ®
MATLAB Uses Column-Major OrderingSo What Should One Do?
Users can transpose their data before read, after write. This works, but inflicts a performance penalty.
Other option is “flip” the dataspace... No performance penalty here, but the user must remember that the dimensions will look reversed from the C API.
12
® ®
Example 2: flipping the dataspace
data = [1 3 5; ... 2 4 6];data = int32(data);
dims = [2 3];space_id = H5S.create_simple(2,fliplr(dims),[]);dset_id = H5D.create(file_id,'foo',...
'H5T_STD_I32LE',space_id,'H5P_DEFAULT');
H5D.write(dset_id,'H5T_NATIVE_INT', ... 'H5S_ALL','H5S_ALL','H5P_DEFAULT',data);
13
® ®
Example 2: Output of h5dump
HDF5 "tst.h5" {GROUP "/" { DATASET "foo" { DATATYPE H5T_STD_I32LE DATASPACE SIMPLE { ( 3, 2 ) / ( 3, 2 ) } DATA { (0,0): 1, 2, (1,0): 3, 4, (2,0): 5, 6 } }}}
14
® ®
Example 3: Compression
dims = [2000 3000];data = int32(zeros(2000,3000));
% Remember to flip the dataspace.space_id = H5S.create_simple(2,fliplr(dims),[]);
dcpl_id = H5P.create('H5P_DATASET_CREATE');H5P.set_deflate(dcpl_id,6);
% Flip the chunking as well!H5P.set_chunk(dcpl_id,fliplr([200 300]));
dset_id = H5D.create(file_id,'foo','H5T_NATIVE_INT',space_id,dcpl_id);
H5D.write(dset_id,'H5T_NATIVE_INT','H5S_ALL','H5S_ALL','H5P_DEFAULT',data);
15
® ®
CompoundsHDF5 "tst.h5" {GROUP "/" { DATASET "ArrayOfStructures" { DATATYPE H5T_COMPOUND { H5T_STD_I32LE "a"; H5T_IEEE_F64LE "c"; H5T_IEEE_F32LE "b"; } DATASPACE SIMPLE { ( 10 ) / ( 10 ) } DATA { (0): { 0, 1, 0 }, (1): { 1, 0.5, 1 },...
16
® ®
Compounds: continued>> d = h5varget('tst.h5','/ArrayOfStructures')
d =
a: [10x1 int32] c: [10x1 double] b: [10x1 single]
>> d.a
ans =
0 1 2 3 4 5 6 7 8 9
17
® ®
References
Mathworks: www.mathworks.com HDF5 Tools: google “matlab hdf5tools”