forked from HDFGroup/hdf5
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathLearnBasics2.dox
1208 lines (1023 loc) · 50.5 KB
/
LearnBasics2.dox
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
/** @page LBGrpCreate Creating an Group
Navigate back: \ref index "Main" / \ref GettingStarted / \ref LearnBasics
<hr>
\section secLBGrpCreate Creating an group
An HDF5 group is a structure containing zero or more HDF5 objects. The two primary HDF5 objects are groups and datasets. To create a group, the calling program must:
<ol>
<li>Obtain the location identifier where the group is to be created.</li>
<li>Create the group.</li>
<li>Close the group.</li>
</ol>
To create a group, the calling program must call #H5Gcreate.
To close the group, #H5Gclose must be called. The close call is mandatory.
For example:
<em>C</em>
\code
group_id = H5Gcreate(file_id, "/MyGroup", H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);
status = H5Gclose (group_id);
\endcode
<em>Fortran</em>
\code
CALL h5gcreate_f (loc_id, name, group_id, error)
CALL h5gclose_f (group_id, error)
\endcode
\section secLBGrpCreateRWEx Programming Example
\subsection secLBGrpCreateRWExDesc Description
See \ref LBExamples for the examples used in the \ref LearnBasics tutorial.
The example shows how to create and close a group. It creates a file called <code style="background-color:whitesmoke;">group.h5</code> in C
(<code style="background-color:whitesmoke;">groupf.h5</code> for FORTRAN), creates a group called MyGroup in the root group, and then closes the group and file.
For details on compiling an HDF5 application:
[ \ref LBCompiling ]
\subsection secLBGrpCreateRWExCont File Contents
Shown below is the contents and the definition of the group of <code style="background-color:whitesmoke;">group.h5</code> (created by the C program).
(The FORTRAN program creates the HDF5 file <code style="background-color:whitesmoke;">groupf.h5</code> and the resulting DDL shows the filename
<code style="background-color:whitesmoke;">groupf.h5</code> in the first line.)
<table>
<caption>The Contents of group.h5.</caption>
<tr>
<td>
\image html imggrpcreate.gif
</td>
</tr>
</table>
<em>group.h5 in DDL</em>
\code
HDF5 "group.h5" {
GROUP "/" {
GROUP "MyGroup" {
}
}
}
\endcode
<hr>
Previous Chapter \ref LBAttrCreate - Next Chapter \ref LBGrpCreateNames
Navigate back: \ref index "Main" / \ref GettingStarted / \ref LearnBasics
@page LBGrpCreateNames Creating Groups using Absolute and Relative Names
Navigate back: \ref index "Main" / \ref GettingStarted / \ref LearnBasics
<hr>
Recall that to create an HDF5 object, we have to specify the location where the object is to be created.
This location is determined by the identifier of an HDF5 object and the name of the object to be created.
The name of the created object can be either an absolute name or a name relative to the specified identifier.
In the previous example, we used the file identifier and the absolute name <code style="background-color:whitesmoke;">/MyGroup</code> to create a group.
In this section, we discuss HDF5 names and show how to use absolute and relative names.
\section secLBGrpCreateNames Names
HDF5 object names are a slash-separated list of components. There are few restrictions on names: component
names may be any length except zero and may contain any character except slash (<code style="background-color:whitesmoke;">/</code>) and the null terminator.
A full name may be composed of any number of component names separated by slashes, with any of the component
names being the special name <code style="background-color:whitesmoke;">.</code> (a dot or period). A name which begins with a slash is an <em>absolute name</em> which
is accessed beginning with the root group of the file; all other names are <em>relative names</em> and and the named
object is accessed beginning with the specified group. A special case is the name <code style="background-color:whitesmoke;">/</code> (or equivalent) which
refers to the root group.
Functions which operate on names generally take a location identifier, which can be either a file identifier
or a group identifier, and perform the lookup with respect to that location. Several possibilities are
described in the following table:
<table>
<tr>
<th><strong>Location Type</strong></th>
<th><strong>Object Name</strong></th>
<th><strong>Description</strong></th>
</tr>
<tr>
<th><strong>File identifier</strong></th>
<td>/foo/bar</td>
<td>The object bar in group foo in the root group.</td>
</tr>
<tr>
<th><strong>Group identifier</strong></th>
<td>/foo/bar</td>
<td>The object bar in group foo in the root group of the file containing the specified group.
In other words, the group identifier's only purpose is to specify a file.</td>
</tr>
<tr>
<th><strong>File identifier</strong></th>
<td>/</td>
<td>The root group of the specified file.</td>
</tr>
<tr>
<th><strong>Group identifier</strong></th>
<td>/</td>
<td>The root group of the file containing the specified group.</td>
</tr>
<tr>
<th><strong>Group identifier</strong></th>
<td>foo/bar</td>
<td>The object bar in group foo in the specified group.</td>
</tr>
<tr>
<th><strong>File identifier</strong></th>
<td>.</td>
<td>The root group of the file.</td>
</tr>
<tr>
<th><strong>Group identifier</strong></th>
<td>.</td>
<td>The specified group.</td>
</tr>
<tr>
<th><strong>Other identifier</strong></th>
<td>.</td>
<td>The specified object.</td>
</tr>
</table>
\section secLBGrpCreateNamesEx Programming Example
\subsection secLBGrpCreateNamesExDesc Description
See \ref LBExamples for the examples used in the \ref LearnBasics tutorial.
The example code shows how to create groups using absolute and relative names. It creates three groups: the first two groups are created using
the file identifier and the group absolute names while the third group is created using a group identifier and a name relative to the specified group.
For details on compiling an HDF5 application:
[ \ref LBCompiling ]
\subsection secLBGrpCreateNamesExRem Remarks
#H5Gcreate creates a group at the location specified by a location identifier and a name. The location identifier
can be a file identifier or a group identifier and the name can be relative or absolute.
The first #H5Gcreate/h5gcreate_f creates the group <code style="background-color:whitesmoke;">MyGroup</code> in the root group of the specified file.
The second #H5Gcreate/h5gcreate_f creates the group <code style="background-color:whitesmoke;">Group_A</code> in the group <code style="background-color:whitesmoke;">MyGroup</code> in the root group of the specified
file. Note that the parent group (<code style="background-color:whitesmoke;">MyGroup</code>) already exists.
The third #H5Gcreate/h5gcreate_f creates the group <code style="background-color:whitesmoke;">Group_B</code> in the specified group.
\subsection secLBGrpCreateNamesExCont File Contents
Shown below is the contents and the definition of the group of <code style="background-color:whitesmoke;">groups.h5</code> (created by the C program).
(The FORTRAN program creates the HDF5 file <code style="background-color:whitesmoke;">groupsf.h5</code> and the resulting DDL shows the filename
<code style="background-color:whitesmoke;">groupsf.h5</code> in the first line.)
<table>
<caption>The Contents of groups.h5.</caption>
<tr>
<td>
\image html imggrps.gif
</td>
</tr>
</table>
<em>groups.h5 in DDL</em>
\code
HDF5 "groups.h5" {
GROUP "/" {
GROUP "MyGroup" {
GROUP "Group_A" {
}
GROUP "Group_B" {
}
}
}
}
\endcode
<hr>
Previous Chapter \ref LBGrpCreate - Next Chapter \ref LBGrpDset
Navigate back: \ref index "Main" / \ref GettingStarted / \ref LearnBasics
@page LBGrpDset Creating Datasets in Groups
Navigate back: \ref index "Main" / \ref GettingStarted / \ref LearnBasics
<hr>
\section secLBGrpDset Datasets in Groups
We have shown how to create groups, datasets, and attributes. In this section, we show how to create
datasets in groups. Recall that #H5Dcreate creates a dataset at the location specified by a location
identifier and a name. Similar to #H5Gcreate, the location identifier can be a file identifier or a
group identifier and the name can be relative or absolute. The location identifier and the name
together determine the location where the dataset is to be created. If the location identifier and
name refer to a group, then the dataset is created in that group.
\section secLBGrpDsetEx Programming Example
\subsection secLBGrpDsetExDesc Description
See \ref LBExamples for the examples used in the \ref LearnBasics tutorial.
The example shows how to create a dataset in a particular group. It opens the file created in the previous example and creates two datasets:
For details on compiling an HDF5 application:
[ \ref LBCompiling ]
\subsection secLBGrpDsetExCont File Contents
Shown below is the contents and the definition of the group of <code style="background-color:whitesmoke;">groups.h5</code> (created by the C program).
(The FORTRAN program creates the HDF5 file <code style="background-color:whitesmoke;">groupsf.h5</code> and the resulting DDL shows the filename
<code style="background-color:whitesmoke;">groupsf.h5</code> in the first line.)
<table>
<caption>The contents of the file groups.h5 (groupsf.h5 for FORTRAN)</caption>
<tr>
<td>
\image html imggrpdsets.gif
</td>
</tr>
</table>
<em>groups.h5 in DDL</em>
\code
HDF5 "groups.h5" {
GROUP "/" {
GROUP "MyGroup" {
GROUP "Group_A" {
DATASET "dset2" {
DATATYPE { H5T_STD_I32BE }
DATASPACE { SIMPLE ( 2, 10 ) / ( 2, 10 ) }
DATA {
1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
}
}
}
GROUP "Group_B" {
}
DATASET "dset1" {
DATATYPE { H5T_STD_I32BE }
DATASPACE { SIMPLE ( 3, 3 ) / ( 3, 3 ) }
DATA {
1, 2, 3,
1, 2, 3,
1, 2, 3
}
}
}
}
}
\endcode
<em>groupsf.h5 in DDL</em>
\code
HDF5 "groupsf.h5" {
GROUP "/" {
GROUP "MyGroup" {
GROUP "Group_A" {
DATASET "dset2" {
DATATYPE { H5T_STD_I32BE }
DATASPACE { SIMPLE ( 10, 2 ) / ( 10, 2 ) }
DATA {
1, 1,
2, 2,
3, 3,
4, 4,
5, 5,
6, 6,
7, 7,
8, 8,
9, 9,
10, 10
}
}
}
GROUP "Group_B" {
}
DATASET "dset1" {
DATATYPE { H5T_STD_I32BE }
DATASPACE { SIMPLE ( 3, 3 ) / ( 3, 3 ) }
DATA {
1, 1, 1,
2, 2, 2,
3, 3, 3
}
}
}
}
}
\endcode
<hr>
Previous Chapter \ref LBGrpCreateNames - Next Chapter \ref LBDsetSubRW
Navigate back: \ref index "Main" / \ref GettingStarted / \ref LearnBasics
@page LBDsetSubRW Reading From or Writing To a Subset of a Dataset
Navigate back: \ref index "Main" / \ref GettingStarted / \ref LearnBasics
<hr>
\section secLBDsetSubRW Dataset Subsets
There are two ways that you can select a subset in an HDF5 dataset and read or write to it:
<ul><li>
<strong>Hyperslab Selection</strong>: The #H5Sselect_hyperslab call selects a logically contiguous
collection of points in a dataspace, or a regular pattern of points or blocks in a dataspace.
</li><li>
<strong>Element Selection</strong>: The #H5Sselect_elements call selects elements in an array.
</li></ul>
HDF5 allows you to read from or write to a portion or subset of a dataset by:
\li Selecting a Subset of the Dataset's Dataspace,
\li Selecting a Memory Dataspace,
\li Reading From or Writing to a Dataset Subset.
\section secLBDsetSubRWSel Selecting a Subset of the Dataset's Dataspace
First you must obtain the dataspace of a dataset in a file by calling #H5Dget_space.
Then select a subset of that dataspace by calling #H5Sselect_hyperslab. The <em>offset</em>, <em>count</em>, <em>stride</em>
and <em>block</em> parameters of this API define the shape and size of the selection. They must be arrays
with the same number of dimensions as the rank of the dataset's dataspace. These arrays <strong>ALL</strong> work
together to define a selection. A change to one of these arrays can affect the others.
\li \em offset: An array that specifies the offset of the starting element of the specified hyperslab.
\li \em count: An array that determines how many blocks to select from the dataspace in each dimension. If the block
size for a dimension is one then the count is the number of elements along that dimension.
\li \em stride: An array that allows you to sample elements along a dimension. For example, a stride of one (or NULL)
will select every element along a dimension, a stride of two will select every other element, and a stride of three
will select an element after every two elements.
\li \em block: An array that determines the size of the element block selected from a dataspace. If the block size
is one or NULL then the block size is a single element in that dimension.
\section secLBDsetSubRWMem Selecting a Memory Dataspace
You must select a memory dataspace in addition to a file dataspace before you can read a subset from or write a subset
to a dataset. A memory dataspace can be specified by calling #H5Screate_simple.
The memory dataspace passed to the read or write call must contain the same number of elements as the file dataspace.
The number of elements in a dataspace selection can be determined with the #H5Sget_select_npoints API.
\section secLBDsetSubRWSub Reading From or Writing To a Dataset Subset
To read from or write to a dataset subset, the #H5Dread and #H5Dwrite routines are used. The memory and file dataspace
identifiers from the selections that were made are passed into the read or write call. For example (C):
\code
status = H5Dwrite (.., .., memspace_id, dataspace_id, .., ..);
\endcode
\section secLBDsetSubRWProg Programming Example
\subsection subsecLBDsetSubRWProgDesc Description
See \ref LBExamples for the examples used in the \ref LearnBasics tutorial.
The example creates an 8 x 10 integer dataset in an HDF5 file. It then selects and writes to a 3 x 4 subset
of the dataset created with the dimensions offset by 1 x 2. (If using Fortran, the dimensions will be swapped.
The dataset will be 10 x 8, the subset will be 4 x 3, and the offset will be 2 x 1.)
PLEASE NOTE that the examples and images below were created using C.
The following image shows the dataset that gets written originally, and the subset of data that gets modified
afterwards. Dimension 0 is vertical and Dimension 1 is horizontal as shown below:
<table>
<tr>
<td>
\image html LBDsetSubRWProg.png
</td>
</tr>
</table>
The subset on the right above is created using these values for offset, count stride, and block:
\code
offset = {1, 2}
count = {3, 4}
stride = {1, 1}
block = {1, 1}
\endcode
\subsection subsecLBDsetSubRWProgExper Experiments with Different Selections
Following are examples of changes that can be made to the example code provided to better understand
how to make selections.
\subsubsection subsubsecLBDsetSubRWProgExperOne Example 1
By default the example code will select and write to a 3 x 4 subset. You can modify the count
parameter in the example code to select a different subset, by changing the value of
DIM0_SUB (C, C++) / dim0_sub (Fortran) near the top. Change its value to 7 to create a 7 x 4 subset:
<table>
<tr>
<td>
\image html imgLBDsetSubRW11.png
</td>
</tr>
</table>
If you were to change the subset to 8 x 4, the selection would be beyond the extent of the dimension:
<table>
<tr>
<td>
\image html imgLBDsetSubRW12.png
</td>
</tr>
</table>
The write will fail with the error: "<strong>file selection+offset not within extent</strong>"
\subsubsection subsubsecLBDsetSubRWProgExperTwo Example 2
In the example code provided, the memory and file dataspaces passed to the H5Dwrite call have the
same size, 3 x 4 (DIM0_SUB x DIM1_SUB). Change the size of the memory dataspace to be 4 x 4 so that
they do not match, and then compile:
\code
dimsm[0] = DIM0_SUB + 1;
dimsm[1] = DIM1_SUB;
memspace_id = H5Screate_simple (RANK, dimsm, NULL);
\endcode
The code will fail with the error: "<strong>src and dest data spaces have different sizes</strong>"
How many elements are in the memory and file dataspaces that were specified above? Add these lines:
\code
hssize_t size;
/* Just before H5Dwrite call the following */
size = H5Sget_select_npoints (memspace_id);
printf ("\nmemspace_id size: %i\n", size);
size = H5Sget_select_npoints (dataspace_id);
printf ("dataspace_id size: %i\n", size);
\endcode
You should see these lines followed by the error:
\code
memspace_id size: 16
dataspace_id size: 12
\endcode
\subsubsection subsubsecLBDsetSubRWProgExperThree Example 3
This example shows the selection that occurs if changing the values of the <em>offset</em>, <em>count</em>,
<em>stride</em> and <em>block</em> parameters in the example code.
This will select two blocks. The <em>count</em> array specifies the number of blocks. The <em>block</em> array
specifies the size of a block. The <em>stride</em> must be modified to accommodate the block <em>size</em>.
<table>
<tr>
<td>
\image html imgLBDsetSubRW31.png
</td>
</tr>
</table>
Now try modifying the count as shown below. The write will fail because the selection goes beyond the extent of the dimension:
<table>
<tr>
<td>
\image html imgLBDsetSubRW32.png
</td>
</tr>
</table>
If the offset were 1x1 (instead of 1x2), then the selection can be made:
<table>
<tr>
<td>
\image html imgLBDsetSubRW33.png
</td>
</tr>
</table>
The selections above were tested with the
<a href="https://\AEXURL/howto/subset/h5_subsetbk.c">h5_subsetbk.c</a>
example code. The memory dataspace was defined as one-dimensional.
\subsection subsecLBDsetSubRWProgRem Remarks
\li In addition to #H5Sselect_hyperslab, this example introduces the #H5Dget_space call to obtain the dataspace of a dataset.
\li If using the default values for the stride and block parameters of #H5Sselect_hyperslab, then, for C you can specify NULL
for these parameters, rather than passing in an array for each, and for Fortran 90 you can omit these parameters.
<hr>
Previous Chapter \ref LBGrpDset - Next Chapter \ref LBDatatypes
Navigate back: \ref index "Main" / \ref GettingStarted / \ref LearnBasics
@page LBDatatypes Datatype Basics
Navigate back: \ref index "Main" / \ref GettingStarted / \ref LearnBasics
<hr>
\section secLBDtype What is a Datatype?
A datatype is a collection of datatype properties which provide complete information for data conversion to or from that datatype.
Datatypes in HDF5 can be grouped as follows:
\li <strong>Pre-Defined Datatypes</strong>: These are datatypes that are created by HDF5. They are actually opened
(and closed) by HDF5, and can have a different value from one HDF5 session to the next.
\li <strong>Derived Datatypes</strong>: These are datatypes that are created or derived from the pre-defined datatypes.
Although created from pre-defined types, they represent a category unto themselves. An example of a commonly used derived
datatype is a string of more than one character.
\section secLBDtypePre Pre-defined Datatypes
The properties of pre-defined datatypes are:
\li Pre-defined datatypes are opened and closed by HDF5.
\li A pre-defined datatype is a handle and is NOT PERSISTENT. Its value can be different from one HDF5 session to the next.
\li Pre-defined datatypes are Read-Only.
\li As mentioned, other datatypes can be derived from pre-defined datatypes.
There are two types of pre-defined datatypes, standard (file) and native.
<h4>Standard</h4>
A standard (or file) datatype can be:
<ul>
<li><strong>Atomic</strong>: A datatype which cannot be decomposed into smaller datatype units at the API level.
The atomic datatypes are:
<ul>
<li>integer</li>
<li>float</li>
<li>string (1-character)</li>
<li>date and time</li>
<li>bitfield</li>
<li>reference</li>
<li>opaque</li>
</ul>
</li>
<li><strong>Composite</strong>: An aggregation of one or more datatypes.
Composite datatypes include:
<ul>
<li>array</li>
<li>variable length</li>
<li>enumeration</li>
<li>compound datatypes</li>
</ul>
Array, variable length, and enumeration datatypes are defined in terms of a single atomic datatype,
whereas a compound datatype is a datatype composed of a sequence of datatypes.
</li>
</ul>
<table>
<tr>
<th><strong>Notes</strong></th>
</tr>
<tr>
<td>
\li Standard pre-defined datatypes are the SAME on all platforms.
\li They are the datatypes that you see in an HDF5 file.
\li They are typically used when creating a dataset.
</td>
</tr>
</table>
<h4>Native</h4>
Native pre-defined datatypes are used for memory operations, such as reading and writing. They are
NOT THE SAME on different platforms. They are similar to C type names, and are aliased to the
appropriate HDF5 standard pre-defined datatype for a given platform.
For example, when on an Intel based PC, #H5T_NATIVE_INT is aliased to the standard pre-defined type,
#H5T_STD_I32LE. On a MIPS machine, it is aliased to #H5T_STD_I32BE.
<table>
<tr>
<th><strong>Notes</strong></th>
</tr>
<tr>
<td>
\li Native datatypes are NOT THE SAME on all platforms.
\li Native datatypes simplify memory operations (read/write). The HDF5 library automatically converts as needed.
\li Native datatypes are NOT in an HDF5 File. The standard pre-defined datatype that a native datatype corresponds
to is what you will see in the file.
</td>
</tr>
</table>
<h4>Pre-Defined</h4>
The following table shows the native types and the standard pre-defined datatypes they correspond
to. (Keep in mind that HDF5 can convert between datatypes, so you can specify a buffer of a larger
type for a dataset of a given type. For example, you can read a dataset that has a short datatype
into a long integer buffer.)
<table>
<caption>Some HDF5 pre-defined native datatypes and corresponding standard (file) type</caption>
<tr>
<th><strong>C Type</strong></th>
<th><strong>HDF5 Memory Type</strong></th>
<th><strong>HDF5 File Type*</strong></th>
</tr>
<tr>
<th span="3"><strong>Integer</strong></th>
</tr>
<tr>
<td>int</td>
<td>#H5T_NATIVE_INT</td>
<td>#H5T_STD_I32BE or #H5T_STD_I32LE</td>
</tr>
<tr>
<td>short</td>
<td>#H5T_NATIVE_SHORT</td>
<td>#H5T_STD_I16BE or #H5T_STD_I16LE</td>
</tr>
<tr>
<td>long</td>
<td>#H5T_NATIVE_LONG</td>
<td>#H5T_STD_I32BE, #H5T_STD_I32LE,
#H5T_STD_I64BE or #H5T_STD_I64LE</td>
</tr>
<tr>
<td>long long</td>
<td>#H5T_NATIVE_LLONG</td>
<td>#H5T_STD_I64BE or #H5T_STD_I64LE</td>
</tr>
<tr>
<td>unsigned int</td>
<td>#H5T_NATIVE_UINT</td>
<td>#H5T_STD_U32BE or #H5T_STD_U32LE</td>
</tr>
<tr>
<td>unsigned short</td>
<td>#H5T_NATIVE_USHORT</td>
<td>#H5T_STD_U16BE or #H5T_STD_U16LE</td>
</tr>
<tr>
<td>unsigned long</td>
<td>#H5T_NATIVE_ULONG</td>
<td>#H5T_STD_U32BE, #H5T_STD_U32LE,
#H5T_STD_U64BE or #H5T_STD_U64LE</td>
</tr>
<tr>
<td>unsigned long long</td>
<td>#H5T_NATIVE_ULLONG</td>
<td>#H5T_STD_U64BE or #H5T_STD_U64LE</td>
</tr>
<tr>
<th span="3"><strong>Float</strong></th>
</tr>
<tr>
<td>_Float16</td>
<td>#H5T_NATIVE_FLOAT16</td>
<td>#H5T_IEEE_F16BE or #H5T_IEEE_F16LE</td>
</tr>
<tr>
<td>float</td>
<td>#H5T_NATIVE_FLOAT</td>
<td>#H5T_IEEE_F32BE or #H5T_IEEE_F32LE</td>
</tr>
<tr>
<td>double</td>
<td>#H5T_NATIVE_DOUBLE</td>
<td>#H5T_IEEE_F64BE or #H5T_IEEE_F64LE</td>
</tr>
</table>
<table>
<caption>Some HDF5 pre-defined native datatypes and corresponding standard (file) type</caption>
<tr>
<th><strong>F90 Type</strong></th>
<th><strong>HDF5 Memory Type</strong></th>
<th><strong>HDF5 File Type*</strong></th>
</tr>
<tr>
<td>integer</td>
<td>H5T_NATIVE_INTEGER</td>
<td>#H5T_STD_I32BE(8,16) or #H5T_STD_I32LE(8,16)</td>
</tr>
<tr>
<td>real</td>
<td>H5T_NATIVE_REAL</td>
<td>#H5T_IEEE_F32BE or #H5T_IEEE_F32LE</td>
</tr>
<tr>
<td>double-precision</td>
<td>#H5T_NATIVE_DOUBLE</td>
<td>#H5T_IEEE_F64BE or #H5T_IEEE_F64LE</td>
</tr>
</table>
<table>
<tr>
<td>* Note that the HDF5 File Types listed are those that are most commonly created.
The file type created depends on the compiler switches and platforms being
used. For example, on the Cray an integer is 64-bit, and using #H5T_NATIVE_INT (C)
or H5T_NATIVE_INTEGER (F90) would result in an #H5T_STD_I64BE file type.</td>
</tr>
</table>
The following code is an example of when you would use standard pre-defined datatypes vs. native types:
\code
#include "hdf5.h"
main() {
hid_t file_id, dataset_id, dataspace_id;
herr_t status;
hsize_t dims[2]={4,6};
int i, j, dset_data[4][6];
for (i = 0; i < 4; i++)
for (j = 0; j < 6; j++)
dset_data[i][j] = i * 6 + j + 1;
file_id = H5Fcreate ("dtypes.h5", H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);
dataspace_id = H5Screate_simple (2, dims, NULL);
dataset_id = H5Dcreate (file_id, "/dset", H5T_STD_I32BE, dataspace_id,
H5P_DEFAULT);
status = H5Dwrite (dataset_id, H5T_NATIVE_INT, H5S_ALL, H5S_ALL,
H5P_DEFAULT, dset_data);
status = H5Dclose (dataset_id);
status = H5Fclose (file_id);
}
\endcode
By using the native types when reading and writing, the code that reads from or writes to a dataset
can be the same for different platforms.
Can native types also be used when creating a dataset? Yes. However, just be aware that the resulting
datatype in the file will be one of the standard pre-defined types and may be different than expected.
What happens if you do not use the correct native datatype for a standard (file) datatype? Your data
may be incorrect or not what you expect.
\section secLBDtypeDer Derived Datatypes
ANY pre-defined datatype can be used to derive user-defined datatypes.
To create a datatype derived from a pre-defined type:
<ol>
<li>Make a copy of the pre-defined datatype:
\code
tid = H5Tcopy (H5T_STD_I32BE);
\endcode
</li>
<li>Change the datatype.</li>
</ol>
There are numerous datatype functions that allow a user to alter a pre-defined datatype. See
\ref subsecLBDtypeSpecStr below for a simple example.
Refer to the \ref H5T in the \ref RM. Example functions are #H5Tset_size and #H5Tset_precision.
\section secLBDtypeSpec Specific Datatypes
On the \ref ExAPI
page under \ref sec_exapi_dtypes
you will find many example programs for creating and reading datasets with different datatypes.
Below is additional information on some of the datatypes. See
the \ref ExAPI
page for examples of these datatypes.
\subsection subsecLBDtypeSpec Array Datatype vs Array Dataspace
#H5T_ARRAY is a datatype, and it should not be confused with the dataspace of a dataset. The dataspace
of a dataset can consist of a regular array of elements. For example, the datatype for a dataset
could be an atomic datatype like integer, and the dataset could be an N-dimensional appendable array,
as specified by the dataspace. See #H5Screate and #H5Screate_simple for details.
Unlimited dimensions and subsetting are not supported when using the #H5T_ARRAY datatype.
The #H5T_ARRAY datatype was primarily created to address the simple case of a compound datatype
when all members of the compound datatype are of the same type and there is no need to subset by
compound datatype members. Creation of such a datatype is more efficient and I/O also requires
less work, because there is no alignment involved.
\subsection subsecLBDtypeSpecArr Array Datatype
The array class of datatypes, #H5T_ARRAY, allows the construction of true, homogeneous,
multi-dimensional arrays. Since these are homogeneous arrays, each element of the array
will be of the same datatype, designated at the time the array is created.
Users may be confused by this datatype, as opposed to a dataset with a simple atomic
datatype (eg. integer) that is an array. See subsecLBDtypeSpec for more information.
Arrays can be nested. Not only is an array datatype used as an element of an HDF5 dataset,
but the elements of an array datatype may be of any datatype, including another array datatype.
Array datatypes <strong>cannot be subdivided for I/O</strong>; the entire array must be transferred from one
dataset to another.
Within certain limitations, outlined in the next paragraph, array datatypes may be N-dimensional
and of any dimension size. <strong>Unlimited dimensions, however, are not supported</strong>. Functionality similar
to unlimited dimension arrays is available through the use of variable-length datatypes.
The maximum number of dimensions, i.e., the maximum rank, of an array datatype is specified by
the HDF5 library constant #H5S_MAX_RANK. The minimum rank is 1 (one). All dimension sizes must
be greater than 0 (zero).
One array datatype may only be converted to another array datatype if the number of dimensions
and the sizes of the dimensions are equal and the datatype of the first array's elements can be
converted to the datatype of the second array's elements.
\subsubsection subsubsecLBDtypeSpecArrAPI Array Datatype APIs
There are three functions that are specific to array datatypes: one, #H5Tarray_create, for creating
an array datatype, and two, #H5Tget_array_ndims and #H5Tget_array_dims
for working with existing array datatypes.
<h4>Creating</h4>
The function #H5Tarray_create creates a new array datatype object. Parameters specify
\li the base datatype of each element of the array,
\li the rank of the array, i.e., the number of dimensions,
\li the size of each dimension, and
\li the dimension permutation of the array, i.e., whether the elements of the array are listed in C or FORTRAN order.
<h4>Working with existing array datatypes</h4>
When working with existing arrays, one must first determine the rank, or number of dimensions, of the array.
The function #H5Tget_array_dims returns the rank of a specified array datatype.
In many instances, one needs further information. The function #H5Tget_array_dims retrieves the
permutation of the array and the size of each dimension.
\subsection subsecLBDtypeSpecCmpd Compound
\subsubsection subsubsecLBDtypeSpecCmpdProp Properties of compound datatypes
A compound datatype is similar to a struct in C or a common block in Fortran. It is a collection of
one or more atomic types or small arrays of such types. To create and use of a compound datatype
you need to refer to various properties of the data compound datatype:
\li It is of class compound.
\li It has a fixed total size, in bytes.
\li It consists of zero or more members (defined in any order) with unique names and which occupy non-overlapping regions within the datum.
\li Each member has its own datatype.
\li Each member is referenced by an index number between zero and N-1, where N is the number of members in the compound datatype.
\li Each member has a name which is unique among its siblings in a compound datatype.
\li Each member has a fixed byte offset, which is the first byte (smallest byte address) of that member in a compound datatype.
\li Each member can be a small array of up to four dimensions.
Properties of members of a compound datatype are defined when the member is added to the compound type and cannot be subsequently modified.
\subsubsection subsubsecLBDtypeSpecCmpdDef Defining compound datatypes
Compound datatypes must be built out of other datatypes. First, one creates an empty compound
datatype and specifies its total size. Then members are added to the compound datatype in any order.
Member names. Each member must have a descriptive name, which is the key used to uniquely identify
the member within the compound datatype. A member name in an HDF5 datatype does not necessarily
have to be the same as the name of the corresponding member in the C struct in memory, although
this is often the case. Nor does one need to define all members of the C struct in the HDF5
compound datatype (or vice versa).
Offsets. Usually a C struct will be defined to hold a data point in memory, and the offsets of the
members in memory will be the offsets of the struct members from the beginning of an instance of the
struct. The library defines the macro to compute the offset of a member within a struct:
\code
HOFFSET(s,m)
\endcode
This macro computes the offset of member m within a struct variable s.
Here is an example in which a compound datatype is created to describe complex numbers whose type
is defined by the complex_t struct.
\code
typedef struct {
double re; /*real part */
double im; /*imaginary part */
} complex_t;
complex_t tmp; /*used only to compute offsets */
hid_t complex_id = H5Tcreate (H5T_COMPOUND, sizeof tmp);
H5Tinsert (complex_id, "real", HOFFSET(tmp,re), H5T_NATIVE_DOUBLE);
H5Tinsert (complex_id, "imaginary", HOFFSET(tmp,im), H5T_NATIVE_DOUBLE);
\endcode
\subsection subsecLBDtypeSpecRef Reference
There are three types of Reference datatypes in HDF5:
\li \ref subsubsecLBDtypeSpecRefStd
\li \ref subsubsecLBDtypeSpecRefObj
\li \ref subsubsecLBDtypeSpecRefDset
\subsubsection subsubsecLBDtypeSpecRefStd Standard Reference
HDF5 references allow users to reference existing HDF5 objects as well as selections within datasets. The
original API, now deprecated, was extended in order to add the ability to reference attributes as well as objects in
external files.
The newer API introduced a single opaque reference type, which not only has the advantage of hiding the internal
representation of references, but it also allows for future extensions to be added more seamlessly. The newer API
introduces a single abstract #H5R_ref_t type as well as attribute references and external references
(i.e., references to objects in an external file).
A file, group, dataset, named datatype, or attribute may be the target of an object reference.
The object reference is created by
#H5Rcreate_object with the name of an object which may be a file, group, dataset, named datatype, or attribute
and the reference type #H5R_OBJECT. The object does not have to be open to create a reference to it.
An object reference may also refer to a region (selection) of a dataset. The reference is created
with #H5Rcreate_region. The dataspace for the region can be retrieved with a call to #H5Ropen_region.
An object reference may also refer to a attribute. The reference is created
with #H5Rcreate_attr. #H5Ropen_attr can be used to open the attribute by returning an identifier
to the attribute just as if #H5Aopen has been called.
An object reference can be accessed by a call to #H5Ropen_object.
When the reference is to a dataset or dataset region, the #H5Ropen_object call returns an
identifier to the dataset just as if #H5Dopen has been called.
When the reference is to an attribute, the #H5Ropen_object call returns an
identifier to the attribute just as if #H5Aopen has been called.
The reference buffer from the #H5Rcreate_object call must be released by
using #H5Rdestroy to avoid resource leaks and possible HDF5 library shutdown issues. And any identifiers
returned by #H5Ropen_object must be closed with the appropriate close call.
\subsubsection subsubsecLBDtypeSpecRefObj Reference to objects - Deprecated
In HDF5, objects (i.e. groups, datasets, and named datatypes) are usually accessed by name.
There is another way to access stored objects -- by reference.
An object reference is based on the relative file address of the object header in the file
and is constant for the life of the object. Once a reference to an object is created and
stored in a dataset in the file, it can be used to dereference the object it points to.
References are handy for creating a file index or for grouping related objects by storing
references to them in one dataset.
<h4>Creating and storing references to objects</h4>
The following steps are involved in creating and storing file references to objects:
<ol>
<li>Create the objects or open them if they already exist in the file.</li>
<li>Create a dataset to store the objects' references, by specifying #H5T_STD_REF_OBJ as the datatype</li>
<li>Create and store references to the objects in a buffer, using #H5Rcreate.</li>
<li>Write a buffer with the references to the dataset, using #H5Dwrite with the #H5T_STD_REF_OBJ datatype.</li>
</ol>
<h4>Reading references and accessing objects using references</h4>
The following steps are involved:
<ol>
<li>Open the dataset with the references and read them. The #H5T_STD_REF_OBJ datatype must be used to describe the memory datatype.</li>
<li>Use the read reference to obtain the identifier of the object the reference points to using #H5Rdereference.</li>
<li>Open the dereferenced object and perform the desired operations.</li>
<li>Close all objects when the task is complete.</li>
</ol>
\subsubsection subsubsecLBDtypeSpecRefDset Reference to a dataset region - Deprecated
A dataset region reference points to a dataset selection in another dataset.
A reference to the dataset selection (region) is constant for the life of the dataset.
<h4>Creating and storing references to dataset regions</h4>
The following steps are involved in creating and storing references to a dataset region:
\li Create a dataset to store the dataset region (selection), by passing in #H5T_STD_REF_DSETREG for the datatype when calling #H5Dcreate.
\li Create selection(s) in existing dataset(s) using #H5Sselect_hyperslab and/or #H5Sselect_elements.
\li Create reference(s) to the selection(s) using #H5Rcreate and store them in a buffer.
\li Write the references to the dataset regions in the file.
\li Close all objects.
<h4>Reading references to dataset regions</h4>
The following steps are involved in reading references to dataset regions and referenced dataset regions (selections).
<ol>
<li>Open and read the dataset containing references to the dataset regions.
The datatype #H5T_STD_REF_DSETREG must be used during read operation.</li>
<li>Use #H5Rdereference to obtain the dataset identifier from the read dataset region reference.
OR
Use #H5Rget_region to obtain the dataspace identifier for the dataset containing the selection from the read dataset region reference.
</li>
<li>With the dataspace identifier, the \ref H5S interface functions, H5Sget_select_*,
can be used to obtain information about the selection.</li>
<li>Close all objects when they are no longer needed.</li>
</ol>
The dataset with the region references was read by #H5Dread with the #H5T_STD_REF_DSETREG datatype specified.
The read reference can be used to obtain the dataset identifier by calling #H5Rdereference or by obtaining
obtain spatial information (dataspace and selection) with the call to #H5Rget_region.
The reference to the dataset region has information for both the dataset itself and its selection. In both functions:
\li The first parameter is an identifier of the dataset with the region references.
\li The second parameter specifies the type of reference stored. In this example, a reference to the dataset region is stored.
\li The third parameter is a buffer containing the reference of the specified type.
This example introduces several H5Sget_select_* functions used to obtain information about selections:
<table>
<caption>Examples of HDF5 predefined datatypes</caption>
<tr>
<th><strong>Function</strong></th>
<th><strong>Description</strong></th>
</tr>
<tr>
<td>#H5Sget_select_npoints</td>
<td>Returns the number of elements in the hyperslab</td>
</tr>
<tr>
<td>#H5Sget_select_hyper_nblocks</td>
<td>Returns the number of blocks in the hyperslab</td>
</tr>
<tr>
<td>#H5Sget_select_hyper_blocklist</td>
<td>Returns the "lower left" and "upper right" coordinates of the blocks in the hyperslab selection</td>
</tr>
<tr>
<td>#H5Sget_select_bounds</td>
<td>Returns the coordinates of the "minimal" block containing a hyperslab selection</td>
</tr>
<tr>
<td>#H5Sget_select_elem_npoints</td>
<td>Returns the number of points in the element selection</td>
</tr>
<tr>
<td>#H5Sget_select_elem_pointlist</td>
<td>Returns the coordinates of points in the element selection</td>
</tr>
</table>
\subsection subsecLBDtypeSpecStr String
A simple example of creating a derived datatype is using the string datatype,
#H5T_C_S1 (#H5T_FORTRAN_S1) to create strings of more than one character. Strings
can be stored as either fixed or variable length, and may have different rules
for padding of unused storage.