forked from HDFGroup/hdf5
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathVFLTechNote.dox
1025 lines (924 loc) · 47.5 KB
/
VFLTechNote.dox
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
/** \page VFLTN HDF5 Virtual File Layer
\section sec_vfl_intro Introduction
The HDF5 file format describes how HDF5 data structures and dataset raw data are mapped
to a linear format address space and the HDF5 library implements that bidirectional mapping
in terms of an API. However, the HDF5 format specifications do not indicate how the format
address space is mapped onto storage and HDF (version 5 and earlier) simply mapped the format
address space directly onto a single file by convention.
Since early versions of HDF5 it became apparent that users want the ability to map the
format address space onto different types of storage (a single file, multiple files, local
memory, global memory, network distributed global memory, a network protocol, etc.) with
various types of maps. For instance, some users want to be able to handle very large format
address spaces on operating systems that support only 2GB files by partitioning the format
address space into equal-sized parts each served by a separate file. Other users want the
same multi-file storage capability but want to partition the address space according to
purpose (raw data in one file, object headers in another, global heap in a third, etc.)
in order to improve I/O speeds.
In fact, the number of storage variations is probably larger than the number of methods
that the HDF5 team is capable of implementing and supporting. Therefore, a Virtual File
Layer API is being implemented which will allow application teams or departments to design
and implement their own mapping between the HDF5 format address space and storage, with each
mapping being a separate file driver (possibly written in terms of other file drivers). The
HDF5 team will provide a small set of useful file drivers which will also serve as examples
for those who which to write their own:
<table>
<tr>
<td>#H5FD_SEC2</td><td>This is the default driver which uses Posix file-system functions
like read and write to perform I/O to a single file. All I/O requests are unbuffered
although the driver does optimize file seeking operations to some extent.
</td>
</tr>
<tr>
<td>#H5FD_STDIO</td><td>This driver uses functions from 'stdio.h' to perform buffered I/O to a single file.
</td>
</tr>
<tr>
<td>#H5FD_CORE</td><td>This driver performs I/O directly to memory and can be
used to create small temporary files that never exist on permanent storage. This
type of storage is generally very fast since the I/O consists only of memory-to-memory copy operations.
</td>
</tr>
<tr>
<td>#H5FD_MPIO</td><td>This is the driver of choice for accessing files in parallel
using MPI and MPI-IO. It is only predefined if the library is compiled with parallel I/O support.
</td>
</tr>
<tr>
<td>#H5FD_FAMILY</td><td>Large format address spaces are partitioned into more
manageable pieces and sent to separate storage locations using an underlying driver
of the user's choice. \ref H5TOOL_RT_UG can be used to change the sizes of the family
members when stored as files or to convert a family of files to a single file or vice versa.
</td>
</tr>
</table>
\section sec_vfl_use Using a File Driver
Most application writers will use a driver defined by the HDF5 library or contributed by another
programming team. This chapter describes how existing drivers are used.
\subsection subsec_vfl_use_hdr Driver Header Files
Each file driver is defined in its own public header file which should be included by any
application which plans to use that driver. The predefined drivers are in header files whose
names begin with 'H5FD' followed by the driver name and '.h'. The 'hdf5.h' header file includes
all the predefined driver header files.
Once the appropriate header file is included a symbol of the form 'H5FD_' followed by the
upper-case driver name will be the driver identification number.(The driver name is by convention
and might not apply to drivers which are not distributed with HDF5.) However, the value may
change if the library is closed (e.g., by calling #H5close) and the symbol is referenced again.
\subsection subsec_vfl_use_create Creating and Opening Files
In order to create or open a file one must define the method by which the storage is
accessed(The access method also indicates how to translate the storage name to a storage server
such as a file, network protocol, or memory.) and does so by creating a file access property
list(The term "file access property list" is a misnomer since storage isn't required to be a file.)
which is passed to the #H5Fcreate or #H5Fopen function. A default file access property list is created
by calling #H5Pcreate and then the file driver information is inserted by calling a driver initialization
function such as #H5Pset_fapl_family:
\code
hid_t fapl = H5Pcreate(H5P_FILE_ACCESS);
size_t member_size = 100*1024*1024; /*100MB*/
H5Pset_fapl_family(fapl, member_size, H5P_DEFAULT);
hid_t file = H5Fcreate("foo%05d.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl);
H5Pclose(fapl);
\endcode
Each file driver will have its own initialization function whose name is H5Pset_fapl_ followed by
the driver name and which takes a file access property list as the first argument followed by additional
driver-dependent arguments.
An alternative to using the driver initialization function is to set the driver directly using the
#H5Pset_driver function.(This function is overloaded to operate on data transfer property lists also, as described below.)
Its second argument is the file driver identifier, which may have a different numeric value from run to run
depending on the order in which the file drivers are registered with the library. The third argument encapsulates
the additional arguments of the driver initialization function. This method only works if the file driver
writer has made the driver-specific property list structure a public datatype, which is often not the case.
\code
hid_t fapl = H5Pcreate(H5P_FILE_ACCESS);
static H5FD_family_fapl_t fa = {100*1024*1024, H5P_DEFAULT};
H5Pset_driver(fapl, H5FD_FAMILY, &fa);
hid_t file = H5Fcreate("foo.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl);
H5Pclose(fapl);
\endcode
It is also possible to query the file driver information from a file access property list by
calling #H5Pget_driver to determine the driver and then calling a driver-defined query function
to obtain the driver information:
\code
hid_t driver = H5Pget_driver(fapl);
if (H5FD_SEC2==driver) {
/*nothing further to get*/
} else if (H5FD_FAMILY==driver) {
hid_t member_fapl;
haddr_t member_size;
H5Pget_fapl_family(fapl, &member_size, &member_fapl);
} else if (....) {
....
}
\endcode
\subsection subsec_vfl_use_per Performing I/O
The #H5Dread and #H5Dwrite functions transfer data between application memory and the file. They both take
an optional data transfer property list which has some general driver-independent properties and optional
driver-defined properties. An application will typically perform I/O in one of three styles via the
#H5Dread or #H5Dwrite function:
Like file access properties in the previous section, data transfer properties can be set using a driver
initialization function or a general purpose function. For example, to set the MPI-IO driver to use
independent access for I/O operations one would say:
\code
hid_t dxpl = H5Pcreate(H5P_DATA_XFER);
H5Pset_dxpl_mpio(dxpl, H5FD_MPIO_INDEPENDENT);
H5Dread(dataset, type, mspace, fspace, buffer, dxpl);
H5Pclose(dxpl);
\endcode
The alternative is to initialize a driver defined C struct and pass it to the #H5Pset_driver function:
\code
hid_t dxpl = H5Pcreate(H5P_DATA_XFER);
static H5FD_mpio_dxpl_t dx = {H5FD_MPIO_INDEPENDENT};
H5Pset_driver(dxpl, H5FD_MPIO, &dx);
H5Dread(dataset, type, mspace, fspace, buffer, dxpl);
\endcode
The transfer property list can be queried in a manner similar to the file access property list: the driver
provides a function (or functions) to return various information about the transfer property list:
\code
hid_t driver = H5Pget_driver(dxpl);
if (H5FD_MPIO==driver) {
H5FD_mpio_xfer_t xfer_mode;
H5Pget_dxpl_mpio(dxpl, &xfer_mode);
} else {
....
}
\endcode
\subsection subsec_vfl_use_inter File Driver Interchangeability
The HDF5 specifications describe two things: the mapping of data onto a linear format address
space and the C API which performs the mapping. However, the mapping of the format address space
onto storage intentionally falls outside the scope of the HDF5 specs. This is a direct result of the
fact that it is not generally possible to store information about how to access storage inside the
storage itself. For instance, given only the file name '/arborea/1225/work/f%03d' the HDF5 library
is unable to tell whether the name refers to a file on the local file system, a family of files on
the local file system, a file on host 'arborea' port 1225, a family of files on a remote system, etc.
Two ways which library could figure out where the storage is located are: storage access information
can be provided by the user, or the library can try all known file access methods. This implementation
uses the former method.
In general, if a file was created with one driver then it isn't possible to open it with another driver.
There are of course exceptions: a file created with MPIO could probably be opened with the sec2 driver,
any file created by the sec2 driver could be opened as a family of files with one member, etc. In fact,
sometimes a file must not only be opened with the same driver but also with the same driver properties.
The predefined drivers are written in such a way that specifying the correct driver is sufficient for
opening a file.
\section sec_vfl_imp Implementation of a Driver
A driver is simply a collection of functions and data structures which are registered with the HDF5
library at runtime. The functions fall into these categories:
\li Functions which operate on modes
\li Functions which operate on files
\li Functions which operate on the address space
\li Functions which operate on data
\li Functions for driver initialization
\li Optimization functions
\subsection subsec_vfl_imp_mode Mode Functions
Some drivers need information about file access and data transfers which are very specific to the driver.
The information is usually implemented as a pair of pointers to C structs which are allocated and
initialized as part of an HDF5 property list and passed down to various driver functions. There are two
classes of settings: file access modes that describe how to access the file through the driver, and
data transfer modes which are settings that control I/O operations. Each file opened by a particular
driver may have a different access mode; each dataset I/O request for a particular file may have a
different data transfer mode.
Since each driver has its own particular requirements for various settings, each driver is responsible
for defining the mode structures that it needs. Higher layers of the library treat the structures as
opaque but must be able to copy and free them. Thus, the driver provides either the size of the
structure or a pair of function pointers for each of the mode types.
Example: The family driver needs to know how the format address space is partitioned and the file
access property list to use for the family members.
\code
// Driver-specific file access properties
typedef struct H5FD_family_fapl_t {
hsize_t memb_size; // size of each family member
hid_t memb_fapl; // file access property list for each family member
} H5FD_family_fapl_t;
// Driver specific data transfer properties
typedef struct H5FD_family_dxpl_t {
hid_t memb_dxpl_id; //data xfer property list of each member
} H5FD_family_dxpl_t;
\endcode
n order to copy or free one of these structures the member file access or data transfer properties must
also be copied or freed. This is done by providing a copy and close function for each structure:
Example: The file access property list copy and close functions for the family driver:
\code
static void *
H5FD_family_fapl_copy(const void *_old_fa)
{
const H5FD_family_fapl_t *old_fa = (const H5FD_family_fapl_t*)_old_fa;
H5FD_family_fapl_t *new_fa = malloc(sizeof(H5FD_family_fapl_t));
assert(new_fa);
memcpy(new_fa, old_fa, sizeof(H5FD_family_fapl_t));
new_fa->memb_fapl_id = H5Pcopy(old_fa->memb_fapl_id);
return new_fa;
}
static herr_t
H5FD_family_fapl_free(void *_fa)
{
H5FD_family_fapl_t *fa = (H5FD_family_fapl_t*)_fa;
H5Pclose(fa->memb_fapl_id);
free(fa);
return 0;
}
\endcode
Generally when a file is created or opened the file access properties for the driver are copied into the
file pointer which is returned and they may be modified from their original value (for instance, the file
family driver modifies the member size property when opening an existing family). In order to support the
#H5Fget_access_plist function the driver must provide a fapl_get callback which creates a copy of the
driver-specific properties based on a particular file.
Example: The file family driver copies the member size file access property list into the return value:
\code
static void *
H5FD_family_fapl_get(H5FD_t *_file)
{
H5FD_family_t *file = (H5FD_family_t*)_file;
H5FD_family_fapl_t *fa = calloc(1, sizeof(H5FD_family_fapl_t*));
fa->memb_size = file->memb_size;
fa->memb_fapl_id = H5Pcopy(file->memb_fapl_id);
return fa;
}
\endcode
\subsection subsec_vfl_imp_file File Functions
The higher layers of the library expect files to have a name and allow the file to be accessed in various modes.
The driver must be able to create a new file, replace an existing file, or open an existing file. Opening or
creating a file should return a handle, a pointer to a specialization of the H5FD_t struct, which allows read-only
or read-write access and which will be passed to the other driver functions as they are called.(Read-only access is
only appropriate when opening an existing file.)
\code
typedef struct {
// Public fields
H5FD_class_t *cls; //class data defined below
// Private fields -- driver-defined
} H5FD_t;
\endcode
Example: The family driver requires handles to the underlying storage, the size of the members for this
particular file (which might be different than the member size specified in the file access property list
if an existing file family is being opened), the name used to open the file in case additional members
must be created, and the flags to use for creating those additional members. The eoa member caches the
size of the format address space so the family members don't have to be queried in order to find it.
\code
// The description of a file belonging to this driver.
typedef struct H5FD_family_t {
H5FD_t pub; // public stuff, must be first
hid_t memb_fapl_id; // file access property list for members
hsize_t memb_size; // maximum size of each member file
int nmembs; // number of family members
int amembs; // number of member slots allocated
H5FD_t **memb; // dynamic array of member pointers
haddr_t eoa; // end of allocated addresses
char *name; // name generator printf format
unsigned flags; // flags for opening additional members
} H5FD_family_t;
\endcode
Example: The sec2 driver needs to keep track of the underlying Unix file descriptor and also the
end of format address space and current Unix file size. It also keeps track of the current file
position and last operation (read, write, or unknown) in order to optimize calls to lseek. The
device and inode fields are defined on Unix in order to uniquely identify the file and will be
discussed below.
\code
typedef struct H5FD_sec2_t {
H5FD_t pub; // public stuff, must be first
int fd; // the unix file
haddr_t eoa; // end of allocated region
haddr_t eof; // end of file; current file size
haddr_t pos; // current file I/O position
int op; // last operation
dev_t device; // file device number
ino_t inode; // file i-node number
} H5FD_sec2_t;
\endcode
\subsection subsec_vfl_imp_open Open Files
All drivers must define a function for opening/creating a file. This function should have a prototype which is:
<table>
<tr>
<td><code>static H5FD_t * open (const char *name, unsigned flags, hid_t fapl, haddr_t maxaddr)</code></td>
<td>The file name name and file access property list fapl are the same as were specified in the #H5Fcreate
or #H5Fopen call. The flags are the same as in those calls also except the flag #H5F_ACC_CREAT is also
present if the call was to H5Fcreate and they are documented in the 'H5Fpublic.h' file. The maxaddr
argument is the maximum format address that the driver should be prepared to handle (the minimum address is always zero).</td>
</tr>
</table>
Example: The sec2 driver opens a Unix file with the requested name and saves information which
uniquely identifies the file (the Unix device number and inode).
\code
static H5FD_t *
H5FD_sec2_open(const char *name, unsigned flags, hid_t fapl_id/*unused*/,
haddr_t maxaddr)
{
unsigned o_flags;
int fd;
struct stat sb;
H5FD_sec2_t *file=NULL;
// Check arguments
if (!name || !*name) return NULL;
if (0==maxaddr || HADDR_UNDEF==maxaddr) return NULL;
if (ADDR_OVERFLOW(maxaddr)) return NULL;
// Build the open flags
o_flags = (H5F_ACC_RDWR & flags) ? O_RDWR : O_RDONLY;
if (H5F_ACC_TRUNC & flags) o_flags |= O_TRUNC;
if (H5F_ACC_CREAT & flags) o_flags |= O_CREAT;
if (H5F_ACC_EXCL & flags) o_flags |= O_EXCL;
// Open the file
if ((fd=open(name, o_flags, 0666))<0) return NULL;
if (fstat(fd, &sb)<0) {
close(fd);
return NULL;
}
// Create the new file struct
file = calloc(1, sizeof(H5FD_sec2_t));
file->fd = fd;
file->eof = sb.st_size;
file->pos = HADDR_UNDEF;
file->op = OP_UNKNOWN;
file->device = sb.st_dev;
file->inode = sb.st_ino;
return (H5FD_t*)file;
}
\endcode
\subsection subsec_vfl_imp_close Closing Files
Closing a file simply means that all cached data should be flushed to the next lower layer, the
file should be closed at the next lower layer, and all file-related data structures should be
freed. All information needed by the close function is already present in the file handle.
<table>
<tr>
<td><code>static herr_t close (H5FD_t *file)</code></td>
<td>The file argument is the handle which was returned by the open function, and the close should
free only memory associated with the driver-specific part of the handle (the public parts will
have already been released by HDF5's virtual file layer).</td>
</tr>
</table>
Example: The sec2 driver just closes the underlying Unix file, making sure that the actual
file size is the same as that known to the library by writing a zero to the last file position
it hasn't been written by some previous operation (which happens in the same code which flushes
the file contents and is shown below).
\code
static herr_t
H5FD_sec2_close(H5FD_t *_file)
{
H5FD_sec2_t *file = (H5FD_sec2_t*)_file;
if (H5FD_sec2_flush(_file)<0) return -1;
if (close(file->fd)<0) return -1;
free(file);
return 0;
}
\endcode
\subsection subsec_vfl_imp_key File Keys
Occasionally an application will attempt to open a single file more than one time in order
to obtain multiple handles to the file. HDF5 allows the files to share information(For instance,
writing data to one handle will cause the data to be immediately visible on the other handle.)
but in order to accomplish this HDF5 must be able to tell when two names refer to the same file.
It does this by associating a driver-defined key with each file opened by a driver and comparing
the key for an open request with the keys for all other files currently open by the same driver.
<table>
<tr>
<td><code>const int cmp (const H5FD_t *f1, const H5FD_t *f2)</code></td>
<td>The driver may provide a function which compares two files f1 and f2 belonging to the same
driver and returns a negative, positive, or zero value a la the strcmp function.(The ordering
is arbitrary as long as it's consistent within a particular file driver.) If this function is
not provided then HDF5 assumes that all calls to the open callback return unique files regardless
of the arguments and it is up to the application to avoid doing this if that assumption is incorrect.</td>
</tr>
</table>
Each time a file is opened the library calls the cmp function to compare that file with all other files
currently open by the same driver and if one of them matches (at most one can match) then the file
which was just opened is closed and the previously opened file is used instead.
Opening a file twice with incompatible flags will result in failure. For instance, opening a file with
the truncate flag is a two step process which first opens the file without truncation so keys can be
compared, and if no matching file is found already open then the file is closed and immediately reopened
with the truncation flag set (if a matching file is already open then the truncating open will fail).
Example: The sec2 driver uses the Unix device and i-node as the key. They were initialized when
the file was opened.
\code
static int
H5FD_sec2_cmp(const H5FD_t *_f1, const H5FD_t *_f2)
{
const H5FD_sec2_t *f1 = (const H5FD_sec2_t*)_f1;
const H5FD_sec2_t *f2 = (const H5FD_sec2_t*)_f2;
if (f1->device < f2->device) return -1;
if (f1->device > f2->device) return 1;
if (f1->inode < f2->inode) return -1;
if (f1->inode > f2->inode) return 1;
return 0;
}
\endcode
\subsection subsec_vfl_imp_save Saving Modes Across Opens
Some drivers may also need to store certain information in the file superblock in order
to be able to reliably open the file at a later date. This is done by three functions:
one to determine how much space will be necessary to store the information in the superblock,
one to encode the information,
and one to decode the information. These functions are optional, but if any one is defined
then the other two must also be defined.
<table>
<tr>
<th>Function</th>
<th>Description</th>
</tr>
<tr>
<td><code>static hsize_t sb_size (H5FD_t *file)</code></td>
<td>The sb_size function returns the number of bytes necessary to encode
information needed later if the file is reopened.</td>
</tr>
<tr>
<td><code>static herr_t sb_encode (H5FD_t *file, char *name, unsigned char *buf)</code></td>
<td>The sb_encode function encodes information from the file into buffer buf
allocated by the caller. It also writes an 8-character (plus null termination) into
the name argument, which should be a unique identification for the driver.</td>
</tr>
<tr>
<td><code>static herr_t sb_decode (H5FD_t *file, const char *name, const unsigned char *buf)</code></td>
<td>The sb_decode function looks at the name decodes data from the buffer buf and
updates the file argument with the new information, advancing *p in the process.</td>
</tr>
</table>
The part of this which is somewhat tricky is that the file must be readable before the
superblock information is decoded. File access modes fall outside the scope of the HDF5
file format, but they are placed inside the boot block for convenience.(File access modes
do not describe data, but rather describe how the HDF5 format address space is mapped to
the underlying file(s). Thus, in general the mapping must be known before the file
superblock can be read. However, the user usually knows enough about the mapping for
the superblock to be readable and once the superblock is read the library can fill
in the missing parts of the mapping.)
\section sec_vfl_address Address Space Functions
HDF5 does not assume that a file is a linear address space of bytes. Instead, the library
will call functions to allocate and free portions of the HDF5 format address space, which
in turn map onto functions in the file driver to allocate and free portions of file address
space. The library tells the file driver how much format address space it wants to allocate
and the driver decides what format address to use and how that format address is mapped
onto the file address space. Usually the format address is chosen so that the file address
can be calculated in constant time for data I/O operations (which are always specified by format addresses).
\subsection subsec_vfl_address_blk Userblock and Superblock
The HDF5 format allows an optional userblock to appear before the actual HDF5 data in such
a way that if the userblock is sucked out of the file and everything remaining is
shifted downward in the file address space, then the file is still a valid HDF5 file.
The userblock size can be zero or any multiple of two greater than or equal to 512 and
the file superblock begins immediately after the userblock.
HDF5 allocates space for the userblock and superblock by calling an allocation function
defined below, which must return a chunk of memory at format address zero on the first call.
\subsection subsec_vfl_address_alloc Allocatiion of Format Regions
The library makes many types of allocation requests:
<table>
<tr>
<td>#H5FD_MEM_SUPER</td><td>userblock</td>
</tr>
<tr>
<td>#H5FD_MEM_BTREE</td><td>An allocation request for a node of a B-tree.
</td>
</tr>
<tr>
<td>#H5FD_MEM_DRAW</td><td>An allocation request for the raw data of a dataset.
</td>
</tr>
</tr>
<tr>
<td>#H5FD_MEM_GHEAP</td><td>An allocation request for a global heap collection. Global
heaps are used to store certain types of references such as dataset region references.
The set of all global heap collections can become quite large.
</td>
</tr>
<tr>
<td>#H5FD_MEM_LHEAP</td><td>An allocation request for a local heap. Local heaps are used
to store the names which are members of a group. The combined size of all local heaps is
a function of the number of object names in the file.
</td>
</tr>
<tr>
<td>#H5FD_MEM_OHDR</td><td>An allocation request for (part of) an object header. Object
headers are relatively small and include meta information about objects (like the data
space and type of a dataset) and attributes.
</td>
</tr>
</table>
When a chunk of memory is freed the library adds it to a free list and allocation requests
are satisfied from the free list before requesting memory from the file driver. Each type of
allocation request enumerated above has its own free list, but the file driver can specify that
certain object types can share a free list. It does so by providing an array which maps a
request type to a free list. If any value of the map is H5MF_DEFAULT (zero) then the object's
own free list is used. The special value H5MF_NOLIST indicates that the library should not
attempt to maintain a free list for that particular object type, instead calling the file driver
each time an object of that type is freed.
Mappings predefined in the 'H5FDdevelop.h' file are:
<table>
<tr>
<td>#H5FD_FLMAP_SINGLE</td><td>All memory usage types are mapped to a single free list.
</td>
</tr>
<tr>
<td>#H5FD_FLMAP_DICHOTOMY</td><td>Memory usage is segregated into meta data and raw data
for the purposes of memory management.
</td>
</tr>
<tr>
<td>#H5FD_FLMAP_DEFAULT</td><td>Each memory usage type has its own free list.
</td>
</tr>
</table>
Example: To make a map that manages object headers on one free list and everything else on
another free list one might initialize the map with the following code: (the use of #H5FD_MEM_SUPER is arbitrary)
\code
H5FD_mem_t mt, map[H5FD_MEM_NTYPES];
for (mt = 0; mt < H5FD_MEM_NTYPES; mt++) {
map[mt] = (H5FD_MEM_OHDR== mt) ? mt : H5FD_MEM_SUPER;
}
\endcode
If an allocation request cannot be satisfied from the free list then one of two things happen.
If the driver defines an allocation callback then it is used to allocate space; otherwise new
memory is allocated from the end of the format address space by incrementing the end-of-address marker.
<table>
<tr>
<td><code>static haddr_t alloc (H5FD_t *file, H5MF_type_t type, hsize_t size)</code></td>
<td>The file argument is the file from which space is to be allocated, type is the type of
memory being requested (from the list above) without being mapped according to the freelist
map and size is the number of bytes being requested. The library is allowed to allocate large
chunks of storage and manage them in a layer above the file driver (although the current library
doesn't do that). The allocation function should return a format address for the first byte
allocated. The allocated region extends from that address for size bytes. If the request cannot
be honored then the undefined address value is returned (#HADDR_UNDEF). The first call to this
function for a file which has never had memory allocated must return a format address of zero
or #HADDR_UNDEF since this is how the library allocates space for the userblock and/or superblock.</td>
</tr>
</table>
\subsection subsec_vfl_address_free Freeing Format Regions
When the library is finished using a certain region of the format address space it will return the
space to the free list according to the type of memory being freed and the free list map described above.
If the free list has been disabled for a particular memory usage type (according to the free list map)
and the driver defines a free callback then it will be invoked. The free callback is also invoked for
all entries on the free list when the file is closed.
<table>
<tr>
<td><code>static herr_t free (H5FD_t *file, H5MF_type_t type, haddr_t addr, hsize_t size)</code></td>
<td>The file argument is the file for which space is being freed; type is the type of object being
freed (from the list above) without being mapped according to the freelist map; addr is the first
format address to free; and size is the size in bytes of the region being freed. The region being
freed may refer to just part of the region originally allocated and/or may cross allocation boundaries
provided all regions being freed have the same usage type. However, the library will never attempt
to free regions which have already been freed or which have never been allocated.</td>
</tr>
</table>
A driver may choose to not define the free function, in which case format addresses will be leaked.
This isn't normally a huge problem since the library contains a simple free list of its own and freeing
parts of the format address space is not a common occurrence.
\subsection subsec_vfl_address_query Querying the Address Range
Each file driver must have some mechanism for setting and querying the end of address, or
EOA, marker. The EOA marker is the first format address after the last format address ever allocated.
If the last part of the allocated address range is freed then the driver may optionally decrease the eoa marker.
<table>
<tr>
<td><code>static haddr_t get_eoa (H5FD_t *file)</code></td>
<td>This function returns the current value of the EOA marker for the specified file.</td>
</tr>
</table>
Example: The sec2 driver just returns the current eoa marker value which is cached in the file structure:
\code
static haddr_t
H5FD_sec2_get_eoa(H5FD_t *_file)
{
H5FD_sec2_t *file = (H5FD_sec2_t*)_file;
return file->eoa;
}
\endcode
The eoa marker is initially zero when a file is opened and the library may set it to some other value
shortly after the file is opened (after the superblock is read and the saved eoa marker is determined)
or when allocating additional memory in the absence of an alloc callback (described above).
Example: The sec2 driver simply caches the eoa marker in the file structure and does not extend the
underlying Unix file. When the file is flushed or closed then the Unix file size is extended to match
the eoa marker.
\code
static herr_t
H5FD_sec2_set_eoa(H5FD_t *_file, haddr_t addr)
{
H5FD_sec2_t *file = (H5FD_sec2_t*)_file;
file->eoa = addr;
return 0;
}
\endcode
\section sec_vfl_data Data Functions
These functions operate on data, transferring a region of the format address space between memory and files.
\subsection subsec_vfl_data_cont Contiguous I/O Functions
A driver must specify two functions to transfer data from the library to the file and vice versa.
<table>
<tr>
<td><code>static herr_t read (H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, void *buf)</code></td>
<td>The read function reads data from file beginning at address addr and continuing
for size bytes into the buffer buf supplied by the caller.</td>
</tr>
<tr>
<td><code>static herr_t write (H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, const void *buf)</code></td>
<td>The write function transfers data
in the opposite direction.</td>
</tr>
</table>
\li Both functions take a data transfer property list dxpl which
indicates the fine points of how the data is to be transferred and which comes directly
from the #H5Dread or #H5Dwrite function.
\li Both functions receive type of data being written,
which may allow a driver to tune it's behavior for different kinds of data.
\li Both functions should return
a negative value if they fail to transfer the requested data, or non-negative if they
succeed. The library will never attempt to read from unallocated regions of the format address space.
Example: The sec2 driver just makes system calls. It tries not to call lseek if the current operation
is the same as the previous operation and the file position is correct. It also fills the output buffer
with zeros when reading between the current EOF and EOA markers and restarts system calls which were interrupted.
\code
static herr_t
H5FD_sec2_read(H5FD_t *_file, H5FD_mem_t type/*unused*/, hid_t dxpl_id/*unused*/,
haddr_t addr, hsize_t size, void *buf/*out*/)
{
H5FD_sec2_t *file = (H5FD_sec2_t*)_file;
ssize_t nbytes;
assert(file && file->pub.cls);
assert(buf);
/* Check for overflow conditions */
if (REGION_OVERFLOW(addr, size)) return -1;
if (addr+size>file->eoa) return -1;
/* Seek to the correct location */
if ((addr!=file->pos || OP_READ!=file->op) &&
file_seek(file->fd, (file_offset_t)addr, SEEK_SET)<0) {
file->pos = HADDR_UNDEF;
file->op = OP_UNKNOWN;
return -1;
}
/*
* Read data, being careful of interrupted system calls, partial results,
* and the end of the file.
*/
while (size>0) {
do nbytes = read(file->fd, buf, size);
while (-1==nbytes && EINTR==errno);
if (-1==nbytes) {
/* error */
file->pos = HADDR_UNDEF;
file->op = OP_UNKNOWN;
return -1;
}
if (0==nbytes) {
/* end of file but not end of format address space */
memset(buf, 0, size);
size = 0;
}
assert(nbytes>=0);
assert((hsize_t)nbytes<=size);
size -= (hsize_t)nbytes;
addr += (haddr_t)nbytes;
buf = (char*)buf + nbytes;
}
/* Update current position */
file->pos = addr;
file->op = OP_READ;
return 0;
}
\endcode
Example: The sec2 write callback is similar except it updates the file EOF marker when extending the file.
\subsection subsec_vfl_data_flush Flushing Cached Data
Some drivers may desire to cache data in memory in order to make larger I/O requests to the
underlying file and thus improving bandwidth. Such drivers should register a cache flushing
function so that the library can insure that data has been flushed out of the drivers in
response to the application calling #H5Fflush.
<table>
<tr>
<td><code>static herr_t flush (H5FD_t *file)</code></td>
<td>Flush all data for file file to storage.</td>
</tr>
</table>
Example: The sec2 driver doesn't cache any data but it also doesn't extend the Unix file as
aggressively as it should. Therefore, when finalizing a file it should write a zero to the last
byte of the allocated region so that when reopening the file later the EOF marker will be at
least as large as the EOA marker saved in the superblock (otherwise HDF5 will refuse to open
the file, claiming that the data appears to be truncated).
\code
static herr_t
H5FD_sec2_flush(H5FD_t *_file)
{
H5FD_sec2_t *file = (H5FD_sec2_t*)_file;
if (file->eoa>file->eof) {
if (-1==file_seek(file->fd, file->eoa-1, SEEK_SET)) return -1;
if (write(file->fd, "", 1)!=1) return -1;
file->eof = file->eoa;
file->pos = file->eoa;
file->op = OP_WRITE;
}
return 0;
}
\endcode
\section sec_vfl_opt Optimization Functions
The library is capable of performing several generic optimizations on I/O, but these types of
optimizations may not be appropriate for a given VFL driver.
Each driver may provide a query function to allow the library to query whether to enable these
optimizations. If a driver lacks a query function, the library will disable all types of
optimizations which can be queried.
<table>
<tr>
<td><code>static herr_t query (const H5FD_t *file, unsigned long *flags)</code></td>
<td>This function is called by the library to query which optimizations to enable for I/O to this driver.</td>
</tr>
</table>
These are the flags which are currently defined:
<table>
<tr>
<td><code>H5FD_FEAT_AGGREGATE_METADATA (0x00000001)</code></td>
<td>Defining the H5FD_FEAT_AGGREGATE_METADATA for a VFL driver means that the library will attempt to allocate
a larger block for metadata and then sub-allocate each metadata request from that larger block.</td>
</tr>
<tr>
<td><code>H5FD_FEAT_ACCUMULATE_METADATA (0x00000002)</code></td>
<td>Defining the H5FD_FEAT_ACCUMULATE_METADATA for a VFL driver means that the library will attempt to cache
metadata as it is written to the file and build up a larger block of metadata to eventually pass to the
VFL 'write' routine.</td>
</tr>
<tr>
<td><code>H5FD_FEAT_DATA_SIEVE (0x00000004)</code></td>
<td>Defining the H5FD_FEAT_DATA_SIEVE for a VFL driver means that the library will attempt to cache raw data
as it is read from/written to a file in a "data sieve" buffer.</td>
</tr>
</table>
See Rajeev Thakur's papers:
http://www.mcs.anl.gov/~thakur/papers/romio-coll.ps.gz
http://www.mcs.anl.gov/~thakur/papers/mpio-high-perf.ps.gz
\section sec_vfl_reg Registration of a Driver
Before a driver can be used the HDF5 library needs to be told of its existence. This is done by
registering the driver, which results in a driver identification number. Instead of passing many
arguments to the registration function, the driver information is entered into a structure and the
address of the structure is passed to the registration function where it is copied. This allows
the HDF5 API to be extended while providing backward compatibility at the source level.
<table>
<tr>
<td><code>hid_t H5FDregister (H5FD_class_t *cls)</code></td>
<td>The driver described by struct cls is registered with the library and an ID number for the driver is returned.</td>
</tr>
</table>
The H5FD_class_t type is a struct with the following fields:
<table>
<tr>
<td><code>const char *name</code></td>
<td>A pointer to a constant, null-terminated driver name to be used for debugging purposes.</td>
</tr>
<tr>
<td><code>size_t fapl_size</code></td>
<td>The size in bytes of the file access mode structure or zero if the driver supplies a copy function
or doesn't define the structure.</td>
</tr>
<tr>
<td><code>void *(*fapl_copy)(const void *fapl)</code></td>
<td>An optional function which copies a driver-defined file access mode structure. This field takes
precedence over fm_size when both are defined.</td>
</tr>
<tr>
<td><code>void (*fapl_free)(void *fapl)</code></td>
<td>An optional function to free the driver-defined file access mode structure. If null, then the
library calls the C free function to free the structure.</td>
</tr>
<tr>
<td><code>size_t dxpl_size</code></td>
<td>The size in bytes of the data transfer mode structure or zero if the driver supplies a copy
function or doesn't define the structure.</td>
</tr>
<tr>
<td><code>void *(*dxpl_copy)(const void *dxpl)</code></td>
<td>An optional function which copies a driver-defined data transfer mode structure. This field
takes precedence over xm_size when both are defined.</td>
</tr>
<tr>
<td><code>void (*dxpl_free)(void *dxpl)</code></td>
<td>An optional function to free the driver-defined data transfer mode structure. If null, then
the library calls the C free function to free the structure.</td>
</tr>
<tr>
<td><code>H5FD_t *(*open)(const char *name, unsigned flags, hid_t fapl, haddr_t maxaddr)</code></td>
<td>The function which opens or creates a new file.</td>
</tr>
<tr>
<td><code>herr_t (*close)(H5FD_t *file)</code></td>
<td>The function which ends access to a file.</td>
</tr>
<tr>
<td><code>int (*cmp)(const H5FD_t *f1, const H5FD_t *f2)</code></td>
<td>An optional function to determine whether two open files have the same key. If this function
is not present then the library assumes that two files will never be the same.</td>
</tr>
<tr>
<td><code>int (*query)(const H5FD_t *f, unsigned long *flags)</code></td>
<td>An optional function to determine which library optimizations a driver can support.</td>
</tr>
<tr>
<td><code>haddr_t (*alloc)(H5FD_t *file, H5FD_mem_t type, hsize_t size)</code></td>
<td>An optional function to allocate space in the file.</td>
</tr>
<tr>
<td><code>herr_t (*free)(H5FD_t *file, H5FD_mem_t type, haddr_t addr, hsize_t size)</code></td>
<td>An optional function to free space in the file.</td>
</tr>
<tr>
<td><code>haddr_t (*get_eoa)(H5FD_t *file)</code></td>
<td>A function to query how much of the format address space has been allocated.</td>
</tr>
<tr>
<td><code>herr_t (*set_eoa)(H5FD_t *file, haddr_t)</code></td>
<td>A function to set the end of address space.</td>
</tr>
<tr>
<td><code>haddr_t (*get_eof)(H5FD_t *file)</code></td>
<td>A function to return the current end-of-file marker value.</td>
</tr>
<tr>
<td><code>herr_t (*read)(H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, void *buffer)</code></td>
<td>A function to read data from a file.</td>
</tr>
<tr>
<td><code>herr_t (*write)(H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, const void *buffer)</code></td>
<td>A function to write data to a file.</td>
</tr>
<tr>
<td><code>herr_t (*flush)(H5FD_t *file)</code></td>
<td>A function which flushes cached data to the file.</td>
</tr>
<tr>
<td><code>H5FD_mem_t fl_map[H5FD_MEM_NTYPES]</code></td>
<td>An array which maps a file allocation request type to a free list.</td>
</tr>
</table>
Example: The sec2 driver would be registered as:
\code
static const H5FD_class_t H5FD_sec2_g = {
"sec2", /*name */
MAXADDR, /*maxaddr */
NULL, /*sb_size */
NULL, /*sb_encode */
NULL, /*sb_decode */
0, /*fapl_size */
NULL, /*fapl_get */
NULL, /*fapl_copy */
NULL, /*fapl_free */
0, /*dxpl_size */
NULL, /*dxpl_copy */
NULL, /*dxpl_free */
H5FD_sec2_open, /*open */
H5FD_sec2_close, /*close */
H5FD_sec2_cmp, /*cmp */
H5FD_sec2_query, /*query */
NULL, /*alloc */
NULL, /*free */
H5FD_sec2_get_eoa, /*get_eoa */
H5FD_sec2_set_eoa, /*set_eoa */
H5FD_sec2_get_eof, /*get_eof */
H5FD_sec2_read, /*read */
H5FD_sec2_write, /*write */
H5FD_sec2_flush, /*flush */
H5FD_FLMAP_SINGLE, /*fl_map */
};
hid_t
H5FD_sec2_init(void)
{
if (!H5FD_SEC2_g) {
H5FD_SEC2_g = H5FDregister(&H5FD_sec2_g);
}
return H5FD_SEC2_g;
}
\endcode
A driver can be removed from the library by unregistering it
<table>
<tr>
<td><code>herr_t H5Dunregister (hid_t driver)</code></td>
<td>Where driver is the ID number returned when the driver was registered.</td>
</tr>
</table>
Unregistering a driver makes it unusable for creating new file access or data transfer property
lists but doesn't affect any property lists or files that already use that driver.
\subsection subsec_vfl_reg_prog Programming Note for C++ Developers Using C Functions
If a C routine that takes a function pointer as an argument is called from within C++ code,
the C routine should be returned from normally.
Examples of this kind of routine include callbacks such as #H5Pset_elink_cb
and #H5Pset_type_conv_cb and functions such as #H5Tconvert and #H5Ewalk2.
Exiting the routine in its normal fashion allows the HDF5 C Library to clean up
its work properly. In other words, if the C++ application jumps out of the routine
back to the C++ “catch” statement, the library is not given the opportunity to close
any temporary data structures that were set up when the routine was called. The C++
application should save some state as the routine is started so that any problem that
occurs might be diagnosed.
\section sec_vfl_query Querying Driver Information
<table>
<tr>
<td><code>void * H5Pget_driver_data (hid_t fapl)<br />void * H5Pget_driver_data (hid_t fxpl)</code></td>
<td>This function is intended to be used by driver functions, not applications. It returns a pointer
directly into the file access property list fapl which is a copy of the driver's file access mode
originally provided to the H5Pset_driver function. If its argument is a data transfer property list
fxpl then it returns a pointer to the driver-specific data transfer information instead.
</td>
</tr>
</table>
\section sec_vfl_misc Miscellaneous
The various private H5F_low_* functions will be replaced by public H5FD* functions so they
can be called from drivers.
All private functions H5F_addr_* which operate on addresses will be renamed as public functions