Releases: HDFGroup/hermes
v1.2.1
What's Changed
- Retest github docker action by @lukemartinlogan in #678
- Allow disabling Python and Adios on the hermes compilation through spack by @JaimeCernuda in #679
- Add libelf to hermes CI and add try-catch around paths by @lukemartinlogan in #685
- Make errors print to HELOG by @lukemartinlogan in #686
- Update spack install for vfd by @lukemartinlogan in #687
Full Changelog: v1.2.0...v1.2.1
v1.2.0
This release marks the point of several important bug fixes. For performance and scalability, a long-running issue with nested RPCs causing deadlocks in large-scale workloads has been fixed. Previously, Hermes would wait for an entire task to execute before an RPC completed. This resulted in the requirement of having at least 1 RPC thread for each node Hermes was running on, becoming problematic at scales of larger than a few hundred nodes. Now RPCs are used only for the transfer of tasks, and do not wait for their completion. Hermes can now run with a single RPC thread per node, regardless of scale. In addition, we have changed our github actions to rely on Dockerhub. This improves the performance of actions dramatically while giving the benefit of having a maintained container. Lastly, we have made some changes to the Hermes spack. We now rely on thallium with cereal to maintain compatability with future mochi releases.
What's Changed
- Remove docker for now. Update readme. by @lukemartinlogan in #623
- Fix flushing and data op by @lukemartinlogan in #629
- Fix deadlock with data stager and data op by @lukemartinlogan in #630
- Remove print statements by @lukemartinlogan in #631
- Dev by @lukemartinlogan in #632
- Improvements to RAM utilization, concurrency control, and RPCs by @lukemartinlogan in #653
- Fix graceful runtime stop function by @lukemartinlogan in #654
- Dev by @lukemartinlogan in #657
- Fix the deadlock caused in stop that occurred for single-node cases by @lukemartinlogan in #658
- Change spack installation for hermes by @lukemartinlogan in #661
- Change the wiki to be our GRC website by @lukemartinlogan in #662
- Data staging will open and close files immediately by @lukemartinlogan in #664
- Add back original ReorganizeBlob by @lukemartinlogan in #666
- Point to GRC website for installation and building instructions by @lukemartinlogan in #667
- Add path regexing by @lukemartinlogan in #668
- Deployment updates by @lukemartinlogan in #673
- Always run container on workflow dispatch by @lukemartinlogan in #675
- Make spack force cereal by @lukemartinlogan in #676
Full Changelog: v1.1.1...v1.2.0
v1.1.0
Hermes 1.1.0 has been released. This release features a major restructuring of Hermes to improve I/O latency and scalability through asynchronous and partial I/O operations. Hermes now follows a task-based design, where I/O commands are converted into tasks and processed asynchronously. We leverage this asynchronicity to improve the performance of metadata and data updates for latency-sensitive producer workloads by nearly 100x over the previous release while leveraging properties of queuing to guarantee strong consistency without locking. In addition, Hermes now supports partial get and put operations. This enables small I/O requests to be merged into the same blob, dramatically reducing the amount of metadata stored in Hermes for latency-sensitive I/O workloads (e.g., deep learning) in addition to eliminating I/O amplification and lock contention for same-blob updates in the Hermes adapters. We have evaluated Hermes 1.1 underneath a variety of real programs, a summary is located here.
What's Changed
- Hermes 1.1 by @lukemartinlogan in #616
- Update spack install readme by @lukemartinlogan in #617
- Remove jarvis preamble by @lukemartinlogan in #618
- Dev by @lukemartinlogan in #619
Full Changelog: v1.0.5-beta...v1.1.0
v1.0.5
This release contains minor updates to spack, improvements to the Hermes CMake, and minor updates to portability. This is the last release before the task-based design of Hermes is used.
What's Changed
- Fix readme by @lukemartinlogan in #528
- Fix cmakes by @lukemartinlogan in #530
- Add additional configuration to mdm by @lukemartinlogan in #546
Full Changelog: v1.0.0-beta...v1.0.5-beta
v1.0.0
Hermes 1.0.0 has been released. Hermes is a multi-tiered I/O buffering platform which can be used to accelerate data access for large-scale scientific applications. This represents the first feature-complete release of Hermes.
For applications that produce data, Hermes intelligently makes initial data placement decisions using the Data Placement Engine. Hermes supports various data placement policies, each with different considerations to hardware characteristics and application I/O patterns. For high-bandwidth checkpoint-restart workloads, for example, Hermes can place data in the fastest available tiers. Data can be placed either locally on the node producing data, remotely, or both. After inital data placement, data can be re-shuffled in the hierarchy using the buffer organizer (BORG) and prefetcher. The BORG demotes data based on their observed access frequency and last time accessed. For checkpoint-restart workloads, demoting data can make space available in high-performing tiers to accelerate future checkpoints.
For workloads which read data, Hermes can accelerate I/O through prefetching and data staging. Hermes provides a policy-based Prefetcher component that promotes data expected to be accessed in the near future. Prefetchers are policy-based in order to represent diverse application behaviors. We currently provide a prefetcher tailored for deterministic I/O workloads, which is fairly common. Deep learning applications, for example, have randomness seeds which can be used to make I/O behavior completely reproducable. Many scientific analysis codes predictably read a batch of data and then perform analysis. For these cases, Hermes comes equipped with an Apriori Prefetcher which parses and executes a user-defined schema file indicating when and where to prefetch data. In addition to prefetching, Hermes also provides data staging, which can import entire datasets from services external to Hermes (e.g., a PFS) and place them in the hierarchy for analysis.
We have evaluated Hermes 1.0.0 underneath various benchmarks and real applications. The Grey-Scott Model, for example, is a reaction-diffusion code that simulates the chemical reaction between two substances diffusing over time. We found that Hermes can improve I/O performance by 3x by intelligently buffering data in faster tiers and asynchronously flushing during checkpoints. A detailed summary of our benchmarks is located here.
We would like to thank the NSF for supporting our research. We invite the community to try Hermes and contribute. We would love to hear about use cases, desired features, and any improvements that we can make.
v0.9.9
This release primarily focuses on changes to CI, portability issues, and bug fixes. We have completely decoupled hermes from MPI, which provides more portability across HPC and Cloud machines. We have also addressed some performance issues relating to metadata performance in workloads which produce many small objects.
v0.9.8-beta
Hermes 0.9.8 has been released. This release features tagging. Tags enable users to semantically define associations between blobs and provide an intuitive way of locating blobs which are related. Tags can be used internally by Hermes to provide enhanced data placement decisions based on the logical grouping of data. Traits can be attached to tags to transparently perform a set operations on a group of related blobs. For example, a combination of tags and traits can indicate that a group of blobs is to be compressed and encrypted.
This release also features enhanced portability. Hermes no longer requires client programs to follow MPI design patterns. In addition, we have addressed a number of issues which have prevented Hermes from being installed on recent OS versions.
v0.9.5-beta
This release features enhanced concurrency control and more attention to useability. Hermes now provides dynamic data structures, which avoids forcing users to configure strict limits on data structures such as Buckets and VBuckets for the sake of simplicity, reducing accidental segfaults due to misconfiguration. In addition, these data structures reduce the complexity of extending Hermes to support new policies, such as Data Placement Engines and Prefetchers.
Hermes 0.9.0-beta
Hermes 0.9.0-beta
What's Changed
New Features
- Provide stage-in / stage-out capabilities
Other Changes
- Unified the adapter codes to inherit from the same base class (reduce code duplication)
- Add a new ADAPTER_MODE (WORKFLOW) to enable data to remain in Hermes after program closes
- Added an explict hermes_finalize script to ensure hermes daemon is stopped
- Make it so that IOR does not require transfer size to be a multiple of the file page size
- Make MinimizeIoTime DPE respect capacity constraints
- Support for parallel I/O to POSIX / STDIO files
Full Changelog: v0.8.0-beta...v0.9.0-beta
Hermes 0.8.0-beta
What's Changed
New Features
- Added new Buffer Organizer component by @ChristopherHogan in #440
- See the details in this webinar.
Other Changes
- Make gflags an optional dependency by @ChristopherHogan in #427
- Fix GLPK bug when one or more constraints are disabled. by @hyoklee in #430
- Fix several CI and build-related issues by @ChristopherHogan in #433
- Update spack to
0.18.0
- Update hdf5 to
1.13.1
- Install hdf5 with spack instead of manually.
- Add IOR so that IOR+VFD tests are run.
- Add
glog
as a dependency oflibhermes
.
- Update spack to
- Update to Catch2 version 3.0.1. by @hyoklee in #437
Full Changelog: v0.7.0-beta...v0.8.0-beta