SimulationHelper provides a comprehensive and flexible framework for managing simulation environments in Python. It is designed to streamline the setup, execution, and logging of simulation runs, particularly for computational experiments and machine learning model training. The package offers functionality for organizing output directories, handling logs, saving and restoring data, and managing the simulation lifecycle.
pip install numpy torch h5py
To use SimulationHelper in your project, import the Simulation
class from the module. Begin by defining a simulation context using the Simulation
class as a context manager. This automatically sets up logging, output directories, and, optionally, the handling of SIGINT signals for graceful interruption.
from simulation.simulation import Simulation
if __name__ == '__main__':
sim_name = "example_simulation"
with Simulation(sim_name=sim_name, output_root='simulation_outputs') as sim:
# Your simulation code here
print(f'Simulation output directory: {sim.outdir}')
SimulationHelper simplifies data management by providing methods to save and restore data in various formats. Use save_data to save data and restore_data to load it back into your simulation. For example, to save a model's state in PyTorch:
checkpoint = {'model_state': model.state_dict(), 'optimizer_state': optimizer.state_dict()}
sim.save_pytorch(checkpoint, epoch=epoch)
To restore this data in a subsequent run:
data = sim.restore_data(title="checkpoint_epoch_10", mode="pkl")
model.load_state_dict(data['model_state'])
optimizer.load_state_dict(data['optimizer_state'])
The Simulation class manages all logs and outputs, directing them to the specified output directory. By default, output directories are named using the simulation name, a numbered sequence, and a date stamp. This behavior can be customized via the Simulation
constructor.
SimulationHelper can catch SIGINT
signals (e.g., from pressing Ctrl+C
) to gracefully terminate the simulation. Set catch_sigint=True
in the Simulation
constructor to enable this feature. Check sim.received_sigint
to determine if an interruption signal was received and handle it as needed.
import sys
import torch
from torch.utils.tensorboard import SummaryWriter
from utils.Trainer import Trainer
from simulation.simulation import Simulation
if __name__ == '__main__':
model = ...
optimizer = ...
trainer = Trainer(model, optimizer)
sim_name = f"{cfg['dataset_name']}/{model.name}"
with Simulation(sim_name=sim_name, output_root='runs', catch_sigint=True) as sim:
writer = SummaryWriter(join(sim.outdir, 'tensorboard'))
for epoch in range(cfg['num_epochs']):
if sim.received_sigint:
print('Interrupted by user, stopping training...')
break
trainer.do(...)
checkpoint = {'epoch': epoch, 'state_dict': model.state_dict(), 'optimizer': optimizer.state_dict()}
sim.save_pytorch(checkpoint, epoch=epoch)
print(f'\nRun {sim.outdir} finished\n')