Skip to content

Logging framework with ML / Pytorch experiments in mind

Notifications You must be signed in to change notification settings

f-ilic/SimulationHelper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

SimulationHelper

SimulationHelper provides a comprehensive and flexible framework for managing simulation environments in Python. It is designed to streamline the setup, execution, and logging of simulation runs, particularly for computational experiments and machine learning model training. The package offers functionality for organizing output directories, handling logs, saving and restoring data, and managing the simulation lifecycle.

Requirements

pip install numpy torch h5py

Usage

Basic Usage

To use SimulationHelper in your project, import the Simulation class from the module. Begin by defining a simulation context using the Simulation class as a context manager. This automatically sets up logging, output directories, and, optionally, the handling of SIGINT signals for graceful interruption.

from simulation.simulation import Simulation

if __name__ == '__main__':
    sim_name = "example_simulation"
    with Simulation(sim_name=sim_name, output_root='simulation_outputs') as sim:
        # Your simulation code here
        print(f'Simulation output directory: {sim.outdir}')

Advanced Features

SimulationHelper simplifies data management by providing methods to save and restore data in various formats. Use save_data to save data and restore_data to load it back into your simulation. For example, to save a model's state in PyTorch:

checkpoint = {'model_state': model.state_dict(), 'optimizer_state': optimizer.state_dict()}
sim.save_pytorch(checkpoint, epoch=epoch)

To restore this data in a subsequent run:

data = sim.restore_data(title="checkpoint_epoch_10", mode="pkl")
model.load_state_dict(data['model_state'])
optimizer.load_state_dict(data['optimizer_state'])

Logging and Output Directories

The Simulation class manages all logs and outputs, directing them to the specified output directory. By default, output directories are named using the simulation name, a numbered sequence, and a date stamp. This behavior can be customized via the Simulation constructor.

Handling Interruptions

SimulationHelper can catch SIGINT signals (e.g., from pressing Ctrl+C) to gracefully terminate the simulation. Set catch_sigint=True in the Simulation constructor to enable this feature. Check sim.received_sigint to determine if an interruption signal was received and handle it as needed.

Example with Pytorch / ML code

import sys
import torch
from torch.utils.tensorboard import SummaryWriter
from utils.Trainer import Trainer
from simulation.simulation import Simulation

if __name__ == '__main__':
    model = ...
    optimizer = ...
    trainer = Trainer(model, optimizer)

    sim_name = f"{cfg['dataset_name']}/{model.name}"
    with Simulation(sim_name=sim_name, output_root='runs', catch_sigint=True) as sim:
        writer = SummaryWriter(join(sim.outdir, 'tensorboard'))

        for epoch in range(cfg['num_epochs']):
            if sim.received_sigint:
                print('Interrupted by user, stopping training...')
                break
            trainer.do(...)
            checkpoint = {'epoch': epoch, 'state_dict': model.state_dict(), 'optimizer': optimizer.state_dict()}
            sim.save_pytorch(checkpoint, epoch=epoch)

        print(f'\nRun {sim.outdir} finished\n')

About

Logging framework with ML / Pytorch experiments in mind

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages