-
Notifications
You must be signed in to change notification settings - Fork 0
Simple pipeline framework in Python
License
fpierfed/owl_pipe
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
STPipe WARNING STPipe (formerly known as owl_pipe) is being moved to https://svn6.assembla.com/svn/jwst/trunk/stpipe/ please update your bookmarks. STPipe is a light-weight Pipeline framework. It is implemented as a Python module called stpipe with no dependencies other than a reasonably recent version of Python (see below for further information) and ConfigObj (with the optional verify.py module). It is either called "OWL Pipe" or stpipe. The two names are used interchangeably in this document. STPipe allows developers to construct Python-based scientific pipelines out of reusable code. The framework provides two useful containers: a Pipeline object and a Step object. A Pipeline is an ordered collection of Steps, where a Step is some python code (that performs some part of the computation) and configuration data needed to run that code. As a simple example from basic astronomical processing, a Step could perform bias subtraction while a pipeline could be basic image detrending. Pipelines in stpipe are built by creating an ASCII configuration file. That file is where, among other things, the Steps forming that pipeline (and their input and outputs) are defined. Each Step has a separate configuration file where its parameters can be specified. Both Pipeline and Step instances have access to some instance variables: 1. log: a Logger instance from the Python logging module. 2. qualified_name: fully qualified name, mostly used in logs. For Pipeline instances it is <system>.<name>. For Step instances it is <pipeline qualified_name>.<name>. 3. name: the Pipeline/Step name. Pipeline instances have these additional instance variables: 1. system: the name of the Pipeline system. 2. log_level: the log threshold. 3. local_logs: boolean determining whether to write logs to disk or STDOUT. 4. steps: list of Step instances for that Pipeline. 5. clipboard: a dictionary for passing data in-memory across Steps (see below). Step instances have these additional instance variables: 1. pipeline: the Pipeline instance the Step belongs to. 2. Any parameter defined in the parameters dictionary in the Step configuration file (see below). 3. Any input/output containers defined in the main Pipeline configuration file (see below). Clearly, STPipe stands on the shoulder of giants, among which NOAO NHPPS and LSST pex_harness. As such it borrows features and ideas from its predecessors quite heavily. Requirements STPipe uses the "ConfigObj" module together with the optional Validation code. It also requires a reasonably recent version of Python, likely 2.5 or later. Python or later can be obtained from http://www.python.org ConfigObj can be found at http://www.voidspace.org.uk/python/configobj.html but remember to download both configobj.py and validate.py or the .zip file which contains both. Alternatively, download from PyPI (or by using easy_install). Installation Installation uses the standard python distutil module as detailed in the Python web site: http://docs.python.org/install/index.html#install-index Basically, shell> python setup.py install should suffice. Configuration File Format The Pipeline and Step definition files are written in INI format with some extensions from ConfigObj (e.g. the use of '#' to indicate comments, string interpolation etc.) (see http://www.voidspace.org.uk/python/configobj.html#the-config-file-format). Pipeline Definition File Structure Required high level sections: 1. pipeline Optional high level sections: None The pipeline section Required keys: 1. name: the name of the Pipeline, can be anything and is only used in logs. 2. system: the name of the pipeline system. Again it can be anything and is only used in logs to group pipelines together. 3. steps: a list of sections, each one defining a Step (see below). Optional keys: 1. log_level: if present, it must be one of the log levels defined by the Python logging module. It defaults to "DEBUG". 2. local_log_mode: if present and true, a log file is written to disk in the local directory as <system>.<name>.log. If false, log messages are written to STDOUT. It defaults to false. Sections in the steps list (the section name is the name of the Pipeline Step it is referring to) Required keys: 1. config_file: the path to the Step configuration file (see below). 2. python_class: the fully qualified name of the Python class to instantiate for the Step. Optional keys: 1. input: a comma separated, list of strings enclosed in double quotes. Each string is itself a comma separated two element list of variable name and corresponding object class name. The object class name can be omitted if type validation is not needed. The Step instance will have these instance variables defined and pre-populated (from the Pipeline clipboard). 2. output: a comma separated, list of strings enclosed in double quotes. Each string is itself a comma separated two element list of variable name and corresponding object class name. The object class name can be omitted if type validation is not needed. The Step instance will have these instance variables defined but not pre-populated. It is the responsibility of the Step code to assign values to them so that STPipe can pass them to subsequent Steps can consume them. Step Configuration File Structure Required high level sections: None Optional high level sections: 1. parameters The parameters section Required keys: None Optional keys: 1. Parameter name = parameter value. The Step instance will have access to parameters by name directly (i.e. as instance variables of the appropriate type, accessed using the usual Python self. notation).
About
Simple pipeline framework in Python
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published