-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME
145 lines (110 loc) · 5.79 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
STPipe
WARNING
STPipe (formerly known as owl_pipe) is being moved to
https://svn6.assembla.com/svn/jwst/trunk/stpipe/
please update your bookmarks.
STPipe is a light-weight Pipeline framework. It is implemented as a Python
module called stpipe with no dependencies other than a reasonably recent
version of Python (see below for further information) and ConfigObj (with the
optional verify.py module). It is either called "OWL Pipe" or stpipe. The two
names are used interchangeably in this document.
STPipe allows developers to construct Python-based scientific pipelines out of
reusable code. The framework provides two useful containers: a Pipeline object
and a Step object.
A Pipeline is an ordered collection of Steps, where a Step is some python code
(that performs some part of the computation) and configuration data needed to
run that code. As a simple example from basic astronomical processing, a Step
could perform bias subtraction while a pipeline could be basic image detrending.
Pipelines in stpipe are built by creating an ASCII configuration file. That
file is where, among other things, the Steps forming that pipeline (and their
input and outputs) are defined.
Each Step has a separate configuration file where its parameters can be
specified.
Both Pipeline and Step instances have access to some instance variables:
1. log: a Logger instance from the Python logging module.
2. qualified_name: fully qualified name, mostly used in logs. For Pipeline
instances it is <system>.<name>. For Step instances it is
<pipeline qualified_name>.<name>.
3. name: the Pipeline/Step name.
Pipeline instances have these additional instance variables:
1. system: the name of the Pipeline system.
2. log_level: the log threshold.
3. local_logs: boolean determining whether to write logs to disk or STDOUT.
4. steps: list of Step instances for that Pipeline.
5. clipboard: a dictionary for passing data in-memory across Steps (see
below).
Step instances have these additional instance variables:
1. pipeline: the Pipeline instance the Step belongs to.
2. Any parameter defined in the parameters dictionary in the Step
configuration file (see below).
3. Any input/output containers defined in the main Pipeline configuration
file (see below).
Clearly, STPipe stands on the shoulder of giants, among which NOAO NHPPS and
LSST pex_harness. As such it borrows features and ideas from its predecessors
quite heavily.
Requirements
STPipe uses the "ConfigObj" module together with the optional Validation code.
It also requires a reasonably recent version of Python, likely 2.5 or later.
Python or later can be obtained from http://www.python.org
ConfigObj can be found at http://www.voidspace.org.uk/python/configobj.html but
remember to download both configobj.py and validate.py or the .zip file which
contains both. Alternatively, download from PyPI (or by using easy_install).
Installation
Installation uses the standard python distutil module as detailed in the Python
web site: http://docs.python.org/install/index.html#install-index
Basically,
shell> python setup.py install
should suffice.
Configuration File Format
The Pipeline and Step definition files are written in INI format with some
extensions from ConfigObj (e.g. the use of '#' to indicate comments, string
interpolation etc.)
(see http://www.voidspace.org.uk/python/configobj.html#the-config-file-format).
Pipeline Definition File Structure
Required high level sections:
1. pipeline
Optional high level sections:
None
The pipeline section
Required keys:
1. name: the name of the Pipeline, can be anything and is only used in logs.
2. system: the name of the pipeline system. Again it can be anything and
is only used in logs to group pipelines together.
3. steps: a list of sections, each one defining a Step (see below).
Optional keys:
1. log_level: if present, it must be one of the log levels defined by the
Python logging module. It defaults to "DEBUG".
2. local_log_mode: if present and true, a log file is written to disk in the
local directory as <system>.<name>.log. If false, log messages are
written to STDOUT. It defaults to false.
Sections in the steps list (the section name is the name of the Pipeline Step it
is referring to)
Required keys:
1. config_file: the path to the Step configuration file (see below).
2. python_class: the fully qualified name of the Python class to instantiate
for the Step.
Optional keys:
1. input: a comma separated, list of strings enclosed in double quotes. Each
string is itself a comma separated two element list of variable name and
corresponding object class name. The object class name can be omitted if
type validation is not needed. The Step instance will have these instance
variables defined and pre-populated (from the Pipeline clipboard).
2. output: a comma separated, list of strings enclosed in double quotes.
Each string is itself a comma separated two element list of variable name
and corresponding object class name. The object class name can be omitted
if type validation is not needed. The Step instance will have these
instance variables defined but not pre-populated. It is the
responsibility of the Step code to assign values to them so that STPipe
can pass them to subsequent Steps can consume them.
Step Configuration File Structure
Required high level sections:
None
Optional high level sections:
1. parameters
The parameters section
Required keys:
None
Optional keys:
1. Parameter name = parameter value. The Step instance will have access to
parameters by name directly (i.e. as instance variables of the
appropriate type, accessed using the usual Python self. notation).