The Pipe-o-matic Pipeline Framework

This repository is a work in progress for a system for authoring and running data pipelines. For now the core functionality is only partly implemented. Move along...

A pipeline is more than just a script. It carefully tracks which steps have been executed and provides facilities for recovering from errors.

Example pipeline: my-pipeline-1.yaml

- file_type: explicit-sequence-1
- executable-versions:
  foo: "1.0"
  bar: "1.0"
- command: mkdir
  dir: sub_dir
- executable: foo
  stdin: input_file  # may be parameterized
  stdout: sub_dir/intermediate_file
- command: md5
  stdin: sub_dir/intermediate_file
  stdout: checksum.md5
- executable: bar
  arguments:
    - sub_dir  # may be parameterized
  stderr: bar.log

Assuming that things are configured correctly, you could run that pipeline like this:

pmatic my-pipeline-1 $BASE_DIR run

In order for that to work, you would need some way of locating those executables. You would do so with a deployments file:

deployments.yaml

file_type: deployments-1
foo:
    "1.0": /usr/local/foo-1.0/bin/foo  # path to the executable
bar:
    "1.0": /usr/local/bar-1.0/bin/bar

Name		Name	Last commit message	Last commit date
Latest commit History 140 Commits
bin		bin
docs		docs
lib		lib
test		test
.gitignore		.gitignore
AUTHORS		AUTHORS
LICENSE		LICENSE
README.creole		README.creole

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Pipe-o-matic Pipeline Framework

About

Releases

Packages

Contributors 3

Languages

License

walkerh/pipe-o-matic

Folders and files

Latest commit

History

Repository files navigation

The Pipe-o-matic Pipeline Framework

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages