Skip to content

Latest commit

 

History

History

docs

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Run pytests codecov Code style: black

pipestat

What is this?

Pipestat standardizes reporting of pipeline results. It provides 1) a standard specification for how pipeline outputs should be stored; and 2) an implementation to easily write results to that format from within Python or from the command line.

How does it work?

A pipeline author defines all the outputs produced by a pipeline by writing a JSON-schema. The pipeline then uses pipestat to report pipeline outputs as the pipeline runs, either via the Python API or command line interface. The user configures results to be stored either in a YAML-formatted file or a PostgreSQL database. The results are recorded according to the pipestat specification, in a standard, pipeline-agnostic way. This way, downstream software can use this specification to create universal tools for analyzing, monitoring, and visualizing pipeline results that will work with any pipeline or workflow.

Quick start

Install pipestat

pip install pipestat

Set environment variables (optional)

export PIPESTAT_RESULTS_SCHEMA=output_schema.yaml
export PIPESTAT_RECORD_IDENTIFIER=my_record
export PIPESTAT_RESULTS_FILE=results_file.yaml

Note: When setting environment variables as in the above example, you will need to provide an output_schema.yaml file in your current working directory with the following example data:

title: An example Pipestat output schema
description: A pipeline that uses pipestat to report sample and project level results.
type: object
properties:
  pipeline_name: "default_pipeline_name"
  samples:
    type: object
    properties:
        result_name:
          type: string
          description: "ResultName"

Pipeline results reporting and retrieval

For these examples below, it is assumed that the proper environment variables (see above) have been set.

Report a result

From command line:

pipestat report -i result_name -v 1.1

From Python:

import pipestat

psm = pipestat.PipestatManager()
psm.report(values={"result_name": 1.1})

Retrieve a result

From command line:

pipestat retrieve -r my_record

From Python:

import pipestat

psm = pipestat.PipestatManager()
psm.retrieve_one(result_identifier="result_name")

Pipeline status management

Set status

From command line:

pipestat status set running

From Python:

import pipestat

psm = pipestat.PipestatManager()
psm.set_status(status_identifier="running")

Get status

From command line:

pipestat status get

From Python:

import pipestat

psm = pipestat.PipestatManager()
psm.get_status()