Skip to content

Commit

Permalink
Docs tweaks
Browse files Browse the repository at this point in the history
  • Loading branch information
mistercrunch committed May 20, 2015
1 parent d45ab12 commit 1fb575a
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 7 deletions.
4 changes: 1 addition & 3 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,7 @@ TODO
* Charts: better error handling

#### Command line
* `airflow task_state dag_id task_id YYYY-MM-DD`
* Backfill, better logging, prompt with details about what it's about to do
* Backfill, better logging, prompt with details about what tasks are about to run

#### More Operators!
* PIG
Expand All @@ -18,7 +17,6 @@ TODO
* Add a run_only_latest flag to BaseOperator, runs only most recent task instance where deps are met
* Pickle all the THINGS!
* Add priority_weight(Int) to BaseOperator, +@property subtree_priority
* BaseExecutor parallelism limit
* Distributed scheduler
* Add decorator to timeout imports on master process [lib](https://github.com/pnpnpn/timeout-decorator)

Expand Down
8 changes: 4 additions & 4 deletions docs/scheduler.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,18 +9,18 @@ next run.

Note that:

* It *won't parallelize* multiple instances of the same tasks, it always wait for the previous schedule to be done to move forward
* It will *not fill in gaps*, it only moves forward in time from the latest task instance on that task
* It **won't parallelize** multiple instances of the same tasks, it always wait for the previous schedule to be done to move forward
* It will **not fill in gaps**, it only moves forward in time from the latest task instance on that task
* If a task instance failed and the task is set to ``depends_on_past=True``, it won't move forward from that point until the error state is cleared and runs successfully, or is marked as successful
* If no task history exist for a task, it will attempt to run it on the task's ``start_date``

Understanding this, you should be able to comprehend what is keeping your
tasks from running or moving forward. To allow the scheduler to move forward, you may want to clear the state
of some task instances, or mark them as successful.

Here are some of the ways you can *unblock tasks*:
Here are some of the ways you can **unblock tasks**:

* From the UI, you can *clear* individual task instances from the tasks instance dialog, while defining whether you want to includes the past/future and the upstream/downstream dependencies. Note that a confirmation window comes next and allows you to see the set you are about to clear.
* From the UI, you can **clear** (as in delete the status of) individual task instances from the tasks instance dialog, while defining whether you want to includes the past/future and the upstream/downstream dependencies. Note that a confirmation window comes next and allows you to see the set you are about to clear.
* The CLI ``airflow clear -h`` has lots of options when it comes to clearing task instances states, including specfying date ranges, targeting task_ids by specifying a regular expression, flags for including upstream and downstream relatives, and targeting task instances in specific states (``failed``, or ``success``)
* Marking task instances as successful can be done through the UI. This is mostly to fix false negatives, or when the fix has been applied oustide of Airflow for instance.
* The ``airflow backfill`` CLI subcommand has a flag to ``--mark_success`` and allows to select subsections of the dag as well as specifying date ranges.
Expand Down

0 comments on commit 1fb575a

Please sign in to comment.