diff options
author | AlanCoding <arominge@redhat.com> | 2017-10-16 15:58:56 +0200 |
---|---|---|
committer | AlanCoding <arominge@redhat.com> | 2017-11-01 15:22:52 +0100 |
commit | 0ae9283fba138ef18cab0c6a439ed031f1ddc55c (patch) | |
tree | 751c16d186e8f1f0532d15b4a180ca80ccbb1986 /docs/retry_by_status.md | |
parent | Merge pull request #528 from AlanCoding/fix_dep_update (diff) | |
download | awx-0ae9283fba138ef18cab0c6a439ed031f1ddc55c.tar.xz awx-0ae9283fba138ef18cab0c6a439ed031f1ddc55c.zip |
Feature: retry on subset of jobs hosts
Diffstat (limited to 'docs/retry_by_status.md')
-rw-r--r-- | docs/retry_by_status.md | 76 |
1 files changed, 76 insertions, 0 deletions
diff --git a/docs/retry_by_status.md b/docs/retry_by_status.md new file mode 100644 index 0000000000..41b695dcc6 --- /dev/null +++ b/docs/retry_by_status.md @@ -0,0 +1,76 @@ +# Relaunch on Hosts with Status + +This feature allows the user to relaunch a job, targeting only a subset +of hosts that had a particular status in the prior job. + +### API Design of Relaunch + +#### Basic Relaunch + +POST to `/api/v2/jobs/N/relaunch/` without any request data should relaunch +the job with the same `limit` value that the original job used, which +may be an empty string. + +#### Relaunch by Status + +Providing request data containing `{"hosts": "<status>"}` should change +the `limit` of the relaunched job to target the hosts matching that status +from the previous job (unless the default option of "all" is used). +The options and meanings of `<status>` include: + + - all: relaunch without changing the job limit + - ok: relaunch against all hosts with >=1 tasks that returned the "ok" status + - changed: relaunch against all hosts with >=1 tasks had a changed status + - failed: relaunch against all hosts with >=1 tasks failed plus all unreachable hosts + - unreachable: relaunch against all hosts with >=1 task when they were unreachable + +These correspond to the playbook summary states from a playbook run, with +the notable exception of "failed" hosts. Ansible does not count an unreachable +event as a failed task, so unreachable hosts can (and often do) have no +associated failed tasks. The "failed" status here will still target both +status types, because Ansible will mark the _host_ as failed and include it +in the retry file if it was unreachable. + +### Relaunch Endpoint + +Doing a GET to the relaunch endpoint should return additional information +regarding the host summary of the last job. Example response: + +```json +{ + "passwords_needed_to_start": [], + "retry_counts": { + "all": 30, + "failed": 18, + "ok": 25, + "changed": 4, + "unreachable": 9 + } +} +``` + +If the user launches, providing a status for which there were 0 hosts, +then the request will be rejected. + +# Acceptance Criteria + +Scenario: user launches a job against host "foobar", and the run fails +against this host. User changes name of host to "foo", and relaunches job +against failed hosts. The `limit` of the relaunched job should reference +"foo" and not "foobar". + +The user should be able to provide passwords on relaunch, while also +running against hosts of a particular status. + +Not providing the "hosts" key in a POST to the relaunch endpoint should +relaunch the same way that relaunching has previously worked. + +If a playbook provisions a host, this feature should behave reasonably +when relaunching against a status that includes these hosts. + +Feature should work even if hosts have tricky characters in their names, +like commas. + +Also need to consider case where a task `meta: clear_host_errors` is present +inside a playbook, and that the retry subset behavior is the same as Ansible +for this case. |