Skip to content

Commit

Permalink
Amazon provider package system test contributing guide (apache#25547)
Browse files Browse the repository at this point in the history
  • Loading branch information
ferruzzi authored Aug 5, 2022
1 parent fe97729 commit 5480b4c
Show file tree
Hide file tree
Showing 2 changed files with 218 additions and 12 deletions.
199 changes: 199 additions & 0 deletions tests/system/providers/amazon/CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

# Goals

This document aims to develop and maintain consistency across the system tests in
the Amazon provider package and sets forth guidelines to use when possible. With
the expectation that there will always be some edge cases, please refer to this
document as a guide when contributing system tests for the Amazon provider package.

Contributions are welcome and are greatly appreciated! Every bit helps, and credit
will always be given. For example, I stole that line from @potiuk!

# Scope

This guide is meant to be used in addition to the [community system test design
guide](/tests/system/README.md) and only applies to system tests within the Amazon
provider package, though other providers are welcome to adopt them as well. In any
cases of conflicting information or confusion, this document should take precedence
within the Amazon provider package.

# Tenants

* Keep it Self-Contained
* Keep it DAMP
* Use the Helpers

## Keep it Self-Contained

For system tests, we would prefer to see each test be a fully self-contained unit.
This means that any infrastructure which is needed, such as an S3 bucket, must be
created and destroyed in the test's setup and teardown steps. Some exceptions are
outlined below in the `SystemTestContextBuilder()` section. Please document all
prerequisites for a given system test in your code for easy replication.

## Keep it DAMP

Whenever possible, avoid adding new helper methods to reuse configuration or setup
code. This means there will be duplicate code across tests, which violates the DRY
(Don't Repeat Yourself) programming principle, but ensures that any given test can
be modified easily without having to consider what those changes may mean for every
other test.

Similarly, whenever possible, a variable should be defined in the module instead of
checking for it in external locations. For example, if you create and break down an
S3 bucket, the bucket name should be defined in the file rather than setting it with
an environment variable.

## Use the Helpers

Some helper methods for system tests have been provided in the [utils module](aws/utils/__init__.py).

### SystemTestContextBuilder()

`SystemTestContextBuilder()` should be used as the first task in every Amazon system
test. When used without any stub methods, it will generate and set the test's `ENV_ID`.
It will check for an environment variable defining the system test environment ID. If
one is found, it will validate that the ID meets system test requirements. Otherwise,
it will generate one and export a value. For more information, see `ENV_ID` below.

Any variables which need to be fetched from an external source, such as an environment
variable or SSM, should be added with the `.add_variable()` stub method. The method
will check a number of sources to find a value before falling back to the default.
Use this to provide values for things which must be generated before the test can run.
Whenever possible, it is preferred to use default values when creating these resources
and document the settings in the code for easy reproduction. For example, a comment
such as "This test requires the ARN of a default EC2 instance named `system_test_instance`."
is preferred over creating an EC2 instance using custom settings to try to improve
performance or reduce expense.

Some examples of when this should be used are:

1. A resource which needs special or elevated privileges to create. For example, an IAM Role.
2. A resource which is very time-consuming to create and would slow down tests. For example, an RDS DB.
3. A default resource which is shared across many tests. For example, the default VPC.

### set_env_id() [*DEPRECATED*]

**NOTE:** `set_env_id()` is deprecated as this functionality is now handled by the
`SystemTestContextBuilder()`. Early AIP-47 system tests used this method so the
below description will remain here until those tests can be updated.

`set_env_id()` should ~~always~~ no longer be used to set the value for `ENV_ID`. It
will check for an environment variable defining the system test environment ID. If one
is not found, it will generate one and export it. If one is found, it will validate that
the ID meets system requirements. For more information, see `ENV_ID` below.

### fetch_variable() [*DEPRECATED*]

**NOTE:** Calling `fetch_variable()` directly is deprecated. Instead, please use the
`.add_variable()` stub method of `SystemTestContextBuilder()`. Early AIP-47 system
tests used this method so the below description will remain here until those tests can
be updated.

`fetch_variable()` accepts the name of a variable and an optional default value, and
will check a number of sources to find a value for it before falling back to the default.
Use this for any constant whose value is being sourced from outside the module. This is
used to pass values in for things which must be generated before the test can run.

Some examples of when this should be used are:

1. A resource which needs special or excessively elevated privileges to create. For example, an IAM Role.
2. A resource which would be very time expensive to create and would slow down tests For example, an RDS DB.
3. A default resource which is shared across many tests. For example, the default VPC.

# Conventions

## Location and Naming

All system tests for the Amazon provider package should be in `tests/system/providers/amazon/aws/`.
If there is only one system test for a given service, the module name should be `example_{service}`.
For example, `example_athena.py`. If more than one module is required, the names should be
descriptive. For example, `example_redshift_cluster.py` and `example_redshift_sql.py`.

## Environment ID

`ENV_ID` should be set via the the `SystemTestContextBuilder` and not manually. This
value should be used as part of any value which is test-specific such as the name of
an S3 bucket being created. For example, `BUCKET_NAME = f'{ENV_ID}-test-bucket'`.

If you choose to define an ENV_ID in your local environment rather than let the helper
generate one, the value must be lower-case alphanumeric with no special characters and
start with a letter. This ensures maximum compatibility across Amazon services since
different services and resources have different naming requirements.

## DAG ID

`DAG_ID` should be the first constant in the module and will be used only when declaring
the DAG object. The value should be the same as the module name. For example, the module
`example_athena.py` should have a DAG_ID value of `example_athena`. We declare the `DAG_ID`
as a constant as defined in AIP-47, but due to different services using different naming
conventions, the underscore in the `DAG_ID` is not always safe to use. Therefore, for the
sake of consistency, `ENV_ID` will be used when building asset names instead.

## IAM Role ARNs

When required, an IAM Role ARN should be imported using the `add_variable()` stub with
the key `ROLE_ARN`. `add_variable()` allows the test runner to store the values in the
environment variables or in a secret manager and standardizing the key name makes using
a secret manager easier.

## Task Dependencies

The task flow definition at the end of the DAG should use the chain() method rather than
the `>>` notation. It should also include comment lines to denote the Setup, Body, and
Teardown stages of the test if such stages exist. For example:

```
chain(
# TEST SETUP
test_context,
create_database,
create_table,
# TEST BODY
copy_selected_data,
# TEST TEARDOWN
delete_database,
)
```

## Cleanup

If your test uses any cleanup tasks, they should include `trigger_rule=TriggerRule.ALL_DONE`
to ensure they run even if some test steps fail. This in turn requires that the `watcher()`
method is added as outlined in [AIP-47](https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-47+New+design+of+Airflow+System+Tests).
For example:

```
chain(
# TEST SETUP
task0,
# TEST BODY
task1,
# TEST TEARDOWN
task2, # task2 has trigger rule "all done" defined
)
from tests.system.utils.watcher import watcher
# This test needs watcher in order to properly mark success/failure
# when "tearDown" task with trigger rule is part of the DAG
list(dag.tasks) >> watcher
```
31 changes: 19 additions & 12 deletions tests/system/providers/amazon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ tests/system/providers/amazon/aws

## Initial configuration

Each test requires some environment variables. Check how to set them up on your
Each test may require some environment variables. Check how to set them up on your
operating system, but on UNIX-based operating systems `export NAME_OF_ENV_VAR=value`
should work. To confirm that it is set up correctly, run `echo $NAME_OF_ENV_VAR`
which will display its value.
Expand All @@ -57,21 +57,28 @@ NAME_OF_ENV_VAR=value pytest --system amazon tests/system/providers/amazon/aws/e

### Required environment variables

- `SYSTEM_TESTS_ENV_ID` - AWS System Tests will generate and export this value if one does not exist.
- `SYSTEM_TESTS_ENV_ID` - AWS System Tests use `SystemTestContextBuilder` to generate
and export this value if one does not exist.

An environment ID is a unique value across different executions of system tests. This
is needed because the CI environment may run the tests on various versions of Airflow
in parallel. If this is the case, the value of this variable ensures that resources
that are created during the tests will not interfere with each other.
An environment ID is a unique value across multiple/different executions of a system
test. It is required because the CI environment may run the tests on various versions
of Airflow in parallel. If this is the case, the value of this variable ensures that
resources that are created during the test execution do not interfere with each other.

The value is used as part of the name for resources which have different requirements.
For example: an S3 bucket name can not use underscores, but an Athena table name can not
use hyphens. In order to minimize conflicts, this variable should be a randomized value
using only lowercase letters A-Z and digits 0-9, and start with a letter.
The value is used as part of the name for resources which have different requirements.
For example, an S3 bucket name can not use underscores whereas an Athena table name can
not use hyphens. In order to minimize conflicts, this variable should be a randomized
value using only lowercase letters A-Z and digits 0-9, and start with a letter.

## Settings for specific tests

Amazon system test files are designed to be as self-contained as possible. They will contain
any sample data and configuration values which are required, and they will create and tear
down any required infrastructure. Some tests will require an IAM Role ARN, and the requirements
for those Roles should be documented inside the docstring of the test file.
down any required infrastructure. Some tests will require an IAM Role ARN or other external
variables that can not be created by the test itself, and the requirements for those values
should be documented inside the docstring of the test file.

### Contributing

Please see [CONTRIBUTING.md](CONTRIBUTING.md) for community guidelines and best practices for
writing AWS System Tests for Apache Airflow.

0 comments on commit 5480b4c

Please sign in to comment.