Skip to content

Commit

Permalink
[lint] create a workflow consistency linter (pytorch#80200)
Browse files Browse the repository at this point in the history
In order to maintain consistency between jobs, introduce a linter that
checks whether jobs sharing the same `sync-tag` are indeed the same.

`sync-tag` is just a dummy input on the reusable workflow. I chose to
use a dummy input over the following alternatives:
- The job's id isn't great, because we are likely to change a job's id
  (say, when upgrading CUDA or linux versions)
- The job's name doesn't work as we have build/test jobs that share the
  same name
Pull Request resolved: pytorch#80200
Approved by: https://github.com/janeyx99
  • Loading branch information
suo authored and pytorchmergebot committed Jul 5, 2022
1 parent 769b446 commit 769df74
Show file tree
Hide file tree
Showing 17 changed files with 230 additions and 5 deletions.
7 changes: 7 additions & 0 deletions .github/workflows/_android-build-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,13 @@ on:
required: true
type: string
description: Name of the base docker image to build with.
sync-tag:
required: false
type: string
default: ""
description: |
If this is set, our linter will use this to make sure that every other
job with the same `sync-tag` is identical.
env:
GIT_DEFAULT_BRANCH: ${{ github.event.repository.default_branch }}
Expand Down
7 changes: 7 additions & 0 deletions .github/workflows/_android-full-build-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,13 @@ on:
required: true
type: string
description: Name of the base docker image to build with.
sync-tag:
required: false
type: string
default: ""
description: |
If this is set, our linter will use this to make sure that every other
job with the same `sync-tag` is identical.
secrets:
SONATYPE_NEXUS_USERNAME:
Expand Down
7 changes: 7 additions & 0 deletions .github/workflows/_bazel-build-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,13 @@ on:
required: true
type: string
description: Name of the base docker image to build with.
sync-tag:
required: false
type: string
default: ""
description: |
If this is set, our linter will use this to make sure that every other
job with the same `sync-tag` is identical.
env:
GIT_DEFAULT_BRANCH: ${{ github.event.repository.default_branch }}
Expand Down
7 changes: 7 additions & 0 deletions .github/workflows/_docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,13 @@ on:
type: boolean
default: false
description: If set, push the docs to the docs website.
sync-tag:
required: false
type: string
default: ""
description: |
If this is set, our linter will use this to make sure that every other
job with the same `sync-tag` is identical.
secrets:
GH_PYTORCHBOT_TOKEN:
Expand Down
7 changes: 7 additions & 0 deletions .github/workflows/_ios-build-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,13 @@ on:
required: true
type: string
description: Which iOS arch to build for.
sync-tag:
required: false
type: string
default: ""
description: |
If this is set, our linter will use this to make sure that every other
job with the same `sync-tag` is identical.
secrets:
IOS_CERT_KEY_2022:
Expand Down
7 changes: 7 additions & 0 deletions .github/workflows/_linux-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,13 @@ on:
type: boolean
default: false
description: If set, build in debug mode.
sync-tag:
required: false
type: string
default: ""
description: |
If this is set, our linter will use this to make sure that every other
job with the same `sync-tag` is identical.
outputs:
docker-image:
Expand Down
7 changes: 7 additions & 0 deletions .github/workflows/_linux-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,13 @@ on:
required: true
type: string
description: Docker image to run in.
sync-tag:
required: false
type: string
default: ""
description: |
If this is set, our linter will use this to make sure that every other
job with the same `sync-tag` is identical.
env:
GIT_DEFAULT_BRANCH: ${{ github.event.repository.default_branch }}
Expand Down
7 changes: 7 additions & 0 deletions .github/workflows/_mac-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,13 @@ on:
type: string
default: ""
description: What xcode version to build with.
sync-tag:
required: false
type: string
default: ""
description: |
If this is set, our linter will use this to make sure that every other
job with the same `sync-tag` is identical.
secrets:
MACOS_SCCACHE_S3_ACCESS_KEY_ID:
Expand Down
7 changes: 7 additions & 0 deletions .github/workflows/_mac-test-arm64.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,13 @@ on:
required: true
type: string
description: Top-level label for what's being built/tested.
sync-tag:
required: false
type: string
default: ""
description: |
If this is set, our linter will use this to make sure that every other
job with the same `sync-tag` is identical.
jobs:
Expand Down
7 changes: 7 additions & 0 deletions .github/workflows/_mac-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,13 @@ on:
required: true
type: string
description: JSON description of what test configs to run.
sync-tag:
required: false
type: string
default: ""
description: |
If this is set, our linter will use this to make sure that every other
job with the same `sync-tag` is identical.
secrets:
AWS_OSSCI_METRICS_V2_ACCESS_KEY_ID:
Expand Down
7 changes: 7 additions & 0 deletions .github/workflows/_rocm-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,13 @@ on:
required: true
type: string
description: Docker image to run in.
sync-tag:
required: false
type: string
default: ""
description: |
If this is set, our linter will use this to make sure that every other
job with the same `sync-tag` is identical.
secrets:
AWS_OSSCI_METRICS_V2_ACCESS_KEY_ID:
Expand Down
7 changes: 7 additions & 0 deletions .github/workflows/_win-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,13 @@ on:
type: boolean
default: false
description: If set, build in debug mode.
sync-tag:
required: false
type: string
default: ""
description: |
If this is set, our linter will use this to make sure that every other
job with the same `sync-tag` is identical.
env:
GIT_DEFAULT_BRANCH: ${{ github.event.repository.default_branch }}
Expand Down
7 changes: 7 additions & 0 deletions .github/workflows/_win-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,13 @@ on:
required: true
type: string
description: JSON description of what test configs to run.
sync-tag:
required: false
type: string
default: ""
description: |
If this is set, our linter will use this to make sure that every other
job with the same `sync-tag` is identical.
env:
GIT_DEFAULT_BRANCH: ${{ github.event.repository.default_branch }}
Expand Down
5 changes: 2 additions & 3 deletions .github/workflows/pull.yml
Original file line number Diff line number Diff line change
Expand Up @@ -250,15 +250,14 @@ jobs:
{ config: "default", shard: 2, num_shards: 2, runner: "windows.4xlarge" },
]}
# please ensure that this and its corresponding job in trunk.yml are in sync
win-vs2019-cuda11_6-py3-build:
# don't run build twice on master
if: github.event_name == 'pull_request'
name: win-vs2019-cuda11.6-py3
uses: ./.github/workflows/_win-build.yml
with:
build-environment: win-vs2019-cuda11.6-py3
cuda-version: "11.6"
sync-tag: win-cuda-build

linux-xenial-cuda11_3-py3_7-gcc7-bazel-test:
name: linux-xenial-cuda11.3-py3.7-gcc7-bazel-test
Expand Down Expand Up @@ -310,7 +309,6 @@ jobs:
{ config: "deploy", shard: 1, num_shards: 1, runner: "linux.4xlarge.nvidia.gpu" },
]}
# please ensure that this and its corresponding job in trunk.yml are in sync
linux-bionic-rocm5_1-py3_7-build:
# don't run build twice on master
if: github.event_name == 'pull_request'
Expand All @@ -319,3 +317,4 @@ jobs:
with:
build-environment: linux-bionic-rocm5.1-py3.7
docker-image-name: pytorch-linux-bionic-rocm5.1-py3.7
sync-tag: rocm-build
4 changes: 2 additions & 2 deletions .github/workflows/trunk.yml
Original file line number Diff line number Diff line change
Expand Up @@ -197,13 +197,13 @@ jobs:
with:
build-environment: macos-10-15-py3-arm64

# please ensure that this and its corresponding job in pull.yml are in sync
win-vs2019-cuda11_6-py3-build:
name: win-vs2019-cuda11.6-py3
uses: ./.github/workflows/_win-build.yml
with:
build-environment: win-vs2019-cuda11.6-py3
cuda-version: "11.6"
sync-tag: win-cuda-build

win-vs2019-cuda11_6-py3-test:
name: win-vs2019-cuda11.6-py3
Expand All @@ -222,14 +222,14 @@ jobs:
{ config: "force_on_cpu", shard: 1, num_shards: 1, runner: "windows.4xlarge" },
]}
# please ensure that this and its corresponding job in pull.yml are in sync
linux-bionic-rocm5_1-py3_7-build:
if: false
name: linux-bionic-rocm5.1-py3.7
uses: ./.github/workflows/_linux-build.yml
with:
build-environment: linux-bionic-rocm5.1-py3.7
docker-image-name: pytorch-linux-bionic-rocm5.1-py3.7
sync-tag: rocm-build

linux-bionic-rocm5_1-py3_7-test:
name: linux-bionic-rocm5.1-py3.7
Expand Down
20 changes: 20 additions & 0 deletions .lintrunner.toml
Original file line number Diff line number Diff line change
Expand Up @@ -633,3 +633,23 @@ command = [
'--',
'@{{PATHSFILE}}'
]

[[linter]]
code = 'WORKFLOWSYNC'
include_patterns = [
'.github/workflows/pull.yml',
'.github/workflows/trunk.yml',
'.github/workflows/periodic.yml',
]
command = [
'python3',
'tools/linter/adapters/workflow_consistency_linter.py',
'--',
'@{{PATHSFILE}}'
]
init_command = [
'python3',
'tools/linter/adapters/pip_init.py',
'--dry-run={{DRYRUN}}',
'PyYAML==6.0',
]
115 changes: 115 additions & 0 deletions tools/linter/adapters/workflow_consistency_linter.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
"""Checks for consistency of jobs between different GitHub workflows.
Any job with a specific `sync-tag` must match all other jobs with the same `sync-tag`.
"""
import argparse
import itertools
import json
from pathlib import Path
from typing import Iterable, Any, Optional, NamedTuple, Dict
from enum import Enum
from collections import defaultdict

from yaml import load, CSafeLoader, dump


class LintSeverity(str, Enum):
ERROR = "error"
WARNING = "warning"
ADVICE = "advice"
DISABLED = "disabled"


class LintMessage(NamedTuple):
path: Optional[str]
line: Optional[int]
char: Optional[int]
code: str
severity: LintSeverity
name: str
original: Optional[str]
replacement: Optional[str]
description: Optional[str]


def glob_yamls(path: Path) -> Iterable[Path]:
return itertools.chain(path.glob("**/*.yml"), path.glob("**/*.yaml"))


def load_yaml(path: Path) -> Any:
with open(path) as f:
return load(f, CSafeLoader)


def is_workflow(yaml: Any) -> bool:
return yaml.get("jobs") is not None


def print_lint_message(path: Path, job: Dict[str, Any], sync_tag: str) -> None:
job_id = list(job.keys())[0]
with open(path) as f:
lines = f.readlines()
for i, line in enumerate(lines):
if f"{job_id}:" in line:
line_number = i + 1

lint_message = LintMessage(
path=str(path),
line=line_number,
char=None,
code="WORKFLOWSYNC",
severity=LintSeverity.ERROR,
name="workflow-inconsistency",
original=None,
replacement=None,
description=f"Job doesn't match other jobs with sync-tag: '{sync_tag}'",
)
print(json.dumps(lint_message._asdict()), flush=True)


if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="workflow consistency linter.",
fromfile_prefix_chars="@",
)
parser.add_argument(
"filenames",
nargs="+",
help="paths to lint",
)
args = parser.parse_args()

# Go through the provided files, aggregating jobs with the same sync tag
tag_to_jobs = defaultdict(list)
for path in args.filenames:
workflow = load_yaml(Path(path))
jobs = workflow["jobs"]
for job_id, job in jobs.items():
try:
sync_tag = job["with"]["sync-tag"]
except KeyError:
continue

# remove the "if" field, which we allow to be different between jobs
# (since you might have different triggering conditions on pull vs.
# trunk, say.)
if "if" in job:
del job["if"]

tag_to_jobs[sync_tag].append((path, {job_id: job}))

# For each sync tag, check that all the jobs have the same code.
for sync_tag, path_and_jobs in tag_to_jobs.items():
baseline_path, baseline_dict = path_and_jobs.pop()
baseline_str = dump(baseline_dict)

printed_baseline = False

for path, job_dict in path_and_jobs:
job_str = dump(job_dict)
if baseline_str != job_str:
print_lint_message(path, job_dict, sync_tag)

if not printed_baseline:
print_lint_message(baseline_path, baseline_dict, sync_tag)
printed_baseline = True

0 comments on commit 769df74

Please sign in to comment.