Skip to content

Commit

Permalink
Refactor clang_tidy.py (pytorch#61119)
Browse files Browse the repository at this point in the history
Summary:
Pull Request resolved: pytorch#61119

This change spilts the clang-tidy CI job into smaller steps and uses a
refactored version of the clang_tidy.py script.

The new folder structure is as follows:
```
tools/linter/clang_tidy
|_ __main__py
|_ requirements.txt
|_ run.py
|_ setup.sh
```

`__main__.py`

This script will run `tools/linter/clang_tidy/setup.sh` if a `build`
directory doesn't exist, mimicing what used to be done as a separate
step in the CI job.

After that, it will invoke `clang-tidy` with default arguments being
declared in the script itself (as opposed to declaring them in
lint.yml).

The reasoning behind this approach is two-fold:

- Make it easier to run `clang-tidy` locally using this script
- De-duplicate the option passing

`requirements.txt`

Contains a list of additional python dependencies needed by the
`clang-tidy` script.

`setup.sh`

If a build directory doesn't exist, this command will run the necessary
codegen and build commands for running `clang-tidy`

Example usage:
```
python3 tools/linter/clang_tidy --parallel
```
Notice that we don't have to put the `.py` at the end of `clang_tidy`.

Test Plan:
Run the following command:
```
python3 tools/linter/clang_tidy --paths torch/csrc/fx --parallel
```

Reviewed By: walterddr, janeyx99

Differential Revision: D29568582

Pulled By: 1ntEgr8

fbshipit-source-id: cd6d11c5cb8ba9f1344a87c35647a1cd8dd45b04
  • Loading branch information
1ntEgr8 authored and facebook-github-bot committed Jul 6, 2021
1 parent 81e36d0 commit a1ad28d
Show file tree
Hide file tree
Showing 17 changed files with 402 additions and 303 deletions.
71 changes: 11 additions & 60 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -326,75 +326,26 @@ jobs:
mkdir clang-tidy-output
cd clang-tidy-output
echo "$HEAD_SHA" > commit-sha.txt
- name: Generate build files
run: |
cd "${GITHUB_WORKSPACE}"
set -eux
if [ ! -d build ]; then
git submodule update --init --recursive
export USE_NCCL=0
export USE_DEPLOY=1
# We really only need compile_commands.json, so no need to build!
time python3 setup.py --cmake-only build
# Generate ATen files.
time python3 -m tools.codegen.gen \
-s aten/src/ATen \
-d build/aten/src/ATen
# Generate PyTorch files.
time python3 tools/setup_helpers/generate_code.py \
--declarations-path build/aten/src/ATen/Declarations.yaml \
--native-functions-path aten/src/ATen/native/native_functions.yaml \
--nn-path aten/src
fi
- name: Run clang-tidy
- name: Fetch PR diff
env:
HEAD_SHA: ${{ github.event.pull_request.head.sha }}
PR_NUMBER: ${{ github.event.pull_request.number }}
run: |
cd "${GITHUB_WORKSPACE}"
set -eux
wget -O pr.diff "https://patch-diff.githubusercontent.com/raw/pytorch/pytorch/pull/$PR_NUMBER.diff"
# Run Clang-Tidy
# The negative filters below are to exclude files that include onnx_pb.h or
# caffe2_pb.h, otherwise we'd have to build protos as part of this CI job.
# FunctionsManual.cpp is excluded to keep this diff clean. It will be fixed
# in a follow up PR.
# /torch/csrc/generic/*.cpp is excluded because those files aren't actually built.
# deploy/interpreter files are excluded due to using macros and other techniques
# that are not easily converted to accepted c++
python3 tools/linter/clang_tidy.py \
- name: Run clang-tidy
run: |
cd "${GITHUB_WORKSPACE}"
python3 tools/linter/clang_tidy \
--diff-file pr.diff \
--parallel \
--verbose \
--paths torch/csrc/ \
--diff-file pr.diff \
--include-dir /usr/lib/llvm-11/include/openmp \
-g"-torch/csrc/jit/passes/onnx/helper.cpp" \
-g"-torch/csrc/jit/passes/onnx/shape_type_inference.cpp" \
-g"-torch/csrc/jit/serialization/onnx.cpp" \
-g"-torch/csrc/jit/serialization/export.cpp" \
-g"-torch/csrc/jit/serialization/import.cpp" \
-g"-torch/csrc/jit/serialization/import_legacy.cpp" \
-g"-torch/csrc/onnx/init.cpp" \
-g"-torch/csrc/cuda/nccl.*" \
-g"-torch/csrc/cuda/python_nccl.cpp" \
-g"-torch/csrc/autograd/FunctionsManual.cpp" \
-g"-torch/csrc/generic/*.cpp" \
-g"-torch/csrc/jit/codegen/cuda/runtime/*" \
-g"-torch/csrc/deploy/interpreter/interpreter.cpp" \
-g"-torch/csrc/deploy/interpreter/interpreter.h" \
-g"-torch/csrc/deploy/interpreter/interpreter_impl.h" \
-g"-torch/csrc/deploy/interpreter/test_main.cpp" \
"$@" >"${GITHUB_WORKSPACE}"/clang-tidy-output.txt
cat "${GITHUB_WORKSPACE}"/clang-tidy-output.txt
- name: Annotate output
env:
HEAD_SHA: ${{ github.event.pull_request.head.sha }}
run: |
cd "${GITHUB_WORKSPACE}"
tools/linter/translate_annotations.py \
--file=clang-tidy-output.txt \
--regex='^(?P<filename>.*?):(?P<lineNumber>\d+):(?P<columnNumber>\d+): (?P<errorDesc>.*?) \[(?P<errorCode>.*)\]' \
Expand Down
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -283,8 +283,9 @@ TAGS
# ccls file
.ccls-cache/

# clang-format storage location used by apply_clang_format.py
# clang tooling storage location
.clang-format-bin
.clang-tidy-bin

# clangd background index
.clangd/
Expand All @@ -305,3 +306,6 @@ bazel-*

# core dump files
core.*

# Generated if you use the pre-commit script for clang-tidy
pr.diff
3 changes: 2 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -1097,7 +1097,8 @@ have more checks than older versions. In our CI, we run clang-tidy-6.0.
uncommitted changes). Changes are picked up based on a `git diff` with the
given revision:
```bash
python tools/linter/clang_tidy.py -d build -p torch/csrc --diff 'HEAD~1'
git diff HEAD~1 > pr.diff
python tools/linter/clang_tidy --paths torch/csrc --diff-file "pr.diff"
```

Above, it is assumed you are in the PyTorch root folder. `path/to/build` should
Expand Down
3 changes: 2 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ setup_lint:
--job 'shellcheck' --step 'Install ShellCheck' --no-quiet; \
fi
pip install jinja2
pip install -r tools/linter/clang_tidy/requirements.txt

quick_checks:
# TODO: This is broken when 'git config submodule.recurse' is 'true' since the
Expand Down Expand Up @@ -116,4 +117,4 @@ toc:
lint: flake8 mypy quick_checks cmakelint shellcheck

quicklint: CHANGED_ONLY=--changed-only
quicklint: mypy flake8 mypy quick_checks cmakelint shellcheck
quicklint: mypy flake8 quick_checks cmakelint shellcheck
5 changes: 0 additions & 5 deletions mypy.ini
Original file line number Diff line number Diff line change
Expand Up @@ -37,11 +37,6 @@ files =
test/test_type_hints.py,
test/test_type_info.py,
test/test_utils.py,
tools/linter/clang_format_utils.py,
tools/linter/clang_tidy.py,
tools/generate_torch_version.py,
tools/render_junit.py,
tools/stats

#
# `exclude` is a regex, not a list of paths like `files` (sigh)
Expand Down
2 changes: 1 addition & 1 deletion tools/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ Build system pieces:

Developer tools which you might find useful:

* [linter/clang_tidy.py](linter/clang_tidy.py) - Script for running clang-tidy
* [linter/clang_tidy](linter/clang_tidy/__main__.py) - Script for running clang-tidy
on lines of your script which you changed.
* [extract_scripts.py](extract_scripts.py) - Extract scripts from
`.github/workflows/*.yml` into a specified dir, on which linters such as
Expand Down
5 changes: 3 additions & 2 deletions tools/git-pre-commit
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,10 @@ python tools/linter/flake8_hook.py
if [ $(which clang-tidy) ]
then
echo "Running pre-commit clang-tidy"
python tools/linter/clang_tidy.py \
git diff HEAD > pr.diff
python tools/linter/clang_tidy \
--paths torch/csrc \
--diff HEAD \
--diff-file "pr.diff" \
-g"-torch/csrc/jit/passes/onnx/helper.cpp" \
-g"-torch/csrc/jit/passes/onnx/shape_type_inference.cpp" \
-g"-torch/csrc/jit/serialization/onnx.cpp" \
Expand Down
142 changes: 2 additions & 140 deletions tools/linter/clang_format_utils.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,5 @@
import platform
import sys
import stat
import hashlib
import os
import urllib.request
import urllib.error

# String representing the host platform (e.g. Linux, Darwin).
HOST_PLATFORM = platform.system()

# PyTorch directory root, derived from the location of this file.
PYTORCH_ROOT = os.path.dirname(os.path.dirname(os.path.dirname(os.path.realpath(__file__))))
from install.download_bin import download, PYTORCH_ROOT # type: ignore[import]

# This dictionary maps each platform to the S3 object URL for its clang-format binary.
PLATFORM_TO_CF_URL = {
Expand All @@ -24,135 +13,8 @@
"Linux": os.path.join("tools", "clang_format_hash", "linux64", "clang-format-linux64"),
}

# Directory and file paths for the clang-format binary.
CLANG_FORMAT_DIR = os.path.join(PYTORCH_ROOT, ".clang-format-bin")
CLANG_FORMAT_PATH = os.path.join(CLANG_FORMAT_DIR, "clang-format")

def compute_file_sha256(path: str) -> str:
"""Compute the SHA256 hash of a file and return it as a hex string."""
# If the file doesn't exist, return an empty string.
if not os.path.exists(path):
return ""

hash = hashlib.sha256()

# Open the file in binary mode and hash it.
with open(path, "rb") as f:
for b in f:
hash.update(b)

# Return the hash as a hexadecimal string.
return hash.hexdigest()


def report_download_progress(chunk_number: int, chunk_size: int, file_size: int) -> None:
"""
Pretty printer for file download progress.
"""
if file_size != -1:
percent = min(1, (chunk_number * chunk_size) / file_size)
bar = "#" * int(64 * percent)
sys.stdout.write("\r0% |{:<64}| {}%".format(bar, int(percent * 100)))


def download_clang_format(path: str) -> bool:
"""
Downloads a clang-format binary appropriate for the host platform and stores it at the given location.
"""
if HOST_PLATFORM not in PLATFORM_TO_CF_URL:
print("Unsupported platform: {}".format(HOST_PLATFORM))
return False

cf_url = PLATFORM_TO_CF_URL[HOST_PLATFORM]
filename = os.path.join(path, "clang-format")

# Try to download clang-format.
print("Downloading clang-format to {}".format(path))
try:
urllib.request.urlretrieve(
cf_url, filename, reporthook=report_download_progress if sys.stdout.isatty() else None
)
except urllib.error.URLError as e:
print("Error downloading {}: {}".format(filename, str(e)))
return False
finally:
print()

return True


def get_and_check_clang_format(verbose: bool = False) -> bool:
"""
Download a platform-appropriate clang-format binary if one doesn't already exist at the expected location and verify
that it is the right binary by checking its SHA256 hash against the expected hash.
"""
if not os.path.exists(CLANG_FORMAT_DIR):
# If the directory doesn't exist, try to create it.
try:
os.mkdir(CLANG_FORMAT_DIR)
except OSError as e:
print("Unable to create directory for clang-format binary: {}".format(CLANG_FORMAT_DIR))
return False
finally:
if verbose:
print("Created directory {} for clang-format binary".format(CLANG_FORMAT_DIR))

# If the directory didn't exist, neither did the binary, so download it.
ok = download_clang_format(CLANG_FORMAT_DIR)

if not ok:
return False
else:
# If the directory exists but the binary doesn't, download it.
if not os.path.exists(CLANG_FORMAT_PATH):
ok = download_clang_format(CLANG_FORMAT_DIR)

if not ok:
return False
else:
if verbose:
print("Found pre-existing clang-format binary, skipping download")

# Now that the binary is where it should be, hash it.
actual_bin_hash = compute_file_sha256(CLANG_FORMAT_PATH)

# If the host platform is not in PLATFORM_TO_HASH, it is unsupported.
if HOST_PLATFORM not in PLATFORM_TO_HASH:
print("Unsupported platform: {}".format(HOST_PLATFORM))
return False

# This is the path to the file containing the reference hash.
hashpath = os.path.join(PYTORCH_ROOT, PLATFORM_TO_HASH[HOST_PLATFORM])

if not os.path.exists(hashpath):
print("Unable to find reference binary hash")
return False

# Load the reference hash and compare the actual hash to it.
with open(hashpath, "r") as f:
reference_bin_hash = f.readline().strip()

if verbose:
print("Reference Hash: {}".format(repr(reference_bin_hash)))
print("Actual Hash: {}".format(repr(actual_bin_hash)))

if reference_bin_hash != actual_bin_hash:
print("The downloaded binary is not what was expected!")

# Err on the side of caution and try to delete the downloaded binary.
try:
os.unlink(CLANG_FORMAT_PATH)
print("The binary has been deleted just to be safe")
except OSError as e:
print("Failed to delete binary: {}".format(str(e)))
print("Delete this binary as soon as possible and do not execute it!")

return False
else:
# Make sure the binary is executable.
mode = os.stat(CLANG_FORMAT_PATH).st_mode
mode |= stat.S_IXUSR
os.chmod(CLANG_FORMAT_PATH, mode)
print("Using clang-format located at {}".format(CLANG_FORMAT_PATH))

return True
return bool(download("clang-format", CLANG_FORMAT_DIR, PLATFORM_TO_CF_URL, PLATFORM_TO_HASH))
Loading

0 comments on commit a1ad28d

Please sign in to comment.