forked from pantsbuild/pants
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Extract Python dependencies in an intrinsic (pantsbuild#18854)
This PR does two things: - Introduces a new crate to the engine: `dep_inference`. A module inside is dedicated to Python, and leverages `tree-sitter` and `tree-sitter-python` to parse Parse dependencies. `tree-sitter` was chosen because it supports Py2/3, supports other languages, and also is syntax-error-resistant. - Leverages the new crate in an intrinsic. The new behavior is forced opt-in/out and will eventually be the "only" way to do the inference. # TImings Helper script: ```shell #!/bin/bash # Replace some random numbers find src/python/pants -type f -name "*.py" -not -name "__init__.py" | xargs sed -i s/'Copyright [0123456789][0123456789][0123456789][0123456789]'/"Copyright $RANDOM"/ # Wait for the kernel really quick sleep 1 # Wait for the inotify notifications to stop while true; do mtime=$(stat -c %Y .pants.d/pants.log) now=$(date +%s) diff=$((now - mtime)) if (( diff >= 5 )); then break fi sleep $((5 - diff)) done ``` Timings follows. `./dirty_files.sh` runs test worst case scenario ( touch every copyright header). I'm on a 64 core machine, so I run as if we only had 8 cores. Findings: - In the worst case (the extraction process is not in the process cache) we blow it out of the water in terms of time saved - In the best case (the process cache is hot) we're comparable. Put another way, the time it takes to execute the rule code and lookup the process in the process cache is roughly the amount of time it takes just to parse it again ## Worst case (completely cold cache) ``` $ hyperfine --prepare ./dirty_files.sh --runs 4 --warmup 1 'pants --rule-threads-core=4 --process-execution-local-parallelism=8 --no-python-infer-use-rust-parser --filter-target-type=python_source dependencies ::' 'pants --rule-threads-core=4 --process-execution-local-parallelism=8 --python-infer-use-rust-parser --filter-target-type=python_source dependencies ::' Benchmark 1: pants --rule-threads-core=4 --process-execution-local-parallelism=8 --no-python-infer-use-rust-parser --filter-target-type=python_source dependencies :: Time (mean ± σ): 36.335 s ± 1.286 s [User: 0.754 s, System: 0.151 s] Range (min … max): 34.698 s … 37.645 s 4 runs Benchmark 2: pants --rule-threads-core=4 --process-execution-local-parallelism=8 --python-infer-use-rust-parser --filter-target-type=python_source dependencies :: Time (mean ± σ): 2.899 s ± 0.096 s [User: 0.758 s, System: 0.131 s] Range (min … max): 2.764 s … 2.990 s 4 runs Summary 'pants --rule-threads-core=4 --process-execution-local-parallelism=8 --python-infer-use-rust-parser --filter-target-type=python_source dependencies ::' ran 12.54 ± 0.61 times faster than 'pants --rule-threads-core=4 --process-execution-local-parallelism=8 --no-python-infer-use-rust-parser --filter-target-type=python_source dependencies ::' ``` ## Best Case (hot cache, but no daemon) ``` $ hyperfine --runs 4 --warmup 1 'pants --no-pantsd --rule-threads-core=4 --process-execution-local-parallelism=8 --no-python-infer-use-rust-parser --filter-target-type=python_source dependencies ::' 'pants --no-pantsd --rule-threads-core=4 --process-execution-local-parallelism=8 --python-infer-use-rust-parser --filter-target-type=python_source dependencies ::' Benchmark 1: pants --no-pantsd --rule-threads-core=4 --process-execution-local-parallelism=8 --no-python-infer-use-rust-parser --filter-target-type=python_source dependencies :: Time (mean ± σ): 20.589 s ± 0.319 s [User: 20.303 s, System: 2.002 s] Range (min … max): 20.167 s … 20.934 s 4 runs Benchmark 2: pants --no-pantsd --rule-threads-core=4 --process-execution-local-parallelism=8 --python-infer-use-rust-parser --filter-target-type=python_source dependencies :: Time (mean ± σ): 19.273 s ± 0.347 s [User: 18.881 s, System: 1.669 s] Range (min … max): 18.940 s … 19.759 s 4 runs Summary 'pants --no-pantsd --rule-threads-core=4 --process-execution-local-parallelism=8 --python-infer-use-rust-parser --filter-target-type=python_source dependencies ::' ran 1.07 ± 0.03 times faster than 'pants --no-pantsd --rule-threads-core=4 --process-execution-local-parallelism=8 --no-python-infer-use-rust-parser --filter-target-type=python_source dependencies ::' ```
- Loading branch information
1 parent
268c9f8
commit 3a20af9
Showing
18 changed files
with
1,301 additions
and
60 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
# Copyright 2023 Pants project contributors (see CONTRIBUTORS.md). | ||
# Licensed under the Apache License, Version 2.0 (see LICENSE). | ||
|
||
from __future__ import annotations | ||
|
||
from dataclasses import dataclass | ||
|
||
from pants.util.frozendict import FrozenDict | ||
|
||
|
||
@dataclass(frozen=True) | ||
class NativeParsedPythonDependencies: | ||
imports: FrozenDict[str, tuple[int, bool]] | ||
string_candidates: FrozenDict[str, int] | ||
|
||
def __init__(self, imports: dict[str, tuple[int, bool]], string_candidates: dict[str, int]): | ||
object.__setattr__(self, "imports", FrozenDict(imports)) | ||
object.__setattr__(self, "string_candidates", FrozenDict(string_candidates)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.