-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize analyze_outcomes.py #8560
Optimize analyze_outcomes.py #8560
Conversation
This extremely improves the performance. Signed-off-by: Pengyu Lv <[email protected]>
Signed-off-by: Pengyu Lv <[email protected]>
Signed-off-by: Pengyu Lv <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the impressive performance improvement ! I also really like the readability improvements. I only have a couple of minor points and one suggestion for further improvement.
I'm on the fence here. On one hand 2.28 does not have @gilles-peskine-arm do you have an opinion here (as you're involved in #2691)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the performance improvement, but I find the logic hard to follow. This is partly preexisting, but it's getting worse.
tests/scripts/analyze_outcomes.py
Outdated
setup = ';'.join([platform, config]) | ||
if key not in outcomes: | ||
outcomes[key] = TestCaseOutcomes() | ||
(_platform, config, suite, case, result, _cause) = line.split(';') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
config
is not unique: when we run the same all.sh
component on Linux and FreeBSD, the entries only differ in the platform column.
I'm not sure if this requires some change to the behavior of the code, or a comment to explain the limitations that this implies.
If a test case passes on Linux and fails on FreeBSD, it'll end up in both the successes
set and the failures
set. How does this affect the rest of the script?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
config
is not unique
Yes, but currently I think currently we don't have a task that cares about platform
. That's why I don't use platform
field at present.
How does this affect the rest of the script?
analyze_coverage
only cares about the presents of test cases, it doesn't care if a test case is in successes
or failures
or both.
analyze_driver_vs_reference
could be affected if the components are running on multiple platform, but this is not true for current CI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that currently we don't have any tasks that care about platform
, or for which it would be a problem to have a case that's both in successes
and failures
.
However, before reading Gilles's comment, I had not realised that config
is not unique and that this could result in a test case being in both successes
and failures
set, so I agree that this deserves at least a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, not directly related, but this made me think: I think what both functions care about is not really the distinction between "success" and "failure", but the distinction between "executed" and "skipped": for analyze_coverage
that's pretty clear, and for analyze_driver_vs_reference
even though the docstring currently says "passing" what this is really about is that we don't want tests being skipped in the driver component unless we know it's expected and justified.
Generally speaking, failures will be found and reported at earlier stages in the CI, so I think this script is really about detecting unexpected skips, not about failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what both functions care about is not really the distinction between "success" and "failure", but the distinction between "executed" and "skipped"
I agree. But I consider this optional here.
When I wrote the original script, I recorded the pass/fail information because it was very easy and I didn't know how the script would evolve beyond the initial feature of test case coverage. After a few years, it turns out we don't need the pass/fail information, but it doesn't really harm to keep it (it isn't really increasing the complexity).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it's probably out of scope here - and this PR already went quite beyond its original scope, so we don't want to keep growing it. But I think this would make things simpler: if we only care about the distinction executed vs skipped, then we only need to record one of those sets (because a test is skipped iff it isn't executed), and ComponentOutcomes
can go from a namedtuple of two set to being just a set - probably with a new name then.
Let's not do it here as there's more value in getting this PR merged quickly. But if you agree with the general approach, I'll create a follow-up task about it.
Regarding backporting: I don't currently expect that this script will change much in 2.28, other than (hopefully, if we finally manage to finish #2691) switching non-coverage from a warning to an error, and expanding the non-coverage allowlist. So it shouldn't hurt if the data structures to keep track of outcomes are different in 2.28 and 3.x+. |
Signed-off-by: Pengyu Lv <[email protected]>
We don't care about the number of hits of the test cases, so break the iteration when the case hits. Signed-off-by: Pengyu Lv <[email protected]>
Signed-off-by: Pengyu Lv <[email protected]>
Signed-off-by: Pengyu Lv <[email protected]>
Signed-off-by: Pengyu Lv <[email protected]>
c7c35b4
to
18908ec
Compare
Signed-off-by: Pengyu Lv <[email protected]>
Signed-off-by: Pengyu Lv <[email protected]>
Signed-off-by: Pengyu Lv <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM apart from a couple of minor points
Also fix a typo in the comments. Signed-off-by: Pengyu Lv <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good to me. Thanks for improving the script that much - not just performance but also making the code clearer and cleaner!
Just one minor thing about the boolean, and one suggestion for slightly more compact code in one place.
Signed-off-by: Pengyu Lv <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
Fix: #8423
Improve the performance of
analyze_outcomes.py
.Test locally with outcomes.csv: 1m37s VS 0m11s
PR checklist