Optimize analyze_outcomes.py #8560

lpy4105 · 2023-11-23T06:33:20Z

Description

Fix: #8423

Improve the performance of analyze_outcomes.py.

Test locally with outcomes.csv: 1m37s VS 0m11s

PR checklist

changelog not required, test script change.
backport not required, see comment.
tests not required

This extremely improves the performance. Signed-off-by: Pengyu Lv <[email protected]>

Signed-off-by: Pengyu Lv <[email protected]>

mpg

Thank you for the impressive performance improvement ! I also really like the readability improvements. I only have a couple of minor points and one suggestion for further improvement.

tests/scripts/analyze_outcomes.py

mpg · 2023-11-27T09:14:18Z

backport done, or not required

I'm on the fence here. On one hand 2.28 does not have driver_vs_reference analysis and only has one task, so all the sharing between task and restructuring to simplify driver_vs_reference is irrelevant there. On the other hand, I think we are going to evolve analyze_coverage in 2.28, see #2691, so perhaps for that work it would be useful if 2.28 and 3.x+ use the same data structure?

@gilles-peskine-arm do you have an opinion here (as you're involved in #2691)?

gilles-peskine-arm

I like the performance improvement, but I find the logic hard to follow. This is partly preexisting, but it's getting worse.

tests/scripts/analyze_outcomes.py

gilles-peskine-arm · 2023-11-27T19:27:13Z

tests/scripts/analyze_outcomes.py

-            setup = ';'.join([platform, config])
-            if key not in outcomes:
-                outcomes[key] = TestCaseOutcomes()
+            (_platform, config, suite, case, result, _cause) = line.split(';')


config is not unique: when we run the same all.sh component on Linux and FreeBSD, the entries only differ in the platform column.

I'm not sure if this requires some change to the behavior of the code, or a comment to explain the limitations that this implies.

If a test case passes on Linux and fails on FreeBSD, it'll end up in both the successes set and the failures set. How does this affect the rest of the script?

config is not unique

Yes, but currently I think currently we don't have a task that cares about platform. That's why I don't use platform field at present.

How does this affect the rest of the script?

analyze_coverage only cares about the presents of test cases, it doesn't care if a test case is in successes or failures or both.
analyze_driver_vs_reference could be affected if the components are running on multiple platform, but this is not true for current CI?

I agree that currently we don't have any tasks that care about platform, or for which it would be a problem to have a case that's both in successes and failures.

However, before reading Gilles's comment, I had not realised that config is not unique and that this could result in a test case being in both successes and failures set, so I agree that this deserves at least a comment.

Btw, not directly related, but this made me think: I think what both functions care about is not really the distinction between "success" and "failure", but the distinction between "executed" and "skipped": for analyze_coverage that's pretty clear, and for analyze_driver_vs_reference even though the docstring currently says "passing" what this is really about is that we don't want tests being skipped in the driver component unless we know it's expected and justified.

Generally speaking, failures will be found and reported at earlier stages in the CI, so I think this script is really about detecting unexpected skips, not about failures.

what both functions care about is not really the distinction between "success" and "failure", but the distinction between "executed" and "skipped"

I agree. But I consider this optional here.

When I wrote the original script, I recorded the pass/fail information because it was very easy and I didn't know how the script would evolve beyond the initial feature of test case coverage. After a few years, it turns out we don't need the pass/fail information, but it doesn't really harm to keep it (it isn't really increasing the complexity).

I agree it's probably out of scope here - and this PR already went quite beyond its original scope, so we don't want to keep growing it. But I think this would make things simpler: if we only care about the distinction executed vs skipped, then we only need to record one of those sets (because a test is skipped iff it isn't executed), and ComponentOutcomes can go from a namedtuple of two set to being just a set - probably with a new name then.

Let's not do it here as there's more value in getting this PR merged quickly. But if you agree with the general approach, I'll create a follow-up task about it.

tests/scripts/analyze_outcomes.py

gilles-peskine-arm · 2023-11-27T19:57:01Z

Regarding backporting: I don't currently expect that this script will change much in 2.28, other than (hopefully, if we finally manage to finish #2691) switching non-coverage from a warning to an error, and expanding the non-coverage allowlist. So it shouldn't hurt if the data structures to keep track of outcomes are different in 2.28 and 3.x+.

Signed-off-by: Pengyu Lv <[email protected]>

We don't care about the number of hits of the test cases, so break the iteration when the case hits. Signed-off-by: Pengyu Lv <[email protected]>

Signed-off-by: Pengyu Lv <[email protected]>

gilles-peskine-arm

LGTM apart from a couple of minor points

tests/scripts/analyze_outcomes.py

Also fix a typo in the comments. Signed-off-by: Pengyu Lv <[email protected]>

mpg

Looks pretty good to me. Thanks for improving the script that much - not just performance but also making the code clearer and cleaner!

Just one minor thing about the boolean, and one suggestion for slightly more compact code in one place.

tests/scripts/analyze_outcomes.py

Signed-off-by: Pengyu Lv <[email protected]>

mpg

LGTM, thanks!

gilles-peskine-arm

LGTM

lpy4105 added 3 commits November 23, 2023 09:51

Share parsed outcomes among tasks when ananlyzing

a6cf5d6

This extremely improves the performance. Signed-off-by: Pengyu Lv <[email protected]>

Restruct the structure of outcome file presentation

a442858

Signed-off-by: Pengyu Lv <[email protected]>

Improve comments and variable naming

31a9b78

Signed-off-by: Pengyu Lv <[email protected]>

mpg requested changes Nov 27, 2023

View reviewed changes

tests/scripts/analyze_outcomes.py Outdated Show resolved Hide resolved

tests/scripts/analyze_outcomes.py Outdated Show resolved Hide resolved

tests/scripts/analyze_outcomes.py Outdated Show resolved Hide resolved

tests/scripts/analyze_outcomes.py Outdated Show resolved Hide resolved

gilles-peskine-arm self-requested a review November 27, 2023 18:53

gilles-peskine-arm requested changes Nov 27, 2023

View reviewed changes

gilles-peskine-arm added needs-work and removed needs-review Every commit must be reviewed by at least two team members, needs-reviewer This PR needs someone to pick it up for review labels Nov 27, 2023

lpy4105 added 5 commits November 28, 2023 10:52

Improve readability of the script

dd1d6a7

Signed-off-by: Pengyu Lv <[email protected]>

Break the loop when case hits

f28cf59

We don't care about the number of hits of the test cases, so break the iteration when the case hits. Signed-off-by: Pengyu Lv <[email protected]>

Check if driver_component is missing

59b9efc

Signed-off-by: Pengyu Lv <[email protected]>

Use mutable set all the time

28ae464

Signed-off-by: Pengyu Lv <[email protected]>

Define named tuple for component outcomes

18908ec

Signed-off-by: Pengyu Lv <[email protected]>

lpy4105 force-pushed the issue/8423/optimize-analyze_outcomes_py branch from c7c35b4 to 18908ec Compare November 28, 2023 05:04

Run tests for ref_vs_driver outside task function

20e3ca3

Signed-off-by: Pengyu Lv <[email protected]>

lpy4105 added needs-review Every commit must be reviewed by at least two team members, and removed needs-work labels Nov 28, 2023

lpy4105 requested review from mpg and gilles-peskine-arm November 28, 2023 08:10

Add type annotations to analyze_outcomes.py

c2e8f3a

Signed-off-by: Pengyu Lv <[email protected]>

Add comment to read_outcome_file in analyze_outcomes.py

451ec8a

Signed-off-by: Pengyu Lv <[email protected]>

gilles-peskine-arm requested changes Nov 28, 2023

View reviewed changes

tests/scripts/analyze_outcomes.py Outdated Show resolved Hide resolved

tests/scripts/analyze_outcomes.py Outdated Show resolved Hide resolved

Use boolean hit instead of int hits

550cd6f

Also fix a typo in the comments. Signed-off-by: Pengyu Lv <[email protected]>

lpy4105 requested a review from gilles-peskine-arm November 29, 2023 02:09

mpg requested changes Nov 29, 2023

View reviewed changes

tests/scripts/analyze_outcomes.py Outdated Show resolved Hide resolved

tests/scripts/analyze_outcomes.py Outdated Show resolved Hide resolved

Some improvements

5dcfd0c

Signed-off-by: Pengyu Lv <[email protected]>

mpg approved these changes Nov 29, 2023

View reviewed changes

gilles-peskine-arm approved these changes Nov 29, 2023

View reviewed changes

gilles-peskine-arm added approved Design and code approved - may be waiting for CI or backports and removed needs-review Every commit must be reviewed by at least two team members, labels Nov 29, 2023

gilles-peskine-arm enabled auto-merge November 29, 2023 12:29

gilles-peskine-arm added this pull request to the merge queue Nov 29, 2023

Merged via the queue into Mbed-TLS:development with commit 18eab98 Nov 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize analyze_outcomes.py #8560

Optimize analyze_outcomes.py #8560

lpy4105 commented Nov 23, 2023 •

edited

Loading

mpg left a comment

mpg commented Nov 27, 2023

gilles-peskine-arm left a comment

gilles-peskine-arm Nov 27, 2023

lpy4105 Nov 28, 2023

mpg Nov 28, 2023

mpg Nov 28, 2023

gilles-peskine-arm Nov 28, 2023

mpg Nov 29, 2023 •

edited

Loading

gilles-peskine-arm commented Nov 27, 2023

gilles-peskine-arm left a comment

mpg left a comment

mpg left a comment

gilles-peskine-arm left a comment

Optimize analyze_outcomes.py #8560

Optimize analyze_outcomes.py #8560

Conversation

lpy4105 commented Nov 23, 2023 • edited Loading

Description

PR checklist

mpg left a comment

Choose a reason for hiding this comment

mpg commented Nov 27, 2023

gilles-peskine-arm left a comment

Choose a reason for hiding this comment

gilles-peskine-arm Nov 27, 2023

Choose a reason for hiding this comment

lpy4105 Nov 28, 2023

Choose a reason for hiding this comment

mpg Nov 28, 2023

Choose a reason for hiding this comment

mpg Nov 28, 2023

Choose a reason for hiding this comment

gilles-peskine-arm Nov 28, 2023

Choose a reason for hiding this comment

mpg Nov 29, 2023 • edited Loading

Choose a reason for hiding this comment

gilles-peskine-arm commented Nov 27, 2023

gilles-peskine-arm left a comment

Choose a reason for hiding this comment

mpg left a comment

Choose a reason for hiding this comment

mpg left a comment

Choose a reason for hiding this comment

gilles-peskine-arm left a comment

Choose a reason for hiding this comment

lpy4105 commented Nov 23, 2023 •

edited

Loading

mpg Nov 29, 2023 •

edited

Loading