Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MB-16176 - Create coverage ADR #350

Merged
11 commits merged into from
Jun 23, 2023
Merged

MB-16176 - Create coverage ADR #350

11 commits merged into from
Jun 23, 2023

Conversation

ghost
Copy link

@ghost ghost commented May 31, 2023

This PR should create an ADR that suggests improvements to the existing code test coverage checks with the intention to avoid PR roadblocks and make the checks more transparent.

@ghost ghost added the Pamplemoose label May 31, 2023
Continue blocking PRs that decrease test coverage.

* `+` *No implementation effort*
* `+` *Provides a way to waintain high test coverage with stricter enforcement*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: waintain => maintain

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this -- we've got to waintain our spelling standards too.

@ghost ghost requested review from a team and rogeruiz May 31, 2023 22:23
Copy link
Contributor

@kctruss kctruss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sold on the Happo approach here. My concern is that rather than alleviate a problem, this approach shifts the responsibility from the engineer doing the work onto the PR review process, which is already a bottleneck.

Personally, I'd rather see us spend time documenting and coaching on when to stop fighting the test results and communicate a cache reset - it should be less "break glass in case of emergency" and more "Hey, here's the situation".

Copy link
Contributor

@rogeruiz rogeruiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving some comments and suggestions here. I think we should have another option that I mention below. This is in-line with @kctruss feedback about treating resetting the cache as a thing that could happen instead of something that should only rarely happen.

Comment on lines 17 to 20
### Scenarios that result in a test coverage decrease
1. **Removal of code that is _more_ tested than the total coverage.** If the total coverage is 77.0% and a block of code that is tested at 100% is deleted, the average coverage goes down.
2. **Addition of code that is _less_ tested than the total coverage and _should be tested more_.** This is the scenario that we want to address with our code coverage strategy.
3. **Addition of code that is _less_ tested than the total coverage and _doesn't need to be tested more_.** The difference between this scenario and scenario 2 can be subjective, but one can imagine a scenario in which a Go database model is created and doesn't require any tests.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these the only scenarios we'd like to discuss here? I think these are enough, but I'd like to hear if folks can think of another or perhaps expand on these scenarios a bit more with more examples.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: I think another example for scenario 3 would be adding logic to existing model_to_payload functions since those aren't unit tested and are only sometimes integration tested via the handler integration tests.


Our approach to Happo changes is currently the following: the Happo PR check is not a required check, but it reports failures and includes an easily-accessible link on the PR to Happo diffs which provide context around visual changes. Something similar could be implemented for code coverage checks with the following steps:
1. Make the code coverage checks optional so that failures do not block a PR
2. Generate a link automatically on the PR (perhaps through a comment) that provides more context behind the failure (including, but not limited to, the go-coverage.html artifact generated in circle-ci)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link should also point to documentation around the coverage tests and how to increase it. I think it's also possible to include this as a third option that may be the best of keeping the tests and having a GH Action that alerts authors to read the specific documentation about how to interpret the code coverage results for the client and the server and also reset the cache in a way that can be tracked / discussed when it comes up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: +1 to the link to docs and that this could be its own third option, even if we are already implementing the happo option.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me -- I'll gauge folk's feelings at the FE/BE check-in and update the ADR accordingly.

@rogeruiz rogeruiz self-assigned this Jun 1, 2023
Copy link
Contributor

@YanZ777 YanZ777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think another few options would be to

  • maintain current test coverage and create documentation around resetting the cache
  • lower the test coverage number
  • review current test coverage and create tickets that increases the test coverage

I also feel a little uneasy with going down the Happo Approach as I worry about that becoming something that folks ignore.

2. **Addition of code that is _less_ tested than the total coverage and _should be tested more_.** This is the scenario that we want to address with our code coverage strategy.
3. **Addition of code that is _less_ tested than the total coverage and _doesn't need to be tested more_.** The difference between this scenario and scenario 2 can be subjective, but one can imagine a scenario in which a Go database model is created and doesn't require any tests.

Our current strategy of never allowing the total coverage to decrease attempts to address scenario 2, but it also unfairly flags scenarios 1 and 3. We also have the option to manually reset the test coverage floor, but determining whether that is appropriate (i.e. discerning which scenario is applicable to a given PR) can be difficult and time-consuming. Furthermore, any combination of the three scenarios can be applicable to a given PR -- it doesn't have to be just one -- which can make it extra difficult to decide whether a coverage decrease is legitimate.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any data for how often scenario 1 and 3 happen? In my memory, I can recall it happening twice a year which to me doesn't suggest we really have a problem. We have hundreds of PRs a year that the current process ensures we don't have a coverage decrease. Making a change for such an infrequent event doesn't really make a lot of sense to me, personally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll also say that I've seen the test coverage problem come up many times, and most of the time the problem is that folks haven't rebased their PR or they need to increase test coverage. IMHO, we should just get rid of the coverage check instead of making it optional, as I believe folks will just ignore it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahobson Thanks Drew. Is this supposed to read "should just get rid of" or "shouldn't just get rid of"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rogeruiz IMHO, if we think the test coverage problem is major enough to make the checks optional, we should just remove the checks entirely.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking about this, and I think I'm wrong. I don't have any data to support my position that making the check optional will actually decrease test coverage.

This proposed change is one that we could easily undo if we needed to.

How about this: We note the coverage before putting this change in, and then one month later, at BE/FE check in, we report on the coverage numbers and how they compare to before the change. We can this discuss in the check in if the changes are concerning or not and if we want to keep the checks optional. It can become something we bring up monthly at the check in to ensure we aren't drifting the wrong way.

Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the PR to record the coverage on merge to main to the govdev environment and we now have a dashboard.

image

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahobson I think this plan makes a lot of sense -- I can mention in the chosen option that we will review it after a month. I should also note that a fairer comparison would be with the coverage if we didn't make the check optional (which we can't know), which would also potentially decrease every time we manually reset it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: Good point @michaelvaldes-truss. I wonder if to make that comparison match better we could compare the new change in test coverage against historical data of how coverage has changed previously with the required check in place since we have reset the cache a few times.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also want to address the very valid concerns expressed in comments on this PR -- that making the check optional would make it too easy for coverage to decrease. My assessment is that the current strategy already allows for decreases when folks manually reset the limit, and this can happen for very valid reasons (scenarios 2 and 3 described in the ADR). Analyzing coverage decreases is difficult to do automatically, so the intention of the Happo Approach was to put it entirely in the hands of the PR reviewers and also make it easier to access information relevant to that analysis.

I added these points to the ADR and also introduced a third option, Lenience for Tiny Coverage Decreases, which is a middle ground that attempts to partially address these concerns.

@YanZ777 @kctruss

Copy link
Contributor

@ruizajtruss ruizajtruss Jun 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do want to note that while I support the third option presented - there's a small chance there could be a .2% difference if you pull main to fix a .1% diff in coverage. That's what happened to me this last go round with coverage issues. Little more detail in this thread for anyone curious.

@YanZ777
Copy link
Contributor

YanZ777 commented Jun 10, 2023

I've thought about another option to consider, and that's to see if we can just have PRs run test coverage only on new files.

I think regardless of whichever option gets picked, we should analyze which packages have the lowest coverage below a certain threshold and write up tickets so that tests can be added to increase the coverage.

## Decision Outcome

<!-- * Chosen Option: -->
Chosen Option: *The Happo Approach*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Will this ADR be reassessed in the future? Is the plan to have the test coverage be optional forever?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like maybe we'll re-assess after a month to see if things are going well: https://github.com/transcom/mymove-docs/pull/350/files#r1221538465

Though if things aren't going how we'd like them to, we'd likely then have to decide how we want to correct things.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks -- I added a note in the ADR to reflect this!

michaelvaldes-truss and others added 2 commits June 12, 2023 12:14
@ghost ghost marked this pull request as ready for review June 12, 2023 22:32
@ghost ghost self-requested a review June 12, 2023 22:32
Copy link
Contributor

@ruizajtruss ruizajtruss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve in support of the third option presented

Copy link
Contributor

@mandaem mandaem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked the Happo Approach best

Copy link
Contributor

@YanZ777 YanZ777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the middle ground is reasonable for the time being.
I would like further discussion on what Kyle mentioned in the BE / FE checkin, which is assessment on if this test coverage check is valuable or not in the first place.
If we do decide that it is valuable to have, I would also like us also consider adoption of a long term strategy of creating a Jira epic ticket to increase coverage for code that doesn't have good test coverage.

@ahobson
Copy link
Contributor

ahobson commented Jun 16, 2023

As a datapoint, in the week that we have had test coverage checks disabled

check 2024-06-09 2024-06-16
client 73.7 74.0
server 77.7 77.6

Copy link
Contributor

@mmayo-truss mmayo-truss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate your work on this ADR and the conversation that it has encouraged. I approve this in favor of the "Happo Approach." I think that makes the most sense based on the discussion I have heard.

@ghost ghost merged commit c8eead1 into main Jun 23, 2023
@ghost ghost deleted the MB-16176-Create_coverage_adr branch June 23, 2023 20:47
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

10 participants