-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MB-16176 - Create coverage ADR #350
Conversation
Continue blocking PRs that decrease test coverage. | ||
|
||
* `+` *No implementation effort* | ||
* `+` *Provides a way to waintain high test coverage with stricter enforcement* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: waintain
=> maintain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this -- we've got to waintain our spelling standards too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sold on the Happo approach here. My concern is that rather than alleviate a problem, this approach shifts the responsibility from the engineer doing the work onto the PR review process, which is already a bottleneck.
Personally, I'd rather see us spend time documenting and coaching on when to stop fighting the test results and communicate a cache reset - it should be less "break glass in case of emergency" and more "Hey, here's the situation".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving some comments and suggestions here. I think we should have another option that I mention below. This is in-line with @kctruss feedback about treating resetting the cache as a thing that could happen instead of something that should only rarely happen.
docs/adrs/0078-develop-strategy-for-ensuring-code-test-coverage.md
Outdated
Show resolved
Hide resolved
### Scenarios that result in a test coverage decrease | ||
1. **Removal of code that is _more_ tested than the total coverage.** If the total coverage is 77.0% and a block of code that is tested at 100% is deleted, the average coverage goes down. | ||
2. **Addition of code that is _less_ tested than the total coverage and _should be tested more_.** This is the scenario that we want to address with our code coverage strategy. | ||
3. **Addition of code that is _less_ tested than the total coverage and _doesn't need to be tested more_.** The difference between this scenario and scenario 2 can be subjective, but one can imagine a scenario in which a Go database model is created and doesn't require any tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these the only scenarios we'd like to discuss here? I think these are enough, but I'd like to hear if folks can think of another or perhaps expand on these scenarios a bit more with more examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: I think another example for scenario 3 would be adding logic to existing model_to_payload functions since those aren't unit tested and are only sometimes integration tested via the handler integration tests.
docs/adrs/0078-develop-strategy-for-ensuring-code-test-coverage.md
Outdated
Show resolved
Hide resolved
|
||
Our approach to Happo changes is currently the following: the Happo PR check is not a required check, but it reports failures and includes an easily-accessible link on the PR to Happo diffs which provide context around visual changes. Something similar could be implemented for code coverage checks with the following steps: | ||
1. Make the code coverage checks optional so that failures do not block a PR | ||
2. Generate a link automatically on the PR (perhaps through a comment) that provides more context behind the failure (including, but not limited to, the go-coverage.html artifact generated in circle-ci) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This link should also point to documentation around the coverage tests and how to increase it. I think it's also possible to include this as a third option that may be the best of keeping the tests and having a GH Action that alerts authors to read the specific documentation about how to interpret the code coverage results for the client and the server and also reset the cache in a way that can be tracked / discussed when it comes up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: +1 to the link to docs and that this could be its own third option, even if we are already implementing the happo option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense to me -- I'll gauge folk's feelings at the FE/BE check-in and update the ADR accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think another few options would be to
- maintain current test coverage and create documentation around resetting the cache
- lower the test coverage number
- review current test coverage and create tickets that increases the test coverage
I also feel a little uneasy with going down the Happo Approach as I worry about that becoming something that folks ignore.
2. **Addition of code that is _less_ tested than the total coverage and _should be tested more_.** This is the scenario that we want to address with our code coverage strategy. | ||
3. **Addition of code that is _less_ tested than the total coverage and _doesn't need to be tested more_.** The difference between this scenario and scenario 2 can be subjective, but one can imagine a scenario in which a Go database model is created and doesn't require any tests. | ||
|
||
Our current strategy of never allowing the total coverage to decrease attempts to address scenario 2, but it also unfairly flags scenarios 1 and 3. We also have the option to manually reset the test coverage floor, but determining whether that is appropriate (i.e. discerning which scenario is applicable to a given PR) can be difficult and time-consuming. Furthermore, any combination of the three scenarios can be applicable to a given PR -- it doesn't have to be just one -- which can make it extra difficult to decide whether a coverage decrease is legitimate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any data for how often scenario 1 and 3 happen? In my memory, I can recall it happening twice a year which to me doesn't suggest we really have a problem. We have hundreds of PRs a year that the current process ensures we don't have a coverage decrease. Making a change for such an infrequent event doesn't really make a lot of sense to me, personally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll also say that I've seen the test coverage problem come up many times, and most of the time the problem is that folks haven't rebased their PR or they need to increase test coverage. IMHO, we should just get rid of the coverage check instead of making it optional, as I believe folks will just ignore it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahobson Thanks Drew. Is this supposed to read "should just get rid of" or "shouldn't just get rid of"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rogeruiz IMHO, if we think the test coverage problem is major enough to make the checks optional, we should just remove the checks entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking about this, and I think I'm wrong. I don't have any data to support my position that making the check optional will actually decrease test coverage.
This proposed change is one that we could easily undo if we needed to.
How about this: We note the coverage before putting this change in, and then one month later, at BE/FE check in, we report on the coverage numbers and how they compare to before the change. We can this discuss in the check in if the changes are concerning or not and if we want to keep the checks optional. It can become something we bring up monthly at the check in to ensure we aren't drifting the wrong way.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahobson I think this plan makes a lot of sense -- I can mention in the chosen option that we will review it after a month. I should also note that a fairer comparison would be with the coverage if we didn't make the check optional (which we can't know), which would also potentially decrease every time we manually reset it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: Good point @michaelvaldes-truss. I wonder if to make that comparison match better we could compare the new change in test coverage against historical data of how coverage has changed previously with the required check in place since we have reset the cache a few times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also want to address the very valid concerns expressed in comments on this PR -- that making the check optional would make it too easy for coverage to decrease. My assessment is that the current strategy already allows for decreases when folks manually reset the limit, and this can happen for very valid reasons (scenarios 2 and 3 described in the ADR). Analyzing coverage decreases is difficult to do automatically, so the intention of the Happo Approach was to put it entirely in the hands of the PR reviewers and also make it easier to access information relevant to that analysis.
I added these points to the ADR and also introduced a third option, Lenience for Tiny Coverage Decreases, which is a middle ground that attempts to partially address these concerns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do want to note that while I support the third option presented - there's a small chance there could be a .2% difference if you pull main to fix a .1% diff in coverage. That's what happened to me this last go round with coverage issues. Little more detail in this thread for anyone curious.
I've thought about another option to consider, and that's to see if we can just have PRs run test coverage only on new files. I think regardless of whichever option gets picked, we should analyze which packages have the lowest coverage below a certain threshold and write up tickets so that tests can be added to increase the coverage. |
## Decision Outcome | ||
|
||
<!-- * Chosen Option: --> | ||
Chosen Option: *The Happo Approach* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Will this ADR be reassessed in the future? Is the plan to have the test coverage be optional forever?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like maybe we'll re-assess after a month to see if things are going well: https://github.com/transcom/mymove-docs/pull/350/files#r1221538465
Though if things aren't going how we'd like them to, we'd likely then have to decide how we want to correct things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks -- I added a note in the ADR to reflect this!
Co-authored-by: Roger Steve Ruiz <[email protected]>
Co-authored-by: Roger Steve Ruiz <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve in support of the third option presented
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked the Happo Approach best
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the middle ground is reasonable for the time being.
I would like further discussion on what Kyle mentioned in the BE / FE checkin, which is assessment on if this test coverage check is valuable or not in the first place.
If we do decide that it is valuable to have, I would also like us also consider adoption of a long term strategy of creating a Jira epic ticket to increase coverage for code that doesn't have good test coverage.
As a datapoint, in the week that we have had test coverage checks disabled
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate your work on this ADR and the conversation that it has encouraged. I approve this in favor of the "Happo Approach." I think that makes the most sense based on the discussion I have heard.
This PR should create an ADR that suggests improvements to the existing code test coverage checks with the intention to avoid PR roadblocks and make the checks more transparent.