-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds custom marshal/unmarshal to Go Statement struct, fixes #363, see sigstore/sigstore-go#326 #403
base: main
Are you sure you want to change the base?
Adds custom marshal/unmarshal to Go Statement struct, fixes #363, see sigstore/sigstore-go#326 #403
Conversation
…e-go#326 Signed-off-by: Andrew Gillis <[email protected]>
Signed-off-by: Andrew Gillis <[email protected]>
This gets a +1 from me! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sending this @gillisandrew . It seems like the need for these custom functions stems from the use of the soon-to-be deprecated in-toto-golang data structures (hence why #363 was closed).
Are these added functions needed for backward compatibility with pre-SLSA v1 Provenance attestations?
I think #363 was closed erroneously, as the issue here is mostly unrelated to the in-toto-golang data structures. We encountered the same bug in sigstore-go after migrating to the data structures in this repo. The issue is specifically about how the new data structures serialize to JSON. When using the new data structures, when you pass them to |
@marcelamelara as cody noted, the structs in this repo are the issue, the field names emitted in the json struct tags are incorrect. There's some background and alternative solutions in golang/protobuf#52 |
ugh, CI complains about golang 1.21, but I can confirm that wfm locally... |
we should update the run-go-tests.yml to use a newer version of the toolchain. |
That can be updated to always use what's in |
@codysoyland and @gillisandrew thanks for the clarifications. I'm fine with the tests you've added, thank you! But my preference would be to find an alternative way to emit structs with the right json tags from the start, rather than adding custom marshalling/unmarshalling functions because these will be needed in more places than statement.go. |
Does this seem like a reasonable alternative? https://github.com/favadi/protoc-go-inject-tag |
This has unfortunately been an issue with protoc-gen-go for years (see golang/protobuf#52). Looking at other workarounds... maybe this package could help? |
Thanks for the suggestion @codysoyland . My only concern with this package is their "we don't accept issues with bug reports or PRs with bug fixes" clause at the bottom of their README. I worry this package will be quite brittle compared to one that has a bigger community around it. |
Good callout, the one you posted looks promising! |
I'd be a bit more comfortable with a plugin like this one, or the tool I shared earlier. |
@marcelamelara given that encoding/json isn't necessarily compatible with protocol buffer messages1, updating the field names in json tags isn't enough. In fact, suppressing the json tags entirely could help by making the bug less subtle. Protojson should be used, be it by the end users or included in a custom marshal/unmarshal method. Given the limited number of structs in the library, I feel manually implementing the latter isn't especially onerous and avoids requiring additional tooling and dependencies. Footnotes |
@SantiagoTorres should I include the CI fix in the PR? @marcelamelara let me know if I should remove the implementation so the updated tests can be merged, perhaps moving the discussion to an issue. |
Concerns about whether or not re-tagging the json field name in the Go bindings fully resolves the issues you are currently encountering aside, we (as the upstream) repo can certainly take on one additional tool/dependency if it means that "it will just work" for downstream. (I get that it's never quite so simple ;) ) At the same time, I don't see a problem with prescribing the use of protojson for our Go bindings, and we can offer custom marshalling functions for v1 structs as a quick workaround. So really, I'd like to ultimately see both changes (what you've sent here + the corresponding functions in resource_descriptor.go) and the proto-gen-go changes (possibly in a different PR). And yes, if you have a fix for the CI, please feel free to include that in this PR. @TomHennen @pxp928 please chime in. |
@marcelamelara Yes, ideally downstream users won't have to worry about this too much. Having delved into ProtoJSON a bit further it seems the unmarshal may be too permissive here given that both proto field names and lowerCamelCase (or json_name) names are accepted. Overly permissive given that by the time the statement is unmarshaled the original field name isn't accessible for validation meaning invalid statements (i.e. ones using |
Signed-off-by: Andrew Gillis <[email protected]>
Signed-off-by: Andrew Gillis <[email protected]>
8f2ad55
to
a64ec07
Compare
@gillisandrew Interesting. So what you're saying is that protojson unmarshalling only behaves how we need it to if we have marshalled the struct with protojson as well. Or is there more to it? I'm thinking we may want to document a warning about this at the very least. |
@marcelamelara Sorry I could have explained that a little more clearly. The issue with using protojson to unmarshal (json -> struct) before validation is that it will accept either the JSON name (lowerCamelCase of field name unless json_name is specified) or field name (matching proto field name), but not both That behavior is specified in the standard:
This is an issue because the in-toto spec doesn't accept both (yet the generated libs do). Suppose a user parses an validates a struct with the following: func parseStatementJSON(jsonText []byte) (*v1.Statement, error) {
s := &v1.Statement{}
// Step 1: Unmarshal
err := protojson.Unmarshal(jsonText, s)
if err != nil {
return nil, fmt.Errorf("failed to unmarshal statement: %w", err)
}
// Step 2: Validate
err = s.Validate()
if err != nil {
return nil, fmt.Errorf("statement validation failed: %w", err)
}
// Step 3: Profit???
return s, nil
} They would (rightfully) expect that the statement satisfies the statement layer spec, but since protojson is used invalid statements that use proto field names pass validation e.g. {
"type": "https://in-toto.io/Statement/v1",
"subject": [
{
"name": "sub name",
"digest": {
"sha256": "2ce2e480e3c3f7ca0af83418d3ebaeedacee135dbac94bd946d7d84edabcdb64"
}
}
],
"predicate_type": "https://example.com/unknownPred2",
"predicate": {
"foo": {
"bar": "baz"
}
}
} Because Further, since protojson doesn't allow both the JSON name and the field name, it means any proto field name is essentially "reserved". Perhaps a good solution, killing two birds with one stone, is to provide a method to validate and unmarshal the json. Reflection can identify field name that don't match the JSON name and return an error when one is encountered. Or, going full circle, updating the json struct tags to match the spec and using encoding/json to handle the unmarshal (ensuring the cases it can't handle aren't included in protos). |
@gillisandrew I appreciate the more detailed explanation.
I will summarize what I think you're suggesting here:
For 1), I think it's not unreasonable to add this functionality to the existing |
This PR adds custom marshaler and unmarshaler to the statement struct to help users avoid incorrectly serializing in-toto statements. This fixes #363 and stems from a discussion in sigstore/sigstore-go#326