-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proto2schema 1/n: Schema generation #4727
Conversation
commit 0339c06 Author: Alex Rosengarten <[email protected]> Date: Fri Feb 21 14:06:11 2020 -0800 Add Kotlinpoet to project
commit 1e8df40 Author: Alex Rosengarten <[email protected]> Date: Fri Feb 21 14:44:59 2020 -0800 updated to internal versions commit 0339c06 Author: Alex Rosengarten <[email protected]> Date: Fri Feb 21 14:06:11 2020 -0800 Add Kotlinpoet to project
java/arcs/core/tools/Proto2Schema.kt
Outdated
echo("$protoFile --> $outFile") | ||
|
||
val bytes = protoFile.readBytes() | ||
val manifest = Manifest.parseFrom(bytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll need to sync.
We should also sync about what it is you need from the serialization. Let's maybe all sync on Monday. Current proto is set up for a single recipe - I think it would be best if we all talked together with @csilvestrini.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that idea a lot. Let's all get on the same page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll do an initial merge now.
class Proto2SchemaTest { | ||
|
||
@Test | ||
fun schemaGeneration_singleProperty() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't a test for proto2schema take a proto and produce a schema instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, totally. Here, I'm testing an internal method, since I was waining on a few things (e.g. the schema protos) I am going to have a discussion with Ray today to discuss how the output Kt file should be structured.
return name[0].toLowerCase() + name.substring(1) | ||
} | ||
|
||
fun generateSchemas(schemas: List<Schema>): Iterable<PropertySpec> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why you generate schemas from the objects that are already schemas. It would seem more productive to generate them from the proto. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The schema protos didn't exist at the time that I wrote this PR (I will sync now). I wanted to get started on learning how to do generation in KtPoet asap.
I'll revise this to use the protos that you landed.
|
||
return [DefaultInfo(files = depset([out]))] | ||
|
||
proto2schema = rule( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why proto2schema? What is the ultimate output of this rule? As far as I understand ProdEx it should be Plans, not schemas. I may however have some wrong ideas. @csilvestrini ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a different story to recipe2plan. I'll chat you the ticket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mmandlis can speak to this: the most immediate need with codegeneration is creating schemas (and Type
s) from manifests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if consolidating the schema and plan generation into a single rule would be a good idea? Are we likely to want to do them separately?
|
||
package(default_visibility = ["//visibility:public"]) | ||
|
||
kt_jvm_library( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If arcs/core/tools
is going to consist of command-line JVM-based tooling, it might make sense to create a README.md in this directory to explain that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea.
import com.squareup.kotlinpoet.PropertySpec | ||
import java.io.File | ||
|
||
class Proto2Schema : CliktCommand( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
KDoc would be great here, including documentation on how the. tool is intended to be used.
help = """Generates Schemas and Types from a protobuf-serialized manifest. | ||
|
||
This script reads schemas from a serialized manifest and generates Kotlin `Schema` and `Type` | ||
classes.""", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use .trimIndent()
unless you intend the second paragraph to be indented by 4 spaces when output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, for readability:
help = """
Generates Schemas and Types from a protobuf-serialized manifest.
This script reads schemas from a serialized manifest and generates Kotlin `Schema` and
`Type` implementations.
""".trimIndent(),
val outdir by option(help = "output directory; defaults to '.'").file(fileOkay = false) | ||
val outfile by option(help = "output filename; if omitted") | ||
val packageName by option(help = "scope to specified package; default: 'arcs'").default("arcs") | ||
val protos by argument(help = "paths to protobuf-serialized manifests") | ||
.file(exists = true).multiple() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can these be private
?
override fun run() { | ||
protos.forEach { protoFile -> | ||
val outFile = outputFile(protoFile) | ||
echo("$protoFile --> $outFile") | ||
|
||
val bytes = protoFile.readBytes() | ||
val envelope = RecipeEnvelopeProto.parseFrom(bytes) | ||
|
||
outFile.writeBytes(protoFile.readBytes()) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could make this a function-expression:
override fun run() = protos.forEach { protoFile ->
val outFile = outputFile(protoFile)
echo("$protoFile --> $outFile")
val bytes = protoFile.readBytes()
val envelope = RecipeEnvelopeProto.parseFrom(bytes)
outFile.writeBytes(protoFile.readBytes())
}
} | ||
|
||
/** Produces a File object per user specification, or with default values. */ | ||
fun outputFile(inputFile: File): File { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be private?
val schemaClass = ClassName("arcs.core.data", "Schema") | ||
val schemaNameClass = ClassName("arcs.core.data", "SchemaName") | ||
val schemaFieldsClass = ClassName("arcs.core.data", "SchemaFields") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this tools package can depend on //java/arcs/core/data
, prefer using Kotlin reflection (will need a new maven library import in the WORKSPACE
/third_party for this) and KotlinPoet's .asTypeName
:
val schemaClass = Schema::class.createType().asTypeName()
This will allow us to more easily and safely refactor the classes you're using here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could probably make that shorter using an extension function like Schema::class.toTypeName()
val schemaClass = ClassName("arcs.core.data", "Schema") | ||
val schemaNameClass = ClassName("arcs.core.data", "SchemaName") | ||
val schemaFieldsClass = ClassName("arcs.core.data", "SchemaFields") | ||
return schemas.map { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lambda's body is complex enough to warrant its own dedicated function.
return schemas.map(::generateSchemaSpec)
// ...
private fun generateSchemaSpec(schema: Schema): PropertySpec {
// ...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would let you test generateSchemaSpec
in isolation.
.addStatement("%T(", schemaClass) | ||
.indent() | ||
.addStatement("listOf(") | ||
.indent() | ||
.apply { | ||
it.names.forEachIndexed { index, name -> | ||
if (index > 0) addStatement(",%T(%S)", schemaNameClass, name.name) | ||
else addStatement("%T(%S)", schemaNameClass, name.name) | ||
} | ||
} | ||
.unindent() | ||
.addStatement("),") | ||
.addStatement("%T(", schemaFieldsClass) | ||
.indent() | ||
.addStatement("singletons = mapOf(") | ||
.indent() | ||
.apply { | ||
val entries = it.fields.singletons.entries | ||
entries.forEachIndexed { index, entry -> | ||
when (entry.value.tag) { | ||
FieldType.Tag.EntityRef -> add( | ||
"%S to %T(%S)", | ||
entry.key, | ||
FieldType.EntityRef::class, | ||
(entry.value as FieldType.EntityRef).schemaHash | ||
) | ||
FieldType.Tag.Primitive -> add( | ||
"%S to %T.%L", | ||
entry.key, | ||
FieldType::class, | ||
(entry.value as FieldType.Primitive).primitiveType | ||
) | ||
} | ||
if (index != entries.size - 1) add(",") | ||
add("\n") | ||
} | ||
} | ||
.unindent() | ||
.addStatement("),") | ||
.addStatement("collections = mapOf(") | ||
.indent() | ||
.apply { | ||
val entries = it.fields.collections.entries | ||
entries.forEachIndexed { index, entry -> | ||
when (entry.value.tag) { | ||
FieldType.Tag.EntityRef -> add( | ||
"%S to %T(%S)", | ||
entry.key, | ||
FieldType.EntityRef::class, | ||
(entry.value as FieldType.EntityRef).schemaHash | ||
) | ||
FieldType.Tag.Primitive -> add( | ||
"%S to %T.%L", | ||
entry.key, | ||
FieldType::class, | ||
(entry.value as FieldType.Primitive).primitiveType | ||
) | ||
} | ||
if (index != entries.size - 1) add(",") | ||
add("\n") | ||
} | ||
} | ||
.unindent() | ||
.addStatement(")") | ||
.unindent() | ||
.addStatement("),") | ||
.addStatement("%S", it.hash) | ||
.addStatement(")") | ||
.build()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is way too much logic to exist in one monolith. Please break this up into smaller, more-teestable and more-readable functions.
assertThat(schemaProperty.toString()).isEqualTo(""" | ||
|val sliceSchema: arcs.core.data.Schema = arcs.core.data.Schema( | ||
| listOf( | ||
| arcs.core.data.SchemaName("Slice") | ||
| ), | ||
| arcs.core.data.SchemaFields( | ||
| singletons = mapOf( | ||
| "num" to arcs.core.data.FieldType.Number, | ||
| "flg" to arcs.core.data.FieldType.Boolean, | ||
| "txt" to arcs.core.data.FieldType.Text | ||
| ), | ||
| collections = mapOf( | ||
| ) | ||
| ), | ||
| "f4907f97574693c81b5d62eb009d1f0f209000b8" | ||
| ) | ||
| | ||
|""".trimMargin()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be very brittle and hard to adjust as schema-generation evolves over time. Instead - prefer testing the actual contents of the generated specs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll wait for all of Jason's code style suggestions to be addressed before reviewing the Kotlin files properly
|
||
proto2schema( | ||
name = "example_generation", | ||
srcs = [":example"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could consider renaming example
to something like example_pb
so that it's easier to tell what it generates.
|
||
package(default_visibility = ["//visibility:public"]) | ||
|
||
kt_jvm_library( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea.
/** Produces a File object per user specification, or with default values. */ | ||
fun outputFile(inputFile: File): File { | ||
val outputName = outfile ?: inputFile.nameWithoutExtension + ".kt" | ||
val outputPath = outdir ?: System.getProperty("user.dir") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional, but you could consider not providing default values here. It might be better to opt for slightly more cumbersome but definitely more correct command line args. We don't expect people will be running this tool themselves anyway, they'll use our bazel macros, which will do the right thing.
|
||
args.add_all("--outfile", [output_name]) | ||
args.add_all("--outdir", [out.dirname]) | ||
args.add_all("--package-name", [ctx.attr.package]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these three can use args.add
instead of args.add_all
|
||
return [DefaultInfo(files = depset([out]))] | ||
|
||
proto2schema = rule( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if consolidating the schema and plan generation into a single rule would be a good idea? Are we likely to want to do them separately?
After discussions last week, the strategy for codegeneration has changed. As a result, I'm going to close this PR. The feedback here will be useful to future PRs, however. |
This is one of multiple PRs that will generate schemas from manifests via protobufs. So far, this creates a copy-and-past BUILD rule using a Kotlin CLI. Later, this will change to actually generate Kotlin jars.
This PR also introduces the first use of Kotlin Poet in the project. Schema generation via this library is exercised in a test. Once @piotrswigon 's efforts to serialize manifests (read: schemas) into protos are accomplished, we'll be able to adapt this CL to generate schemas from manifests end-to-end.
I anticipate that the names for the build macros don't quite fit our project, I'm happy to take suggestions (a nod to you, @csilvestrini).