proto2schema 1/n: Schema generation #4727

alxmrs · 2020-02-21T23:57:05Z

This is one of multiple PRs that will generate schemas from manifests via protobufs. So far, this creates a copy-and-past BUILD rule using a Kotlin CLI. Later, this will change to actually generate Kotlin jars.

This PR also introduces the first use of Kotlin Poet in the project. Schema generation via this library is exercised in a test. Once @piotrswigon 's efforts to serialize manifests (read: schemas) into protos are accomplished, we'll be able to adapt this CL to generate schemas from manifests end-to-end.

I anticipate that the names for the build macros don't quite fit our project, I'm happy to take suggestions (a nod to you, @csilvestrini).

commit 0339c06 Author: Alex Rosengarten <[email protected]> Date: Fri Feb 21 14:06:11 2020 -0800 Add Kotlinpoet to project

commit 1e8df40 Author: Alex Rosengarten <[email protected]> Date: Fri Feb 21 14:44:59 2020 -0800 updated to internal versions commit 0339c06 Author: Alex Rosengarten <[email protected]> Date: Fri Feb 21 14:06:11 2020 -0800 Add Kotlinpoet to project

piotrswigon · 2020-02-22T00:10:19Z

java/arcs/core/tools/Proto2Schema.kt

+            echo("$protoFile --> $outFile")
+
+            val bytes = protoFile.readBytes()
+            val manifest = Manifest.parseFrom(bytes)


You'll need to sync.

We should also sync about what it is you need from the serialization. Let's maybe all sync on Monday. Current proto is set up for a single recipe - I think it would be best if we all talked together with @csilvestrini.

I like that idea a lot. Let's all get on the same page.

I'll do an initial merge now.

piotrswigon · 2020-02-22T00:18:06Z

javatests/arcs/core/tools/Proto2SchemaTest.kt

+class Proto2SchemaTest {
+
+    @Test
+    fun schemaGeneration_singleProperty() {


Shouldn't a test for proto2schema take a proto and produce a schema instead?

Yes, totally. Here, I'm testing an internal method, since I was waining on a few things (e.g. the schema protos) I am going to have a discussion with Ray today to discuss how the output Kt file should be structured.

piotrswigon · 2020-02-22T00:21:19Z

java/arcs/core/tools/Proto2Schema.kt

+        return name[0].toLowerCase() + name.substring(1)
+    }
+
+    fun generateSchemas(schemas: List<Schema>): Iterable<PropertySpec> {


I don't know why you generate schemas from the objects that are already schemas. It would seem more productive to generate them from the proto. WDYT?

The schema protos didn't exist at the time that I wrote this PR (I will sync now). I wanted to get started on learning how to do generation in KtPoet asap.
I'll revise this to use the protos that you landed.

piotrswigon · 2020-02-22T00:22:34Z

third_party/java/arcs/build_defs/internal/schemas.bzl

+
+    return [DefaultInfo(files = depset([out]))]
+
+proto2schema = rule(


why proto2schema? What is the ultimate output of this rule? As far as I understand ProdEx it should be Plans, not schemas. I may however have some wrong ideas. @csilvestrini ?

This is a different story to recipe2plan. I'll chat you the ticket.

@mmandlis can speak to this: the most immediate need with codegeneration is creating schemas (and Types) from manifests.

I wonder if consolidating the schema and plan generation into a single rule would be a good idea? Are we likely to want to do them separately?

jasonwyatt · 2020-02-22T22:32:35Z

java/arcs/core/tools/BUILD

+
+package(default_visibility = ["//visibility:public"])
+
+kt_jvm_library(


If arcs/core/tools is going to consist of command-line JVM-based tooling, it might make sense to create a README.md in this directory to explain that.

jasonwyatt · 2020-02-22T22:33:09Z

java/arcs/core/tools/Proto2Schema.kt

+import com.squareup.kotlinpoet.PropertySpec
+import java.io.File
+
+class Proto2Schema : CliktCommand(


KDoc would be great here, including documentation on how the. tool is intended to be used.

jasonwyatt · 2020-02-22T22:34:06Z

java/arcs/core/tools/Proto2Schema.kt

+    help = """Generates Schemas and Types from a protobuf-serialized manifest.
+
+    This script reads schemas from a serialized manifest and generates Kotlin `Schema` and `Type` 
+    classes.""",


Use .trimIndent() unless you intend the second paragraph to be indented by 4 spaces when output.

Also, for readability:

help = """ Generates Schemas and Types from a protobuf-serialized manifest. This script reads schemas from a serialized manifest and generates Kotlin `Schema` and `Type` implementations. """.trimIndent(),

jasonwyatt · 2020-02-22T22:34:31Z

java/arcs/core/tools/Proto2Schema.kt

+    val outdir by option(help = "output directory; defaults to '.'").file(fileOkay = false)
+    val outfile by option(help = "output filename; if omitted")
+    val packageName by option(help = "scope to specified package; default: 'arcs'").default("arcs")
+    val protos by argument(help = "paths to protobuf-serialized manifests")
+        .file(exists = true).multiple()


Can these be private?

jasonwyatt · 2020-02-22T22:38:12Z

java/arcs/core/tools/Proto2Schema.kt

+    override fun run() {
+        protos.forEach { protoFile ->
+            val outFile = outputFile(protoFile)
+            echo("$protoFile --> $outFile")
+
+            val bytes = protoFile.readBytes()
+            val envelope = RecipeEnvelopeProto.parseFrom(bytes)
+
+            outFile.writeBytes(protoFile.readBytes())
+        }
+    }


Could make this a function-expression:

override fun run() = protos.forEach { protoFile -> val outFile = outputFile(protoFile) echo("$protoFile --> $outFile") val bytes = protoFile.readBytes() val envelope = RecipeEnvelopeProto.parseFrom(bytes) outFile.writeBytes(protoFile.readBytes()) }

jasonwyatt · 2020-02-22T22:42:17Z

java/arcs/core/tools/Proto2Schema.kt

+    }
+
+    /** Produces a File object per user specification, or with default values. */
+    fun outputFile(inputFile: File): File {


Can this be private?

jasonwyatt · 2020-02-22T22:47:24Z

java/arcs/core/tools/Proto2Schema.kt

+        val schemaClass = ClassName("arcs.core.data", "Schema")
+        val schemaNameClass = ClassName("arcs.core.data", "SchemaName")
+        val schemaFieldsClass = ClassName("arcs.core.data", "SchemaFields")


If this tools package can depend on //java/arcs/core/data, prefer using Kotlin reflection (will need a new maven library import in the WORKSPACE/third_party for this) and KotlinPoet's .asTypeName:

val schemaClass = Schema::class.createType().asTypeName()

This will allow us to more easily and safely refactor the classes you're using here.

I think you could probably make that shorter using an extension function like Schema::class.toTypeName()

jasonwyatt · 2020-02-22T22:50:18Z

java/arcs/core/tools/Proto2Schema.kt

+        val schemaClass = ClassName("arcs.core.data", "Schema")
+        val schemaNameClass = ClassName("arcs.core.data", "SchemaName")
+        val schemaFieldsClass = ClassName("arcs.core.data", "SchemaFields")
+        return schemas.map {


This lambda's body is complex enough to warrant its own dedicated function.

return schemas.map(::generateSchemaSpec) // ... private fun generateSchemaSpec(schema: Schema): PropertySpec { // ... }

This would let you test generateSchemaSpec in isolation.

jasonwyatt · 2020-02-22T22:52:34Z

java/arcs/core/tools/Proto2Schema.kt

+                    .addStatement("%T(", schemaClass)
+                    .indent()
+                    .addStatement("listOf(")
+                    .indent()
+                    .apply {
+                        it.names.forEachIndexed { index, name ->
+                            if (index > 0) addStatement(",%T(%S)", schemaNameClass, name.name)
+                            else addStatement("%T(%S)", schemaNameClass, name.name)
+                        }
+                    }
+                    .unindent()
+                    .addStatement("),")
+                    .addStatement("%T(", schemaFieldsClass)
+                    .indent()
+                    .addStatement("singletons = mapOf(")
+                    .indent()
+                    .apply {
+                        val entries = it.fields.singletons.entries
+                        entries.forEachIndexed { index, entry ->
+                            when (entry.value.tag) {
+                                FieldType.Tag.EntityRef -> add(
+                                    "%S to %T(%S)",
+                                    entry.key,
+                                    FieldType.EntityRef::class,
+                                    (entry.value as FieldType.EntityRef).schemaHash
+                                )
+                                FieldType.Tag.Primitive -> add(
+                                    "%S to %T.%L",
+                                    entry.key,
+                                    FieldType::class,
+                                    (entry.value as FieldType.Primitive).primitiveType
+                                )
+                            }
+                            if (index != entries.size - 1) add(",")
+                            add("\n")
+                        }
+                    }
+                    .unindent()
+                    .addStatement("),")
+                    .addStatement("collections = mapOf(")
+                    .indent()
+                    .apply {
+                        val entries = it.fields.collections.entries
+                        entries.forEachIndexed { index, entry ->
+                            when (entry.value.tag) {
+                                FieldType.Tag.EntityRef -> add(
+                                    "%S to %T(%S)",
+                                    entry.key,
+                                    FieldType.EntityRef::class,
+                                    (entry.value as FieldType.EntityRef).schemaHash
+                                )
+                                FieldType.Tag.Primitive -> add(
+                                    "%S to %T.%L",
+                                    entry.key,
+                                    FieldType::class,
+                                    (entry.value as FieldType.Primitive).primitiveType
+                                )
+                            }
+                            if (index != entries.size - 1) add(",")
+                            add("\n")
+                        }
+                    }
+                    .unindent()
+                    .addStatement(")")
+                    .unindent()
+                    .addStatement("),")
+                    .addStatement("%S", it.hash)
+                    .addStatement(")")
+                    .build())


This is way too much logic to exist in one monolith. Please break this up into smaller, more-teestable and more-readable functions.

jasonwyatt · 2020-02-22T22:54:09Z

javatests/arcs/core/tools/Proto2SchemaTest.kt

+        assertThat(schemaProperty.toString()).isEqualTo("""
+            |val sliceSchema: arcs.core.data.Schema = arcs.core.data.Schema(
+            |    listOf(
+            |        arcs.core.data.SchemaName("Slice")
+            |    ),
+            |    arcs.core.data.SchemaFields(
+            |        singletons = mapOf(
+            |            "num" to arcs.core.data.FieldType.Number,
+            |            "flg" to arcs.core.data.FieldType.Boolean,
+            |            "txt" to arcs.core.data.FieldType.Text
+            |        ),
+            |        collections = mapOf(
+            |        )
+            |    ),
+            |    "f4907f97574693c81b5d62eb009d1f0f209000b8"
+            |    )
+            |
+            |""".trimMargin())


This will be very brittle and hard to adjust as schema-generation evolves over time. Instead - prefer testing the actual contents of the generated specs.

csilvestrini

I'll wait for all of Jason's code style suggestions to be addressed before reviewing the Kotlin files properly

csilvestrini · 2020-02-25T02:14:37Z

java/arcs/core/data/testdata/BUILD

+
+proto2schema(
+    name = "example_generation",
+    srcs = [":example"],


You could consider renaming example to something like example_pb so that it's easier to tell what it generates.

csilvestrini · 2020-02-25T02:14:49Z

java/arcs/core/tools/BUILD

+
+package(default_visibility = ["//visibility:public"])
+
+kt_jvm_library(


csilvestrini · 2020-02-25T02:18:03Z

java/arcs/core/tools/Proto2Schema.kt

+    /** Produces a File object per user specification, or with default values. */
+    fun outputFile(inputFile: File): File {
+        val outputName = outfile ?: inputFile.nameWithoutExtension + ".kt"
+        val outputPath = outdir ?: System.getProperty("user.dir")


Optional, but you could consider not providing default values here. It might be better to opt for slightly more cumbersome but definitely more correct command line args. We don't expect people will be running this tool themselves anyway, they'll use our bazel macros, which will do the right thing.

csilvestrini · 2020-02-25T02:20:52Z

third_party/java/arcs/build_defs/internal/schemas.bzl

+
+    args.add_all("--outfile", [output_name])
+    args.add_all("--outdir", [out.dirname])
+    args.add_all("--package-name", [ctx.attr.package])


I think these three can use args.add instead of args.add_all

csilvestrini · 2020-02-25T02:21:53Z

third_party/java/arcs/build_defs/internal/schemas.bzl

+
+    return [DefaultInfo(files = depset([out]))]
+
+proto2schema = rule(


I wonder if consolidating the schema and plan generation into a single rule would be a good idea? Are we likely to want to do them separately?

alxmrs · 2020-03-02T21:16:08Z

After discussions last week, the strategy for codegeneration has changed. As a result, I'm going to close this PR.

The feedback here will be useful to future PRs, however.

alxmrs added 11 commits February 21, 2020 14:10

Squashed commit of the following:

15c899f

commit 0339c06 Author: Alex Rosengarten <[email protected]> Date: Fri Feb 21 14:06:11 2020 -0800 Add Kotlinpoet to project

Compiling proto2schema java binary

088370c

Quick CLI tool

974ed73

Created rule and example for plant2schema

0d72e06

Squashed commit of the following:

c62f51a

commit 1e8df40 Author: Alex Rosengarten <[email protected]> Date: Fri Feb 21 14:44:59 2020 -0800 updated to internal versions commit 0339c06 Author: Alex Rosengarten <[email protected]> Date: Fri Feb 21 14:06:11 2020 -0800 Add Kotlinpoet to project

parse protos

5b4dc82

first iteration: schema tests

2e4c344

Got single test passing

7c498ae

cleaned up test

2fdd00b

Merge branch 'master' of github.com:PolymerLabs/arcs into proto2schema

449b7ec

lint fixes

953b9af

alxmrs requested review from jasonwyatt, cromwellian, csilvestrini and piotrswigon February 21, 2020 23:57

googlebot added the cla: yes label Feb 21, 2020

piotrswigon reviewed Feb 22, 2020

View reviewed changes

fixed given merge

19cfa00

jasonwyatt requested changes Feb 22, 2020

View reviewed changes

csilvestrini reviewed Feb 25, 2020

View reviewed changes

alxmrs closed this Mar 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proto2schema 1/n: Schema generation #4727

proto2schema 1/n: Schema generation #4727

alxmrs commented Feb 21, 2020

piotrswigon Feb 22, 2020 •

edited

Loading

alxmrs Feb 22, 2020

alxmrs Feb 22, 2020

piotrswigon Feb 22, 2020

alxmrs Feb 22, 2020

piotrswigon Feb 22, 2020

alxmrs Feb 22, 2020

piotrswigon Feb 22, 2020

alxmrs Feb 22, 2020

alxmrs Feb 22, 2020

csilvestrini Feb 25, 2020

jasonwyatt Feb 22, 2020

csilvestrini Feb 25, 2020

jasonwyatt Feb 22, 2020

jasonwyatt Feb 22, 2020

jasonwyatt Feb 22, 2020

jasonwyatt Feb 22, 2020

jasonwyatt Feb 22, 2020

jasonwyatt Feb 22, 2020

jasonwyatt Feb 22, 2020

cromwellian Feb 24, 2020

jasonwyatt Feb 22, 2020

jasonwyatt Feb 22, 2020

jasonwyatt Feb 22, 2020

jasonwyatt Feb 22, 2020

csilvestrini left a comment

csilvestrini Feb 25, 2020

csilvestrini Feb 25, 2020

csilvestrini Feb 25, 2020

csilvestrini Feb 25, 2020

csilvestrini Feb 25, 2020

alxmrs commented Mar 2, 2020


		return [DefaultInfo(files = depset([out]))]

		proto2schema = rule(


		package(default_visibility = ["//visibility:public"])

		kt_jvm_library(

proto2schema 1/n: Schema generation #4727

proto2schema 1/n: Schema generation #4727

Conversation

alxmrs commented Feb 21, 2020

piotrswigon Feb 22, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

csilvestrini left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alxmrs commented Mar 2, 2020

piotrswigon Feb 22, 2020 •

edited

Loading