Skip to content

Commit

Permalink
Merge pull request twitter#1267 from rubanm/rubanm/jdbc_macros_merged
Browse files Browse the repository at this point in the history
Adds jdbc macros from internal
  • Loading branch information
ianoc committed Jul 2, 2015
2 parents c394a1c + 9c93939 commit 500ab80
Show file tree
Hide file tree
Showing 28 changed files with 1,747 additions and 66 deletions.
8 changes: 8 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,14 @@ matrix:
env: BUILD="base" TEST_TARGET="scalding-serialization scalding-serialization-macros"
script: "scripts/run_test.sh"

- scala: 2.10.5
env: BUILD="base" TEST_TARGET="scalding-db scalding-db-macros"
script: "scripts/run_test.sh"

- scala: 2.11.7
env: BUILD="base" TEST_TARGET="scalding-db scalding-db-macros"
script: "scripts/run_test.sh"

- scala: 2.10.5
env: BUILD="test tutorials and matrix tutorials and repl" TEST_TARGET="scalding-repl"
script:
Expand Down
15 changes: 15 additions & 0 deletions project/Build.scala
Original file line number Diff line number Diff line change
Expand Up @@ -214,6 +214,8 @@ object ScaldingBuild extends Build {
scaldingJdbc,
scaldingHadoopTest,
scaldingMacros,
scaldingDb,
scaldingDbMacros,
maple,
executionTutorial,
scaldingSerialization,
Expand Down Expand Up @@ -522,4 +524,17 @@ object ScaldingBuild extends Build {
)
}
).dependsOn(scaldingCore)

lazy val scaldingDb = module("db").dependsOn(scaldingCore)

lazy val scaldingDbMacros = module("db-macros").settings(
libraryDependencies <++= (scalaVersion) { scalaVersion => Seq(
"org.scala-lang" % "scala-library" % scalaVersion,
"org.scala-lang" % "scala-reflect" % scalaVersion,
"com.twitter" %% "bijection-macros" % bijectionVersion
) ++ (if(isScala210x(scalaVersion)) Seq("org.scalamacros" %% "quasiquotes" % "2.0.1") else Seq())
},
addCompilerPlugin("org.scalamacros" % "paradise" % "2.0.1" cross CrossVersion.full)
).dependsOn(scaldingDb, scaldingMacros)

}
82 changes: 82 additions & 0 deletions scalding-db-macros/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
## Scalding JDBC Macros

Provides macros to interop between Scala case classes and relational database / SQL column definitions.

For a case class T, the macro-generated `ColumnDefinitionProvider[T]` provides:
1. `ColumnDefinition`s for the corresponding DB table columns
2. `ResultSetExtractor[T]` for extracting records from `java.sql.ResultSet` into objects of type `T`

Also provided are `TupleConverter`, `TupleSetter` and `cascading.tuple.Fields` for use with Cascading.
`DBTypeDescriptor[T]` is the top-level class that contains all of the above.

### Illustration

(in the REPL)

Necessary imports:

scalding> import com.twitter.scalding.db_
scalding> import com.twitter.scalding.db.macros._

Case class representing your DB schema:

scalding> case class ExampleDBRecord(
| card_id: Long,
| tweet_id: Long,
| created_at: Option[java.util.Date],
| deleted: Boolean = false)
defined class ExampleDBRecord

Get the macro-generated converters:

scalding> val dbTypeInfo = implicitly[DBTypeDescriptor[ExampleDBRecord]]
dbTypeInfo: com.twitter.scalding.db.DBTypeDescriptor[ExampleDBRecord] = $anon$6@7b07168

scalding> val columnDefn = dbTypeInfo.columnDefn
columnDefn: com.twitter.scalding.db.ColumnDefinitionProvider[ExampleDBRecord] = $anon$6$$anon$2@53328a4f

Macro-generated SQL column definitions:

scalding> columnDefn.columns
res0: Iterable[com.twitter.scalding.db.ColumnDefinition] =
List(
ColumnDefinition(BIGINT,ColumnName(card_id),NotNullable,None,None),
ColumnDefinition(BIGINT,ColumnName(tweet_id),NotNullable,None,None),
ColumnDefinition(DATETIME,ColumnName(created_at),Nullable,None,None),
ColumnDefinition(BOOLEAN,ColumnName(deleted),NotNullable,None,Some(false))
)

Macro-generated Cascading fields:

scalding> dbTypeInfo.fields
res1: cascading.tuple.Fields = 'card_id', 'tweet_id', 'created_at', 'deleted | long, long, Date, boolean


### Supported Mappings

Scala type | SQL type
------------- | -------------
`Int` | `INTEGER`
`Long` | `BIGINT`
`Short` | `SMALLINT`
`Double` | `DOUBLE`
`@varchar @size(20) String `| `VARCHAR(20)`
`@text String` | `TEXT`
`java.util.Date` | `DATETIME`
`@date java.util.Date` | `DATE`
`Boolean` | `BOOLEAN`
| <sub><sup>(`BOOLEAN` is used if creating a new table at write time, but `BOOL` and `TINYINT` are also supported for reading existing columns)</sup></sub>

* Annotations are used for String types to clearly distinguish between TEXT and VARCHAR column types
* Scala `Option`s can be used to denote columns that are `NULLABLE` in the DB
* `java.lang.*` types are not supported. For e.g. `Integer` (`java.lang.Integer`) does not work

## Nested case class

Nested case classes can be used as a workaround for the 22-size limitation. It can also be used for logically grouping the table columns. Nested case classes are flattened in left to right order. For example:
```scala
case class Person(id: Long, name: String, location: Location)
case class Location(geo: GeoCode, doorNum: Int, street: String, city: String)
case class GeoCode(lat: Long, lng: Long)
```
is flattened to a table schema with columns `id`, `name`, `lat`, `lng`, `doorNum`, `street`, `city`.
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
package com.twitter.scalding.db.macros

import scala.language.experimental.macros
import com.twitter.scalding.db.macros.impl._
import com.twitter.scalding.db.{ ColumnDefinitionProvider, DBTypeDescriptor }

// This is the sealed base trait for scala runtime annotiations used by the JDBC macros.
// These will read from these macros as a means to annotate fields to make up for the missing
// extra type information JDBC wants but is not in the jvm types.
sealed trait ScaldingDBAnnotation

// This is the size in characters for a char field
// For integers its really for display purposes
@scala.annotation.meta.getter
class size(val size: Int) extends annotation.StaticAnnotation with ScaldingDBAnnotation

// JDBC TEXT type, this forces the String field in question to be a text type
@scala.annotation.meta.getter
class text() extends annotation.StaticAnnotation with ScaldingDBAnnotation

// JDBC VARCHAR type, this forces the String field in question to be a text type
@scala.annotation.meta.getter
class varchar() extends annotation.StaticAnnotation with ScaldingDBAnnotation

// JDBC DATE type, this toggles a java.util.Date field to be JDBC Date.
// It will default to DATETIME to preserve the full resolution of java.util.Date
@scala.annotation.meta.getter
class date() extends annotation.StaticAnnotation with ScaldingDBAnnotation

// This is the entry point to explicitly calling the JDBC macros.
// Most often the implicits will be used in the package however
object DBMacro {
def toColumnDefinitionProvider[T]: ColumnDefinitionProvider[T] = macro ColumnDefinitionProviderImpl[T]
def toDBTypeDescriptor[T]: DBTypeDescriptor[T] = macro DBTypeDescriptorImpl[T]
}
Loading

0 comments on commit 500ab80

Please sign in to comment.