Skip to content

WLun001/spark-template

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spark template using Scala

Spark application template for running on Cloud

Quick Tips

  • Import Spark library at build.sbt
libraryDependencies ++= Seq(
  "org.apache.spark" % "spark-core_2.11" % "2.3.0",
  "org.apache.spark" % "spark-sql_2.11" % "2.3.0"
)
  • Spark only work with Scala version below 2.12 and Java 9 according to its documentation
  • Compile and create scala program using sbt sbt compile sbt package
  • Upload datasets on Cloud Storage - example

Example Use Case

  1. Running Spark application on Google Cloud Dataproc. Tutorial can be found here
  2. Save the output to a Parquet to Google Cloud Storage
  3. Import to Google BigQuery and further process it

Releases

No releases published

Packages

No packages published

Languages