Skip to content

Commit

Permalink
Updating title of Spark for Preprocessing to mention data
Browse files Browse the repository at this point in the history
  • Loading branch information
ryguyrg committed May 6, 2015
1 parent 264a43c commit fac9326
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion integration/apache-spark/apache-spark.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ The infrastructure is set up using Docker containers, there are dedicated contai
* [Presentation: Combining Neo4j and Apache Spark using Docker]

[[preprocessing]]
== Spark Preprocessing
== Spark for Data Preprocessing

One example of pre-processing raw data (Chicago Crime dataset) into a format that's well suited for import into Neo4j, was demonstrated by http://twitter.com/markhneedham[Mark Needham].
He combined a number of functions into a Spark-job that takes the existing data, cleans and aggregates it and outputs fragments which are recombined later to larger files.
Expand Down

0 comments on commit fac9326

Please sign in to comment.