-
-
pyspark-example-project Public
Forked from AlexIoannides/pyspark-example-projectExample project implementing best practices for PySpark ETL jobs and applications.
Python UpdatedFeb 19, 2019 -
hudi Public
Forked from apache/hudiSpark Library for Hadoop Upserts And Incrementals
Java Apache License 2.0 UpdatedDec 13, 2018 -
benchmarking Public
System benchmarks over JVM with JMH - SIMD (superscalar processing), Branch prediction, False sharing.
-
canvas-fingerprinting Public
POC of Canvas fingerprinting
-
merkle-tree Public
Merkle tree in functional style
-
pysparkling Public
Forked from svenkreiss/pysparklingA pure Python implementation of Apache Spark's RDD and DStream interfaces.
Python MIT License UpdatedJun 19, 2018 -
spark-graphx Public
Spark GraphX - Pregel, PageRank and Dijkstra on a social graph
-
kafka-scala-api Public
Samples for using Kafka within Spark Streaming and Akka Actors, Akka Streams
-
spark-algebird Public
Spark with probabilistic algortighmts - Bloom filter, HLL, QTree and Count-min sketch
-
fast-data-dev Public
Forked from lensesio/fast-data-devKafka Docker for development. Kafka, Zookeeper, Schema Registry, Kafka-Connect, Landoop Tools, 20+ connectors
Shell Apache License 2.0 UpdatedMay 12, 2017 -
spark-deployer Public
Forked from KKBOX/spark-deployerDeploy Spark cluster in an easy way.
Scala Apache License 2.0 UpdatedSep 13, 2016 -
cassandra-talks-scala Public
Forked from timcharper/cassandra-talks-scalaSimple integration library which enriches a Cassandra session so you can easily return Scala Futures and Akka Streams
Scala UpdatedMar 20, 2016