Stars
models for grocery shopping behavior (Wan et al, CIKM'18, Wan et al, WWW'17)
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
A curated list of awesome open source workflow engines
Apache Druid: a high performance real-time analytics database.