- 武汉
Stars
Apache Spark - A unified analytics engine for large-scale data processing
A platform to build and run apps that are elastic, agile, and resilient. SDK, libraries, and hosted environments.
CMAK is a tool for managing Apache Kafka clusters
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Lightweight, modular, and extensible library for functional programming.
Breeze is/was a numerical processing library for Scala.
REST job server for Apache Spark
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
GeoMesa is a suite of tools for working with big geo-spatial data in a distributed fashion.
GeoTrellis is a geographic data processing engine for high performance applications.
Livy is an open source REST interface for interacting with Apache Spark from anywhere
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
Essential Spark extensions and helper methods ✨😲
A Spark plugin for reading and writing Excel files
Akka Http directives implementing the CORS specifications defined by W3C
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Semi-automatic incremental construction and debugging of regular expressions for grok to parse logfiles for logstash http://logstash.net/ . Deployed at http://grokconstructor.appspot.com/ .
The Raster Foundry web application.
Template projects for GeoSpark, GeoSpark-SQL, GeoSpark-Viz
Support for operating on images via Apache Spark
GeoTrellis PointCloud library to work with any pointcloud data on Spark
Spark DataFrames for earth observation data
CGCL-codes / MURS
Forked from zx247549135/sparkMURS is a memory scheduler for in-memory computing, which tries to mitigate the memory pressure for multiple data processing tasks sharing the executor.