Highlights
- Pro
Starred repositories
Free and Open Source, Distributed, RESTful Search Engine
The java implementation of Apache Dubbo. An RPC and microservice framework.
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
The official home of the Presto distributed SQL query engine for big data
Apache Pulsar - distributed pub-sub messaging system
Apache Druid: a high performance real-time analytics database.
MIT Deep Learning Book in PDF format (complete and parts) by Ian Goodfellow, Yoshua Bengio and Aaron Courville
Apache Doris is an easy-to-use, high performance and unified analytics database.
OpenRefine is a free, open source power tool for working with messy data and improving it
Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
🔎 Open source distributed and RESTful search engine.
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance …
Apache Beam is a unified programming model for Batch and Streaming data processing.
Alluxio, data orchestration for analytics and machine learning in the cloud
A Flexible and Powerful Parameter Server for large-scale machine learning
A cross-language remote procedure call(RPC) framework for rapid development of high performance distributed services.
Apache Pinot - A realtime distributed OLAP datastore
Upserts, Deletes And Incremental Processing on Big Data.