An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs

Scala 7,726 1,743 Updated Jan 4, 2025

keras-team / keras

Deep Learning for humans

Python 62,324 19,478 Updated Jan 5, 2025

jiegzhan / build-elasticsearch-index

Build an Elasticsearch index with Python APIs on AWS EC2. Search the Elasticsearch index with appropriate queries.

Python 4 Updated Feb 7, 2017

spoddutur / cloud-based-sql-engine-using-spark

Cloud-based SQL engine using SPARK where data is accessible as JDBC/ODBC data source via Spark ThriftServer.

Java 31 14 Updated Jul 12, 2017

yaooqinn / spark-ranger

已经合入(apache/incubator-kyuubi) ACL Management for Apache Spark SQL with Apache Ranger.

Scala 54 56 Updated Nov 11, 2021

apache / hudi

Upserts, Deletes And Incremental Processing on Big Data.

Java 5,546 2,443 Updated Jan 6, 2025

apache / kudu

Mirror of Apache Kudu

C++ 1,855 652 Updated Dec 31, 2024

ClickHouse / ClickHouse

ClickHouse® is a real-time analytics DBMS

C++ 38,315 7,000 Updated Jan 6, 2025

Netflix / genie

Distributed Big Data Orchestration Service

Java 1,725 367 Updated Dec 10, 2024

elastic / beats

🐠 Beats - Lightweight shippers for Elasticsearch & Logstash

Go 12,240 4,935 Updated Jan 6, 2025

elastic / logstash

Logstash - transport and process your logs, events, or other data

Java 14,309 3,510 Updated Jan 3, 2025

fluent / fluentd

Fluentd: Unified Logging Layer (project under CNCF)

Ruby 12,991 1,353 Updated Jan 6, 2025

apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform

TypeScript 63,668 14,171 Updated Jan 5, 2025

apache / druid

Apache Druid: a high performance real-time analytics database.

Java 13,579 3,713 Updated Jan 6, 2025

Alluxio / alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud

Java 6,895 2,940 Updated Nov 27, 2024

trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Java 10,657 3,056 Updated Jan 6, 2025

jiegzhan / kaggle

Build machine learning and deep learning models on Kaggle.

Jupyter Notebook 3 Updated Mar 9, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zhang Jie 张杰 jiegzhan

Achievements

Achievements

Block or report jiegzhan

Stars

rasbt / LLMs-from-scratch

apache / iceberg

durgeshsamariya / awesome-github-profile-readme-templates

jiegzhan / jiegzhan

apache / flink-kubernetes-operator

apache / pinot

cb372 / scalacache

streaming-with-flink / examples-scala

robinhood / faust

apache / bahir-flink

amundsen-io / amundsen

t9md / atom-vim-mode-plus

joshdick / onedark.vim

delta-io / delta