Skip to content
View jiegzhan's full-sized avatar
  • Disney Streaming
  • San Francisco Bay Area, CA

Block or report jiegzhan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 37,321 4,785 Updated Jan 5, 2025

Apache Iceberg

Java 6,736 2,296 Updated Jan 6, 2025

This repository contains best profile readme's for your reference.

HTML 4,495 7,171 Updated Aug 24, 2024

Enjoy the journey.

1 Updated Oct 8, 2024

Apache Flink Kubernetes Operator

Java 824 423 Updated Dec 20, 2024

Apache Pinot - A realtime distributed OLAP datastore

Java 5,582 1,311 Updated Jan 6, 2025

Simple caching in Scala

Scala 770 120 Updated Aug 12, 2024

Stream Processing with Apache Flink - Scala Examples

Scala 402 205 Updated Nov 20, 2023

Python Stream Processing

Python 6,755 535 Updated Jul 27, 2024

Mirror of Apache Bahir Flink

Java 789 428 Updated Oct 30, 2023

Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Python 4,466 960 Updated Dec 19, 2024

vim-mode improved

CoffeeScript 1,399 110 Updated Oct 7, 2021

A dark Vim/Neovim color scheme inspired by Atom's One Dark syntax theme.

Vim Script 3,918 527 Updated Jul 16, 2024

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs

Scala 7,726 1,743 Updated Jan 4, 2025

Deep Learning for humans

Python 62,324 19,478 Updated Jan 5, 2025

Build an Elasticsearch index with Python APIs on AWS EC2. Search the Elasticsearch index with appropriate queries.

Python 4 Updated Feb 7, 2017

Cloud-based SQL engine using SPARK where data is accessible as JDBC/ODBC data source via Spark ThriftServer.

Java 31 14 Updated Jul 12, 2017

已经合入(apache/incubator-kyuubi) ACL Management for Apache Spark SQL with Apache Ranger.

Scala 54 56 Updated Nov 11, 2021

Upserts, Deletes And Incremental Processing on Big Data.

Java 5,546 2,443 Updated Jan 6, 2025

Mirror of Apache Kudu

C++ 1,855 652 Updated Dec 31, 2024

ClickHouse® is a real-time analytics DBMS

C++ 38,315 7,000 Updated Jan 6, 2025

Distributed Big Data Orchestration Service

Java 1,725 367 Updated Dec 10, 2024

🐠 Beats - Lightweight shippers for Elasticsearch & Logstash

Go 12,240 4,935 Updated Jan 6, 2025

Logstash - transport and process your logs, events, or other data

Java 14,309 3,510 Updated Jan 3, 2025

Fluentd: Unified Logging Layer (project under CNCF)

Ruby 12,991 1,353 Updated Jan 6, 2025

Apache Superset is a Data Visualization and Data Exploration Platform

TypeScript 63,668 14,171 Updated Jan 5, 2025

Apache Druid: a high performance real-time analytics database.

Java 13,579 3,713 Updated Jan 6, 2025

Alluxio, data orchestration for analytics and machine learning in the cloud

Java 6,895 2,940 Updated Nov 27, 2024

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Java 10,657 3,056 Updated Jan 6, 2025

Build machine learning and deep learning models on Kaggle.

Jupyter Notebook 3 Updated Mar 9, 2018
Next