Skip to content
View csunny's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report csunny

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents

Python 3,059 191 Updated Dec 23, 2024

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.

Java 2,534 990 Updated Dec 23, 2024

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

Python 10,117 1,646 Updated Dec 23, 2024

DSPy: The framework for programming—not prompting—language models

Python 20,419 1,544 Updated Dec 22, 2024

MinIO is a high-performance, S3 compatible object store, open sourced under GNU AGPLv3 license.

Go 49,023 5,579 Updated Dec 21, 2024

Apache DataFusion SQL Query Engine

Rust 6,499 1,244 Updated Dec 23, 2024

Awesome-RAG: Collect typical RAG papers and systems.

124 9 Updated Dec 11, 2024

SQL parser and planner used by MindsDB

Python 58 22 Updated Nov 14, 2024

vsag is a vector indexing library used for similarity search.

C++ 189 19 Updated Dec 23, 2024

This is a repo with links to everything you'd ever want to learn about data engineering

Jupyter Notebook 23,670 4,103 Updated Dec 23, 2024

Let your Claude able to think

TypeScript 10,279 1,172 Updated Dec 3, 2024

ETL, Analytics, Versioning for Unstructured Data

Python 2,127 94 Updated Dec 24, 2024

Lyric: A Rust-powered secure runtime for AI-Agent.

Rust 16 1 Updated Dec 20, 2024
Python 8 1 Updated Dec 22, 2024

Secure open source cloud runtime for AI apps & AI agents

HTML 7,157 473 Updated Dec 23, 2024

中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理

Python 34,165 10,258 Updated Dec 18, 2024

A Notebook Web Client with Flexible Customization and Easy Integration.

TypeScript 364 13 Updated Dec 19, 2024

World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.

Java 1,160 373 Updated Dec 23, 2024

☘️ A visualization grammar based on G2 for streamlit.

Python 21 1 Updated Jan 10, 2024

SuperSonic is the next-generation AI+BI platform that unifies Chat BI (powered by LLM) and Headless BI (powered by semantic layer) paradigms.

Java 2,529 436 Updated Dec 23, 2024

agentUniverse is a LLM multi-agent framework that allows developers to easily build multi-agent applications.

Python 988 122 Updated Dec 20, 2024

Making data higher-quality, juicier, and more digestible for foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!

Python 3,181 189 Updated Dec 23, 2024

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!

C++ 9,795 594 Updated Dec 23, 2024

Apache Iceberg

Java 6,683 2,290 Updated Dec 23, 2024

LLM101n: Let's build a Storyteller

30,613 1,673 Updated Aug 1, 2024

Cloud Native DataOps & AIOps Platform | 云原生数智运维平台

Java 1,826 408 Updated Apr 11, 2024

The Memory layer for your AI apps

Python 23,488 2,169 Updated Dec 23, 2024

A library that provides an embeddable, persistent key-value store for fast storage optimized for AWS

C++ 765 121 Updated Oct 16, 2024

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs

Scala 7,698 1,735 Updated Dec 21, 2024
Next