Stars
A portable accelerated data query and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.
Fluid, elastic data abstraction and acceleration for BigData/AI applications in cloud. (Project under CNCF)
An open-source ML pipeline development platform
Epsilla is a high performance Vector Database Management System
🧙 Build, run, and manage data pipelines for integrating and transforming data.
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
C++/Wolfram Language package for exploring set and graph rewriting systems
The open source Firebase alternative. Supabase gives you a dedicated Postgres database to build your web, mobile, and AI applications.
LlamaIndex is the leading framework for building LLM-powered agents over your data.
《Machine Learning Systems: Design and Implementation》- Chinese Version
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Developer-friendly, serverless vector database for AI applications. Easily add long-term memory to your LLM apps!
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch…
[ICCV 2023] VAD: Vectorized Scene Representation for Efficient Autonomous Driving
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team co…
A curated, but incomplete, list of data-centric AI resources.
High-Performance Serverless event and data processing platform
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activelo…
DuckDB is an analytical in-process SQL database management system
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, Du…
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
Evidently is an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.
Federated learning platform for edge computing, based on KubeEdge
WebRTC for the Curious: Go beyond the APIs
🪄 Master Modern C++(11/14/17/20) Templates: TMP, SFINAE, Concepts, CRTP, Variadic Magic, and Compile-Time Sorcery
flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去…
Curated List of Self-Driving Cars and Autonomous Vehicles Resources
An awesome list of self-driving cars