Stars
Fluss is a streaming storage built for real-time analytics.
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
Know your data better!Datavines is Next-gen Data Observability Platform, support metadata manage and data quality.
Run any open-source LLMs, such as Llama, Mistral, as OpenAI compatible API endpoint in the cloud.
CKAN is an open-source DMS (data management system) for powering data hubs and data portals. CKAN makes it easy to publish, share and use data. It powers catalog.data.gov, open.canada.ca/data, data…
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team co…
Collect, aggregate, and visualize a data ecosystem's metadata
Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.
Cloud Native DataOps & AIOps Platform | 云原生数智运维平台
🦜🔗 Build context-aware reasoning applications
Drag & drop UI to build your customized LLM flow
An orchestration platform for the development, production, and observation of data assets.
ModelScope-Agent: An agent framework connecting models in ModelScope with the world
Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
BobbySun / autolabel
Forked from refuel-ai/autolabelLabel, clean and enrich text datasets with LLMs.
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts…
🚀 RocketQA, dense retrieval for information retrieval and question answering, including both Chinese and English state-of-the-art models.
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
AI Flow is an open source framework that bridges big data and artificial intelligence.
An Industrial Grade Federated Learning Framework
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
🔥🔥🔥AI-driven database tool and SQL client, The hottest GUI client, supporting MySQL, Oracle, PostgreSQL, DB2, SQL Server, DB2, SQLite, H2, ClickHouse, and more.
A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
A unified framework for privacy-preserving data analysis and machine learning