
Starred repositories
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Clone a voice in 5 seconds to generate arbitrary speech in real-time
A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
TensorFlow code and pre-trained models for BERT
A toolkit for developing and comparing reinforcement learning algorithms.
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
SoftVC VITS Singing Voice Conversion
Write scalable load tests in plain Python 🚗💨
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
Zulip server and web application. Open-source team chat that helps teams stay productive and focused.
DSPy: The framework for programming—not prompting—language models
Magenta: Music and Art Generation with Machine Intelligence
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
Pyodide is a Python distribution for the browser and Node.js based on WebAssembly
🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
An open-source NLP research library, built on PyTorch.
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.