Stars
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
The open-source visual AI programming environment and TypeScript library
🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library
A user-friendly, multi-platform GUI for managing and running CrewAI agents and tasks. Supports Conda and virtual environments, no coding needed.
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
A Kubernetes deployable instance of GroundX for document parsing, storage, and search.
A set of interpersonal relationship labels for the CALLHOME English corpus
A collection of notebooks/recipes showcasing usecases of open-source models with Together AI.
The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.
A curated list of awesome data labeling tools
Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"
A modular graph-based Retrieval-Augmented Generation (RAG) system
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
BookNLP, a natural language processing pipeline for books
Introducing Alplex, an AI-powered virtual law office designed to assist you with legal issues based on Swiss laws
Python NLTK module for interfacing with the Apache OpenNLP
⚡FlashRAG: A Python Toolkit for Efficient RAG Research (WWW2025 Resource)
The most no-nonsense, locally or API-hosted AI code completion plugin for Visual Studio Code - like GitHub Copilot but 100% free.
SGLang is a fast serving framework for large language models and vision language models.
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Benchmarking long-form factuality in large language models. Original code for our paper "Long-form factuality in large language models".
LLM training code for Databricks foundation models
Retrieval Augmented Generation Generalized Evaluation Dataset
A minimal, latex-style hugo theme for personal blogging