Stars
A modular graph-based Retrieval-Augmented Generation (RAG) system
A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.
Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.
The Fastest State-of-the-Art Static Embeddings in the World
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Uses AI to locally transcribes speech from media files, generating subtitle files, translates the generated subtitles, inserts them into the mp4 container, and burns them directly into video
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
A GUI tool for offline transcription of speech recordings, including speaker diarization, utilizing state-of-the-art machine learning models.
A python package for benchmarking interpretability techniques on Transformers.
Converts standard Korean dataset to instruction-tuning available format.
AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation
RAG (Retrieval-Augmented Generation) Chatbot Examples Using PyMuPDF
Finetune Llama 3.3, Mistral, Phi-4, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
Enhanced ChatGPT Clone: Features Agents, Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, Code…
CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean
A framework for benchmarking model's instruction following ability
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
⚡ Dynamically generated stats for your github readmes
🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.
Large-scale LLM inference engine
The Universe of Evaluation. All about the evaluation for LLMs.
Finetune mistral-7b-instruct for sentence embeddings
Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali
Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch