Starred repositories
Convert ebooks to audiobooks with chapters and metadata using dynamic AI models and voice cloning. Supports 1,107+ languages!
Finetune Llama 3.3, Mistral, Phi-4, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
Datasets and Evaluation Scripts for CompHRDoc
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Fast and memory-efficient exact attention
ReFT: Representation Finetuning for Language Models
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
This is the official repository of the revised datasets FUNSD-r and CORD-r, introduced in EMNLP 2023 paper Reading Order Matters: Information Extraction from Visually-rich Documents by Token Path P…
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.
全网最全Stable Diffusion全套教程,从入门到进阶,耗时三个月制作
《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version ongoing
《剑指 Offer》 Python, Java, C++ 解题代码,LeetBook《图解算法数据结构》配套代码仓
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
A batched offline inference oriented version of segment-anything
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
A high-throughput and memory-efficient inference and serving engine for LLMs
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production