Stars
🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/mEkkMXFG
A generative speech model for daily dialogue.
Convert PDF to markdown + JSON quickly with high accuracy
OCR, layout analysis, reading order, table recognition in 90+ languages
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
justfont collaborates with calligrapher Daphne to release Elffont (精靈文), a unique typeface blending Bopomofo phonetic symbols with a mystical "Elvish" style.
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and cont…
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
BioNeMo NIMs example notebooks: for optimized inference at scale
A deep learning model for small molecule drug discovery and cheminformatics based on SMILES
Codebase for reproducing the experiments of the semantic uncertainty paper (paragraph-length experiments).
🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using RAG 🔄.
Easily train a good VC model with voice data <= 10 mins!
Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI
Bark Voice Cloning and Voice Cloning for Chinese Speech
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Stanford NLP Python library for Representation Finetuning (ReFT)
Enforce the output format (JSON Schema, Regex etc) of a language model
Chat凉宫春日, An open sourced Role-Playing chatbot Cheng Li, Ziang Leng, and others.
A curated list of 🌌 Azure OpenAI, 🦙 Large Language Models (incl. RAG, Agent), and references with memos.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/