Highlights
- Pro
Stars
Use the reMarkable2 as an interface to vision-LLMs (ChatGPT, Claude, Gemini). Ghost in the machine!
An open source deep research clone. AI Agent that reasons large amounts of web data extracted with Firecrawl
EXPERIMENTAL β A library for language models to respond with GUI.
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
Make websites accessible for AI agents
Scira (Formerly MiniPerplx) is a minimalistic AI-powered search engine that helps you find information on the internet. Powered by Vercel AI SDK! Search with models like Grok 2.0.
Minimalistic 4D-parallelism distributed training framework for education purpose
Automagically reverse-engineer REST APIs via capturing traffic
A paper list of some recent works about Token Compress for Vit and VLM
π An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)
A toolkit for describing model features and intervening on those features to steer behavior.
OpenAI's Realtime API minus the enterprise bloat
A system for agentic LLM-powered data processing and ETL
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Best practices & guides on how to write distributed pytorch training code
Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!
ππ€ Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.