Stars
Let your Claude able to think
The AI-native database built for LLM applications, providing incredibly fast hybrid search of dense vector, sparse vector, tensor (multi-vector), and full-text
Implementation of "DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation"
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…
Background removal with Transformers.js
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Codebase for Automated Creation of Digital Cousins for Robust Policy Learning
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
Real time faster whisper gradio
Multimodal LLM Application with PyMuPDF4LLM
XR-Objects is an open-source prototype that anchors contextual interactions onto analog objects to not only convey information but also to initiate digital actions, such as querying LLMs for detail…
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
Desktop app for prototyping and debugging LangGraph applications locally.
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Anthropic's educational courses
DSPy: The framework for programming—not prompting—language models
The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curatio…
Official implementation of MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance
LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control
Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction…
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)