Stars
A modern starter kit for 3D browser games powered by r3f and threejs -
Create Epic Math and Physics Animations From Text.
AI wearables. Put it on, speak, transcribe, automatically
Open source Loom alternative. Beautiful, shareable screen recordings.
Command and Conquer: Generals - Zero Hour
OCR & Document Extraction using vision models
🔥 Open Source Browser API for AI Agents & Apps. Steel Browser is a batteries-included browser instance that lets you automate the web without worrying about infrastructure.
Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …
Turn any webpage into structured data using LLMs
A simple screen parsing tool towards pure vision based GUI agent
An AI cursor for desktop using Gemini 2.0 Flash (Experimental)
🪄 Create rich visualizations with AI
Using the moondream VLM with optical flow for promptable object tracking
streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL
🕷️ An undetectable, powerful, flexible, high-performance Python library that makes Web Scraping easy again!
Keep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget)
browser controlling AI agent that applies to relavant jobs on internet autonomously. join chat @ https://discord.gg/umgnyQU2K8
Virtual whiteboard for sketching hand-drawn like diagrams
Create architecture diagrams from code automatically using large language models (LLMs).
A react-based starter app for using the Multimodal Live API over websockets with Gemini