Stars
A generative world for general-purpose robotics & embodied AI learning.
A react-based starter app for using the Multimodal Live API over websockets with Gemini
[AAAI2025] Predicting the Original Appearance of Damaged Historical Documents
The official implementation of paper "BrushEdit: All-In-One Image Inpainting and Editing"
English pronunciation correction teacher built with gemini
Examples and guides for using the Gemini API
Luna AI换脸 / AI写真 / AI证件照 / AI高管照 / AI照相馆 / 妙鸭相机同款
🔥 Open Source Browser API for AI Agents & Apps. Steel Browser is a batteries-included browser instance that lets you automate the web without worrying about infrastructure.
Conversational RPA SDK for Chatbot Makers. Join our Discord: https://discord.gg/7q8NBZbQzt
Query and Summarize your chat messages.
AI agent for building React Native apps
Convert any PDF into a podcast episode!
An Open Source implementation of Notebook LM with more flexibility and features
Repository for ShowUI: One Vision-Language-Action Model for GUI Visual Agent
MinIO is a high-performance, S3 compatible object store, open sourced under GNU AGPLv3 license.
Simple, unified interface to multiple Generative AI providers
Out-of-the-box (OOTB) GUI Agent for Windows and macOS
TEN Agent is a conversational AI powered by the TEN, integrating Gemini 2.0 Live, OpenAI Realtime, RTC, and more. It delivers real-time capabilities to see, hear, and speak, while being fully compa…
LLM-powered multiagent persona simulation for imagination enhancement and business insights.
Let your Claude able to think
Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way.