Stars
Gen-AI Chat for Teams - Think ChatGPT if it had access to your team's unique knowledge.
Sky-T1: Train your own O1 preview model within $450
HunyuanVideo: A Systematic Framework For Large Video Generation Model
PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
The Construction Site Snag Detector is a powerful tool built using Python, Gradio, and the Groq platform. It leverages AI to automatically detect defects, unfinished work, and quality issues in con…
このツールは、Googleスプレッドシートのデータを基にGoogleスライドを自動生成するPythonスクリプトです。スプレッドシートの各行からスライドを作成し、タイトル、サブタイトル、本文を適切なフォーマットで配置します。
Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
Diffusion model derived evolutionary algorithm
Python library that provides a unified interface for interacting with multiple Large Language Models (LLMs) from different providers.
A simple example implementation of the VoiceRAG pattern to power interactive voice generative AI experiences using RAG with Azure AI Search and Azure OpenAI's gpt-4o-realtime-preview model.
React app for inspecting, building and debugging with the Realtime API
Official implementation of EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars
A collection of projects designed to help developers quickly get started with building deployable applications using the Anthropic API
A large-scale RWKV v6, v7 inference. Capable of inference by combining multiple states(Pseudo MoE). Easy to deploy on docker. Supports true multi-batch generation and dynamic State switching. CUDA …
Speech To Speech: an effort for an open-sourced and modular GPT4-o
CosineAI / experiments
Forked from swe-bench/experimentsOpen sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.
[ICLR 2025] LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone