Stars
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
Identity-Preserving Text-to-Video Generation by Frequency Decomposition
The official repository for paper "Tora: Trajectory-oriented Diffusion Transformer for Video Generation"
Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
The first open autoregressive foundational video AI model.
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
Unified KV Cache Compression Methods for Auto-Regressive Models
This is the official reproduction of FancyVideo.
Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。
⭐ Dynamically generate stats SVG from your Github, LeetCode, Steam, and more in #Cyberpunk style :)
数字底座是一款面向大型政府、企业数字化转型,基于身份认证、组织架构、岗位职务、应用系统、资源角色、数据目录、安全控制等功能构建的统一且安全的管理支撑平台。数字底座基于三员管理模式,具备微服务、多租户、容器化和国产化,支持用户利用代码生成器快速构建自己的业务应用,同时可关联诸多成熟且好用的内部生态应用。
Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
A Language and Multimodal Agents Framework for Smart Device and More
Dynamic Protein Data Bank
EOS is a dual-core operating system designed specifically for embodied intelligence, suitable for robots, drones, satellites or other scenarios requiring real-time and general capabilities.
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language models (VLM), Audio Language Model, auto-speech-recognition (…
Accelerate your Stable Diffusion inference with the library's universal C/C++ framework design, powered by ONNXRuntime & across platforms.
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Cocos simplifies game creation and distribution with Cocos Creator, a free, open-source, cross-platform game engine. Empowering millions of developers to create high-performance, engaging 2D/3D gam…
Next-Generation Interactive Intelligent Programming Assistant
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
[NeurIPS 2024 Datasets and Benchmarks Track] Closed-Loop E2E-AD Benchmark Enhanced by World Model RL Expert
一种基于栈式虚拟机的类c 语言编译器。This project has moved from https://sourceforge.net/projects/msct/. C-SVM: A Compiler for a C-Like Language Based on a Stack Virtual Machine. Aims to help individuals learn about…
PyTorch Implementation of AudioLCM (ACM-MM'24): a efficient and high-quality text-to-audio generation with latent consistency model.