Stars
CSP-J/S/X, NOIP, NOI, IOI, 信息学奥林匹克竞赛历年真题收录 | QQ交流群529507453
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
Convert any PDF into a podcast episode!
A generative speech model for daily dialogue.
Prompt Visualization | Art Gallery
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
A simple screen parsing tool towards pure vision based GUI agent
Kolors的ComfyUI原生采样器实现(Kolors ComfyUI Native Sampler Implementation)
Diffusers wrapper to run Kwai-Kolors model
real time face swap and one-click video deepfake with only a single image
State-of-the-art 2D and 3D Face Analysis Project
A GPT-4 AI Tutor Prompt for customizable personalized learning experiences.
[AAAI 2025] Follow-Your-Click: This repo is the official implementation of "Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts"
Large Action Model framework to develop AI Web Agents
Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.
Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning
A curated list of image captioning and related area resources. :-)
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts…
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
Web Scraping with GPT-4 Vision API and Puppeteer
AI-Driven Children’s Storytelling Web App using Next.js, OpenAI, Stability.ai, and ElevenLabs