Stars
📦A portable package for running Hunyuan3D-2 on Windows. | 混元 3D 2.0 整合包
ComfyUI-Manager is an extension designed to enhance the usability of ComfyUI. It offers management functions to install, remove, disable, and enable various custom nodes of ComfyUI. Furthermore, th…
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
An extensive node suite that enables ComfyUI to process 3D inputs (Mesh & UV Texture, etc) using cutting edge algorithms (3DGS, NeRF, etc.)
(IJCV 2024) Code of "AniClipart: Clipart Animation with Text-to-Video Priors"
[WIP] Layer Diffusion for WebUI (via Forge)
Few-shot Keyword Spotting in Any Language and Multilingual Spoken Word Corpus
This repository contains code for applying Data2Vec to pretrain Keyword Transformer model as described in "Improving Label-Deficient Keyword Spotting Through Self-Supervised Pretraining".
AAAI 2024 Papers: Explore a comprehensive collection of innovative research papers presented at one of the premier artificial intelligence conferences. Seamlessly integrate code implementations for…
INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processin…
ICCV 2023 Papers: Discover cutting-edge research from ICCV 2023, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ suppo…
CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code inclu…
ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal proce…
[ICASSP2023] Source code, model links and open test sets for paper SeACo-Paraformer.
OneShot Learning-based hotword detection.
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
TikTok 发布/喜欢/合辑/直播/视频/图集/音乐;抖音发布/喜欢/收藏/收藏夹/视频/图集/实况/直播/音乐/合集/评论/账号/搜索/热榜数据采集工具
The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"
Keyword spotting on Arm Cortex-M Microcontrollers
Wav2Keyword is keyword spotting(KWS) based on Wav2Vec 2.0. This model shows state-of-the-art in Speech commands dataset V1 and V2.
This repository is a curated list of awesome Speech Keyword Spotting (Wake-Up Word Detection).
A lightweight, simple-to-use, RNN wake word listener
An open-source audio wake word (or phrase) detection framework with a focus on performance and simplicity.