Skip to content
View ToughmanL's full-sized avatar

Block or report ToughmanL

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Research on Automatic Speech Recognition for dysarthric speech

Jupyter Notebook 9 2 Updated Oct 9, 2024

VoiceBank-2023 is the speech corpus specially designed for constructing personalized Mandarin text-to-speech (TTS) systems.

39 1 Updated Aug 30, 2023

Qwen2 vllm api & gradio front-end

Python 6 Updated Oct 12, 2024

ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'

Python 341 79 Updated Oct 23, 2023

My Machine Learning Web Service

Python 614 166 Updated Jul 22, 2023

[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding

Python 2,088 419 Updated Jul 11, 2024

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

Python 4,432 1,267 Updated Aug 14, 2024

Instant voice cloning by MIT and MyShell. Audio foundation model.

Python 30,744 3,078 Updated Jan 7, 2025

A library to inspect and extract intermediate layers of PyTorch models.

Python 471 16 Updated May 12, 2022

chinese speech pretrained models

Shell 1,068 90 Updated Aug 23, 2024

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 8,428 645 Updated Feb 3, 2025

A pipeline to read lips and generate speech for the read content, i.e Lip to Speech Synthesis.

Python 79 19 Updated Nov 25, 2021

AI-based Audio Watermarking Tool

Python 246 32 Updated Jan 7, 2024

A self-supervised learning framework for audio-visual speech

Python 869 138 Updated Dec 7, 2023

Auto-AVSR: Lip-Reading Sentences Project

Python 298 48 Updated Jan 8, 2025

AIdea 是一款支持 GPT 以及国产大语言模型通义千问、文心一言等,支持 Stable Diffusion 文生图、图生图、 SDXL1.0、超分辨率、图片上色的全能型 APP。

Dart 6,614 990 Updated Feb 5, 2025

ICASSP'22 Training Strategies for Improved Lip-Reading; ICASSP'21 Towards Practical Lipreading with Distilled and Efficient Models; ICASSP'20 Lipreading using Temporal Convolutional Networks

Python 406 103 Updated May 18, 2023

Open STT

Python 789 81 Updated Mar 11, 2022

📺 B 站全站视频信息爬虫

Python 636 187 Updated Feb 17, 2019
Python 1,408 181 Updated Feb 11, 2024

语音方向实验室/公司/资源/实习等,欢迎推荐或自荐

542 68 Updated Nov 13, 2024

科技爱好者周刊,每周五发布

51,941 3,096 Updated Feb 7, 2025

收集整理 GitHub 上高质量、有趣的开源项目。

15,590 1,742 Updated Feb 5, 2025
5 Updated Feb 7, 2025

Simple samples for TensorRT programming

Python 1,569 345 Updated Dec 18, 2024

Automatic Depression Detection: a GRU/ BiLSTM-based Model and An Emotional Audio-Textual Corpus

Python 148 36 Updated Jul 10, 2023

A one-of-a-kind resume builder that keeps your privacy in mind. Completely secure, customizable, portable, open-source and free forever. Try it out today!

TypeScript 29,371 2,950 Updated Feb 7, 2025

Papers from the computer science community to read and discuss.

Shell 90,614 5,836 Updated Nov 8, 2024

Kaldi model converter to ONNX

Python 237 57 Updated Jan 27, 2023
Next