francklinson

francklinson

ChenghaoZhou

China.Hangzhou

Stars

modelscope / ClearerVoice-Studio

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 2,138 151 Updated Jan 27, 2025

ShiDongyuan / Multichannel_FxLMS_python_code

Python 18 3 Updated Feb 3, 2023

njchiang / tikhonov

code for L2 regularization of arbitrary Tikhonov matrices

Python 14 3 Updated Mar 16, 2018

francesclluis / sound-field-neural-network

A deep-learning-based method for sound field reconstruction

Python 65 12 Updated Jun 26, 2023

lsg1213 / PEAQ_python

Python version of PEAQ(Perceptual Evaluation of Audio Quality)

Python 14 1 Updated Jul 13, 2022

Ashvala / AQUA-Tk

AQUA-Tk = Audio QUality Assessment-Toolkit. (In development)

Python 97 6 Updated Nov 3, 2024

stephencwelch / Perceptual-Coding-In-Python

MATLAB 156 40 Updated May 14, 2015

NetEase / Polyphonic-TrOMR

TrOMR:Transformer-based Polyphonic Optical Music Recognition

Python 51 12 Updated Jan 21, 2023

YuejieGao / TG-CRITIC

TG-CRITIC: A TIMBRE-GUIDED MODEL FOR REFERENCE-INDEPENDENT SINGING EVALUATION

13 2 Updated May 26, 2023

spotify / pedalboard

🎛 🔊 A Python library for audio.

C++ 5,352 275 Updated Nov 26, 2024

facebookresearch / encodec

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Python 3,579 313 Updated Jan 4, 2024

LAION-AI / CLAP

Contrastive Language-Audio Pretraining

Python 1,514 149 Updated Nov 21, 2024

Stability-AI / stable-audio-metrics

Metrics for evaluating music and audio generative models – with a focus on long-form, full-band, and stereo generations.

Python 185 20 Updated Nov 18, 2024

microsoft / fadtk

A simple library for Fréchet Audio Distance (FAD) calculation

Python 173 23 Updated Jan 8, 2025

gudgud96 / frechet-audio-distance

A lightweight library for Frechet Audio Distance calculation.

Python 251 24 Updated Sep 4, 2024

MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Jupyter Notebook 4,070 362 Updated Dec 18, 2024

NTIA / alignnet

Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.

Python 18 Updated Oct 8, 2024

nyrahealth / CrisperWhisper

Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection

Python 549 26 Updated Dec 19, 2024

m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 13,608 1,479 Updated Jan 27, 2025

modelscope / 3D-Speaker

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Python 1,557 131 Updated Jan 17, 2025

Audio-WestlakeU / NBSS

The official repo of NBC & SpatialNet for multichannel speech separation, denoising, and dereverberation

Python 249 30 Updated Jan 1, 2025

ictnlp / LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,780 187 Updated Nov 14, 2024

tensorfly-gpu / aichess

使用alphazero算法打造属于你自己的象棋AI

Python 229 55 Updated Sep 1, 2022

fishaudio / fish-speech

SOTA Open Source TTS

Python 18,740 1,416 Updated Jan 26, 2025

abdullahtarek / football_analysis

This repository contains a comprehensive computer vision/machine learning football project that uses YOLO for object detection, Kmeans for pixel segmentation, optical flow for motion tracking, and …

Jupyter Notebook 604 212 Updated Apr 23, 2024

dotpcap / sharppcap

Official repository - Fully managed, cross platform (Windows, Mac, Linux) .NET library for capturing packets

C# 1,394 274 Updated Jan 20, 2025

CheshireCC / faster-whisper-GUI

faster_whisper GUI with PySide6

Python 1,967 117 Updated Dec 8, 2024

microsoft / PLC-Challenge

This repo contains required files for the INTERSPEECH 2022 Audio Deep Packet Loss Concealment (PLC) Challenge.

Python 81 11 Updated Oct 31, 2024

netease-youdao / EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Python 7,616 650 Updated Aug 13, 2024

modelscope / KAN-TTS

KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech

Python 499 85 Updated Dec 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly