Skip to content
View francklinson's full-sized avatar
  • China.Hangzhou

Block or report francklinson

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 2,138 151 Updated Jan 27, 2025

code for L2 regularization of arbitrary Tikhonov matrices

Python 14 3 Updated Mar 16, 2018

A deep-learning-based method for sound field reconstruction

Python 65 12 Updated Jun 26, 2023

Python version of PEAQ(Perceptual Evaluation of Audio Quality)

Python 14 1 Updated Jul 13, 2022

AQUA-Tk = Audio QUality Assessment-Toolkit. (In development)

Python 97 6 Updated Nov 3, 2024

TrOMR:Transformer-based Polyphonic Optical Music Recognition

Python 51 12 Updated Jan 21, 2023

TG-CRITIC: A TIMBRE-GUIDED MODEL FOR REFERENCE-INDEPENDENT SINGING EVALUATION

13 2 Updated May 26, 2023

🎛 🔊 A Python library for audio.

C++ 5,352 275 Updated Nov 26, 2024

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Python 3,579 313 Updated Jan 4, 2024

Contrastive Language-Audio Pretraining

Python 1,514 149 Updated Nov 21, 2024

Metrics for evaluating music and audio generative models – with a focus on long-form, full-band, and stereo generations.

Python 185 20 Updated Nov 18, 2024

A simple library for Fréchet Audio Distance (FAD) calculation

Python 173 23 Updated Jan 8, 2025

A lightweight library for Frechet Audio Distance calculation.

Python 251 24 Updated Sep 4, 2024

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Jupyter Notebook 4,070 362 Updated Dec 18, 2024

Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.

Python 18 Updated Oct 8, 2024

Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection

Python 549 26 Updated Dec 19, 2024

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 13,608 1,479 Updated Jan 27, 2025

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Python 1,557 131 Updated Jan 17, 2025

The official repo of NBC & SpatialNet for multichannel speech separation, denoising, and dereverberation

Python 249 30 Updated Jan 1, 2025

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,780 187 Updated Nov 14, 2024

使用alphazero算法打造属于你自己的象棋AI

Python 229 55 Updated Sep 1, 2022

SOTA Open Source TTS

Python 18,740 1,416 Updated Jan 26, 2025

This repository contains a comprehensive computer vision/machine learning football project that uses YOLO for object detection, Kmeans for pixel segmentation, optical flow for motion tracking, and …

Jupyter Notebook 604 212 Updated Apr 23, 2024

Official repository - Fully managed, cross platform (Windows, Mac, Linux) .NET library for capturing packets

C# 1,394 274 Updated Jan 20, 2025

faster_whisper GUI with PySide6

Python 1,967 117 Updated Dec 8, 2024

This repo contains required files for the INTERSPEECH 2022 Audio Deep Packet Loss Concealment (PLC) Challenge.

Python 81 11 Updated Oct 31, 2024

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Python 7,616 650 Updated Aug 13, 2024

KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech

Python 499 85 Updated Dec 28, 2023
Next