jingxuan9862

jingxuan9862

Stars

64 results for source starred repositories

TEN-framework / TEN-Agent

TEN Agent is a conversational AI powered by the TEN, integrating Gemini 2.0 Live, OpenAI Realtime, RTC, and more. It delivers real-time capabilities to see, hear, and speak, while being fully compa…

Python 3,895 412 Updated Jan 14, 2025

kyutai-labs / moshi

Python 7,153 558 Updated Jan 14, 2025

chentuochao / Target-Conversation-Extraction

This is the code and dataset repo for Interspeech 2024 paper "Target conversation extraction: Source separation using turn-taking dynamics"

Python 42 4 Updated Oct 4, 2024

WooooDyy / LLM-Agent-Paper-List

The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.

7,086 422 Updated Jul 28, 2024

vb000 / LookOnceToHear

A novel human-interaction method for real-time speech extraction on headphones.

Python 555 61 Updated Jun 5, 2024

ewan-xu / pyaec

simple and efficient python implemention of a series of adaptive filters. including time domain adaptive filters(lms、nlms、rls、ap、kalman)、nonlinear adaptive filters(volterra filter、functional link a…

Python 338 98 Updated Nov 29, 2021

magenta / mt3

MT3: Multi-Task Multitrack Music Transcription

Python 1,468 195 Updated Dec 11, 2024

jzi040941 / PercepNet

Unofficial implementation of PercepNet: A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech

C++ 335 94 Updated Jan 22, 2023

google / visqol

Perceptual Quality Estimator for speech and audio

C++ 723 128 Updated Aug 2, 2024

webdataset / webdataset

A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.

Python 2,424 194 Updated Dec 11, 2024

microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 20,577 2,584 Updated Jan 7, 2025

PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…

Python 11,386 1,871 Updated Jan 16, 2025

FederatedAI / FATE

An Industrial Grade Federated Learning Framework

Python 5,782 1,559 Updated Nov 19, 2024

bytedance / byteps

A high performance and generic framework for distributed DNN training

Python 3,655 493 Updated Oct 3, 2023

qiuqiangkong / panns_transfer_to_gtzan

Python 103 40 Updated Jul 12, 2020

jameslyons / python_speech_features

This library provides common speech features for ASR including MFCCs and filterbank energies.

Python 2,382 617 Updated Oct 20, 2021

cvondrick / soundnet

SoundNet: Learning Sound Representations from Unlabeled Video. NIPS 2016

Lua 460 93 Updated Oct 7, 2017

qiuqiangkong / audioset_tagging_cnn

Python 1,392 259 Updated Jul 25, 2024

deezer / spleeter

Deezer source separation library including pretrained models.

Python 26,206 2,874 Updated Oct 29, 2024

NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 12,873 2,626 Updated Jan 16, 2025

abisee / pointer-generator

Code for the ACL 2017 paper "Get To The Point: Summarization with Pointer-Generator Networks"

Python 2,180 808 Updated Jun 16, 2022

wenet-e2e / WeTextProcessing.deprecated

C++ 61 5 Updated Jan 31, 2023

speechio / chinese_text_normalization

Chinese text normalization for speech processing

Python 645 146 Updated Mar 18, 2023

BUTSpeechFIT / speakerbeam

Jupyter Notebook 109 18 Updated Oct 25, 2021

magenta / ddsp

DDSP: Differentiable Digital Signal Processing

Python 2,950 344 Updated Sep 23, 2024

nanahou / Awesome-Speech-Enhancement

A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.

MATLAB 735 150 Updated Dec 1, 2020

asteroid-team / asteroid

The PyTorch-based audio source separation toolkit for researchers

Python 2,310 427 Updated Jan 11, 2025

nay0648 / unified2021

A UNIFIED SPEECH ENHANCEMENT FRONT-END FOR ONLINE DEREVERBERATION, ACOUSTIC ECHO CANCELLATION, AND SOURCE SEPARATION

MATLAB 113 56 Updated Jun 18, 2022

etzinis / sudo_rm_rf

Code for SuDoRm-Rf networks for efficient audio source separation. SuDoRm-Rf stands for SUccessive DOwnsampling and Resampling of Multi-Resolution Features which enables a more efficient way of sep…

Jupyter Notebook 313 34 Updated Jul 6, 2023

clovaai / voxceleb_trainer

In defence of metric learning for speaker recognition

Python 1,077 276 Updated Mar 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly