jingxuan9862

jingxuan9862

Stars

TEN-framework / TEN-Agent

TEN Agent is a conversational AI powered by the TEN, integrating Gemini 2.0 Live, OpenAI Realtime, RTC, and more. It delivers real-time capabilities to see, hear, and speak, while being fully compa…

Python 3,780 372 Updated Dec 28, 2024

kyutai-labs / moshi

Python 7,057 550 Updated Dec 20, 2024

chentuochao / Target-Conversation-Extraction

This is the code and dataset repo for Interspeech 2024 paper "Target conversation extraction: Source separation using turn-taking dynamics"

Python 40 4 Updated Oct 4, 2024

WooooDyy / LLM-Agent-Paper-List

The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.

7,021 414 Updated Jul 28, 2024

vb000 / LookOnceToHear

A novel human-interaction method for real-time speech extraction on headphones.

Python 557 61 Updated Jun 5, 2024

ewan-xu / pyaec

simple and efficient python implemention of a series of adaptive filters. including time domain adaptive filters(lms、nlms、rls、ap、kalman)、nonlinear adaptive filters(volterra filter、functional link a…

Python 334 98 Updated Nov 29, 2021

magenta / mt3

MT3: Multi-Task Multitrack Music Transcription

Python 1,457 195 Updated Dec 11, 2024

jzi040941 / PercepNet

Unofficial implementation of PercepNet: A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech

C++ 335 93 Updated Jan 22, 2023

google / visqol

Perceptual Quality Estimator for speech and audio

C++ 719 127 Updated Aug 2, 2024

webdataset / webdataset

A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.

Python 2,397 192 Updated Dec 11, 2024

microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 20,480 2,575 Updated Dec 15, 2024

jingxuan9862 / PaddleSpeech

Forked from PaddlePaddle/PaddleSpeech

An Easy-to-use Speech Toolkit including SOTA ASR pipeline, influential TTS with text frontend and End-to-End Speech Simultaneous Translation.

Python 1 Updated Dec 22, 2021

PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…

Python 11,319 1,866 Updated Dec 27, 2024

FederatedAI / FATE

An Industrial Grade Federated Learning Framework

Python 5,766 1,558 Updated Nov 19, 2024

bytedance / byteps

A high performance and generic framework for distributed DNN training

Python 3,641 491 Updated Oct 3, 2023

qiuqiangkong / panns_transfer_to_gtzan

Python 103 40 Updated Jul 12, 2020

jameslyons / python_speech_features

This library provides common speech features for ASR including MFCCs and filterbank energies.

Python 2,381 617 Updated Oct 20, 2021

cvondrick / soundnet

SoundNet: Learning Sound Representations from Unlabeled Video. NIPS 2016

Lua 460 93 Updated Oct 7, 2017

qiuqiangkong / audioset_tagging_cnn

Python 1,379 258 Updated Jul 25, 2024

deezer / spleeter

Deezer source separation library including pretrained models.

Python 26,107 2,866 Updated Oct 29, 2024

NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 12,553 2,576 Updated Dec 28, 2024

abisee / pointer-generator

Code for the ACL 2017 paper "Get To The Point: Summarization with Pointer-Generator Networks"

Python 2,180 811 Updated Jun 16, 2022

google / sparrowhawk

Shell 207 58 Updated Jun 16, 2018

wenet-e2e / WeTextProcessing.deprecated

C++ 61 5 Updated Jan 31, 2023

speechio / chinese_text_normalization

Chinese text normalization for speech processing

Python 639 147 Updated Mar 18, 2023

BUTSpeechFIT / speakerbeam

Jupyter Notebook 105 18 Updated Oct 25, 2021

magenta / ddsp

DDSP: Differentiable Digital Signal Processing

Python 2,932 345 Updated Sep 23, 2024

nanahou / Awesome-Speech-Enhancement

A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.

MATLAB 729 151 Updated Dec 1, 2020

asteroid-team / asteroid

The PyTorch-based audio source separation toolkit for researchers

Python 2,298 424 Updated Jul 19, 2024

nay0648 / unified2021

A UNIFIED SPEECH ENHANCEMENT FRONT-END FOR ONLINE DEREVERBERATION, ACOUSTIC ECHO CANCELLATION, AND SOURCE SEPARATION

MATLAB 113 56 Updated Jun 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly