rywang99

Ruoyu Wang rywang99

Ph.D. @SPRATeam-USTC. Long-term intern @iflytek.

5 followers · 26 following

University of Science and Technology of China
http://home.ustc.edu.cn/~wangruoyu/

Highlights

Organizations

Stars

facebookresearch / fairseq2

FAIR Sequence Modeling Toolkit 2

Python 863 102 Updated Mar 6, 2025

ictnlp / LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,841 194 Updated Nov 14, 2024

THUDM / GLM-4-Voice

GLM-4-Voice | 端到端中英语音对话模型

Python 2,733 222 Updated Dec 5, 2024

mapequation / infomap

Multi-level network clustering based on the Map Equation

C++ 447 89 Updated Jan 15, 2025

nii-yamagishilab / SpeechSPC-mini

Speech Security and Privacy Compendium - Mini

Python 9 Updated Jun 18, 2024

mulab-mir / song-describer-dataset

The Song Describer dataset is an evaluation dataset made of ~1.1k captions for 706 permissively licensed music recordings.

Jupyter Notebook 148 5 Updated Dec 22, 2023

gaomingqi / Awesome-Video-Object-Segmentation

🔖 Curated list of video object segmentation (VOS) papers, datasets, and projects.

273 9 Updated Mar 4, 2025

kyegomez / AudioFlamingo

Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities"

Python 40 1 Updated Jan 27, 2025

Voice-Privacy-Challenge / Voice-Privacy-Challenge-2024

Baseline Recipe for VoicePrivacy Challenge 2024: anonymization systems and evaluation software

Python 51 6 Updated Jan 30, 2025

microsoft / NOTSOFAR1-Challenge

NOTSOFAR-1 Challenge: Distant Diarization and ASR

Python 50 12 Updated Feb 12, 2025

TBC-TJU / MetaBCI

MetaBCI: China’s first open-source platform for non-invasive brain computer interface. The project of MetaBCI is led by Prof. Minpeng Xu from Tianjin University, China.

Python 393 164 Updated Dec 28, 2024

erdewit / HiFiScan

Optimize the audio quality of your loudspeakers

Python 997 30 Updated Nov 29, 2023

Srijith-rkr / KAUST-Whisper-Adapter

INTERSPEECH 23 - Refunction Whisper to recognize new tasks with adapters!

Python 36 2 Updated Sep 11, 2023

chimechallenge / chime-utils

Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.

Python 21 3 Updated Feb 25, 2025

radarFudan / Awesome-state-space-models

Collection of papers on state-space models

581 20 Updated Mar 2, 2025

state-spaces / s4

Structured state space sequence models

Jupyter Notebook 2,571 313 Updated Jul 17, 2024

state-spaces / mamba

Mamba SSM architecture

Python 14,175 1,236 Updated Jan 18, 2025

open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 8,644 671 Updated Mar 3, 2025