Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Jupyter Notebook 7,943 598 Updated Nov 30, 2024

THUDM / GLM-4-Voice

GLM-4-Voice | 端到端中英语音对话模型

Python 2,488 198 Updated Dec 5, 2024

lucidrains / minGRU-pytorch

Implementation of the proposed minGRU in Pytorch

Python 264 20 Updated Dec 18, 2024

google-research / google-research

Google Research

Jupyter Notebook 34,538 7,957 Updated Dec 13, 2024

yueliu1999 / Awesome-Deep-Graph-Clustering

Awesome Deep Graph Clustering is a collection of SOTA, novel deep graph clustering methods (papers, codes, and datasets).

Python 860 147 Updated Oct 24, 2024

microsoft / CLAP

Learning audio concepts from natural language supervision

Python 505 38 Updated Sep 18, 2024

kyutai-labs / moshi

Python 7,008 548 Updated Dec 20, 2024

Audio-AGI / WavJourney

WavJourney: Compositional Audio Creation with LLMs

Python 525 44 Updated Sep 28, 2023

hijkzzz / Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

5,738 314 Updated Dec 21, 2024

microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 20,445 2,572 Updated Dec 15, 2024

feizc / FluxMusic

Text-to-Music Generation with Rectified Flow Transformers

Python 1,634 125 Updated Dec 10, 2024

jasminsternkopf / mel_cepstral_distance

Computes the Mel-Cepstral Distance of two WAV files based on the paper "Mel-Cepstral Distance Measure for Objective Speech Quality Assessment" by Robert F. Kubichek.

Python 50 10 Updated Dec 11, 2024

microsoft / Pengi

An Audio Language model for Audio Tasks

Python 296 16 Updated Apr 19, 2024

FunAudioLLM / FunAudioLLM-APP

Python 302 56 Updated Jul 22, 2024

lyuchenyang / Macaw-LLM

Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

Python 1,571 128 Updated Jun 17, 2024

shansongliu / MuMu-LLaMA

This is the official repository for M2UGen

Jupyter Notebook 455 38 Updated Dec 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XiaohuaiLe-AALab Le-Xiaohuai-speech

Achievements

Achievements

Block or report Le-Xiaohuai-speech

Stars

Metacreation-Lab / GigaMIDI-Dataset

bytedance / uss

x-cls / superclass

chentuochao / Target-Conversation-Extraction

facebookresearch / seamless_communication

google-deepmind / alphafold3

CrossmodalGroup / LAPS

bytedance / dplm

Pathoschild / StardewMods

tango4j / Auto-Tuning-Spectral-Clustering

JusperLee / SonicSim

halsay / ASR-TTS-paper-daily

JishengBai / AudioSetCaps

Plachtaa / seed-vc

open-mmlab / Amphion