ICHRick

ICHRick

0 followers · 2 following

Lists (4)

Sort

Stars

VOICEVOX / voicevox_core

無料で使える中品質なテキスト読み上げソフトウェア、VOICEVOXのコア

Rust 915 122 Updated Mar 4, 2025

m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 14,268 1,543 Updated Mar 4, 2025

RookieJunChen / FullSubNet-plus

The official PyTorch implementation of "FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement".

Python 253 57 Updated Apr 23, 2024

TuneNN / TuneNN

A transformer-based network model for pitch detection

Python 164 6 Updated Dec 19, 2023

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 21,704 2,384 Updated Aug 12, 2024

state-spaces / mamba

Mamba SSM architecture

Python 14,139 1,231 Updated Jan 18, 2025

confident-ai / deepeval

The LLM Evaluation Framework

Python 5,344 449 Updated Mar 3, 2025

pyannote / pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook 6,974 841 Updated Mar 1, 2025

Stability-AI / generative-models

Generative Models by Stability AI

Python 25,434 2,821 Updated Sep 4, 2024

yangdongchao / UniAudio

The Open Source Code of UniAudio

Python 546 32 Updated Jul 22, 2024

Rongjiehuang / GenerSpeech

PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.

Python 320 44 Updated Feb 9, 2024

asteroid-team / torch-audiomentations

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.

Python 1,004 92 Updated Jan 15, 2025

ckiplab / ckip-transformers

CKIP Transformers

Python 716 75 Updated Apr 21, 2023

ckiplab / han-transformers

8 1 Updated Jan 19, 2023

lucidrains / musiclm-pytorch

Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch

Python 3,231 263 Updated Sep 6, 2023

espeak-ng / espeak-ng

eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.

C 4,761 971 Updated Mar 3, 2025

speechbrain / speechbrain

A PyTorch-based Speech Toolkit

Python 9,452 1,445 Updated Feb 27, 2025

lucidrains / audiolm-pytorch

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch

Python 2,507 275 Updated Jan 12, 2025

facebookresearch / encodec

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Python 3,614 319 Updated Jan 4, 2024

OlaWod / FreeVC

FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion

Python 654 114 Updated Jan 19, 2025

coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 38,163 4,777 Updated Aug 16, 2024

suno-ai / bark

🔊 Text-Prompted Generative Audio Model

Jupyter Notebook 37,125 4,378 Updated Aug 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly