Skip to content
View tuong-olli's full-sized avatar

Block or report tuong-olli

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

VoiceLDM: Text-to-Speech with Environmental Context

Python 165 8 Updated Aug 9, 2024

Making large AI models cheaper, faster and more accessible

Python 38,966 4,348 Updated Dec 25, 2024

BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs

Python 506 35 Updated Jul 21, 2023

Stable Diffusion and Flux in pure C/C++

C++ 3,632 312 Updated Nov 30, 2024

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 36,352 4,468 Updated Aug 16, 2024

An official implementation of "UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data"

Jupyter Notebook 133 13 Updated Aug 17, 2023

SoftVC VITS Singing Voice Conversion

Python 26,193 4,873 Updated Nov 11, 2023

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 30,725 6,433 Updated Oct 18, 2024

Voice activity detection (VAD) paper and code(From 198*~ )and its classification.

89 13 Updated Feb 6, 2024

Barkify: an unoffical training implementation of Bark TTS by suno-ai

Python 126 21 Updated May 31, 2023

FluentTTS: Text-dependent Fine-grained Style Control for Multi-style TTS

Python 21 4 Updated Nov 15, 2022

PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.

Jupyter Notebook 157 29 Updated Mar 18, 2024

🦜🔗 Build context-aware reasoning applications

Jupyter Notebook 96,892 15,756 Updated Dec 24, 2024

🔊 Text-Prompted Generative Audio Model

Jupyter Notebook 36,498 4,293 Updated Aug 19, 2024

The Mojo Programming Language

Mojo 23,468 2,596 Updated Dec 26, 2024

ICASSP 2023 Accepted

Python 190 14 Updated May 6, 2024

🔊 Text-prompted Generative Audio Model - With the ability to clone voices

Jupyter Notebook 3,195 429 Updated Jun 12, 2024

A family of diffusion models for text-to-audio generation.

Python 1,110 93 Updated Jul 3, 2024

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

Python 1,295 103 Updated Sep 24, 2023

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Jupyter Notebook 48,268 5,706 Updated Sep 18, 2024

An unofficial PyTorch implementation of Mix-Phoneme-Bert

Python 39 7 Updated Jul 10, 2023

Keep track of big models in audio domain, including speech, singing, music etc.

463 28 Updated Sep 26, 2024

PITS: Variational Pitch Inference for End-to-end Pitch-controllable TTS without External Pitch Predictor

Python 277 34 Updated Jul 16, 2023

A fully working pytorch implementation of NaturalSpeech (Tan et al., 2022)

Python 472 68 Updated Feb 7, 2024

AudioLDM: Generate speech, sound effects, music and beyond, with text.

Python 2,495 226 Updated Dec 9, 2024

Objective metrics used in several text-to-speech (TTS) papers.

Python 46 9 Updated Apr 22, 2022

phoneme tokenizer and grapheme-to-phoneme model for 8k languages

Python 150 15 Updated Jun 9, 2023

Open source voice labeling application

Kotlin 153 22 Updated Nov 6, 2024

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

Python 37,128 3,248 Updated Aug 17, 2024

An unofficial PyTorch implementation of the audio LM VALL-E

Python 2,974 419 Updated May 10, 2023
Next