Skip to content
View adiyoss's full-sized avatar
šŸ 
Working from home
šŸ 
Working from home

Block or report adiyoss

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userā€™s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 3 1 Updated Oct 4, 2024

šŸ”Š Text-Prompted Generative Audio Model

Jupyter Notebook 37,037 4,363 Updated Aug 19, 2024

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,167 275 Updated Nov 5, 2024

WavJourney: Compositional Audio Creation with LLMs

Python 530 43 Updated Sep 28, 2023
Jupyter Notebook 4 Updated Aug 24, 2024

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 7,537 604 Updated Feb 22, 2025

The official code for the SALMonšŸ£ benchmark (ICASSP 2025)

Python 43 Updated Feb 16, 2025

SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling

Python 1,027 74 Updated Jan 2, 2025

All-In-One Music Structure Analyzer

Python 502 69 Updated May 9, 2024

This repo contains the official PyTorch implementation of vLMIG: Improving Visual Commonsense in Language Models via Multiple Image Generation

Python 16 Updated Jul 1, 2024

The official implementation of "A Language Modeling Approach to Diacritic-Free Hebrew TTS"

Python 83 8 Updated Jul 21, 2024

Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".

Python 413 38 Updated Apr 24, 2024

A curated list for awesome discrete diffusion models resources.

232 9 Updated Feb 5, 2025

Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11037

Python 44 2 Updated Jul 2, 2024

CodeBERT

Python 2,378 474 Updated Jul 9, 2023

šŸŒø A command-line fuzzy finder

Go 68,207 2,462 Updated Feb 24, 2025

A Zsh theme

Shell 48,019 2,247 Updated Jan 29, 2025

This repo is a fork, containing the official PyTorch implementation of: Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation

Python 1 Updated Sep 28, 2023

This repo contains the official PyTorch implementation of: Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation

Python 113 13 Updated Feb 13, 2025

A Python toolbox for performing gradient-free optimization

Python 4,004 358 Updated Feb 24, 2025

A sequence-to-sequence voice conversion toolkit.

Python 93 12 Updated Jul 5, 2024

This repo is a fork from the official PyTorch implementation of "AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation" (Interspeech 2023)

Python 5 Updated Jun 25, 2023

A spoken version of the textual story cloze benchmark

14 1 Updated Aug 6, 2023

This repository contains the official PyTorch implementation of the paper: "Learning Discrete Structured VAE using NES".

Python 4 4 Updated May 3, 2022

This repo contains the official PyTorch implementation of AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation

Python 80 4 Updated Jun 18, 2024

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllableā€¦

Jupyter Notebook 21,543 2,248 Updated Jan 15, 2025

Official repository for "Speaking Style Conversion With Discrete Self-Supervised Units" (EMNLP 2023). https://arxiv.org/abs/2212.09730

Python 128 9 Updated Dec 8, 2023

This repo contains the official PyTorch implementation of "Audio Super Resolution in the Spectral Domain" (ICASSP 2023)

Python 217 29 Updated Jul 14, 2024

This repo contains the official PyTorch implementation of "Analyzing Discrete Self Supervised Speech Representation For Spoken Language Modeling" (ICASSP 2023)

Python 17 1 Updated Jan 3, 2023
Next