- Montreal
- kundan2510.github.io
Stars
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate
Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration
Open Source framework for voice and multimodal conversational AI
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
min(DALL·E) is a fast, minimal port of DALL·E Mini to PyTorch
Project to play board games like Great Western Trail and Dominant Species online. Backend code for Quarkus, AWS Lambda, DynamoDB. Front end code: https://github.com/tomwetjens/boardgamefiesta-app
Scripts powering https://infiloop.io/personalstockticker
GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis
Implementation of "Generating Sequences With Recurrent Neural Networks" https://arxiv.org/abs/1308.0850
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Using Convnet to classify images of cats from those of dogs. :)
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding a…
Code for replication of the paper "The relativistic discriminator: a key element missing from standard GAN"
Send voicified messages on Slack using your vocal avatar!
Minimalist Attention-based RNN for NMT (tested on Multi30k)
A domain specific language to express machine learning workloads.
Code for our paper in ACL 2017
Decoupled Neural Interfaces using Synthetic Gradients for PyTorch
MagPhase Vocoder: Speech analysis/synthesis system for TTS and related applications.
A repository of state of the art Deep Learning modules implemented in Tensorflow