Stars
Awesome speech/audio LLMs, representation learning, and codec models
SALMONN: Speech Audio Language Music Open Neural Network
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
Library for Jacobian descent with PyTorch. It enables optimization of neural networks with multiple losses (e.g. multi-task learning).
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
EVAL(Elastic Versatile Agent with Langchain) will execute all your requests. Just like an eval method!
Secure open source cloud runtime for AI apps & AI agents
This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.
UniSpeech - Large Scale Self-Supervised Learning for Speech
Aqueduct is no longer being maintained. Aqueduct allows you to run LLM and ML workloads on any cloud infrastructure.
Accessible large language models via k-bit quantization for PyTorch.
Robust Speech Recognition via Large-Scale Weak Supervision
Stable Diffusion web UI
Nick's Docker-based version of Stable Diffusion
Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The …
Flax is a neural network library for JAX that is designed for flexibility.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Python Sorted Container Types: Sorted List, Sorted Dict, and Sorted Set
A PyTorch implementation of End-to-End Neural Diarization
Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.
Various speech datasets made available to the public
A data augmentations library for audio, image, text, and video.