Skip to content
View rwesterman's full-sized avatar

Block or report rwesterman

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Awesome speech/audio LLMs, representation learning, and codec models

793 47 Updated Dec 21, 2024

SALMONN: Speech Audio Language Music Open Neural Network

Python 1,095 85 Updated Dec 12, 2024

Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

624 35 Updated Aug 3, 2024

A fast multimodal LLM for real-time voice

Python 1,647 116 Updated Dec 12, 2024

Open source inference code for Rev's model

Python 348 24 Updated Dec 19, 2024

Library for Jacobian descent with PyTorch. It enables optimization of neural networks with multiple losses (e.g. multi-task learning).

Python 168 1 Updated Dec 22, 2024
10 Updated Jun 17, 2024

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

Python 169,463 44,634 Updated Dec 23, 2024

EVAL(Elastic Versatile Agent with Langchain) will execute all your requests. Just like an eval method!

Python 869 81 Updated May 30, 2023

Secure open source cloud runtime for AI apps & AI agents

HTML 7,160 473 Updated Dec 23, 2024

This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.

Python 73 17 Updated Oct 18, 2022

UniSpeech - Large Scale Self-Supervised Learning for Speech

Python 440 74 Updated Apr 5, 2024

Aqueduct is no longer being maintained. Aqueduct allows you to run LLM and ML workloads on any cloud infrastructure.

Go 521 18 Updated Jun 7, 2023

Accessible large language models via k-bit quantization for PyTorch.

Python 6,444 639 Updated Dec 23, 2024

Robust Speech Recognition via Large-Scale Weak Supervision

Python 73,265 8,745 Updated Dec 1, 2024

Stable Diffusion web UI

Python 144,802 27,197 Updated Dec 23, 2024

Nick's Docker-based version of Stable Diffusion

Jupyter Notebook 55 5 Updated Dec 26, 2022

Multimodal grounded language dataset

10 Updated Dec 14, 2021

Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The …

TypeScript 23,979 2,460 Updated Dec 23, 2024

Flax is a neural network library for JAX that is designed for flexibility.

Jupyter Notebook 6,210 652 Updated Dec 23, 2024

Diarization scoring tools.

Python 227 43 Updated Mar 28, 2023

End-to-End Neural Diarization

Python 382 59 Updated Aug 30, 2021

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 12,508 2,575 Updated Dec 24, 2024

Python Sorted Container Types: Sorted List, Sorted Dict, and Sorted Set

Python 3,594 204 Updated Mar 8, 2024

A PyTorch implementation of End-to-End Neural Diarization

Python 98 16 Updated Jun 19, 2023

Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.

Python 518 73 Updated Sep 25, 2024

Convert words to numbers

Python 20 10 Updated Apr 13, 2022

Various speech datasets made available to the public

Jupyter Notebook 107 13 Updated Dec 13, 2024

A data augmentations library for audio, image, text, and video.

Python 4,979 302 Updated Nov 21, 2024
Next