Skip to content
View d-zimmermann's full-sized avatar

Block or report d-zimmermann

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Got Your Back (GYB) is a command line tool for backing up your Gmail messages to your computer using Gmail's API over HTTPS.

Python 2,688 215 Updated Oct 31, 2024

SOTA Open Source TTS

Python 17,856 1,338 Updated Dec 29, 2024

Get up and running with Llama 3.3, Mistral, Gemma 2, and other large language models.

Go 104,888 8,382 Updated Dec 29, 2024

The simplest & most comprehensible tutorial on speaker identification with NVIDIA's `Nemo`.

Python 3 5 Updated Aug 5, 2021

Python package for combining diarization system outputs.

Python 80 13 Updated Oct 12, 2023

Hyperaudio Lite - a Super-lightweight Interactive Transcript Player

HTML 130 40 Updated Nov 19, 2024

ez audio transcription tool with flexible processing and post-processing options

Python 137 13 Updated Feb 1, 2024

💬 ASR FastAPI server using faster-whisper and Multi-Scale Auto-Tuning Spectral Clustering for diarization.

Python 200 29 Updated Oct 30, 2024

Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection

Python 494 22 Updated Dec 19, 2024

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 12,559 2,576 Updated Dec 30, 2024

Unofficial implementation of NVIDIA P-Flow TTS paper

Python 219 33 Updated Dec 24, 2024

Official repository for LTX-Video

Python 2,248 164 Updated Dec 20, 2024

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

HTML 9,593 808 Updated Dec 27, 2024

speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names

Python 169 15 Updated Oct 9, 2024

Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts

Python 290 22 Updated Nov 12, 2024

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Python 1,091 105 Updated Jul 15, 2024

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

Python 986 75 Updated Aug 24, 2024

"LightRAG: Simple and Fast Retrieval-Augmented Generation"

Python 12,376 1,696 Updated Dec 29, 2024

OpenRecall is a fully open-source, privacy-first alternative to proprietary solutions like Microsoft's Windows Recall. With OpenRecall, you can easily access your digital history, enhancing your me…

Python 1,985 102 Updated Oct 1, 2024

build ai agents that have the full context, open source, runs locally, developer friendly. 24/7 screen, mic, keyboard recording and control

TypeScript 11,310 721 Updated Dec 29, 2024

A python package to analyze and compare voices with deep learning

Python 2,819 432 Updated Oct 12, 2023

A PyTorch-based Speech Toolkit

Python 9,112 1,413 Updated Dec 28, 2024

turnkey self-hosted offline transcription and diarization service with llm summary

Python 767 45 Updated Sep 25, 2024

ASR + diarization model server with speculative decoding

Python 51 9 Updated May 22, 2024

🔊 Text-Prompted Generative Audio Model

Jupyter Notebook 36,535 4,298 Updated Aug 19, 2024

An easy way to extract information from documents

Python 1,721 129 Updated May 3, 2023

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Python 12,962 907 Updated Oct 22, 2024

MARS5 speech model (TTS) from CAMB.AI

Jupyter Notebook 2,575 213 Updated Aug 1, 2024
Next