- Munich, Germany
Stars
A list of publicly available room impulse response datasets and scripts to download them.
Baselines for IS25 Source Tracing Special Session
Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models
audioLIME: Listenable Explanations Using Source Separation
Thorsten-Voice: A free to use, offline working, high quality german TTS voice should be available for every project without any license struggling.
FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion
Advances in audio anti-spoofing and deepfake detection using graph neural networks and self-supervised learning
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
A multi-voice TTS system trained with an emphasis on quality
This repository includes the code to reproduce our paper "End-to-end anti-spoofing with RawNet2" (https://arxiv.org/abs/2011.01108) published in ICASSP '21.
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.
A high-level toolbox for using complex valued neural networks in PyTorch
A tool to create seperate exercise/solution files from a single .ipynb input notebook.
StyleGAN2-ADA - Official PyTorch implementation
A PyTorch model for Stanford Cars Datasets: https://ai.stanford.edu/~jkrause/cars/car_dataset.html
Learning computer vision by striving to maximise accuracy on the Stanford Cars dataset
Pytorch speech emotion recognition for RAVDESS dataset with CNN.
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.
Automatic 2D-to-3D Video Conversion with CNNs
PyTorch implementation of VQ-VAE by Aäron van den Oord et al.