Skip to content
View wendongj's full-sized avatar

Block or report wendongj

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
40 stars written in Jupyter Notebook
Clear filter

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Jupyter Notebook 46,175 4,907 Updated Jan 22, 2025

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 39,675 5,266 Updated Feb 11, 2025

🔊 Text-Prompted Generative Audio Model

Jupyter Notebook 36,908 4,357 Updated Aug 19, 2024

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Jupyter Notebook 21,466 2,231 Updated Jan 15, 2025

A multi-voice TTS system trained with an emphasis on quality

Jupyter Notebook 13,671 1,893 Updated Nov 19, 2024

Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

Jupyter Notebook 13,577 4,264 Updated Aug 19, 2024

FaceChain is a deep-learning toolchain for generating your Digital-Twin.

Jupyter Notebook 9,273 868 Updated Dec 10, 2024

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook 6,838 825 Updated Feb 12, 2025

AirLLM 70B inference with single 4GB GPU

Jupyter Notebook 5,662 449 Updated Nov 24, 2024

An Open Source text-to-speech system built by inverting Whisper.

Jupyter Notebook 4,111 230 Updated Dec 12, 2024

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Jupyter Notebook 3,877 324 Updated Jan 13, 2025

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

Jupyter Notebook 1,591 343 Updated Apr 22, 2024

Official Pytorch Implementation for "MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation" presenting "MultiDiffusion" (ICML 2023)

Jupyter Notebook 1,017 60 Updated Sep 21, 2023

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

Jupyter Notebook 863 110 Updated Feb 4, 2025

VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design

Jupyter Notebook 532 54 Updated Sep 11, 2023

Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型,支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.

Jupyter Notebook 525 57 Updated Jul 11, 2024

Code for SuDoRm-Rf networks for efficient audio source separation. SuDoRm-Rf stands for SUccessive DOwnsampling and Resampling of Multi-Resolution Features which enables a more efficient way of sep…

Jupyter Notebook 314 34 Updated Jul 6, 2023

EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction

Jupyter Notebook 246 16 Updated May 19, 2024

In this repository, you will learn how code works in VITS(Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech) in Jupyter Notebooks, including normalizing da…

Jupyter Notebook 152 21 Updated Jun 5, 2023
Jupyter Notebook 139 15 Updated Jan 7, 2024

Intel Neuromorphic DNS Challenge

Jupyter Notebook 135 29 Updated Dec 3, 2024

This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".

Jupyter Notebook 119 13 Updated Oct 15, 2024

A DDSP-based neural voice synthesiser.

Jupyter Notebook 112 8 Updated Nov 14, 2024
Jupyter Notebook 110 18 Updated Oct 25, 2021

Implementation for "Music Enhancement via Image Translation and Vocoding"

Jupyter Notebook 54 5 Updated Apr 28, 2022

(R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.

Jupyter Notebook 47 5 Updated Sep 4, 2023
Jupyter Notebook 46 9 Updated Aug 16, 2023

A Pytorch-based implementation of the compression and decompression module in "Ultra Dual-Path Compression For Joint Echo Cancellation And Noise Suppression".

Jupyter Notebook 45 5 Updated Feb 20, 2024
Jupyter Notebook 32 3 Updated Sep 14, 2022

Generative Fixed-Filter Active Noise Control with CNN-Kalman Filtering

Jupyter Notebook 24 8 Updated Sep 15, 2024
Next