Skip to content
View wendongj's full-sized avatar

Block or report wendongj

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Witness the aha moment of VLM with less than $3.

Python 1,770 122 Updated Feb 8, 2025

Explorations into whether a transformer with RL can direct a genetic algorithm to converge faster

51 1 Updated Feb 2, 2025

YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open

Python 3,209 321 Updated Feb 8, 2025

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Python 1,315 225 Updated Jan 16, 2024

This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data

Python 2,236 168 Updated Feb 7, 2025

Fully open reproduction of DeepSeek-R1

Python 17,565 1,451 Updated Feb 7, 2025

FireRedASR is a family of open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outs…

Python 107 4 Updated Feb 5, 2025

target speaker extraction with sepformer

Python 4 Updated Apr 20, 2024
Cuda 3 1 Updated Apr 29, 2024

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 7,975 828 Updated Feb 5, 2025

faster inference

18 1 Updated Jan 20, 2025

Transformer with Local Modeling by Convolution for Speech Separation and Enhancement

Python 39 5 Updated Aug 1, 2024

This is the official repository of ``Scalable Neural Vocoder from Range-Null Space Decomposition'', which is submitted to TPAMI.

8 1 Updated Jan 10, 2025

This is the official repository of ``Scalable Neural Vocoder from Range-Null Space Decomposition'', which is submitted to TPAMI.

1 Updated Jan 10, 2025

This is the repository of the manuscript "Residual Fusion Probabilistic Knowledge Distillation for Speech Enhancement".

JavaScript 4 1 Updated Apr 17, 2024
Python 2,074 142 Updated Jan 16, 2025

Sky-T1: Train your own O1 preview model within $450

Python 2,429 262 Updated Feb 7, 2025

Code for Audio-Visual Target Speaker Extraction with Selective Auditory Attention (TASLP)

8 Updated Jan 21, 2025

Taming Stable Diffusion for Lip Sync!

Python 2,301 324 Updated Jan 19, 2025
Python 38 1 Updated Jan 9, 2025

My implementation of percepnet

Jupyter Notebook 7 2 Updated Apr 15, 2024
Python 5 Updated Feb 1, 2025

MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting

Python 3,389 433 Updated Nov 27, 2024

A Pytorch Implementation of Finite Scalar Quantization

Python 107 4 Updated Nov 29, 2023

制作懂人情世故的大语言模型 | 涵盖提示词工程、RAG、Agent、LLM微调教程

Python 1,115 81 Updated Jan 18, 2025
1 Updated Dec 18, 2024
Next