wendongj

Follow

wendong wendongj

Follow

work with attention

4 followers · 141 following

Stars

Deep-Agent / R1-V

Witness the aha moment of VLM with less than $3.

Python 1,770 122 Updated Feb 8, 2025

lucidrains / transformer-directed-evolution

Explorations into whether a transformer with RL can direct a genetic algorithm to converge faster

51 1 Updated Feb 2, 2025

multimodal-art-projection / YuE

YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open

Python 3,209 321 Updated Feb 8, 2025

deepseek-ai / DeepSeek-MoE

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Python 1,315 225 Updated Jan 16, 2024

hkust-nlp / simpleRL-reason

This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data

Python 2,236 168 Updated Feb 7, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 17,565 1,451 Updated Feb 7, 2025

FireRedTeam / FireRedASR

FireRedASR is a family of open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outs…

Python 107 4 Updated Feb 5, 2025

Mikezz1 / sepformer-tse

target speaker extraction with sepformer

Python 4 Updated Apr 20, 2024

isHuangZiling / SEF-PNet

5 Updated Sep 3, 2024

fairsky0201 / gpuRIR

Cuda 3 1 Updated Apr 29, 2024

modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 7,975 828 Updated Feb 5, 2025

deepseek-ai / DeepSeek-R1

68,018 8,722 Updated Feb 8, 2025

Mddct / cosyvoice2-flow-optimized

faster inference

18 1 Updated Jan 20, 2025

merlresearch / tf-locoformer

Transformer with Local Modeling by Convolution for Speech Separation and Enhancement

Python 39 5 Updated Aug 1, 2024

Andong-Li-speech / RNDVoC

This is the official repository of ``Scalable Neural Vocoder from Range-Null Space Decomposition'', which is submitted to TPAMI.

8 1 Updated Jan 10, 2025

Taltt / RNDVoC

Forked from Andong-Li-speech/RNDVoC

This is the official repository of ``Scalable Neural Vocoder from Range-Null Space Decomposition'', which is submitted to TPAMI.

1 Updated Jan 10, 2025

JMCheng-SEU / UCLFWPKD-for-SE

This is the repository of the manuscript "Residual Fusion Probabilistic Knowledge Distillation for Speech Enhancement".

JavaScript 4 1 Updated Apr 17, 2024

MiniMax-AI / MiniMax-01

Python 2,074 142 Updated Jan 16, 2025

NovaSky-AI / SkyThought

Sky-T1: Train your own O1 preview model within $450

Python 2,429 262 Updated Feb 7, 2025

vkothapally / Subband-Beamformer

HTML 32 4 Updated Nov 29, 2022

TaoRuijie / SEANet

Code for Audio-Visual Target Speaker Extraction with Selective Auditory Attention (TASLP)

8 Updated Jan 21, 2025

bytedance / LatentSync

Taming Stable Diffusion for Lip Sync!

Python 2,301 324 Updated Jan 19, 2025

seongho608 / RingFormer

Python 38 1 Updated Jan 9, 2025

AestheTech163 / PercepNet

My implementation of percepnet

Jupyter Notebook 7 2 Updated Apr 15, 2024

tan90xx / distillw2n

Python 5 Updated Feb 1, 2025

TMElyralab / MuseTalk

MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting

Python 3,389 433 Updated Nov 27, 2024

duchenzhuang / FSQ-pytorch

A Pytorch Implementation of Finite Scalar Quantization

Python 107 4 Updated Nov 29, 2023

SocialAI-tianji / Tianji

制作懂人情世故的大语言模型 | 涵盖提示词工程、RAG、Agent、LLM微调教程

Python 1,115 81 Updated Jan 18, 2025

deepseek-ai / DeepSeek-V3

Python 79,845 12,647 Updated Feb 8, 2025

ZBang / USEF-TP

1 Updated Dec 18, 2024