PeiwenSun2000

PeiwenSun PeiwenSun2000

6 followers · 0 following

Achievements

Highlights

Stars

315386775 / DeepLearing-Interview-Awesome-2024

AIGC-interview/CV-interview/LLMs-interview面试问题与答案集合仓，同时包含工作和科研过程中的新想法、新问题、新资源与新项目

1,968 187 Updated Jan 13, 2025

alibaba / Tora

The official repository for paper "Tora: Trajectory-oriented Diffusion Transformer for Video Generation"

Python 1,058 46 Updated Jan 6, 2025

magic-research / Sa2VA

🔥 Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Python 802 52 Updated Jan 22, 2025

Stability-AI / stable-audio-tools

Generative models for conditional audio generation

Python 2,864 281 Updated Jan 10, 2025

av-savchenko / face-emotion-recognition

Efficient face emotion recognition in photos and videos

Jupyter Notebook 728 131 Updated Dec 18, 2024

PeiwenSun2000 / Both-Ears-Wide-Open

The official repo for Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation

19 Updated Jan 24, 2025

bytedance / Make-An-Audio-2

a text-conditional diffusion probabilistic model capable of generating high fidelity audio.

Python 143 18 Updated May 29, 2024

zhenye234 / FlashSpeech

ACM MM 2024 FlashSpeech: Efficient Zero-Shot Speech Synthesis

Python 120 7 Updated Sep 20, 2024

jiahaoli57 / Call-for-Reviewers

This project aims to collect the latest "call for reviewers" links from various top CS/ML/AI conferences/journals

727 21 Updated Jan 31, 2025

gpt-omni / mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,107 270 Updated Nov 5, 2024

zhenye234 / xcodec

AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model

Python 146 9 Updated Jan 9, 2025

wanghao9610 / OV-DINO

Official implementation of OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion

Python 288 17 Updated Jan 17, 2025

microsoft / GLIP

Grounded Language-Image Pre-training

Python 2,308 197 Updated Jan 24, 2024

microsoft / RegionCLIP

[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"

Python 732 52 Updated Mar 20, 2024

Qiangest / DeepEar

DeepEar: Sound Localization with Binaural Microphones

Python 8 4 Updated Mar 10, 2024

bingo-todd / WaveLoc

End-to-End binaural sound localization

Python 14 2 Updated Feb 27, 2020

see2sound / see2sound

Official code for SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound

Python 121 9 Updated Nov 9, 2024

BingYang-20 / DP-RTF-Learning

A python implementation of “Learning Deep Direct-Path Relative Transfer Function for Binaural Sound Source Localization” [TASLP 2021]

Python 23 Updated Feb 11, 2023

Audio-WestlakeU / FN-SSL

The Official PyTorch Implementation of FN-SSL & IPDnet for Sound Source Localization [INTERSPEECH2023 & TASLP2024]

Python 102 11 Updated Dec 9, 2024

otaha178 / Emotion-recognition

Real time emotion recognition

Python 1,113 367 Updated Aug 30, 2024

GeWu-Lab / Ref-AVS

The official repo for "Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes", ECCV 2024

Python 31 1 Updated Dec 4, 2024

We-Math / We-Math

Code and data of We-Math

Python 125 9 Updated Jan 9, 2025

HarborYuan / ovsam

[ECCV 2024] The official code of paper "Open-Vocabulary SAM".

Python 924 31 Updated Jul 31, 2024

ayameyao / ResearchToolCode

Python 6 Updated Oct 25, 2023

Harmonai-org / oobleck

open soundstream-ish VAE codecs for downstream neural audio synthesis

Python 116 10 Updated Jun 12, 2023

haoheliu / AudioLDM2

Text-to-Audio/Music Generation

Python 2,364 183 Updated Sep 29, 2024

karpathy / nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 38,954 6,331 Updated Dec 9, 2024

karpathy / minGPT

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

Python 21,214 2,743 Updated Aug 15, 2024

facebookresearch / audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Jupyter Notebook 21,402 2,221 Updated Jan 15, 2025

facebookresearch / encodec

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Python 3,580 313 Updated Jan 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PeiwenSun PeiwenSun2000

Achievements

Achievements

Highlights

Block or report PeiwenSun2000

Stars

315386775 / DeepLearing-Interview-Awesome-2024

alibaba / Tora

magic-research / Sa2VA

Stability-AI / stable-audio-tools

av-savchenko / face-emotion-recognition

PeiwenSun2000 / Both-Ears-Wide-Open

bytedance / Make-An-Audio-2

zhenye234 / FlashSpeech

jiahaoli57 / Call-for-Reviewers

gpt-omni / mini-omni

zhenye234 / xcodec

wanghao9610 / OV-DINO

microsoft / GLIP

microsoft / RegionCLIP

Qiangest / DeepEar

bingo-todd / WaveLoc

see2sound / see2sound

BingYang-20 / DP-RTF-Learning

Audio-WestlakeU / FN-SSL

otaha178 / Emotion-recognition

GeWu-Lab / Ref-AVS

We-Math / We-Math

HarborYuan / ovsam

ayameyao / ResearchToolCode

Harmonai-org / oobleck

haoheliu / AudioLDM2

karpathy / nanoGPT

karpathy / minGPT

facebookresearch / audiocraft

facebookresearch / encodec