JiJiJiang

Hongji Wang JiJiJiang

Speech Algorithm Engineer

56 followers · 36 following

Tencent Meeting, Tencent
Shenzhen, China

Achievements

Stars

ASLP-lab / LLaSE-G1

Forked from Kevin-naticl/LLaSE-G1

LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement

Python 18 1 Updated Mar 10, 2025

JorisCos / LibriMix

An open source dataset for source separation

Python 408 69 Updated Feb 9, 2024

wsstriving / awesome-speaker

2 Updated Mar 5, 2025

BUTSpeechFIT / speakerbeam

Jupyter Notebook 114 19 Updated Oct 25, 2021

gemengtju / SpEx_Plus

SpEx+(tied) source code

Python 79 17 Updated Jul 6, 2023

SparkAudio / Spark-TTS

Spark-TTS Inference Code

Python 2,867 286 Updated Mar 5, 2025

slp-rl / slamkit

SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on One GPU in a Day"

Python 172 8 Updated Mar 8, 2025

wenet-e2e / wespeaker

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

Python 846 127 Updated Feb 26, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,714 198 Updated Mar 4, 2025

ASLP-lab / OSUM

OSUM: Open Speech Understanding Model, open-sourced by ASLP@NPU.

Python 324 18 Updated Mar 6, 2025

facebookresearch / audiobox-aesthetics

Unified automatic quality assessment for speech, music, and sound.

Python 408 25 Updated Mar 7, 2025

FireRedTeam / FireRedASR

Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics rec…

Python 719 48 Updated Mar 5, 2025

WangRongsheng / awesome-LLM-resourses

🧑‍🚀 全世界最好的LLM资料总结（数据处理、模型训练、模型部署、o1 模型、小语言模型、视觉语言模型） | Summary of the world's best LLM resources.

3,977 420 Updated Mar 10, 2025

Jiayi-Pan / TinyZero

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 11,073 1,411 Updated Feb 1, 2025

sooftware / conformer

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

Python 1,003 183 Updated Dec 22, 2023

pirxus / personalVAD

An unofficial implementation of the Personal VAD speaker-conditioned voice activity detection method. Bachelor's thesis project.

Python 65 13 Updated Sep 22, 2022

ga642381 / speech-trident

Awesome speech/audio LLMs, representation learning, and codec models

923 58 Updated Mar 4, 2025

kamilakesbi / DiarizersLM

Python 10 2 Updated Jul 16, 2024

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 40,989 6,178 Updated Mar 10, 2025

nttcslab-sp / mamba-diarization

Official repository for Mamba-based Segmentation Model for Speaker Diarization

Python 33 3 Updated Oct 10, 2024

gengxuelong / wenet_LLM_from_ASLP

wenet_LLM_from_ASLP

Python 8 Updated Nov 26, 2024

meta-llama / llama3

The official Meta Llama 3 GitHub site

Python 28,478 3,308 Updated Jan 26, 2025

meta-llama / llama

Inference code for Llama models

Python 57,829 9,713 Updated Jan 26, 2025

ictnlp / LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,844 194 Updated Nov 14, 2024

hacksider / Deep-Live-Cam

real time face swap and one-click video deepfake with only a single image

Python 44,546 6,564 Updated Mar 6, 2025

VITA-MLLM / VITA

✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,145 164 Updated Feb 13, 2025

huggingface / peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 17,691 1,776 Updated Mar 7, 2025

microsoft / LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 11,463 720 Updated Dec 17, 2024

declare-lab / MELD

MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation

Python 878 210 Updated Mar 10, 2024

kyutai-labs / moshi

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 7,728 621 Updated Mar 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hongji Wang JiJiJiang

Achievements

Achievements

Block or report JiJiJiang

Stars

ASLP-lab / LLaSE-G1

JorisCos / LibriMix

wsstriving / awesome-speaker

BUTSpeechFIT / speakerbeam

gemengtju / SpEx_Plus

SparkAudio / Spark-TTS

slp-rl / slamkit

wenet-e2e / wespeaker

deepseek-ai / open-infra-index

ASLP-lab / OSUM

facebookresearch / audiobox-aesthetics

FireRedTeam / FireRedASR

WangRongsheng / awesome-LLM-resourses

Jiayi-Pan / TinyZero

sooftware / conformer

pirxus / personalVAD

ga642381 / speech-trident

kamilakesbi / DiarizersLM

vllm-project / vllm

nttcslab-sp / mamba-diarization

gengxuelong / wenet_LLM_from_ASLP

meta-llama / llama3

meta-llama / llama

ictnlp / LLaMA-Omni

hacksider / Deep-Live-Cam

VITA-MLLM / VITA

huggingface / peft

microsoft / LoRA

declare-lab / MELD

kyutai-labs / moshi