Skip to content
View pengyizhou's full-sized avatar

Highlights

  • Pro

Block or report pengyizhou

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

R Bioinformatics Cookbook, published by Packt

HTML 110 71 Updated Jan 30, 2023

Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…

Jupyter Notebook 16,237 2,334 Updated Feb 19, 2025
Jupyter Notebook 36 8 Updated Feb 19, 2025

A Survey of Spoken Dialogue Models (60 pages)

263 16 Updated Nov 28, 2024

Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。

Python 1,625 181 Updated Jan 16, 2025
TypeScript 33 4 Updated Aug 17, 2024

Paper, Code and Resources for Speech Language Model and End2End Speech Dialogue System.

156 13 Updated Nov 10, 2024

GLM-4-Voice | 端到端中英语音对话模型

Python 2,669 216 Updated Dec 5, 2024

Code for paper "Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models"

Python 239 3 Updated May 24, 2024

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 2,247 163 Updated Feb 14, 2025

A generative speech model for daily dialogue.

Python 34,506 3,724 Updated Feb 18, 2025

Whisper realtime streaming for long speech-to-text transcription and translation

Python 2,471 303 Updated Jan 7, 2025

python bindings for symphonia/opus - read various audio formats from python and write opus files

Rust 30 4 Updated Dec 22, 2024

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 1 Updated Jul 8, 2024

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 7,511 600 Updated Feb 19, 2025

[NeurIPS 2024] SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words

Python 48 1 Updated Jun 25, 2024

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 10,807 1,056 Updated Feb 16, 2025

🚀 Power Your World with AI - Explore, Extend, Empower.

JavaScript 7,141 530 Updated Feb 10, 2025

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,597 115 Updated Jul 5, 2024
HTML 8 Updated Dec 11, 2023

iOS CallKit blocking of NPA-NXX number prefix spam

Swift 76 23 Updated Dec 1, 2018

B 站(bilibili)自动任务工具,支持docker、青龙、k8s等多种部署方式。敏感肌也能用。

C# 6,840 1,820 Updated Feb 16, 2025

🔮 ChatGPT Desktop Application (Mac, Windows and Linux)

Rust 53,627 6,060 Updated Aug 29, 2024

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 13,151 2,683 Updated Feb 19, 2025

Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.

Python 4,166 467 Updated Aug 22, 2024

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 8,244 858 Updated Feb 18, 2025

🔊 Text-Prompted Generative Audio Model

Jupyter Notebook 36,984 4,357 Updated Aug 19, 2024

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.

Python 999 91 Updated Jan 15, 2025

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

Python 1,951 194 Updated Feb 18, 2025

[NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer

Python 16,491 3,449 Updated Oct 9, 2024
Next