Skip to content
View by2101's full-sized avatar

Block or report by2101

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Benchmarking physical understanding in generative video models

Python 38 Updated Jan 17, 2025

Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"

Python 395 10 Updated Sep 2, 2024

🔥🔥 Kokoro in Rust. https://huggingface.co/hexgrad/Kokoro-82M Insanely fast, realtime TTS with high quality you ever have.

Rust 183 13 Updated Jan 18, 2025
Python 1,634 105 Updated Jan 16, 2025

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 5,287 458 Updated Aug 10, 2024

Code for NeurIPS 2024 paper - The GAN is dead; long live the GAN! A Modern Baseline GAN - by Huang et al.

Python 553 19 Updated Jan 15, 2025

Whisper with Medusa heads

Python 818 50 Updated Dec 30, 2024

Align Anything: Training All-modality Model with Feedback

Python 511 95 Updated Jan 18, 2025

Music repair method to convert lossy MP3 compressed music to lossless music.

Python 180 16 Updated Jan 7, 2025

Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch

Python 535 29 Updated Jan 18, 2025

VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation

Python 99 1 Updated Jan 17, 2025

🎛 🔊 A Python library for audio.

C++ 5,329 272 Updated Nov 26, 2024

Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*

Python 15 1 Updated Jan 13, 2025

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Python 2,016 516 Updated Jul 27, 2024

✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 1,938 136 Updated Jan 17, 2025

📚[WIP] FFPA: Yet antother Faster Flash Prefill Attention with O(1)⚡️GPU SRAM complexity for headdim > 256, 1.8x~3x↑🎉faster vs SDPA EA.

Cuda 51 1 Updated Jan 18, 2025

Scalable RL solution for advanced reasoning of language models

Python 897 55 Updated Jan 17, 2025

[Preprint] "Understanding Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing" by Peihao Wang, Ruisi Cai, Yuehao Wang, Jiajun Zhu, Pragya Srivastava, Zhangyang Wang, Pa…

8 Updated Jan 3, 2025

Official repository of the paper "MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization".

Python 123 5 Updated Jan 9, 2025

Applied AI experiments and examples for PyTorch

Python 211 21 Updated Jan 17, 2025

Triton implement of bi-directional (non-causal) linear attention

Python 35 1 Updated Jan 13, 2025

Medical o1, Towards medical complex reasoning with LLMs

Python 673 67 Updated Jan 5, 2025

The official repository for the paper "Optimal Flow Matching: Learning Straight Trajectories in Just One Step" (NeurIPS 2024)

Jupyter Notebook 53 1 Updated Dec 19, 2024

An AI Hedge Fund Team

Python 6,529 1,219 Updated Jan 18, 2025

Tarsier -- a family of large-scale video-language models, which is designed to generate high-quality video descriptions , together with good capability of general video understanding.

Python 187 12 Updated Jan 15, 2025

DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought

194 9 Updated Dec 31, 2024

Grapheme-to-Phoneme for Mixed Chinese (Mandarin or Cantonese) and English.

Python 83 12 Updated Jan 16, 2025

Torchaudio Forced Aligner for Mixed Chinese (Mandarin or Cantonese) and English.

Python 9 1 Updated Dec 25, 2024

High performance components for building Trading Platform such as ultra fast matching engine, order book processor

C++ 849 257 Updated Mar 9, 2024
Next