xoicy

Follow

xoicy

Follow

4 followers · 1 following

Achievements

Achievements

Starred repositories

fishaudio / fish-speech

SOTA Open Source TTS

Python 18,275 1,367 Updated Jan 12, 2025

hkchengrex / MMAudio

[arXiv 2024] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

Python 927 100 Updated Jan 9, 2025

PixArt-alpha / PixArt-sigma

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Python 1,731 85 Updated Oct 31, 2024

microsoft / onnxruntime-genai

Generative AI extensions for onnxruntime

C++ 574 145 Updated Jan 11, 2025

facebookresearch / demucs

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Python 8,553 1,100 Updated Apr 24, 2024

huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.

Python 27,068 5,552 Updated Jan 12, 2025

Lightricks / LTX-Video

Official repository for LTX-Video

Python 2,494 204 Updated Jan 3, 2025

Breta01 / handwriting-ocr

OCR software for recognition of handwritten text

Jupyter Notebook 779 241 Updated Dec 23, 2022

VikParuchuri / surya

OCR, layout analysis, reading order, table recognition in 90+ languages

Python 15,390 994 Updated Jan 11, 2025

deepseek-ai / DeepSeek-VL2

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Python 766 69 Updated Dec 30, 2024

atfortes / Awesome-LLM-Reasoning

Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought and OpenAI o1 🍓

2,243 127 Updated Dec 17, 2024

maitrix-org / llm-reasoners

A library for advanced large language model reasoning

Python 1,651 143 Updated Jan 10, 2025

tberg12 / ocular

Ocular is a state-of-the-art historical OCR system.

Java 257 48 Updated Jun 7, 2024

ocropus-archive / DUP-ocropy

Python-based tools for document analysis and OCR

Jupyter Notebook 3,429 592 Updated May 22, 2021

THUDM / CogVideo

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 10,242 956 Updated Jan 12, 2025

NVlabs / Sana

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer

Python 1,937 100 Updated Jan 12, 2025

intel / intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Python 2,150 210 Updated Oct 8, 2024

intel / neural-speed

An innovative library for efficient LLM inference via low-bit quantization

C++ 352 38 Updated Aug 30, 2024

microsoft / BitNet

Official inference framework for 1-bit LLMs

C++ 12,584 881 Updated Dec 20, 2024

microsoft / T-MAC

Low-bit LLM inference on CPU with lookup table

C++ 641 48 Updated Jan 9, 2025

OpenNMT / CTranslate2

Fast inference engine for Transformer models

C++ 3,517 310 Updated Dec 18, 2024

facebookresearch / audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Python 21,288 2,205 Updated Nov 11, 2024

adefossez / demucs

Forked from facebookresearch/demucs

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Python 1,134 109 Updated Jul 15, 2024

mit-han-lab / efficientvit

Efficient vision foundation models for high-resolution generation and perception.

Python 2,540 204 Updated Dec 24, 2024

ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++

C++ 36,841 3,786 Updated Jan 9, 2025

efeslab / Nanoflow

A throughput-oriented high-performance serving framework for LLMs

Cuda 692 29 Updated Sep 21, 2024

jiaaro / pydub

Manipulate audio with a simple and easy high level interface

Python 9,093 1,060 Updated Jul 25, 2024

deanmalmgren / textract

extract text from any document. no muss. no fuss.

HTML 3,954 612 Updated Dec 2, 2024

getomni-ai / zerox

PDF to Markdown with vision models

Python 7,891 475 Updated Dec 18, 2024

SWivid / F5-TTS

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 8,854 1,164 Updated Jan 9, 2025

Starred topics

neural-machine-translation