Skip to content
View dmitrymailk's full-sized avatar
🎈
Focusing
🎈
Focusing

Highlights

  • Pro

Block or report dmitrymailk

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

282 results for source starred repositories
Clear filter

TTS with kokoro and onnx runtime

Python 1,387 119 Updated Feb 3, 2025

LLMPerf is a library for validating and benchmarking LLMs

Python 714 121 Updated Dec 9, 2024

A pipeline parallel training script for diffusion models.

Python 477 43 Updated Jan 26, 2025

The official implementation of paper "BrushEdit: All-In-One Image Inpainting and Editing"

Python 497 25 Updated Dec 26, 2024

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 13,636 1,484 Updated Jan 27, 2025

Official repository for our work on micro-budget training of large-scale diffusion models.

Python 1,211 47 Updated Jan 12, 2025

Live2Diff: A Pipeline that processes Live video streams by a uni-directional video Diffusion model.

Python 180 15 Updated Jul 22, 2024

STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution

Python 824 44 Updated Jan 22, 2025

NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training

266 19 Updated Jan 6, 2025

[WIP] The all in one inference optimization solution for ComfyUI, universal, flexible, and fast.

Python 682 22 Updated Feb 2, 2025

Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…

Python 7,350 456 Updated Jan 28, 2025

UI components and hooks for building video/audio players on the web. Robust, customizable, and accessible. Modern alternative to JW Player and Video.js.

TypeScript 2,582 142 Updated Feb 2, 2025

Repo of the YT Channel

Python 27 4 Updated Feb 14, 2024

Segment Anything 2, 100% in the browser (with WebGPU!)

TypeScript 99 5 Updated Dec 18, 2024

LLM inference in C/C++

C++ 72,758 10,485 Updated Feb 2, 2025

A lightweight, object-oriented finite state machine implementation in Python with many extensions

Python 5,903 533 Updated Aug 23, 2024

A frontend for transitions state machines

Python 69 7 Updated Sep 1, 2024

Structured Text Generation

Python 10,548 556 Updated Jan 31, 2025

A Python-based voice assistant integrating speech-to-text (STT), text-to-speech (TTS), and powerful AI capabilities using either a local LLM via llama.cpp or OpenAI API. Includes clipboard integrat…

Python 9 1 Updated Dec 2, 2024

The Hugging Face Course on Transformers for Audio

MDX 363 105 Updated Jan 23, 2025
Python 5 Updated Jan 22, 2025

A set of ComfyUI nodes providing additional control for the LTX Video model

Python 433 19 Updated Dec 21, 2024

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 39,707 4,455 Updated Jan 18, 2025

StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation

Python 9,904 726 Updated Dec 4, 2024

Android application for running Windows applications with Wine and Box86/Box64

C 10,712 566 Updated Jan 6, 2025
Python 2 1 Updated Dec 19, 2024

Various custom nodes for ComfyUI

Python 814 88 Updated Feb 1, 2025

Python hook for ReShade processing

C++ 36 4 Updated Mar 25, 2023

Official codes of VEnhancer: Generative Space-Time Enhancement for Video Generation

Python 492 27 Updated Sep 16, 2024
Python 68 2 Updated Nov 2, 2024
Next