Skip to content
View bmd080's full-sized avatar

Block or report bmd080

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A cross-platform library to access USB devices

C 5,462 1,942 Updated Nov 14, 2024

Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way.

TypeScript 31,105 2,986 Updated Feb 24, 2025

Python tool for converting files and office documents to Markdown.

HTML 38,860 1,787 Updated Feb 21, 2025

A feature-rich command-line audio/video downloader

Python 101,521 7,954 Updated Feb 23, 2025

👾 Fast and simple video download library and CLI tool written in Go

Go 28,725 3,079 Updated Oct 12, 2024

MiniCPM on Android platform.

Python 625 50 Updated Apr 11, 2024

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.

Python 27,656 5,681 Updated Feb 24, 2025

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Python 68,328 7,348 Updated Feb 24, 2025

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

Python 6,885 1,021 Updated Aug 5, 2024

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 9,845 1,330 Updated Feb 22, 2025

real time face swap and one-click video deepfake with only a single image

Python 44,156 6,469 Updated Feb 19, 2025

AI app store powered by 24/7 desktop history. open source | 100% local | dev friendly | 24/7 screen, mic recording

TypeScript 12,402 869 Updated Feb 24, 2025

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,818 191 Updated Nov 14, 2024

Empowering RAG with a memory-based data interface for all-purpose applications!

Python 1,640 113 Updated Nov 28, 2024

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,168 275 Updated Nov 5, 2024

中文领域心理健康对话大模型SoulChat

Python 557 60 Updated Jun 15, 2024

✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,105 159 Updated Feb 13, 2025

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 18,636 1,331 Updated Feb 21, 2025

Microsoft's GraphRAG + AutoGen + Ollama + Chainlit = Fully Local & Free Multi-Agent RAG Superbot

Python 631 125 Updated Jul 20, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 14,201 1,453 Updated Dec 25, 2024

EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning

Python 3,591 399 Updated Dec 10, 2024

This is the official implementation of "Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams"

Python 164 15 Updated Dec 24, 2024

跨平台视频提取工具:支持流媒体下载、视频下载、m3u8 下载及 B站视频下载,提供 Windows 和 Mac 桌面客户端。Cross-platform video extraction tool: Supports streaming download, video download, m3u8 download, and Bilibili video download, with des…

TypeScript 5,967 551 Updated Feb 17, 2025

Self-hosted AI coding assistant

Rust 30,018 1,376 Updated Feb 24, 2025

GraphRAG using Local LLMs - Features robust API and multiple apps for Indexing/Prompt Tuning/Query/Chat/Visualizing/Etc. This is meant to be the ultimate GraphRAG/KG local LLM app.

Python 1,993 234 Updated Nov 9, 2024

SEED-Story: Multimodal Long Story Generation with Large Language Model

Python 792 60 Updated Oct 11, 2024

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。

Python 26,422 2,024 Updated Feb 24, 2025

2025年2月更新,五星体育直播源、咪咕、五大联赛直播源、F1直播源,IPTV电视直播源、APTV电视直播源、IPTV直播软件、中国、台港澳、海外IPTV直播源M3U、TV观看工具,iptv最新可用直播源iptv4/iptv6,TVBox接口,福利节目源,IPTV检查工具、电视家替代APP

849 85 Updated Feb 18, 2025

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Python 1,082 72 Updated Jan 23, 2025

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO、GRPO。

Python 3,610 531 Updated Feb 20, 2025
Next