Skip to content
View xuhzyy's full-sized avatar

Block or report xuhzyy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Make websites accessible for AI agents

Python 36,647 3,795 Updated Mar 3, 2025

🐫 CAMEL: Finding the Scaling Law of Agents. The first and the best multi-agent framework. https://www.camel-ai.org

Python 7,164 820 Updated Mar 8, 2025

🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation

Python 3,035 288 Updated Mar 8, 2025

No fortress, purely open ground. OpenManus is Coming.

Python 12,656 1,845 Updated Mar 8, 2025

Medical o1, Towards medical complex reasoning with LLMs

Python 946 96 Updated Jan 20, 2025

HuatuoGPT, Towards Taming Language Models To Be a Doctor. (An Open Medical GPT)

Python 1,167 152 Updated Dec 16, 2024

Medical NLP Competition, dataset, large models, paper

2,237 418 Updated Dec 6, 2024

Medical Multimodal LLMs

Python 251 20 Updated Jan 9, 2025

Official repository of ’Visual-RFT: Visual Reinforcement Fine-Tuning’

Python 898 32 Updated Mar 6, 2025

[IEEE TIP] TOPIC: A Parallel Association Paradigm for Multi-Object Tracking under Complex Motions and Diverse Scenes

Python 387 44 Updated Feb 27, 2025

Official Repository of paper OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Python 116 Updated Mar 2, 2025

✨ Light and Fast AI Assistant. Support: Web | iOS | MacOS | Android | Linux | Windows

TypeScript 81,723 61,202 Updated Mar 3, 2025

A CPU Realtime VLM in 500M. Surpassed Moondream2 and SmolVLM. Training from scratch with ease.

Python 137 16 Updated Mar 2, 2025

🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM!🌏 Train a 26M-parameter VLM from scratch in just 1 hours!

Python 1,518 168 Updated Feb 23, 2025

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!

Python 14,711 1,634 Updated Feb 23, 2025

Build multimodal language agents for fast prototype and production

Python 2,169 227 Updated Mar 4, 2025

Solve Visual Understanding with Reinforced VLMs

Python 3,911 241 Updated Mar 7, 2025

[KDD2025] Improving Synthetic Image Detection Towards Generalization: An Image Transformation Perspective

Python 44 5 Updated Feb 23, 2025

Build Multimodal AI Agents with memory, knowledge and tools. Simple, fast and model-agnostic.

Python 20,233 2,691 Updated Mar 7, 2025

minimal-cost for training 0.5B R1-Zero

Python 607 79 Updated Feb 26, 2025

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 7,092 623 Updated Feb 10, 2025

Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types

Python 13 Updated Feb 22, 2025

Ola: Pushing the Frontiers of Omni-Modal Language Model

Python 293 11 Updated Feb 28, 2025

Fully open reproduction of DeepSeek-R1

Python 22,335 2,002 Updated Mar 7, 2025

InspireMusic: A Unified Framework for Music, Song, Audio Generation.

Python 943 83 Updated Mar 7, 2025

Witness the aha moment of VLM with less than $3.

Python 3,083 240 Updated Mar 1, 2025

OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。

Python 30,382 3,045 Updated Mar 7, 2025

😎丰富生态、🧩支持扩展、🦄多模态 - 大模型原生即时通信机器人平台 | 适配 QQ / 微信(企业微信、个人微信)/ 飞书 / 钉钉 / Discord / Telegram 等消息平台 | 支持 ChatGPT、DeepSeek、Dify、Claude、Gemini、xAI Grok、Ollama、LM Studio、阿里云百炼、火山方舟、SiliconFlow、Qwen、Moonshot…

Python 9,147 649 Updated Mar 7, 2025

Frontier Multimodal Foundation Models for Image and Video Understanding

Jupyter Notebook 598 38 Updated Mar 7, 2025
Next