xuhzyy

Follow

xuhzyy

Follow

2 followers · 0 following

Achievements

Achievements

Stars

browser-use / browser-use

Make websites accessible for AI agents

Python 36,647 3,795 Updated Mar 3, 2025

camel-ai / camel

🐫 CAMEL: Finding the Scaling Law of Agents. The first and the best multi-agent framework. https://www.camel-ai.org

Python 7,164 820 Updated Mar 8, 2025

camel-ai / owl

🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation

Python 3,035 288 Updated Mar 8, 2025

mannaandpoem / OpenManus

No fortress, purely open ground. OpenManus is Coming.

Python 12,656 1,845 Updated Mar 8, 2025

baichuan-inc / Baichuan-M1-14B

147 4 Updated Feb 20, 2025

FreedomIntelligence / HuatuoGPT-o1

Medical o1, Towards medical complex reasoning with LLMs

Python 946 96 Updated Jan 20, 2025

FreedomIntelligence / HuatuoGPT

HuatuoGPT, Towards Taming Language Models To Be a Doctor. (An Open Medical GPT)

Python 1,167 152 Updated Dec 16, 2024

FreedomIntelligence / Medical_NLP

Medical NLP Competition, dataset, large models, paper

2,237 418 Updated Dec 6, 2024

FreedomIntelligence / HuatuoGPT-Vision

Medical Multimodal LLMs

Python 251 20 Updated Jan 9, 2025

Liuziyu77 / Visual-RFT

Official repository of ’Visual-RFT: Visual Reinforcement Fine-Tuning’

Python 898 32 Updated Mar 6, 2025

holmescao / TOPICTrack

[IEEE TIP] TOPIC: A Parallel Association Paradigm for Multi-Object Tracking under Complex Motions and Diverse Scenes

Python 387 44 Updated Feb 27, 2025

PhoenixZ810 / OmniAlign-V

Official Repository of paper OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Python 116 Updated Mar 2, 2025

ChatGPTNextWeb / NextChat

✨ Light and Fast AI Assistant. Support: Web | iOS | MacOS | Android | Linux | Windows

TypeScript 81,723 61,202 Updated Mar 3, 2025

lucasjinreal / Namo-R1

A CPU Realtime VLM in 500M. Surpassed Moondream2 and SmolVLM. Training from scratch with ease.

Python 137 16 Updated Mar 2, 2025

jingyaogong / minimind-v

🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM！🌏 Train a 26M-parameter VLM from scratch in just 1 hours!

Python 1,518 168 Updated Feb 23, 2025

jingyaogong / minimind

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT！🌏 Train a 26M-parameter GPT from scratch in just 2h!

Python 14,711 1,634 Updated Feb 23, 2025

om-ai-lab / OmAgent

Build multimodal language agents for fast prototype and production

Python 2,169 227 Updated Mar 4, 2025

om-ai-lab / VLM-R1

Solve Visual Understanding with Reinforced VLMs

Python 3,911 241 Updated Mar 7, 2025

Ouxiang-Li / SAFE

[KDD2025] Improving Synthetic Image Detection Towards Generalization: An Image Transformation Perspective

Python 44 5 Updated Feb 23, 2025

agno-agi / agno

Build Multimodal AI Agents with memory, knowledge and tools. Simple, fast and model-agnostic.

Python 20,233 2,691 Updated Mar 7, 2025

dhcode-cpp / X-R1

minimal-cost for training 0.5B R1-Zero

Python 607 79 Updated Feb 26, 2025

Ucas-HaoranWei / GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 7,092 623 Updated Feb 10, 2025

Kwai-YuanQi / TaskGalaxy

Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types

Python 13 Updated Feb 22, 2025

Ola-Omni / Ola

Ola: Pushing the Frontiers of Omni-Modal Language Model

Python 293 11 Updated Feb 28, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 22,335 2,002 Updated Mar 7, 2025

FunAudioLLM / InspireMusic

InspireMusic: A Unified Framework for Music, Song, Audio Generation.

Python 943 83 Updated Mar 7, 2025

Deep-Agent / R1-V

Witness the aha moment of VLM with less than $3.

Python 3,083 240 Updated Mar 1, 2025

hiroi-sora / Umi-OCR

OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片，PDF文档识别，排除水印/页眉页脚，扫描/生成二维码。内置多国语言库。

Python 30,382 3,045 Updated Mar 7, 2025

RockChinQ / LangBot

😎丰富生态、🧩支持扩展、🦄多模态 - 大模型原生即时通信机器人平台 | 适配 QQ / 微信（企业微信、个人微信）/ 飞书 / 钉钉 / Discord / Telegram 等消息平台 | 支持 ChatGPT、DeepSeek、Dify、Claude、Gemini、xAI Grok、Ollama、LM Studio、阿里云百炼、火山方舟、SiliconFlow、Qwen、Moonshot…

Python 9,147 649 Updated Mar 7, 2025

DAMO-NLP-SG / VideoLLaMA3

Frontier Multimodal Foundation Models for Image and Video Understanding

Jupyter Notebook 598 38 Updated Mar 7, 2025