zhtsh

Follow

🎯

Focusing

zhtsh zhtsh

🎯

Focusing

Follow

Stay hungry, Stay foolish.

14 followers · 5 following

meetsocial
shanghai, china

Stars

Qihoo360 / Light-R1

Python 98 3 Updated Mar 6, 2025

Liuziyu77 / Visual-RFT

Official repository of ’Visual-RFT: Visual Reinforcement Fine-Tuning’

Python 835 30 Updated Mar 6, 2025

deepseek-ai / 3FS

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 7,535 648 Updated Mar 6, 2025

deepseek-ai / profile-data

Analyze computation-communication overlap in V3/R1.

886 113 Updated Mar 3, 2025

deepseek-ai / EPLB

Expert Parallelism Load Balancer

Python 1,018 146 Updated Feb 27, 2025

deepseek-ai / DualPipe

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,504 239 Updated Mar 5, 2025

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 4,798 464 Updated Mar 5, 2025

Wan-Video / Wan2.1

Wan: Open and Advanced Large-Scale Video Generative Models

Python 7,298 717 Updated Mar 6, 2025

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 7,025 600 Updated Mar 6, 2025

deepseek-ai / FlashMLA

FlashMLA: Efficient MLA decoding kernels

C++ 11,161 774 Updated Mar 1, 2025

microsoft / OmniParser

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 19,305 1,546 Updated Feb 23, 2025

browser-use / browser-use

Make websites accessible for AI agents

Python 35,733 3,703 Updated Mar 3, 2025

open-webui / open-webui

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

JavaScript 81,373 9,771 Updated Mar 6, 2025

ericciarla / trendFinder

Stay on top of trending topics on social media and the web with AI

TypeScript 2,702 290 Updated Feb 17, 2025

unslothai / unsloth

Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥

Python 33,664 2,355 Updated Mar 6, 2025

hkust-nlp / simpleRL-reason

This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data

Python 3,076 228 Updated Feb 19, 2025

simplescaling / s1

s1: Simple test-time scaling

Python 5,858 669 Updated Mar 6, 2025

ollama / ollama

Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.

Go 131,423 10,790 Updated Mar 6, 2025

deepseek-ai / Janus

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 16,601 2,175 Updated Feb 1, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 22,248 1,996 Updated Mar 6, 2025

bytedance / tarsier

Tarsier -- a family of large-scale video-language models, which is designed to generate high-quality video descriptions , together with good capability of general video understanding.

Python 308 17 Updated Feb 17, 2025

DS4SD / docling

Get your documents ready for gen AI

Python 23,410 1,361 Updated Mar 6, 2025

jgm / pandoc

Universal markup converter

Haskell 36,184 3,453 Updated Mar 6, 2025

opendatalab / MinerU

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具，将PDF转换成Markdown和JSON格式。

Python 27,501 2,119 Updated Mar 6, 2025

microsoft / markitdown

Python tool for converting files and office documents to Markdown.

Python 39,579 1,840 Updated Mar 6, 2025

deepseek-ai / DeepSeek-R1

85,224 10,992 Updated Feb 24, 2025

QwenLM / Qwen2.5-Coder

Qwen2.5-Coder is the code version of Qwen2.5, the large language model series developed by Qwen team, Alibaba Cloud.

Python 4,596 367 Updated Mar 3, 2025

QwenLM / Qwen2.5-VL

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 8,416 591 Updated Mar 4, 2025

QwenLM / Qwen2.5

Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 15,909 1,108 Updated Feb 28, 2025

deepseek-ai / DeepSeek-Coder-V2

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

5,423 808 Updated Sep 24, 2024