Skip to content
View zhtsh's full-sized avatar
🎯
Focusing
🎯
Focusing
  • meetsocial
  • shanghai, china

Block or report zhtsh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 98 3 Updated Mar 6, 2025

Official repository of ’Visual-RFT: Visual Reinforcement Fine-Tuning’

Python 835 30 Updated Mar 6, 2025

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 7,535 648 Updated Mar 6, 2025

Analyze computation-communication overlap in V3/R1.

886 113 Updated Mar 3, 2025

Expert Parallelism Load Balancer

Python 1,018 146 Updated Feb 27, 2025

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,504 239 Updated Mar 5, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 4,798 464 Updated Mar 5, 2025

Wan: Open and Advanced Large-Scale Video Generative Models

Python 7,298 717 Updated Mar 6, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 7,025 600 Updated Mar 6, 2025

FlashMLA: Efficient MLA decoding kernels

C++ 11,161 774 Updated Mar 1, 2025

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 19,305 1,546 Updated Feb 23, 2025

Make websites accessible for AI agents

Python 35,733 3,703 Updated Mar 3, 2025

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

JavaScript 81,373 9,771 Updated Mar 6, 2025

Stay on top of trending topics on social media and the web with AI

TypeScript 2,702 290 Updated Feb 17, 2025

Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥

Python 33,664 2,355 Updated Mar 6, 2025

This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data

Python 3,076 228 Updated Feb 19, 2025

s1: Simple test-time scaling

Python 5,858 669 Updated Mar 6, 2025

Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.

Go 131,423 10,790 Updated Mar 6, 2025

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 16,601 2,175 Updated Feb 1, 2025

Fully open reproduction of DeepSeek-R1

Python 22,248 1,996 Updated Mar 6, 2025

Tarsier -- a family of large-scale video-language models, which is designed to generate high-quality video descriptions , together with good capability of general video understanding.

Python 308 17 Updated Feb 17, 2025

Get your documents ready for gen AI

Python 23,410 1,361 Updated Mar 6, 2025

Universal markup converter

Haskell 36,184 3,453 Updated Mar 6, 2025

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。

Python 27,501 2,119 Updated Mar 6, 2025

Python tool for converting files and office documents to Markdown.

Python 39,579 1,840 Updated Mar 6, 2025

Qwen2.5-Coder is the code version of Qwen2.5, the large language model series developed by Qwen team, Alibaba Cloud.

Python 4,596 367 Updated Mar 3, 2025

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 8,416 591 Updated Mar 4, 2025

Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 15,909 1,108 Updated Feb 28, 2025

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

5,423 808 Updated Sep 24, 2024
Next