Skip to content
View ycjing's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report ycjing

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 139 12 Updated Dec 17, 2024

DeepSeek 资料大全🔥,DeepSeek 使用,指令指南,应用开发指南,精选资源清单,更好的使用 DeepSeek 让你的生产力 10倍提升! 🚀

90 15 Updated Feb 21, 2025

[COLING 2025]A curated paper list about LLMs for chemistry

32 1 Updated Feb 8, 2025

Conformalized Credal Set Predictors (NeurIPS 2024)

Jupyter Notebook 5 Updated Nov 7, 2024

This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data

Python 3,061 225 Updated Feb 19, 2025

AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。

Python 1,602 205 Updated Mar 3, 2025

Collecting awesome papers of RAG for AIGC. We propose a taxonomy of RAG foundations, enhancements, and applications in paper "Retrieval-Augmented Generation for AI-Generated Content: A Survey".

1,479 106 Updated Aug 20, 2024

Sky-T1: Train your own O1 preview model within $450

Python 3,060 311 Updated Mar 2, 2025

Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥

Python 33,486 2,311 Updated Mar 5, 2025

DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought

209 9 Updated Dec 31, 2024

Official PyTorch implementation of paper "CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up".

Python 200 6 Updated Feb 12, 2025

A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large Language Model Inference-Time Self-Improvement.

70 2 Updated Dec 24, 2024

[CVPR 2025] TinyFusion: Diffusion Transformers Learned Shallow

Python 82 1 Updated Dec 4, 2024

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 6,810 442 Updated Jan 12, 2025

A Survey of Attributions for Large Language Models

196 9 Updated Aug 24, 2024

The first Object-Oriented Programming (OOP) Evaluaion Benchmark for LLMs

Python 24 2 Updated Jan 15, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 40,327 6,044 Updated Mar 5, 2025

Official Implementation of ICLR 2024 paper: "Reasoning on Graphs: Faithful and Interpretable Large Language Model Reasoning"

Python 379 47 Updated Mar 5, 2025

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 7,171 551 Updated Feb 26, 2025

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 8,372 584 Updated Mar 4, 2025

List of papers on Self-Correction of LLMs.

71 2 Updated Dec 28, 2024

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,539 365 Updated Feb 26, 2025

A bibliography and survey of the papers surrounding o1

TeX 1,176 50 Updated Nov 16, 2024

Large Reasoning Models

Python 800 45 Updated Dec 3, 2024

[NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.

Python 99 4 Updated Oct 18, 2024

A new dataset of difficult graduate-level applied mathematics problems; evaluations demonstrate that leading LLMs currently exhibit low accuracy in solving these problems.

Jupyter Notebook 13 2 Updated Feb 14, 2025

Repository for the code and dataset for the paper: "Have LLMs Advanced enough? Towards Harder Problem Solving Benchmarks For Large Language Models"

39 3 Updated Dec 18, 2023

Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 15,819 1,099 Updated Feb 28, 2025
Next