Stars
veRL: Volcano Engine Reinforcement Learning for LLM
Scalable RL solution for advanced reasoning of language models
A high performance general purpose code execution engine.
Sandboxed code execution for AI agents, locally or on the cloud.
A multi-language code evaluation tool.
Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym
Qwen2.5-Coder is the code version of Qwen2.5, the large language model series developed by Qwen team, Alibaba Cloud.
[ICLR 2024] SWE-bench: Can Language Models Resolve Real-world Github Issues?
800,000 step-level correctness labels on LLM solutions to MATH problems
👨💻 An awesome and curated list of best code-LLM for research.
Recipes to train reward model for RLHF.
A series of math-specific large language models of our Qwen2 series.
Recipes to scale inference-time compute of open models
Code and datasets for "What’s “up” with vision-language models? Investigating their struggle with spatial reasoning".
Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought and OpenAI o1 🍓
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
A curated list of awesome data labeling tools
To speedup and simplify image labeling/ annotation process with multiple supported formats.
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities (ICML 2024)
Tips for Writing a Research Paper using LaTeX