Highlights
- Pro
Stars
Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied with elaborately-written concise descriptions to help readers g…
This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.
Efficient Triton Kernels for LLM Training
State-of-the-art bilingual open-sourced Math reasoning LLMs.
Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.
QLoRA: Efficient Finetuning of Quantized LLMs
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
This repository contains the joint use of CPO and SimPO method for better reference-free preference learning methods.
A framework for few-shot evaluation of language models.
Get up and running with Llama 3.3, Mistral, Gemma 2, and other large language models.
A collection of AWESOME things about mixture-of-experts
Code for STaR: Bootstrapping Reasoning With Reasoning (NeurIPS 2022)
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
📰 Must-read papers and blogs on Speculative Decoding ⚡️
A smart router to switch between GPT-3.5 and GPT-4 based on the hardness of the context. Aim to reduce cost while keeping the performance ≈ GPT-3¾.