Skip to content
View jxjessieli's full-sized avatar

Highlights

  • Pro

Block or report jxjessieli

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 890 53 Updated Mar 4, 2025

Code and Data for Tau-Bench

Python 325 47 Updated Jan 22, 2025

Official Repo for Open-Reasoner-Zero

Python 1,558 73 Updated Mar 5, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,712 197 Updated Mar 4, 2025

Fully open data curation for reasoning models

Python 1,483 126 Updated Feb 23, 2025
Python 490 58 Updated Jan 2, 2025

Fully open reproduction of DeepSeek-R1

Python 22,483 2,016 Updated Mar 10, 2025

This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data

Python 3,107 229 Updated Feb 19, 2025

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,556 364 Updated Mar 6, 2025

✨✨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio

Python 41 2 Updated Oct 17, 2024

Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"

Python 162 5 Updated Mar 6, 2025

Long Context Extension and Generalization in LLMs

Python 50 1 Updated Sep 21, 2024

USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference

Python 444 32 Updated Feb 19, 2025

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 4,883 518 Updated Mar 7, 2025

Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 16,062 1,118 Updated Feb 28, 2025

Suri: Multi-constraint instruction following for long-form text generation (EMNLP’24)

Python 22 Updated Nov 10, 2024

[NeurIPS'24 Spotlight, ICLR'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an …

Python 931 46 Updated Feb 25, 2025

Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".

Python 192 17 Updated Mar 8, 2025

The official implementation of "Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks"

Python 53 2 Updated Apr 22, 2024
Python 39 1 Updated Aug 10, 2024

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Python 1,101 72 Updated Jan 23, 2025

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Python 4,353 330 Updated Mar 9, 2025

Official repo for "Make Your LLM Fully Utilize the Context"

Python 252 20 Updated May 15, 2024

The official Meta Llama 3 GitHub site

Python 28,478 3,308 Updated Jan 26, 2025

[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

Python 644 63 Updated Jun 1, 2024

[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

Python 4,927 278 Updated Jan 26, 2025

📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥

1,308 44 Updated Mar 10, 2025

Reading list of Instruction-tuning. A trend starts from Natrural-Instruction (ACL 2022), FLAN (ICLR 2022) and T0 (ICLR 2022).

765 24 Updated Jul 20, 2023

Code for "Mixed Cross Entropy Loss for Neural Machine Translation"

Python 20 1 Updated Jul 23, 2021
Next