Skip to content
View authurlord's full-sized avatar

Block or report authurlord

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 883 104 Updated Jan 23, 2025

Inference Code for Paper "Harder Tasks Need More Experts: Dynamic Routing in MoE Models"

Python 32 2 Updated Jul 30, 2024

Training Sparse Autoencoders on Language Models

Jupyter Notebook 614 141 Updated Feb 12, 2025
Python 50 14 Updated Feb 7, 2025

欢迎来到 LLM-Dojo,这里是一个开源大模型学习场所,使用简洁且易阅读的代码构建模型训练框架(支持各种主流模型如Qwen、Llama、GLM等等)、RLHF框架(DPO/CPO/KTO/PPO)等各种功能。👩‍🎓👨‍🎓

Python 496 46 Updated Jan 13, 2025

awesome SAE papers

18 1 Updated Jan 17, 2025

Bringing BERT into modernity via both architecture changes and scaling

Python 1,168 75 Updated Feb 12, 2025

LoRAMoE: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignment

Python 279 23 Updated Apr 29, 2024

An Efficient LLM Fine-Tuning Factory Optimized for MoE PEFT

Python 65 8 Updated Jan 22, 2025
Python 122 7 Updated Jul 22, 2024

Blazingly fast LLM inference.

Rust 4,974 344 Updated Feb 12, 2025

[NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other models

Python 336 22 Updated Sep 6, 2024

[NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other models

Python 1 Updated Mar 18, 2024

X-LoRA: Mixture of LoRA Experts

Python 207 11 Updated Aug 4, 2024

Adapt an LLM model to a Mixture-of-Experts model using Parameter Efficient finetuning (LoRA), injecting the LoRAs in the FFN.

Python 25 Updated Oct 13, 2024

Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks

Python 140 18 Updated Sep 20, 2024

RapidIn: Scalable Influence Estimation for Large Language Models (LLMs). The implementation for paper "Token-wise Influential Training Data Retrieval for Large Language Models" (Accepted on ACL 2024).

Python 12 1 Updated Oct 23, 2024

This repository contains various advanced techniques for Retrieval-Augmented Generation (RAG) systems.

Jupyter Notebook 1,661 169 Updated Jan 31, 2025

Tools for working with Gauss-Newton Hessian in PyTorch

Python 2 1 Updated Sep 9, 2024

A collection of large question answering datasets

356 36 Updated Jul 1, 2024

An Open Large Reasoning Model for Real-World Solutions

Python 1,431 75 Updated Nov 28, 2024

Data for "Datamodels: Predicting Predictions with Training Data"

Python 95 3 Updated May 25, 2023

Influence Functions with (Eigenvalue-corrected) Kronecker-Factored Approximate Curvature

Python 131 12 Updated Jul 31, 2024

An accessibility tool to assist in FFXIV gameplay and compensate for human imperfections.

C# 268 63 Updated Feb 5, 2025

[ICML 2024] Selecting High-Quality Data for Training Language Models

Python 157 12 Updated Jun 20, 2024

AI Logging for Interpretability and Explainability🔬

Python 102 8 Updated Jun 7, 2024

Official repository for MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models [NeurIPS 2024]

Python 58 6 Updated Nov 14, 2024

Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]

Python 533 29 Updated Dec 9, 2024

DSIR large-scale data selection framework for language model training

Python 242 19 Updated Apr 7, 2024
Next