Stars
Awesome-LLM: a curated list of Large Language Model
[NeurIPS'24 Spotlight] Observational Scaling Laws
Awesome Lists for Tenure-Track Assistant Professors and PhD students. (助理教授/博士生生存指南)
A quick guide (especially) for trending instruction finetuning datasets
[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
Research without Re-search: Maximal Update Parametrization Yields Accurate Loss Prediction across Scales
Transformer training code for sequential tasks
Standalone TFRecord reader/writer with PyTorch data loaders
An implementation of training for GPT2, supports TPUs
Open Academic Research on Improving LLaMA to SOTA LLM
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Example models using DeepSpeed
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
飞桨大模型开发套件,提供大语言模型、跨模态大模型、生物计算大模型等领域的全流程开发工具链。
PaddleSlim is an open-source library for deep model compression and architecture search.
🎁[ChatGPT4MT] Towards Making the Most of ChatGPT for Machine Translation
🎁[ChatGPT4MTevaluation] ErrorAnalysis Prompt for MT Evaluation in ChatGPT
🎁[ChatGPT4NLU] A Comparative Study on ChatGPT and Fine-tuned BERT
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference,…
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
A procedural Blender pipeline for photorealistic training image generation