Stars
This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicitation and Distillation Algorithms, and explore the Skill & V…
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
基于pytorch的GlobalPointer进行中文命名实体识别。
Transformer: PyTorch Implementation of "Attention Is All You Need"
Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning And private Server services
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.
这个是一个在SSD的基础上用于生成绘制mAP代码所用的txt的例子。(目的是生成txt)
Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
A pytorch implementation of Attention is all you need
🦜🔗 Build context-aware reasoning applications
Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation me…
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
Example models using DeepSpeed
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
🩹Editing large language models within 10 seconds⚡
Fast and memory-efficient exact attention
Reference implementation for DPO (Direct Preference Optimization)
Code for the paper "Evaluating Large Language Models Trained on Code"
Aligning pretrained language models with instruction data generated by themselves.
A large-scale 7B pretraining language model developed by BaiChuan-Inc.
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Code for the paper Fine-Tuning Language Models from Human Preferences