Stars
MTEB: Massive Text Embedding Benchmark
[ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版
MindSpore online courses: Step into LLM
QLoRA: Efficient Finetuning of Quantized LLMs
闻达:一个LLM调用平台。目标为针对特定环境的高效内容生成,同时考虑个人和中小企业的计算资源局限性,以及知识安全和私密性问题
TechGPT: Technology-Oriented Generative Pretrained Transformer
Survey: A collection of AWESOME papers and resources on the large language model (LLM) related recommender system topics.
A list of awesome papers and resources of recommender system on large language model (LLM).
TensorFlow code and pre-trained models for BERT
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
The official GitHub page for the survey paper "A Survey of Large Language Models".
精选了10K+项目,包括机器学习、深度学习、NLP、GNN、推荐系统、生物医药、机器视觉、前后端开发等内容。Selected more than 10k+ projects, including machine learning, deep learning, NLP, GNN, recommendation system, biomedicine, machine vision, etc.…
A PyTorch implementation of ICLR 2021 paper: Learnable Embedding Sizes for Recommender Systems
An open-source framework for self-supervised recommender systems.
Must-read papers on prompt-based tuning for pre-trained language models.
This repo includes ChatGPT prompt curation to use ChatGPT better.
Awesome-LLM: a curated list of Large Language Model
CIKM2023 Best Demo Paper Award. HugNLP is a unified and comprehensive NLP library based on HuggingFace Transformer. Please hugging for NLP now!😊
Source code for Twitter's Recommendation Algorithm
An open-source tool-augmented conversational language model from Fudan University
Code and documentation to train Stanford's Alpaca models, and generate the data.