Stars
Hackable and optimized Transformers building blocks, supporting a composable construction.
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
A playbook for systematically maximizing the performance of deep learning models.
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…
Making large AI models cheaper, faster and more accessible
Stable Diffusion web UI
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.
A Large-scale Chinese Short-Text Conversation Dataset and Chinese pre-training dialog models
Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)
This repository is the official implementation of our paper MVP: Multi-task Supervised Pre-training for Natural Language Generation.
👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search…
A curated list of awesome papers on dataset distillation and related applications.
Code and source for paper ``How to Fine-Tune BERT for Text Classification?``
基于Pytorch的,中文语义相似度匹配模型(ABCNN、Albert、Bert、BIMPM、DecomposableAttention、DistilBert、ESIM、RE2、Roberta、SiaGRU、XlNet)
A curated list of resources for Learning with Noisy Labels
专注于可解释的NLP技术 An NLP Toolset With A Focus on Explainable Inference
3000000+语义理解与匹配数据集。可用于无监督对比学习、半监督学习等构建中文领域效果最好的预训练模型
复盘所有NLP比赛的TOP方案,只关注NLP比赛,持续更新中!
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Model explainability that works seamlessly with 🤗 transformers. Explain your transformers model in just 2 lines of code.
The PyTorch implementation the Smooth Grad [https://arxiv.org/pdf/1706.03825.pdf] and Integrated Gradients [https://arxiv.org/pdf/1703.01365.pdf] for NLP Models.