Stars
A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
A simple and beautiful Vue chat component backend agnostic, fully customisable and extendable.
Robust Speech Recognition via Large-Scale Weak Supervision
Code and documentation to train Stanford's Alpaca models, and generate the data.
Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Repository that accompanies "An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction" (EMNLP 2019)
A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).
汉字拆字库,可以将汉字拆解成偏旁部首,在机器学习中作为汉字的字形特征
💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagging" (BEA-21)
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
A very simple framework for state-of-the-art Natural Language Processing (NLP)
An Emacs configuration bundle with batteries included
Unsupervised text tokenizer for Neural Network-based text generation.
SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
Modern spell checking library - accurate, fast, multi-language
Team Kakao&Brain's Grammatical Error Correction System for the ACL 2019 BEA Shared Task
TensorFlow code and pre-trained models for BERT
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.