Data and software for building the ACL Anthology.
EMNLP 2023 Papers: Explore cutting-edge research from EMNLP 2023, the premier conference for advancing empirical methods in natural language processing. Stay updated on the latest in machine learni…
基于大模型搭建的聊天机器人,同时支持 微信公众号、企业微信应用、飞书、钉钉 等接入,可选择GPT3.5/GPT-4o/GPT-o1/ Claude/文心一言/讯飞星火/通义千问/ Gemini/GLM-4/Claude/Kimi/LinkAI,能处理文本、语音和图片,访问操作系统和互联网,支持基于自有知识库进行定制企业智能客服。
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
Ancient Chinese Corpus with Word Sense Annotation
This is a program that used for making metaphor recognition in several Chinese sentences. This could help grading the article in a way.
GuwenBERT: 古文预训练语言模型(古文BERT) A Pre-trained Language Model for Classical Chinese (Literary Chinese)
This is a 25,000 word UD treebank of Old English. The text has been retrieved from Martín Arista, Javier (ed.), et al. 2023. ParCorOEv3 []. The treebank is a revised version o…
This is an SQL file of Oxford English Dictionary. It includes more than 41,OOO words! Just import the SQL.
Scrape article metadata from major media outlet's websites, including NYT, WaPo, WSJ. Built on top of the Newspaper Python Library (
The Washington Post Scraper is an application that allows the user to scrape articles from the Washington Post website and save a reference to them.
A news crawler for BBC News, Reuters and New York Times.
Data journalism research project: women through the lens of the New York Times from 1950 till present day.
This is a code example repo for the NLP course offered by the Institute of Chinese Information Processing of BNU.
Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning And private Server services
100+ Chinese Word Vectors 上百种预训练中文词向量
Natural Language Processing for the next decade. Tokenization, Part-of-Speech Tagging, Named Entity Recognition, Syntactic & Semantic Dependency Parsing, Document Classification
GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
甲言,专注于古代汉语(古汉语/古文/文言文/文言)处理的NLP工具包,支持文言词库构建、分词、词性标注、断句和标点。Jiayan, the 1st NLP toolkit designed for Classical Chinese, supports lexicon construction, tokenizing, POS tagging, sentence segmentation a…
中文情感分析库(Chinese Sentiment))可对文本进行情绪分析、正负情感分析。Chinese sentiment analysis library, which supports counting the number of different emotional words in the text