Stars
基于AI的图片/视频硬字幕去除、文本水印去除,无损分辨率生成去字幕、去水印后的图片/视频文件。无需申请第三方API,本地实现。AI-based tool for removing hard-coded subtitles and text-like watermarks from videos or Pictures.
A Corpus for Chinese Literary Grace Evaluation(literary grace level, figure-of-speech, sentence category)
WebGLM: An Efficient Web-enhanced Question Answering System (KDD 2023)
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
silverriver / ChatGLM-6B-Slim
Forked from THUDM/ChatGLM-6BChatGLM-6B-Slim:裁减掉20K图片Token的ChatGLM-6B,完全一样的性能,占用更小的显存。
ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
Code release for "TempLM: Distilling Language Models into Template-Based Generators"
Repository containing all source code and documentation of iLFQA: intelligent Longform Question Answering
pix2tex: Using a ViT to convert images of equations into LaTeX code.
Title and keywords are used to generate text.
自然语言处理(NLP)教程,包括:词向量,词法分析,预训练语言模型,文本分类,文本语义匹配,信息抽取,翻译,对话。
Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.
A flask-based Docker image providing a deep-learning long-form question answering service
Sentiment Analysis, Text Classification, Text Augmentation, Text Adversarial defense, etc.;
Code for ACL'20 paper "Multi-Granularity Interaction Network for Extractive and Abstractive Multi-Document Summarization" .
The official code for PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization
Large-scale multi-document summarization dataset and code