Stars
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Fess is very powerful and easily deployable Enterprise Search Server.
专注于解决自然语言处理领域的几个核心问题:词法分析,句法分析,语义分析,语种检测,信息抽取,文本聚类和文本分类. 为相关领域的研发人员提供完整的通用设计与参考实现. 涵盖了多种自然语言处理算法,适配了多个自然语言处理框架. 兼容Lucene/Solr/ElasticSearch插件.
thulac analysis plugin for elasticsearch
BosonNLP Analysis for ElasticSearch
Elasticsearch with T5/Bert/Other models provided by huggingface Transfomers.
Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation
中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、…
100+ Chinese Word Vectors 上百种预训练中文词向量
中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。
It's an image similarity search Engine built on top of Lire. The images can be filtered using a query by keywords [support Chinese]and are afterwards optically ranked. This engine provides an easy …
ChainSQL: the collaboration of blockchain and database
Small python-gtk application, which helps the user to merge or split PDF documents and rotate, crop and rearrange their pages using an interactive and intuitive graphical interface.
PDFsam, a desktop application to split, merge, mix, rotate PDF files and extract pages
公司名语料库。机构名语料库。公司简称,缩写,品牌词,企业名。可用于中文分词、机构名实体识别。
An open source engine for license management on the Java Virtual Machine.
A reverse image search engine powered by elastic search and tensorflow
🎇 Quickly search over billions of images
Convert Word documents to simple and clean HTML
ansj分词.ict的真正java实现.分词效果速度都超过开源版的ict. 中文分词,人名识别,词性标注,用户自定义词典
Addon to provide a set of common content store implementations and easy-to-use configuration (no Spring config)