Text-Related
Popular repositories Loading
-
WeChat800
WeChat800 PublicForked from BonnieCabrera/WeChat800
上次曝光微信聊天记录明文数据库的那个老外分析了38.7亿条微信总结出来的会触发监控的关键词。并且指出98%的微信对话带有GPS信息。
-
sensitive-stop-words
sensitive-stop-words PublicForked from fwwdn/sensitive-stop-words
互联网常用敏感词、停止词词库
-
sensitivewd-filter
sensitivewd-filter PublicForked from andyzty/sensitivewd-filter
敏感词过滤、广告词过滤、包含敏感词库,停顿词库。
Java
-
-
reference-materials-for-review
reference-materials-for-review Publicreference-materials-for-review-2019
C++
-
Chinese-Names-Corpus
Chinese-Names-Corpus PublicForked from wainshine/Chinese-Names-Corpus
中文人名语料库。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。
Repositories
- nsfw_data_scraper Public Forked from alex000kim/nsfw_data_scraper
Collection of scripts to aggregate image data for the purposes of training an NSFW Image Classifier
- funNLP Public Forked from fighting41love/funNLP
中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、历史名人词库、诗词词库、医学词库、饮食词库、法律词库、汽车词库、动物词库、中文聊天语料、中文谣言数据、百度中文问答数据集、句子相似度匹配算法集合、bert资源、文本生成&摘要相关工具、cocoNLP信息抽取工具、国内电话号码正则匹配、清华大学XLORE:中英文跨语言百科知识图谱、清华大学人工智能技术…
- Company-Names-Corpus Public Forked from wainshine/Company-Names-Corpus
公司名语料库。机构名语料库。公司简称,缩写,品牌词,企业名。可用于中文分词、机构名实体识别。
- WeChat800 Public Forked from BonnieCabrera/WeChat800
上次曝光微信聊天记录明文数据库的那个老外分析了38.7亿条微信总结出来的会触发监控的关键词。并且指出98%的微信对话带有GPS信息。
- Chinese-Names-Corpus Public Forked from wainshine/Chinese-Names-Corpus
中文人名语料库。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。
- The-Big-Username-Blacklist Public Forked from marteinn/The-Big-Username-Blocklist
This is a opinionated blacklist of words that you might not like to see used as usernames in your service.
- Chinese-Word-Vectors Public Forked from Embedding/Chinese-Word-Vectors
100+ Chinese Word Vectors 上百种预训练中文词向量