JackLin24

Follow

JackLin24

Follow

4 followers · 3 following

Stars

29 stars written in Python

hankcs / HanLP

中文分词词性标注命名实体识别依存句法分析成分句法分析语义依存分析语义角色标注指代消解风格转换语义相似度新词发现关键词短语提取自动摘要文本分类聚类拼音简繁转换自然语言处理

Python 34,326 10,320 Updated Jan 15, 2025

fxsjy / jieba

结巴中文分词

Python 33,610 6,727 Updated Aug 21, 2024

jhao104 / proxy_pool

Python ProxyPool for web spider

Python 21,890 5,225 Updated Sep 10, 2024

piskvorky / gensim

Topic Modelling for Humans

Python 15,796 4,387 Updated Dec 18, 2024

Embedding / Chinese-Word-Vectors

100+ Chinese Word Vectors 上百种预训练中文词向量

Python 11,913 2,325 Updated Oct 30, 2023

dataabc / weiboSpider

新浪微博爬虫，用python爬取新浪微博数据

Python 8,584 1,996 Updated Jan 13, 2025

chyroc / WechatSogou

基于搜狗微信搜索的微信公众号爬虫接口

Python 5,968 1,713 Updated Nov 15, 2023

SpiderClub / haipproxy

💖 High available distributed ip proxy pool, powerd by Scrapy and Redis

Python 5,453 912 Updated Dec 26, 2022

wepe / MachineLearning

Basic Machine Learning and Deep Learning

Python 5,288 3,177 Updated Jun 15, 2024

HIT-SCIR / ltp

Language Technology Platform

Python 5,017 1,047 Updated Jan 1, 2025

dataabc / weibo-crawler

新浪微博爬虫，用python爬取新浪微博数据，并下载微博图片和微博视频

Python 3,593 784 Updated Jan 4, 2025

LiuXingMing / SinaSpider

新浪微博爬虫（Scrapy、Redis）

Python 3,269 1,518 Updated Sep 5, 2018

dataabc / weibo-search

获取微博搜索结果信息，搜索即可以是微博关键词搜索，也可以是微博话题搜索

Python 1,824 387 Updated Jan 4, 2025

Python3Spiders / WeiboSuperSpider

微博爬虫及配套工具箱，微博用户、话题、评论采集一网打尽。图片下载、情感分析，地理位置、关系网络、spammer 机器人识别等功能应有尽有。Docs：https://buyixiao.github.io/blog/weibo-super-spider.html 配套可视化网站：https://buyixiao.github.io/blog/one-stop-weibo-visualizatio…

Python 1,626 332 Updated Apr 23, 2023

benitoro / stockholm

一个股票数据（沪深）爬虫和选股策略测试框架

Python 1,398 623 Updated Aug 14, 2020

zhezhaoa / ngram2vec

Four word embedding models implemented in Python. Supporting arbitrary context features

Python 847 174 Updated Aug 22, 2019

stay-leave / weibo-public-opinion-analysis

基于微博数据的舆情分析项目，包括微博爬虫、LDA主题分析和情感分析。

Python 727 110 Updated Dec 8, 2024

lanbing510 / LianJiaSpider

链家爬虫

Python 679 455 Updated Apr 6, 2016

yanzhou / CnkiSpider

中国知网爬虫

Python 547 301 Updated Aug 28, 2015

andyzsf / TuShare

TuShare是实现对股票/期货等金融数据从数据采集、清洗加工到数据存储过程的工具，满足金融量化分析师和学习数据分析的人在数据获取方面的需求，它的特点是数据覆盖范围广，接口调用简单,响应快速。

Python 445 142 Updated Feb 29, 2016

RitterHou / music-163

爬取网易云音乐所有歌曲的评论数

Python 351 230 Updated Feb 16, 2017

CarltonHere / auto-cpdaily

今日校园自动化是一个基于Python的爬虫项目，主要实现今日校园签到、信息收集、查寝等循环表单的自动化任务

Python 318 68 Updated Aug 23, 2022

pakoo / tbcrawler

淘宝天猫商品爬虫

Python 239 205 Updated Oct 9, 2013

Qutan / Spider

社交数据爬虫

Python 214 130 Updated Oct 11, 2016

dataabc / weibo-follow

爬取关注列表中微博账号的微博

Python 185 52 Updated May 21, 2024

simapple / spider

test

Python 163 132 Updated Feb 4, 2023

XWang20 / WeiboCrawler

无cookie版微博爬虫，可以连续爬取一个或多个新浪微博用户信息、用户微博及其微博评论转发。

Python 155 26 Updated Apr 8, 2022

dataabc / weiboPR

用python判断微博用户的影响力

Python 52 17 Updated Mar 27, 2016

waykom / weibo_top

爬取微博热搜

Python 7 2 Updated Jun 18, 2023