Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
R2h1 authored Jul 23, 2019
1 parent 05e6a35 commit cb9b720
Showing 1 changed file with 9 additions and 1 deletion.
10 changes: 9 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,10 @@
# mallspider
Jd full commodity spider
使用scrapy,redis,mongodb实现的一个分布式爬虫,底层存储选择mongodb,分布式使用redis来实现。

针对https://www.jd.com/2019 网站,将其首页的分类信息——各级分类的名称和URL,商品详情信息——商品名称,商品价格,商品评论数量,商品店铺,商品促销,商品选项,商品图片的URL

避免爬虫被禁的策略:
实现随机User-Agent下载中间件
实现代理IP的中间件


0 comments on commit cb9b720

Please sign in to comment.