Skip to content

Commit

Permalink
Merge branch 'master' of github.com:yaochenkun/EnterpriseInformationS…
Browse files Browse the repository at this point in the history
…pider
  • Loading branch information
yaochenkun committed Jan 26, 2017
2 parents 5a9e687 + 627a630 commit d1e27e7
Showing 1 changed file with 18 additions and 1 deletion.
19 changes: 18 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,19 @@
# EnterpriseInformationSpider
A spider program that could collect all the enterprises' information in China from 'Qichacha' website.
A spider program that could collect all the enterprises' information in China from '[Qichacha](http://www.qichacha.com/)' website.<br>
一个爬取"[企查查](http://www.qichacha.com/)"网站中所有中国公司基本注册信息的爬虫程序。

## Screenshot
![](http://yaochenkun.cn/wordpress/wp-content/uploads/2017/01/33123.png)
![](http://yaochenkun.cn/wordpress/wp-content/uploads/2017/01/12.png)

## Dependencies
Before run 'qichacha_spider.py' script, you should ensure that your system has these dependencies or libraries installed:

1. __requests__: an only Non-GMO HTTP library based on urllib for Python to provide apis to handle with HTTP, URL and so on.
2. __BeautifulSoup__: a library for Python to help you quickly analyze the structure of web page based HTML DOM and CSS.
3. __xlrd, xlwt, xlutils.copy__: some libraries to be responsible for reading and writing excel files.

A good way to install these libraries is to firstly install 'pip', and then you can install all the libraries based on pip commands.

## For More
If you prefer more detailed information about this spider program, you could have a further look at the file '企查爬虫软件使用指南.pdf' in this repository.

0 comments on commit d1e27e7

Please sign in to comment.