项目很简单,主要包括以下文件:
- article_collector.py:主文件,用于爬取公众号文章以及把文章储存为word文档;
- add_hyperlinks.py:用于在word文档中添加超链接
- gzh.txt:待爬取的公众号列表
- 比心.JPG:用来撒狗粮的,不用管
- python3
- WechatSogou
- python-docx
注:步骤说明可以关注微信公众号:Alfred数据室,阅读对应文章《50行代码教你打造一个公众号文章采集器》
It is a simple project containing files listed below:
- article_collector.py:main file for crawling wechat official account articles and storing it into ms office word file
- add_hyperlinks.py:for adding hyperlinks in ms office word file
- gzh.txt:wechat official account list to crawl
- 比心.JPG:just ignore it
- Python3
- WechatSogou
- python-docx
Notice: you can find the detailed document by following Alfred's wechat official account: Alfred_Lab