Skip to content

Commit

Permalink
[update] doc
Browse files Browse the repository at this point in the history
  • Loading branch information
jhao104 committed Aug 14, 2019
1 parent a418306 commit c3dd617
Show file tree
Hide file tree
Showing 2 changed files with 21 additions and 28 deletions.
42 changes: 14 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ pip install -r requirements.txt
# 配置DB
DATABASES = {
"default": {
"TYPE": "SSDB", # 如果使用SSDB或redis数据库,均配置为SSDB
"TYPE": "SSDB", # 目前支持SSDB或redis数据库
"HOST": "127.0.0.1", # db host
"PORT": 8888, # db port,例如SSDB通常使用8888,redis通常默认使用6379
"NAME": "proxy", # 默认配置
Expand All @@ -60,8 +60,8 @@ DATABASES = {
# 配置 ProxyGetter

PROXY_GETTER = [
"freeProxyFirst", # 这里是启用的代理抓取函数名,可在ProxyGetter/getFreeProxy.py 扩展
"freeProxySecond",
"freeProxy01", # 这里是启用的代理抓取函数名,可在ProxyGetter/getFreeProxy.py 扩展
"freeProxy02",
....
]

Expand All @@ -80,32 +80,18 @@ SERVER_API = {
* 启动:

```shell
# 如果你的依赖已经安全完成并且具备运行条件,可以直接在Run下运行main.py
# 到Run目录下:
>>>python main.py
# 如果你的依赖已经安全完成并且具备运行条件,可以在cli下运行通过ProxyPool.py启动
# 程序分为: schedule 调度程序 和 webserver Api服务

# 如果运行成功你应该看到有4个main.py进程
# 首先启动调度程序
>>>python proxyPool.py schedule

# 你也可以分别运行他们,
# 依次到Api下启动ProxyApi.py,Schedule下启动ProxyRefreshSchedule.py和ProxyValidSchedule.py即可.
```
# 然后启动webApi服务
>>>python proxyPool.py webserver

* 生产环境 Docker/docker-compose

```shell
# Workdir proxy_pool
docker build -t proxy_pool .
pip install docker-compose
docker-compose -f docker-compose.yml up -d
```

* 开发环境 Docker

```shell
# Workdir proxy_pool
docker build -t proxy_pool .
docker run -it --rm -v $(pwd):/usr/src/app -p 5010:5010 proxy_pool
```

### 使用

Expand All @@ -131,7 +117,7 @@ docker run -it --rm -v $(pwd):/usr/src/app -p 5010:5010 proxy_pool
import requests

def get_proxy():
return requests.get("http://127.0.0.1:5010/get/").content
return requests.get("http://127.0.0.1:5010/get/").json()

def delete_proxy(proxy):
requests.get("http://127.0.0.1:5010/delete/?proxy={}".format(proxy))
Expand All @@ -141,7 +127,7 @@ def delete_proxy(proxy):
def getHtml():
# ....
retry_count = 5
proxy = get_proxy()
proxy = get_proxy().get("proxy")
while retry_count > 0:
try:
html = requests.get('https://www.example.com', proxies={"http": "http://{}".format(proxy)})
Expand Down Expand Up @@ -186,15 +172,15 @@ class GetFreeProxy(object):

```shell
PROXY_GETTER = [
"freeProxyFirst",
"freeProxySecond",
"freeProxy01",
"freeProxy02",
....
"freeProxyCustom" # # 确保名字和你添加方法名字一致
]
```


  `ProxyRefreshSchedule`会每隔一段时间抓取一次代理,下次抓取时会自动识别调用你定义的方法。
  `ProxySchedule`会每隔一段时间抓取一次代理,下次抓取时会自动识别调用你定义的方法。

### 代理采集

Expand Down
7 changes: 7 additions & 0 deletions doc/release_notes.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
## Release Notes

* 2.0.0 (201908)

1. WebApi集成Gunicorn方式启动, Windows平台暂不支持;
2. 优化Proxy调度程序;
3. 扩展Proxy属性;
4. 提供cli工具, 更加方便启动proxyPool

* 1.14 (2019.07)

1. 修复`ProxyValidSchedule`假死bug,原因是Queue阻塞;
Expand Down

0 comments on commit c3dd617

Please sign in to comment.