Skip to content

Commit

Permalink
better negative page index support
Browse files Browse the repository at this point in the history
  • Loading branch information
duckingod committed Jul 20, 2017
1 parent e415229 commit 5de5c19
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 6 deletions.
8 changes: 3 additions & 5 deletions PttWebCrawler/crawler.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,11 +41,9 @@ def __init__(self, cmdline=None):
board = args.b
PTT_URL = 'https://www.ptt.cc'
if args.i:
start = args.i[0]
if args.i[1] == -1:
end = self.getLastPage(board)
else:
end = args.i[1]
last = self.getLastPage(board)
indexs = [i if i>=0 else last+i+1 for i in args.i]
start, end = sorted(indexs)
index = start
filename = board + '-' + str(start) + '-' + str(end) + '.json'
self.store(filename, u'{"articles": [', 'w')
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@
}

### 執行方式
python crawler.py -b 看板名稱 -i 起始索引 結束索引 (設為 -1 則自動計算最後一頁)
python crawler.py -b 看板名稱 -i 起始索引 結束索引 (設為負數則以倒數第幾頁計算)
python crawler.py -b 看板名稱 -a 文章ID

### 範例
Expand Down

0 comments on commit 5de5c19

Please sign in to comment.