Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi,
I found your WA Reader and found it very useful already. I noticed Issue #70, as I saw it during the upload of a chat with ~ 4000 lines. But after investigating on which line it actually breaks, I found that in that chat someone posted a joke of binary text.
If you eg. create a chat file like:
...
13/10/2017, 12:12 a.m. - The Joker: Let's not BLOW this out of proportion.
13/10/2017, 12:12 a.m. - Gambol: You think you can steal from us and just walk away?
00001100110000111100010101111111000001111000
13/10/2017, 12:12 a.m. - The Joker: Yeah.
...
(yes, there is a line break starting the new line with only text or here numbers)
The dateutil.parser parser tries to find some date in it and fails on the message: OverflowError: Python int too large to convert to C long in
Sample code to prove:
from dateutil.parser import parse dt = parse('Sun, 05/12/1999, 12:30PM') print(dt.date()) dt = parse('00001100110000111100010101111111000001111000') print(dt.date())
I added a line to the utils.py to catch this error as well. In the Issue #70 you already mentioned some weird chars. Maybe this happens also in case of some unicode chars?
regards