reuters

Crawler and scraper for news archive on reuters.com.

It goes through pages with links to articles of the news archive overall, or for a given section, and take article headlines, article text and time stamp of release and put it to a MongoDB collection.

the framework of scrapy was used to create the spider.

The documentation and development is still being developed.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
reuters		reuters
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

reuters

About

Releases

Packages

Languages

monkeyclass/reuters

Folders and files

Latest commit

History

Repository files navigation

reuters

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages