This repository is primarily for Web Scraping using Python. The library that I'd be using is BeautifulSoup. To learn the basics, I recommend: http://www.analyticsvidhya.com/blog/2015/10/beginner-guide-web-scraping-beautiful-soup-python/ In this repo, I've written python code for extracting the article from a given webpage! You can provide the url to the function and it will return the main article content from it, removing all the unwanted information like advertisements/links.
As you might know, each website has their own HTML way of enclosing an article, so I'll try to cover as many websites as possible.
Also added script to parse Hinglish data (eg. SantaBanta website).