web_scraping CLI

Used Requests and Beutifulsoup to scrap the website
this website is commonly used to practice scraping it is an easy website to work with
that have just an easy HTML page nothing crazy such as Javascript, login sessions ..etc
I got it from: best-websites-to-practice-your-web-scraping-skills

Outline of what I did in this project:

1. Created a Virtual envirenment.

2. Installed the required packages (check requirements.txt ).

3. Used Requests and Beutifulsoup for scraping.

4. Extracted the html page with the UTF-8 coding (there was a sign that got saved weirdly so I had to..) then grab the desired data namely the name and price of every book in the main page.

5. Used a Dataclass decorator to initialize a ProductData class that we later on used to save the data with it elegantly (should fall into the best practices road).

6. Used Pandas to save the extracted data and to display it using .head() method (check displaying_csv.py ).

7. A library called Click was used to make the CLI.

To run this code:

You just need to write that in the terminal:
python scraper.py <url_link> <the-csv-file-name.csv>

Inspiration:

Special thanks to John Watson Rooney VIDEO that inspired me to do that project (a beginner projects) that gets my hands on some fundementals of web scraping + CLI.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.venv		.venv
README.md		README.md
displaying_csv.py		displaying_csv.py
output.csv		output.csv
requirements.txt		requirements.txt
scraper.py		scraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

web_scraping CLI

Outline of what I did in this project:

1. Created a Virtual envirenment.

2. Installed the required packages (check requirements.txt ).

3. Used Requests and Beutifulsoup for scraping.

4. Extracted the html page with the UTF-8 coding (there was a sign that got saved weirdly so I had to..) then grab the desired data namely the name and price of every book in the main page.

5. Used a Dataclass decorator to initialize a ProductData class that we later on used to save the data with it elegantly (should fall into the best practices road).

6. Used Pandas to save the extracted data and to display it using .head() method (check displaying_csv.py ).

7. A library called Click was used to make the CLI.

To run this code:

Inspiration:

About

Releases

Packages

Languages

ilyesBoukraa/web_scraping

Folders and files

Latest commit

History

Repository files navigation

web_scraping CLI

Outline of what I did in this project:

1. Created a Virtual envirenment.

2. Installed the required packages (check requirements.txt ).

3. Used Requests and Beutifulsoup for scraping.

4. Extracted the html page with the UTF-8 coding (there was a sign that got saved weirdly so I had to..) then grab the desired data namely the name and price of every book in the main page.

5. Used a Dataclass decorator to initialize a ProductData class that we later on used to save the data with it elegantly (should fall into the best practices road).

6. Used Pandas to save the extracted data and to display it using .head() method (check displaying_csv.py ).

7. A library called Click was used to make the CLI.

To run this code:

Inspiration:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages