Skip to content
forked from my8100/scrapydweb

Scrapyd cluster management, Scrapy log analysis & visualization, Basic auth, Auto eggifying, Email notice and Mobile UI. 🎞️ GIF DEMO 👉

License

Notifications You must be signed in to change notification settings

xucn/scrapydweb

 
 

Repository files navigation

🔤 English | 🀄 简体中文

ScrapydWeb: A full-featured web UI for Scrapyd cluster management, with Scrapy log analysis & visualization supported.

PyPI - scrapydweb Version Downloads - total PyPI - Python Version Coverage Status GitHub license Twitter

overview

Scrapyd ❌ ScrapydWeb ❌ LogParser

📖 Recommended Reading

🔗 How to efficiently manage your distributed web scraping projects

⭐ Features

View contents
  • 💠 Scrapyd Cluster Management

    • 💯 All Scrapyd JSON API Supported
    • ☑️ Group, filter and select any number of nodes
    • 🖱️ Execute command on multinodes with just a few clicks
  • 🔍 Scrapy Log Analysis

    • 📊 Stats collection
    • 📈 Progress visualization
    • 📑 Logs categorization
  • 🔋 Enhancements

    • 📦 Auto eggify your projects
    • 🕵️‍♂️ Integrated with 🔗 LogParser
    • 📧 Email notice
    • 📱 Mobile UI
    • 🔐 Basic auth for web UI

👀 Preview

💻 Getting Started

View contents

⚠️ Prerequisites

Make sure that 🔗 Scrapyd has been installed and started on all of your hosts.

‼️ Note that for remote access, you have to manually set 'bind_address = 0.0.0.0' in 🔗 the configuration file of Scrapyd and restart Scrapyd to make it visible externally.

⬇️ Install

  • Use pip:
pip install scrapydweb
  • Use git:
git clone https://github.com/my8100/scrapydweb.git
cd scrapydweb
python setup.py install

▶️ Start

  1. Start ScrapydWeb via command scrapydweb. (a config file would be generated for customizing settings at the first startup.)
  2. Visit http://127.0.0.1:5000 (It's recommended to use Google Chrome for a better experience.)

🌐 Browser Support

The latest version of Google Chrome, Firefox, and Safari.

✔️ Running the tests

View contents
$ git clone https://github.com/my8100/scrapydweb.git
$ cd scrapydweb

# To create isolated Python environments
$ pip install virtualenv
$ virtualenv venv/scrapydweb
# Or specify your Python interpreter: $ virtualenv -p /usr/local/bin/python3.7 venv/scrapydweb
$ source venv/scrapydweb/bin/activate

# Install dependent libraries
(scrapydweb) $ python setup.py install
(scrapydweb) $ pip install pytest
(scrapydweb) $ pip install coverage

# Make sure Scrapyd has been installed and started, then update the custom_settings item in tests/conftest.py
(scrapydweb) $ vi tests/conftest.py
(scrapydweb) $ curl http://127.0.0.1:6800

(scrapydweb) $ coverage run --source=scrapydweb -m pytest tests/test_a_factory.py -s -vv
(scrapydweb) $ coverage run --source=scrapydweb -m pytest tests -s -vv
(scrapydweb) $ coverage report
# To create an HTML report, check out htmlcov/index.html
(scrapydweb) $ coverage html

🏗️ Built With

View contents

📋 Changelog

Detailed changes for each release are documented in the 🔗 HISTORY.md.

👨‍💻 Author


my8100

👥 Contributors


Kaisla

©️ License

This project is licensed under the GNU General Public License v3.0 - see the 🔗 LICENSE file for details.

About

Scrapyd cluster management, Scrapy log analysis & visualization, Basic auth, Auto eggifying, Email notice and Mobile UI. 🎞️ GIF DEMO 👉

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 50.1%
  • HTML 39.0%
  • CSS 5.6%
  • JavaScript 5.3%