Strip tags from HTML, optionally from areas identified by CSS selectors
Install this tool using pip
:
pip install strip-tags
Pipe content into this tool to strip tags from it:
cat input.html | strip-tags > output.txt
Or pass a filename:
strip-tags -i input.html > output.txt
To run against just specific areas identified by CSS selectors:
strip-tags '.content' -i input.html > output.txt
This can be called with multiple selectors:
cat input.html | strip-tags '.content' '.sidebar' > output.txt
You can also use:
python -m strip_tags --help
To contribute to this tool, first checkout the code. Then create a new virtual environment:
cd strip-tags
python -m venv venv
source venv/bin/activate
Now install the dependencies and test dependencies:
pip install -e '.[test]'
To run the tests:
pytest