strip-tags by simonw

Star

README source code

strip-tags

PyPI Changelog Tests License

Strip tags from HTML, optionally from areas identified by CSS selectors

See llm, ttok and strip-tags—CLI tools for working with ChatGPT and other LLMs for more on this project.

Installation

Install this tool using pip:

pip install strip-tags

Usage

Pipe content into this tool to strip tags from it:

cat input.html | strip-tags > output.txt

Or pass a filename:

strip-tags -i input.html > output.txt

To run against just specific areas identified by CSS selectors:

strip-tags '.content' -i input.html > output.txt

This can be called with multiple selectors:

cat input.html | strip-tags '.content' '.sidebar' > output.txt

To return just the first element on the page that matches one of the selectors, use --first:

cat input.html | strip-tags .content --first > output.txt

To remove content contained by specific selectors - e.g. the