Accurately separates a URL’s subdomain, domain, and public suffix, using the Public Suffix List (PSL).
git clone https://github.com/rix4uni/tldinfo.git
cd tldinfo
python3 setup.py install
Quick setup in isolated python environment using pipx
pipx install --force git+https://github.com/rix4uni/tldinfo.git
usage: tldinfo [-h] [-e EXTRACT] [-r] [-f] [-j] [-s] [-v]
Accurately separates a URL’s subdomain, domain, and public suffix, using the Public Suffix List (PSL).
options:
-h, --help show this help message and exit
-e EXTRACT, --extract EXTRACT
Comma-separated list of parts to extract (subdomain, domain, suffix)
-r, --registered_domain
Get the registered domain
-f, --fqdn Get the fqdn
-j, --json Output result in JSON format
-s, --silent Run without printing the banner
-v, --version Show current version of tldinfo
Single Domains:
▶ echo "http://forums.news.cnn.com/" | tldinfo --silent --extract subdomain
forums.news
▶ echo "http://forums.news.cnn.com/" | tldinfo --silent --extract domain
cnn
▶ echo "http://forums.news.cnn.com/" | tldinfo --silent --extract suffix
com
▶ echo "http://forums.news.cnn.com/" | tldinfo --silent --extract subdomain,domain,suffix
forums.news.cnn.com
▶ echo "http://forums.news.cnn.com/" | tldinfo --silent --extract subdomain,domain,suffix --json
{"input": "http://forums.news.cnn.com/", "subdomain": "forums.news", "domain": "cnn", "suffix": "com"}
▶ echo "http://forums.news.cnn.com/" | tldinfo --silent --registered_domain
cnn.com
▶ echo "http://forums.news.cnn.com/" | tldinfo --silent --fqdn
forums.news.cnn.com
Multiple Domains:
▶ cat targets.txt
forums.news.cnn.com
forums.bbc.co.uk
www.worldbank.org.kg
▶ cat targets.txt | tldinfo --silent --extract subdomain
forums.news
forums
www