Skip to content

Commit

Permalink
Add more bots (fnando#337)
Browse files Browse the repository at this point in the history
  • Loading branch information
paolodona authored and fnando committed Mar 17, 2018
1 parent ba58281 commit a7006a6
Show file tree
Hide file tree
Showing 4 changed files with 16 additions and 3 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

- Add GarlikCrawler, ImplisenseBot and WikiDo bots.
- Add Mastodon URL expander bot.
- Add eZ Publish Link Validator, GermCrawler, Pu_iN Crawler, ZoomBot, and ZoominfoBot bots.

## v2.5.3

Expand Down
6 changes: 6 additions & 0 deletions bots.yml
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ everyonesocialbot: "EveryoneSocial"
evrinid: "Evri bot"
exabot: "Exalead's bot"
exaleadcloudview: "ExaleadCloudView"
ez publish: "eZ Publish Link Validator"
facebookexternalhit: "Facebook Bot"
facebot: "Facebook Bot"
feedburner: "RSS bot"
Expand All @@ -75,6 +76,7 @@ flipboardproxy: "FlipboardProxy"
friendfeedbot: "FriendFeed"
garlik: "GarlikCrawler"
genieo: "Genieo Web filter bot"
germcrawler: 'GermCrawler'
getprismatic.com: "getprismatic.com"
gigabot: "Gigabot spider"
gimme60bot: "Gimme60 (gimme60.com)"
Expand Down Expand Up @@ -175,12 +177,14 @@ plukkie: "botje.com/plukkie.htm"
privacyawarebot: "PrivacyAwareBot"
proximic: "Proximic Spider"
psbot-page: "Picsearch"
pu_in: 'Pu_iN Crawler'
publiclibraryarchive.org: "publiclibraryarchive.org"
pycurl: "Python http library"
python-httplib2: "Python-httplib2"
python-requests: "Python http library"
python-urllib: "Python http library"
queryseeker: "QuerySeekerSpider"
quick-crawler: "Quick-Crawler"
quicklook: "QuickLook"
re-animator: "Domain Re-Animator Bot"
readability: "Readability"
Expand Down Expand Up @@ -268,4 +272,6 @@ yourls: "YOURLS"
zelist.ro: "feed parser"
zibb: "ZIBB spider"
zitebot: "Zite"
zoombot: 'ZoomBot'
zoominfobot: 'ZoominfoBot'
zyborg: "Zyborg"
4 changes: 2 additions & 2 deletions search_engines.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
ask jeeves: "Ask Jeeves"
baidu: "Chinese search engine"
bingbot: "Microsoft bing bot"
duckduckbot: "Duck Duck Go"
googlebot: "Google spider"
slurp: "Yahoo spider"
duckduckbot: "Duck Duck Go"
ask jeeves: "Ask Jeeves"
8 changes: 7 additions & 1 deletion test/ua_bots.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,18 +19,20 @@ DAUMOA: Mozilla/5.0 (compatible; MSIE or Firefox mutant; not on Windows server;)
DOMAINAREANIMATOR: 'Domain Re-Animator Bot (http://domainreanimator.com) - [email protected]'
DOT_BOT: 'Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplorer.org/dotbot, [email protected])'
DUCKDUCKGO: 'DuckDuckBot/1.0; (+http://duckduckgo.com/duckduckbot.html)'
EZPUBLISH: 'eZ Publish Link Validator'
FACEBOOK_BOT: 'facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)'
GARLIK: 'GarlikCrawler/1.2 (http://garlik.com/, [email protected])'
GERMCRAWLER: 'GermCrawler'
GOOGLE_BOT: 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)'
GOOGLE_PAGE_SPEED_INSIGHTS: 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.4 (KHTML, like Gecko; Google Page Speed Insights) Chrome/22.0.1229 Safari/537.4'
GOOGLE_SITE_VERIFICATION: Mozilla/5.0 (compatible; Google-Site-Verification/1.0)
GOOGLE_STACKDRIVER_UPTIME_CHECKS: 'GoogleStackdriverMonitoring-UptimeChecks'
GOOGLE_STRUCTURED_DATA_TESTING_TOOL2: 'Mozilla/5.0 (compatible; Google-Structured-Data-Testing-Tool +http://developers.google.com/structured-data/testing-tool/)'
GOOGLE_STRUCTURED_DATA_TESTING_TOOL: 'Mozilla/5.0 (compatible; X11; Linux x86_64; Google-StructuredDataTestingTool; +http://www.google.com/webmasters/tools/richsnippets)'
GRAPESHOT: 'Mozilla/5.0 (compatible; GrapeshotCrawler/2.0; +http://www.grapeshot.co.uk/crawler.php)'
IMPLISENSEBOT: 'ImplisenseBot 1.0'
JOBSEEKER: 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/538.1 (KHTML, like Gecko) JobBot/5.0 (compatible; +http://www.jobseeker.com.au/bot.html) Safari/538.1'
LINKDEXBOT: 'Mozilla/5.0 (compatible; linkdexbot/2.0; +http://www.linkdex.com/bots/)'
IMPLISENSEBOT: 'ImplisenseBot 1.0'
LOAD_TIME_BOT: 'Mozilla/5.0 (compatible; LoadTimeBot/0.9; +http://www.loadtime.net/bot.html)'
LTX71: 'ltx71 - (http://ltx71.com/)'
MAIL_RU: 'Mozilla/5.0 (compatible; Linux x86_64; Mail.RU_Bot/2.0; +http://go.mail.ru/help/robots)'
Expand All @@ -46,7 +48,9 @@ NEWRELICPINGER: NewRelicPinger/1.0 (12345)
PAESSLER: Mozilla/5.0 (compatible; PRTG Network Monitor (www.paessler.com); Windows)
PRIVACYAWAREBOT: 'Mozilla/5.0 (compatible; PrivacyAwareBot/1.1; +http://www.privacyaware.org)'
PROXIMIC: 'Mozilla/5.0 (compatible; proximic; +http://www.proximic.com/info/spider.php)'
PUINCRAWLER: 'Pu_iN Crawler (+http://semanticjuice.com/)'
QUERYSEEKER: 'QuerySeekerSpider ( http://queryseeker.com/bot.html )'
QUICKCRAWLER: "Quick-Crawler (+https://www.scrapinghub.com/)"
SCRAPY: 'Scrapy/0.18.4 (+http://scrapy.org)'
SEMANTICBOT: 'Mozilla/5.0 (compatible; Semanticbot/1.0; +http://sempi.tech/bot.html)'
SEO_AUDIT: 'Mozilla/5.0 (compatible; seo-audit-check-bot/1.0)'
Expand All @@ -70,3 +74,5 @@ YAHOO_SLURP: 'Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/
YANDEX_DIRECT: 'Mozilla/5.0 (compatible; YandexDirect/3.0; +http://yandex.com/bots)'
YANDEX_METRIKA: 'Mozilla/5.0 (compatible; YandexMetrika/3.0; +http://yandex.com/bots)'
YANGA: 'Yanga WorldSearch Bot v1.1/beta (http://www.yanga.co.uk/)'
ZOOMBOT: 'ZoomBot (Linkbot 1.0 http://suite.seozoom.it/bot.html)'
ZOOMINFOBOT: 'ZoominfoBot (zoominfobot at zoominfo dot com)'

0 comments on commit a7006a6

Please sign in to comment.