Skip to content

Commit

Permalink
test: Additional robot testcases
Browse files Browse the repository at this point in the history
  • Loading branch information
nielsbasjes committed Jan 2, 2025
1 parent 988738d commit 3f65ff8
Show file tree
Hide file tree
Showing 2 changed files with 120 additions and 5 deletions.
8 changes: 5 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,11 @@ NEXT RELEASE
- Handle edgecases:
- 'OpenBSD != Linux amd64'
- 'Linux x86_64:108.0'
- Robots: AmazonBot, Bravebot, FediIndex
- AI related robots: OpenAI/ChatGPT, Claudebot (Anthropic), PerplexityBot
- Codeberg.org is a code hosting site
- Robots (Generic, Fediverse and AI Related):
- AmazonBot, Bravebot, PetalBot
- FediIndex, vmcrawl, Nonsensebot, Caveman-hunter, ...
- OpenAI/ChatGPT, Claudebot (Anthropic), PerplexityBot
- Codeberg.org is a code hosting site (not a brand for a bot)
- Updated the ISO 639-3 language code table

v7.29.0
Expand Down
117 changes: 115 additions & 2 deletions analyzer/src/main/resources/UserAgents/Robots.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7515,6 +7515,7 @@ config:
AgentNameVersionMajor : 'Bytespider ??'
AgentInformationEmail : '[email protected]'


- test:
input:
user_agent_string: 'Mozilla/5.0 (Linux; Android 5.0) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; Bytespider; [email protected])'
Expand Down Expand Up @@ -7543,6 +7544,34 @@ config:
AgentInformationEmail : '[email protected]'


- test:
input: # Wrong website ??
User-Agent : 'Mozilla/5.0 (Linux; Android 5.0) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; Bytespider; https://zhanzhang.toutiao.com/)'
expected:
DeviceClass : 'Robot'
DeviceName : 'Toutiao Bytespider'
DeviceBrand : 'Toutiao'
OperatingSystemClass : 'Cloud'
OperatingSystemName : 'Cloud'
OperatingSystemVersion : '??'
OperatingSystemVersionMajor : '??'
OperatingSystemNameVersion : 'Cloud ??'
OperatingSystemNameVersionMajor : 'Cloud ??'
LayoutEngineClass : 'Robot'
LayoutEngineName : 'Bytespider'
LayoutEngineVersion : '??'
LayoutEngineVersionMajor : '??'
LayoutEngineNameVersion : 'Bytespider ??'
LayoutEngineNameVersionMajor : 'Bytespider ??'
AgentClass : 'Robot'
AgentName : 'Bytespider'
AgentVersion : '??'
AgentVersionMajor : '??'
AgentNameVersion : 'Bytespider ??'
AgentNameVersionMajor : 'Bytespider ??'
AgentInformationUrl : 'https://zhanzhang.toutiao.com/'


# A screenshot creation bot.
- test:
input:
Expand Down Expand Up @@ -10057,6 +10086,34 @@ config:
AgentInformationUrl : 'http://aspiegel.com/petalbot'


- test:
input:
user_agent_string: 'Mozilla/5.0 (compatible;PetalBot;+https://webmaster.petalsearch.com/site/petalbot)'
expected:
DeviceClass : 'Robot'
DeviceName : 'Petalsearch PetalBot'
DeviceBrand : 'Petalsearch'
OperatingSystemClass : 'Cloud'
OperatingSystemName : 'Cloud'
OperatingSystemVersion : '??'
OperatingSystemVersionMajor : '??'
OperatingSystemNameVersion : 'Cloud ??'
OperatingSystemNameVersionMajor : 'Cloud ??'
LayoutEngineClass : 'Robot'
LayoutEngineName : 'PetalBot'
LayoutEngineVersion : '??'
LayoutEngineVersionMajor : '??'
LayoutEngineNameVersion : 'PetalBot ??'
LayoutEngineNameVersionMajor : 'PetalBot ??'
AgentClass : 'Robot'
AgentName : 'PetalBot'
AgentVersion : '??'
AgentVersionMajor : '??'
AgentNameVersion : 'PetalBot ??'
AgentNameVersionMajor : 'PetalBot ??'
AgentInformationUrl : 'https://webmaster.petalsearch.com/site/petalbot'


- test:
input:
user_agent_string: 'Mozilla/5.0 eCairn-Grabber/1.0 (+http://ecairn.com/grabber)'
Expand Down Expand Up @@ -13318,7 +13375,7 @@ config:


- test:
input:
input: # A Mastodon robot
User-Agent : 'CDSCbot/2024.06 https://wiki.communitydata.science/CommunityData:Fediverse_research'
expected:
DeviceClass : 'Robot'
Expand Down Expand Up @@ -13346,7 +13403,7 @@ config:


- test:
input:
input: # A Mastodon robot
User-Agent : 'CDSCbot/2024.08.08 https://wiki.communitydata.science/CommunityData:Fediverse_research'
expected:
DeviceClass : 'Robot'
Expand All @@ -13373,6 +13430,62 @@ config:
AgentInformationUrl : 'https://wiki.communitydata.science/CommunityData:Fediverse_research'


- test:
input: # A Mastodon robot
User-Agent : 'caveman-hunter/0.0.0 (+https://fedi.buzz/)'
expected:
DeviceClass : 'Robot'
DeviceName : 'Fedi Robot'
DeviceBrand : 'Fedi'
OperatingSystemClass : 'Cloud'
OperatingSystemName : 'Cloud'
OperatingSystemVersion : '??'
OperatingSystemVersionMajor : '??'
OperatingSystemNameVersion : 'Cloud ??'
OperatingSystemNameVersionMajor : 'Cloud ??'
LayoutEngineClass : 'Unknown'
LayoutEngineName : 'Unknown'
LayoutEngineVersion : '??'
LayoutEngineVersionMajor : '??'
LayoutEngineNameVersion : 'Unknown ??'
LayoutEngineNameVersionMajor : 'Unknown ??'
AgentClass : 'Special'
AgentName : 'Caveman-Hunter'
AgentVersion : '0.0.0'
AgentVersionMajor : '0'
AgentNameVersion : 'Caveman-Hunter 0.0.0'
AgentNameVersionMajor : 'Caveman-Hunter 0'
AgentInformationUrl : 'https://fedi.buzz/'


- test:
input: # A Mastodon robot
User-Agent : 'vmcrawl/0.2 (https://docs.vmst.io/projects/vmcrawl)'
expected:
DeviceClass : 'Robot'
DeviceName : 'Vmst Vmcrawl'
DeviceBrand : 'Vmst'
OperatingSystemClass : 'Cloud'
OperatingSystemName : 'Cloud'
OperatingSystemVersion : '??'
OperatingSystemVersionMajor : '??'
OperatingSystemNameVersion : 'Cloud ??'
OperatingSystemNameVersionMajor : 'Cloud ??'
LayoutEngineClass : 'Robot'
LayoutEngineName : 'vmcrawl'
LayoutEngineVersion : '0.2'
LayoutEngineVersionMajor : '0'
LayoutEngineNameVersion : 'vmcrawl 0.2'
LayoutEngineNameVersionMajor : 'vmcrawl 0'
AgentClass : 'Robot'
AgentName : 'Vmcrawl'
AgentVersion : '0.2'
AgentVersionMajor : '0'
AgentNameVersion : 'Vmcrawl 0.2'
AgentNameVersionMajor : 'Vmcrawl 0'
AgentInformationUrl : 'https://docs.vmst.io/projects/vmcrawl'


- test:
input:
User-Agent : 'Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Amazonbot/0.1; +https://developer.amazon.com/support/amazonbot) Chrome/119.0.6045.214 Safari/537.36'
Expand Down

0 comments on commit 3f65ff8

Please sign in to comment.