Trandoshan dark web crawler

This repository is a complete rewrite of the Trandoshan dark web crawler. Everything has been written inside a single Git repository to ease maintenance.

Why a rewrite?

The first version of Trandoshan (available here) is working great but not really professional, the code start to be a mess, hard to manage since split in multiple repositories, etc..

I have therefore decided to create & maintain the project in this specific directory, where all process code will be available (as a Go module).

How build the crawler

Since the docker image are not available yet, one must run the following script in order to build the crawler fully.

./scripts/build.sh

How to start the crawler

Execute the /scripts/start.sh and wait for all containers to start. You can start the crawler in detached mode by passing --detach to start.sh

Note

Ensure you have at least 3GB of memory as the Elasticsearch stack docker will require 2GB.

How to start the crawling process

Since the API is explosed on localhost:15005, one can use it to start the crawling process:

feeder --api-uri http://localhost:15005 --url https://www.facebookcorewwwi.onion

this will 'force' the API to publish given URL in crawling queue.

How to view results

At the moment there is no Trandoshan dashboard. You can use the Kibana dashboard available at http://localhost:15004.

You will need to create an index pattern named 'resources', and when it asks for the time field, choose 'time'.

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.github/workflows		.github/workflows
build/docker		build/docker
cmd		cmd
deployments/docker		deployments/docker
internal		internal
pkg/proto		pkg/proto
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
go.mod		go.mod
go.sum		go.sum
snapcraft.yaml		snapcraft.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Trandoshan dark web crawler

Why a rewrite?

How build the crawler

How to start the crawler

Note

How to start the crawling process

How to view results

About

Releases

Packages

Languages

License

dev7ch/trandoshan

Folders and files

Latest commit

History

Repository files navigation

Trandoshan dark web crawler

Why a rewrite?

How build the crawler

How to start the crawler

Note

How to start the crawling process

How to view results

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages