Name		Name	Last commit message	Last commit date
parent directory ..
bin		bin
migrations		migrations
src		src
.gitignore		.gitignore
package.json		package.json
readme.md		readme.md
tsconfig.json		tsconfig.json
wrangler.toml.example		wrangler.toml.example

readme.md

⛏️ Prospector

An open-source template built for internal use by Cloudflare's SEO experts to parse and notify on new website content. Using D1, Queues, and Workers, this template will show you how to connect multiple Cloudflare products together to build a fully-featured application.

Deployment

Clone the repository from the cloudflare/tmemplates repository:

$ npm init cloudflare my-project prospector
# or
$ yarn create cloudflare my-project prospector
# or
$ pnpm create cloudflare my-project prospector

Install Wrangler if not already installed.

$ npm install @cloudflare/wrangler -g

Login to your account using Wrangler.

$ wrangler login

Create a new D1 database and Queues instance.

$ wrangler d1 create $DATABASE_NAME
$ wrangler queues create $QUEUE_NAME

Update wrangler.toml with the appropriate bindings. See configuration for more information.

Run the bin/migrate script to create the tables in the database.

$ bin/migrate

Deploy the application to your account.

$ npm run deploy

Visit the Workers URL to access the user interface and add notifiers and URLs. Receive email when a new keyword match is found.

Configuration

Prospector is configured with a combination of environment variables and secrets. The following configuration options are available (some are required):

AUTH_TOKEN - An optional token to use for authentication when scraping websites. If not provided, authentication will be disabled. If provided, it will be used to authenticate against the website using the Authorization header, passed as a bearer token.
SITEMAP_URL - The URL of the sitemap to use for scraping. This is required.

Additionally, you must configure a D1 database and Queues instance. They should be configured in the wrangler.toml file:

[[ d1_databases ]]
binding = "DB"
database_name = "{{database_name}}"
database_id = "{{database_id}}"
preview_database_id = "{{database_preview_id}}"

[[queues.producers]]
  queue = "{{queue_name}}"
  binding = "QUEUE"

[[queues.consumers]]
  queue = "{{queue_name}}"
  max_batch_size = 10
  max_batch_timeout = 30
  max_retries = 10
  dead_letter_queue = "{{dlq_queue_name}}"

Finally, you must enable a cron trigger to run the scraper. This is configured in the wrangler.toml file:

[triggers]
crons = ["0 0 * * *"]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

worker-prospector

worker-prospector

readme.md

⛏️ Prospector

Deployment

Configuration

Files

worker-prospector

Directory actions

More options

Directory actions

More options

Latest commit

History

worker-prospector

Folders and files

parent directory

readme.md

⛏️ Prospector

Deployment

Configuration