Skip to content

A scraper to pull images from NBDC's network of buoys that have cam feeds

Notifications You must be signed in to change notification settings

MrKing2023/buoy-cam-scraper

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Buoy Cam Scraper

A scraper to pull images from NBDC's network of buoys that have cam feeds (link), lonely sensors that constantly monitor our seas, air and atmosphere. The cameras are updated on an hourly basis, so the script is set to scrape the feeds every hour from the time it starts running.

Installation

Make sure you have node installed. Download the files, open a terminal in the buoy-cam-scraper folder and run:

npm install

Usage

Open up a terminal in buoy-cam-scraper/ and run:

npm run start

This will start reading and capturing new images from the buoy cameras and saving them to scraped-images/. The script will re-scrape every 30 minutes. Each file is named in the following format:

scraped-images/[buoy ID]/[UTC ms when photo was taken by buoy]-[UTC ms when photo was downloaded]-[buoy ID].jpg

The UTC time when the photo was taken by the buoy is parsed via OCR using tesseract.js. Images that have no data (i.e. all white images) and images that have already been downloaded will be skipped.

Performance

The code is written to be run on a Raspberry Pi. On a Raspberry Pi 3, it takes about ~2 minutes to scrape images from all the buoy cams. The bulk of the time is spent on OCR.

Info

See the info about the NBDC web APIs here.

Data

The data folder contains some scraped meta information in buoycam-info.json - meta information about the buoy cams including: name, GPS location, image width, image height.

About

A scraper to pull images from NBDC's network of buoys that have cam feeds

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 100.0%