This repo can be used to quickly generate YOLOv8 models for biodiversity monitoring, relying on Ultralytics and a GBIF dataset. All code is tested on Windows 10 and Python 3.11, without GPU. GPU would obviously accelerate the below steps, Ultralytics should automatically select the available GPU if there is any.
We have released a new version here: github.com/Tvenver/Bplusplus/tree/package We also launched a package, which can be installed directly: https://github.com/Tvenver/Bplusplus/tree/package
To create your own custom CV model:
- Input names (scientific names) in the names.csv file, in the data folder
- Download the GBIF repository of your choosing, or download a prepared dataset linking to 16M images of many insect species: https://doi.org/10.15468/dl.dk9czq
- Update the path in collect_images.py on line 36 and line 54, to route to the unzipped GBIF downloaded files.
- In collect_images.py, consider activating the sampling function, to reduce the number of images to download per species - in the case of many insect species, the download will take longer.
- run collect_images.py, this fetches the names, iterates through them, and attempts to download images from a GBIF data repository.
- As an example, for about 8 insect species, ending up with 4000 images, the entire operation might take +-20 minutes, depending on your internet speed and hardware.
- run train_validate.py, this shuffles the images into a train and validation set, and Ultralytics takes care of the training.
- You can tweak various parameters for the training, if you want to, please visit the Ultralytics YOLOv8 documentation for more information.
You have created a YOLOv8 model for image classification.
To use the pretrained model: There is also a pretrained YOLOv8 classification model, containing 2584 species, included in this repo under B++ CV Model. The included species are listed in a separate file.
- Download the pretrained model from the Google Drive link listed in the folder B++ CV Model
- Take the run_model.py script, specify the path to the downloaded .pt file, and run the model.
All information in this GitHub is available under MIT license, as long as credit is given to the authors.
Venverloo, T., Duarte, F., B++: Towards Real-Time Monitoring of Insect Species. MIT Senseable City Laboratory, AMS Institute.