This project uses convolutional neural nets to classify images of wildflowers found in Colorado's Front Range.
Accurate identification of wildflowers is a task with relevance to both recreation and environmental management. Currently, there are several mobile apps designed to identify flowers using images; the best of these (e.g., https://identify.plantnet-project.org/) is connected with an annual international competition for advancing techniques for plant classification from images. However, none of the extant plant identification apps are particularly accurate for identification of flowers in North America.
Example: Pl@ntNet attempt to identify Delphinium nuttalianum, a plant commonly seen blooming in Colorado in early spring:
It seems reasonable that a model trained primarily on images of flora prevalent in the Front Range of Colorado would be more likely to correctly identify images of local wildflowers than global apps trained on flora located primarily in other regions of the world. The primary aim of this project is to develop a model for classification of wildflowers native to the Front Range in Colorado. A secondary aim is to develop a model that, in future, could take advantage of metadata provided by users of a mobile app while photographing wildflowers in order to provide more accurate classifications.
Initially, I planned to collect images via web scraping. However, my preliminary efforts suggested that web scraping would be very time intensive as most websites with images of wildflowers have only a few images of each species. Additionally, when considering ways to improve upon existing flower identification apps, it seemed to me that having photographs tagged with date/time and GPS location could be potentially useful. In the long term, historical GPS and date/time information could be used to improve prediction of flower species; each species is more common in particular areas/elevations and at particular times of the year. More immediately, GPS information will permit clustering of photos by location, which will allow me to cluster images within observations (i.e., one plant = one observation), a strategy employed in the 2015 LifeCLEF challenge (for a summary, see http://ceur-ws.org/Vol-1391/157-CR.pdf). For all these reasons, I chose to collect photographs of local wildflowers using my iPhone and a point and shoot camera. I also gathered mobile phone photos from friends and family.
Basic hand-written CNN using Keras with Theano backend, trained on photos taken with my iPhone 6s.
-
Data: For this model, I was pickier with images than in later attempts (see below); I only included images that were in focus and I removed images that were very similar.
- 651 images representing 11 wildflower species/classes
- Images resized to 120 x 90
-
Results: Accuracy was .88. Misclassified images were most commonly images confused as penstemon virens (suggesting that I needed more photos of penstemon virens) or images with a lot of foliage. This seemed to be due to the relative infrequency of zoomed-out images containing a lot of foliage within the data set, generally. To resolve this issue, I considered adding more zoomed-out images or simply using higher resolution images or cropping the images. The foliage-related misclassification issue is demonstrated by the images below:
-
Next Steps: A brief review of the literature related to image classification for flowers brought me to publications from recent successful teams in the PlantCLEF (http://www.imageclef.org/lifeclef/2016/plant) annual competition. I was particularly interested in the possibility of using a deep residual network based on work from Šulc and colleagues (http://cmp.felk.cvut.cz/~mishkdmy/papers/CMP-CLEF-2016.pdf).
The current standard for plant identification is fine tuning very deep networks trained on large datasets of images (e.g., ImageNet (http://www.image-net.org/)). One of the newer advances in deep networks is He and colleagues' residual neural network, ResNet (https://arxiv.org/abs/1512.03385)). Deep networks have been of great interest to computer vision researchers because neural networks with more layers are able to recognize more features than those with fewer layers. Being able to recognize more features is very useful for differentiating objects with a lot of visual complexity, like flowers. However, traditional neural networks suffer from oversaturation when they have a lot of layers; they actually underfit on the training data. Residual networks differ from 'traditional' deep networks because the model is trained to learn the residual error instead of the traditional mapping. ResNet also passes the identity mapping past convolutional layers in parts of the model; this also reduces the chance of oversaturation.
Image from He et al., 2015 paper: https://arxiv.org/abs/1512.03385
-
Fine-tuning of pre-trained ResNet50 (Keras build from https://github.com/fchollet/keras/blob/master/keras/applications/resnet50.py). ResNet50 was trained on millions of images of objects, so it is already trained to detect basic features in objects (e.g., edges, colors). By adding fully connected layers specific to the wildflower data, we essentially fine tune ResNet50 to apply its understanding of basic objects to identify features that distinguish our flower species/classes.
- Base Model = ResNet50 trained on Imagenet dataset
- Fully connected layers are specific to this project:
- Flatten
- Dense (activation = relu)
- Dense (matches shape of 13 flower classes, activation=softmax)
- Compiling model:
- Optimizer = SGD
- Loss = categorical crossentropy
-
Image Preprocessing: This time, I wanted to try using image generation to reduce overfitting (see below). To do this, I first needed to resize (to 256x256) and center/crop (to 224x224) the images.
-
Image Generation: To decrease the chance of overfitting, the image generator in Keras provided augmented images for each epoch; thus, the model never saw same image twice. Random augmentations included horizontal flip, rotation (up to 30 degrees), horizontal and vertical shift.
- Data: Using all available data (including out of focus photos):
- 1,526 images / 13 species of flowers
- Training the Model
- Set aside 20% of data (n = 306) for validation data set
- Trained model with train/test split (80% train) of remaining images (n = 1,220) on NVIDIA GPU using Amazon AWS EC2 instance.
- Accuracy with random guessing, given the class imbalance, would be .09.
- Model accuracy on validation data: .97
- 97% accuracy is pretty good!
- Only misclassified 4/306 flowers
- The misclassified images look like they were challenging cases (i.e., side views (rare), blurred images, unusual bloom appearance for a given class).
- Add more classes/more images
- Include images from cameras other than my iPhone 6
- Bagging of multiple deep networks to improve accuracy with more classes
- Object recognition: automated cropping
- Cluster images by geotags
Goëau, H., Bonnet, P., & Joly, A. (2015). LifeCLEF Plant Identification Task 2015. (http://ceur-ws.org/Vol-1391/157-CR.pdf)
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep residual learning for image recognition. arXiv.org. (https://arxiv.org/abs/1512.03385)
Šulc, M., Mishkin, D., & Matas, J. (2016). Very deep residual networks with MaxOut for plant identification in the wild. (http://cmp.felk.cvut.cz/~mishkdmy/papers/CMP-CLEF-2016.pdf)
© Jennifer Waller 2017. All rights reserved.