Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.ipynb_checkpoints		.ipynb_checkpoints
change_detection/using_pca_and_k_means		change_detection/using_pca_and_k_means
land_classification		land_classification
object_detection		object_detection
semantic_segmentation		semantic_segmentation
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Repository files navigation

Introduction

This document primarily lists resources for performing deep learning (DL) on satellite imagery. To a lesser extent Machine learning (ML, e.g. random forests, stochastic gradient descent) are also discussed, as are classical image processing techniques.

Datasets

Various datasets listed here

Sentinel

One of the best known open data sets. See wikipedia.
Sentinel-hub provides access to a range of Sentinel data and may be the best overall source of imagery + data.
raw data - requester pays
Paid access via sentinel-hub and python-api.
GBDX also has Sentinel imagery.
Example loading sentinel data in a notebook

Kaggle

Kaggle hosts several large satellite image datasets (> 1 GB). A list if general image datasets is here. A list of land-use datasets is here.

Kaggle - Deepsat - classification challenge

Each sample image is 28x28 pixels and consists of 4 bands - red, green, blue and near infrared. The training and test labels are one-hot encoded 1x6 vectors. Each image patch is size normalized to 28x28 pixels. Data in .mat Matlab format. JPEG?

Sat4 500,000 image patches covering four broad land cover classes - barren land, trees, grassland and a class that consists of all land cover classes other than the above three Example notebook
Sat6 405,000 image patches each of size 28x28 and covering 6 landcover classes - barren land, trees, grassland, roads, buildings and water bodies.

Kaggle - Amazon from space - classification challenge

https://www.kaggle.com/c/planet-understanding-the-amazon-from-space/data
3-5 meter resolution GeoTIFF images
12 classes including - cloudy, primary + waterway etc

Kaggle - DSTL - segmentation challenge

https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection
45 satellite images covering 1km x 1km in both 3-band and 16-band formats.
10 Labelled classes include - Buildings, Road, Trees, Crops, Waterway, Vehicles

Kaggle - Airbus Ship Detection Challenge

https://www.kaggle.com/c/airbus-ship-detection/overview
I believe there was a problem with this dataset, which led to many complaints that the competition was ruined.

Kaggle - Draper - place images in order of time

https://www.kaggle.com/c/draper-satellite-image-chronology/data
Images are grouped into sets of five, each of which have the same setId. Each image in a set was taken on a different day (but not necessarily at the same time each day). The images for each set cover approximately the same area but are not exactly aligned.

Kaggle - other

Satellite + loan data -> https://www.kaggle.com/reubencpereira/spatial-data-repo

Alternative datasets

There are a variety of datasets suitable for land classification problems.

UC Merced

http://weegee.vision.ucmerced.edu/datasets/landuse.html
This is a 21 class land use image dataset meant for research purposes.
There are 100 RGB TIFF images for each class
Each image measures 256x256 pixels with a pixel resolution of 1 foot

AWS datasets

Landsat -> free viewer at remotepixel and libra
Optical, radar, segmented etc. https://aws.amazon.com/earth/
SpaceNet

Quilt

Several people have uploaded datasets to Quilt

Google Earth Engine

https://developers.google.com/earth-engine/
Various imagery and climate datasets, including Landsat & Sentinel imagery
Python API but all compute happens on Googles servers

Weather Datasets

UK met-odffice -> https://www.metoffice.gov.uk/datapoint
NASA (make request and emailed when ready) -> https://search.earthdata.nasa.gov
NOAA (requires BigQuery) -> https://www.kaggle.com/noaa/goes16/home
Time series weather data for several US cities -> https://www.kaggle.com/selfishgene/historical-hourly-weather-data

Online computing resources

Generally a GPU is required for DL. Googles colab is free but limited compute time (12 hours) and somewhat non persistent,

Kaggle

Free to use
GPU Kernels (may run for 1 hour which limits usefulness?)
Tensorflow, pytorch & fast.ai available
Advantage that many datasets are already available
Read

### Clouderizer

https://clouderizer.com/
Clouderizer is a cloud computing management service, it takes care of installing the required packages to a cloud computing instance (like Amazon AWS or Google Colab). Clouderizer is free for 200 hours per month (Robbie plan) and does not require a credit card to sign up.
Run projects locally, on cloud or both.
SSH terminal, Jupyter Notebooks and Tensorboard are securely accessible from Clouderizer Web Console.

AWS

GPU available
https://aws.amazon.com/ec2/?ft=n

Microsoft Azure

GPU available (link?)
Focus on CNTK?
https://azure.microsoft.com/en-us/free/?b=16.45
https://docs.microsoft.com/en-us/azure/machine-learning/preview/scenario-aerial-image-classification

Google

ML engine - sklearn, tensorflow, keras
Collaboratory (notebooks with GPU as a backend for free for 12 hours at a time),
Tensorflow available
pytorch can be installed, useful articles

Floydhub

https://www.floydhub.com/
Cloud GPUs
Jupyter Notebooks
Tensorboard
Version Control for DL
Deploy Models as REST APIs
Public Datasets

Paperspace

https://www.paperspace.com/
1-Click Jupyter Notebooks
GPU on demand
Python API

Crestle

https://www.crestle.com/
Cloud GPU & persistent file store
Fast.ai lessons pre-installed

Salamander

https://salamander.ai/

Interesting DL projects

RoboSat

https://github.com/mapbox/robosat
Generic ecosystem for feature extraction from aerial and satellite imagery.

RoboSat.Pink

A fork of robotsat
https://github.com/datapink/robosat.pink

DeepOSM

https://github.com/trailbehind/DeepOSM
Train a deep learning net with OpenStreetMap features and satellite imagery.

DeepNetsForEO - segmentation

https://github.com/nshaud/DeepNetsForEO
Uses SegNET for working on remote sensing images using deep learning.

Skynet-data

https://github.com/developmentseed/skynet-data
Data pipeline for machine learning with OpenStreetMap

Production

Custom REST API

Tensorflow Serving

https://www.tensorflow.org/serving/
Official version is python 2 but python 3 build here
Another approach is to use Docker

TensorFlow Serving makes it easy to deploy new algorithms and experiments, while keeping the same server architecture and APIs. Multiple models, or indeed multiple versions of the same model, can be served simultaneously. TensorFlow Serving comes with a scheduler that groups individual inference requests into batches for joint execution on a GPU

Floydhub

Allows exposing model via rest API

modeldepot

https://modeldepot.io
ML models hosted

Image formats & catalogues

We certainly want to consider cloud optimised GeoTiffs https://www.cogeo.org/
https://terria.io/ for pretty catalogues
Remote pixel
Sentinel-hub eo-browser
Large datasets may come in HDF5 format, can view with -> https://www.hdfgroup.org/downloads/hdfview/
Climate data is often in netcdf format, which can be opened using xarray

STAC - SpatioTemporal Asset Catalog

Specification describing the layout of a catalogue comprising of static files. The aim is that the catalogue is crawlable so it can be indexed by a search engine and make imagery discoverable, without requiring yet another API interface.
An initiative of https://www.radiant.earth/ in particular https://github.com/cholmes
Spec at https://github.com/radiantearth/stac-spec
Browser at https://github.com/radiantearth/stac-browser
Talk at https://docs.google.com/presentation/d/1O6W0lMeXyUtPLl-k30WPJIyH1ecqrcWk29Np3bi6rl0/edit#slide=id.p
Example catalogue at https://landsat-stac.s3.amazonaws.com/catalog.json
Chat https://gitter.im/SpatioTemporal-Asset-Catalog/Lobby
Several useful repos on https://github.com/sat-utils

State of the art

What are companies doing?

Overall trend to using AWS S3 backend for image storage. There are a variety of tools for exploring and having teams collaborate on data on S3, e.g. T4.
Just speculating, but a serverless pipeline appears to be where companies are headed for routine compute tasks, whilst providing a Jupyter notebook approach for custom analysis.
Cloud optimised geotiffs to become the standard?
DigitalGlobe have a cloud hosted Jupyter notebook platform called GBDX. Cloud hosting means they can guarantee the infrastructure supports their algorithms, and they appear to be close/closer to deploying DL. Tutorial notebooks here.
Planet have a Jupyter notebook platform which can be deployed locally and requires an API key (14 days free). They have a python wrapper (2.7?!) to their rest API. They are mostly focussed on classical & fast algorithms?

Interesting projects

Pangeo - resources for parallel processing using Dask and Xarray http://pangeo.io/index.html
Open Data Cube - serve up cubes of data https://www.opendatacube.org/
Process Satellite data using AWS Lambda functions
OpenDroneMap - generate maps, point clouds, 3D models and DEMs from drone, balloon or kite images.

Techniques

This section explores the different techniques (DL, ML & classical) people are applying to common problems in satellite imagery analysis. Classification problems are the most simply addressed via DL, object detection is harder, and cloud detection harder still (niche interest).

Land classification

Very common problem, assign land classification to a pixel based on pixel value, can be addressed via simple sklearn cluster algorithm or deep learning.
Land use is related to classification, but we are trying to detect a scene, e.g. housing, forestry. I have tried CNN -> See my notebooks
Land Use Classification using Convolutional Neural Network in Keras
Sea-Land segmentation using DL
Pixel level segmentation on Azure
Deep Learning-Based Classification of Hyperspectral Data
A U-net based on Tensorflow for objection detection (or segmentation) of satellite images - DSTL dataset but python 2.7

Change detection

Monitor water levels, coast lines, size of urban areas, wildfire damage. Note, clouds change often too..!
Using PCA (python 2, requires updating) -> https://appliedmachinelearning.blog/2017/11/25/unsupervised-changed-detection-in-multi-temporal-satellite-images-using-pca-k-means-python-code/
Using CNN -> https://github.com/vbhavank/Unstructured-change-detection-using-CNN
Siamese neural network to detect changes in aerial images
https://www.spaceknow.com/
LANDSAT Time Series Analysis for Multi-temporal Land Cover Classification using Random Forest

Image registration

Wikipedia article on registration -> register for change detection or image stitching
Traditional approach -> define control points, employ RANSAC algorithm
Phase correlation used to estimate the translation between two images with sub-pixel accuracy, useful for allows accurate registration of low resolution imagery onto high resolution imagery, or register a sub-image on a full image -> Unlike many spatial-domain algorithms, the phase correlation method is resilient to noise, occlusions, and other defects. Applied to Landsat images here.

Object detection

A typical task is detecting boats on the ocean, which should be simpler than land based challenges owing to blank background in images, but is still challenging and no convincing robust solutions available.
Intro articles here and here.
DigitalGlobe article - they use a combination classical techniques (masks, erodes) to reduce the search space (identifying water via NDWI which requires SWIR) then apply a binary DL classifier on candidate regions of interest. They deploy the final algo as a task on their GBDX platform. They propose that in the future an R-CNN may be suitable for the whole process.
Planet use non DL felzenszwalb algorithm to detect ships
Segmentation of buildings on kaggle
Identifying Buildings in Satellite Images with Machine Learning and Quilt -> NDVI & edge detection via gaussian blur as features, fed to TPOT for training with labels from OpenStreetMap, modelled as a two class problem, “Buildings” and “Nature”.
Deep learning for satellite imagery via image segmentation

Cloud detection

A subset of the object detection problem, but surprisingly challenging
From this article on sentinelhub there are three popular classical algorithms that detects thresholds in multiple bands in order to identify clouds. In the same article they propose using semantic segmentation combined with a CNN for a cloud classifier (excellent review paper here), but state that this requires too much compute resources.
This article compares a number of ML algorithms, random forests, stochastic gradient descent, support vector machines, Bayesian method.
DL..

Super resolution

## Pansharpening

Does not require DL, classical algos suffice, see this notebook
https://github.com/mapbox/rio-pansharpen

Stereo imaging for terrain mapping & DEMs

Map terrain from stereo images to produce a digital elevation model (DEM) -> high resolution & paired images required, typically 0.3 m, e.g. Worldview or GeoEye.
Process of creating a DEM here and here.
https://github.com/MISS3D/s2p -> produces elevation models from images taken by high resolution optical satellites -> demo code on https://gfacciol.github.io/IS18/
Intro to depth from stereo
Automatic 3D Reconstruction from Multi-Date Satellite Images
Semi-global matching with neural networks
Predict the fate of glaciers
monodepth - Unsupervised single image depth prediction with CNNs
Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches
Terrain and hydrological analysis based on LiDAR-derived digital elevation models (DEM) - Python package

License

liushuaicare/satellite-image-deep-learning

Folders and files

Latest commit

History

Repository files navigation

Introduction

Top links

Table of contents

Datasets

Sentinel

Kaggle

Kaggle - Deepsat - classification challenge

Kaggle - Amazon from space - classification challenge

Kaggle - DSTL - segmentation challenge

Kaggle - Airbus Ship Detection Challenge

Kaggle - Draper - place images in order of time

Kaggle - other

Alternative datasets

UC Merced

AWS datasets

Quilt

Google Earth Engine

Weather Datasets

Online computing resources

Kaggle

AWS

Microsoft Azure

Google

Floydhub

Paperspace

Crestle

Salamander

Interesting DL projects

RoboSat

RoboSat.Pink

DeepOSM

DeepNetsForEO - segmentation

Skynet-data

Production

Custom REST API

Tensorflow Serving

Floydhub

modeldepot

Image formats & catalogues

STAC - SpatioTemporal Asset Catalog

State of the art

Interesting projects

Techniques

Land classification

Change detection

Image registration

Object detection

Cloud detection

Super resolution

Stereo imaging for terrain mapping & DEMs

NVDI - vegetation index

For fun

Useful References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages