Skip to content

Efficiently collect data from geographically distributed locations

License

Notifications You must be signed in to change notification settings

luiscarlos27/allocator

Repository files navigation

allocator: Optimally Allocate Geographically Distributed Tasks

https://travis-ci.org/soodoku/allocator.svg?branch=master https://ci.appveyor.com/api/projects/status/qfvbu8h99ymtw2ub?svg=true https://ci.appveyor.com/api/projects/status/m2oxean5ona6p2m6?svg=true

How can we efficiently collect data from geographically distributed locations? If the data collection is being crowd-sourced, then we may want to exploit the fact that workers are geographically distributed. One simple heuristic to do so is to order the locations by distance for each worker (with some task registration backend). If you have hired workers (or rented drones) who you can send to different locations, then you must split the tasks across workers (drones), and plan the 'shortest' routes for each, ala the Traveling Salesman Problem (TSP). This is a problem that companies like Fedex etc. solve all the time. Since there are no computationally feasible solutions for solving for the global minimum, one heuristic solution is to split the locations into clusters of points that are close to each other (ideally, we want the clusters to be 'balanced'), and then to estimate a TSP solution for each cluster.

The package provides a simple way to implement these solutions. Broadly, it provides three kinds of functions:

  1. Sort by Distance: Produces an ordered list of workers for each point or an ordered list of points
    for each worker.
  2. Cluster the Points: Clusters the points into n_worker groups.
  3. Shortest Path: Order points within a cluster (or any small number of points) into a path or itinerary.

The package also provides access to three different kinds of distance functions for calculating the distance matrices that underlie these functions:

  1. Euclidean Distance: use option -d euclidean; similar to the Haversine distance within the same UTM zone)
  2. Haversine Distance: use option -d haversine.
  3. OSRM Distance: use option -d osrm. Neither Haversine nor Euclidean distance take account of the actual road network or the traffic. To use actual travel time, use Open Source Routing Machine API A maximum number of 100 points can be passed to the function if we use the public server. However, you can set up your own private OSRM server with --max-table-size to specific the maximum number of points.
  4. Google Distance Matrix API:. use option -d google. This option available in sort_by_distane and cluster_kahip only due to Google Distance Matrix API has very usage limits. Please look at the limitations here.

Related Package

To sample locations randomly on the streets, check out geo_sampling.

Application

Missing Women on the streets of Delhi. See women count

Install

pip install allocator

Functions

  1. Sort By Distance

  2. Cluster

    Cluster data collection locations using k-means (clustering) or KaHIP (graph partitioning). To check which of the algorithms produces more cohesive, balanced clusters, run Compare K-means to KaHIP

    1. k-means

      Examples:

      python -m allocator.cluster_kmeans -n 10 allocator/examples/chonburi-roads-1k.csv --plot
      
    2. KaHIP allocator

  3. Shortest Path

    These function can be used find the estimated shortest path across all the locations in a cluster. We expose three different ways of getting the 'shortest' path, a) via MST (Christofides algorithm), b) via Google OR-Tools, b) Google Maps Directions API.

    1. Approximate TSP using MST
    2. Google OR Tools TSP solver Shortest path
    3. Google Maps Directions API Shortest path
    4. OSRM Trip API Shortest path

Documentation

Documentation available at: https://allocator.readthedocs.io/en/latest/

Authors

Suriyan Laohaprapanon and Gaurav Sood

Contributor Code of Conduct

The project welcomes contributions from everyone! In fact, it depends on it. To maintain this welcoming atmosphere, and to collaborate in a fun and productive way, we expect contributors to the project to abide by the Contributor Code of Conduct.

License

The package is released under the MIT License.

About

Efficiently collect data from geographically distributed locations

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HTML 99.3%
  • Python 0.7%