Skip to content

Clever/sphinx

Repository files navigation

Sphinx: HTTP Rate Limiting

Sphinx is a rate limiting HTTP proxy, implemented in Go, using leaky buckets.

The name for this project ("Sphinx") comes from the ancient Greek word sphingien, which means "to squeeze" or "to strangle." The Sphinx would stand by the road and stop travelers to ask them a riddle. If they could not answer, she would strangle them. She was often thought of as a guardian and flanked the entrances to temples.

Sphinx

Why?

Rate limiting an API is often required to ensure that clients do not abuse the available resources and that the API is reliably available when multiple clients are requesting data concurrently. Buckets can be created based on various parameters of an incoming request (eg. Authorization, IP address) to configure how requests are grouped for limiting.

Rate limiting functionality is already available in some proxies (eg. Nginx, HAProxy). However, they often use in-memory stores that make rate-limiting when running multiple proxies (e.g. for load balancing) unpredictable. Configuration for these limits also gets complex since it includes many actions such as routing, request/response re-writing, and rate-limiting.

Sphinx is not...

  • Sphinx is not focused on preventing Denial of Service (DoS) attacks or requests from malicious clients. The goal is to expose rate limiting information to clients and enforce balanced use by API clients.

  • Sphinx is not a request forwarding service. Sphinx only allows for very simplistic forwarding to a single host per instance of the rate limiter. Any advanced routing or request handling should be handled by a real proxy (eg. Nginx, HAProxy).

  • Sphinx is not an HTTPS terminator. This keeps the burden of configuring SSL certificates and security outside of Sphinx. Ideally, there is real load balancing and HTTPS termination before a request hits Sphinx.

Rate limit headers and errors

Sphinx will update HTTP response headers for requests that match limits to include details about the rate limit status. Headers are canonicalized, but clients should assume header names are case insensitive.

  • X-RateLimit-Reset: Unix timestamp when the rate limit counter will be reset.
  • X-RateLimit-Limit: The total number of requests allowed in a time period.
  • X-RateLimit-Remaining: Number of requests that can be made until the reset time.
  • X-RateLimit-Bucket: Name of the rate-limit bucket this request belongs to in the configuration.

Limit names can be configured via a configuration file.

Request:

HOST example.com
GET /resource/123
AUTHORIZATION Basic ABCD

Response headers:

Status: 200 OK
X-RateLimit-Limit: 200
X-RateLimit-Remaining: 199
X-RateLimit-Reset: 1394506274
X-RateLimit-Bucket: authorized-users

In case the client hits a rate limit, an empty response with a 429 Too Many Requests status code will be returned.

Request:

HOST example.com
GET /resource/123
AUTHORIZATION Basic ABC

Response headers:

Status: 429 Too Many Requests
X-RateLimit-Limit: 200
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1394506274
X-RateLimit-Bucket: authorized-users

Rate limiting in Sphinx is managed by setting up limits in a yaml configuration file. Details about the configuration format can be found in the annotated example.

It is important to understand the concept of buckets and limits to effectively configure a rate limiter.

Limit: A limit defines a rate limiting policy that Sphinx enforces by counting requests in named buckets. Bucket: A bucket is simply a named value. Each request that matches a limit increments the value of one bucket.

Below is an example of a limit and three requests that increment two bucket values.

Test Limit

match if request path begins with /limited bucket names are defined as name-{ip-address} Allow TWO requests per minute

Setting this limit using the config would look like:

proxy:
  handler: http             # can be {http,httplogger}
  host: http://httpbin.org  # URI for the http(s) backend we are proxying to
  listen: :6634             # bind to host:port. default: height of the Great Sphinx of Giza

storage:
  type: memory    # can be {redis,memory}

limits:
  test-limit:
    interval: 60  # in seconds
    max: 2        # number of requests allowed in interval
    keys:
      ip: ""      # ip keys require no configuration
    matches:
      paths:
        match_any:
          - "/limited*"

Request One

path: /limited/resource/1 Headers: Host: example.com Authorization: Basic User:Password IP: 10.0.0.1

State: test-limit-10.0.0.1: 1

Request Two

path: /limited/resource/2 Headers: Host: example.com Authorization: Basic Admin:Secure IP: 10.0.0.2

State test-limit-10.0.0.1: 1 test-limit-10.0.0.2: 1

Request Three

path: /limited/resource/3 Headers: Host: example.com Authorization: Basic Admin:Secure IP: 10.0.0.1

State test-limit-10.0.0.1: 2 test-limit-10.0.0.2: 1

The following snippet explains how to define limits in Sphinx:

limit-name:
  interval: 15
  max: 200
  keys:
    headers:
      names:
        - "Authorization"
  matches:
    paths:
      match_any:
        - "/special/resources/.*"

limit_name: Used to identify and added to the X-RateLimit-Bucket header.

interval: A limit may create many buckets. This key provides the expire time in secs for all buckets created for this limit.

max: Maximum number of requests that will be allowed for a bucket in one interval.

keys: This section defines the dynamic bucket name generated for each request. Currently supported matchers include headers and ip. All keys defined are concatenated to create the full bucket name.

headers: Use concatenated header values from requests in the bucket name.

headers:
  encrypt: "SALT_TO_ENCRYPT_VALUE"  # optional
  names:
    - HEADER_NAME_1
    - HEADER_NAME_2

ip: Use the incoming IP Address from the incoming request in the bucket name.

matches: This section defines which requests this limit should be applied to. The request MUST match all of the matchers defined in this block. Currently supported matchers are headers and paths.

headers: This matcher currently supports the match_any key which returns true if any of the list items evaluate to true. eg:

headers:
  match_any:
    - name: "HEADER_NAME"
      match: "REGEX_FOR_MATCHING_HEADER_VALUE"
    - name: "OTHER_HEADER_NAME"  # no match key means just check for existence

paths: This matcher also supports the match_any key.

paths:
  match_any:
    - "/limited/resource/*"
    - "/objects/limited/.*"

Documentation

  • LeakyBucket: LeakyBucket documentation
  • Sphinx: Sphinx documentation

Tests

Sphinx is built and tested against Go 1.2. Ensure this is the version of Go you're running with go version. Make sure your GOPATH is set, e.g. export GOPATH=~/go. Clone the repository to a location outside your GOPATH, and symlink it to $GOPATH/src/github.com/Clever/sphinx. If you have gvm installed, you can make this symlink by running the following from the root of the repository: gvm linkthis github.com/Clever/sphinx.

If you have done all of the above, then you should be able to run

make test

If you'd like to see a code coverage report, install the cover tool (go get code.google.com/p/go.tools/cmd/cover), make sure $GOPATH/bin is in your PATH, and run:

COVERAGE=1 make

Credits

  • Sphinx logo by EricP from The Noun Project
  • Drone inspiration for building a deb