Skip to content

Commit

Permalink
Use a Session to enable connection pooling
Browse files Browse the repository at this point in the history
Requests and urllib3 automatically reuse connections if using a Session, which otherwise has all the same methods as the main Requests API, making this a drop-in replacement: http://docs.python-requests.org/en/latest/user/advanced/

In my tests, this change cut 830 HEAD requests to Foursquare's internal artifact cache from 22s to 11s.

Testing Done:
Using 830 cache keys copied from a real build, this change cut the time to do HEAD requests for each from 22s to 11s.

https://travis-ci.org/pantsbuild/pants/builds/38515947

Reviewed at https://rbcommons.com/s/twitter/r/981/
  • Loading branch information
dt authored and Patrick Lawson committed Oct 20, 2014
1 parent 63c0a02 commit cc8931e
Showing 1 changed file with 8 additions and 4 deletions.
12 changes: 8 additions & 4 deletions src/python/pants/cache/restful_artifact_cache.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,10 @@ def __init__(self, log, artifact_root, url_base, compress=True):
self._path_prefix = parsed_url.path.rstrip('/')
self.compress = compress

# To enable connection reuse, all requests must be created from same session.
# TODO: Re-evaluate session's life-cycle if/when a longer-lived pants process exists.
self._session = requests.Session()

# Reduce the somewhat verbose logging of requests.
# TODO do this in a central place
logging.getLogger('requests').setLevel(logging.WARNING)
Expand Down Expand Up @@ -109,13 +113,13 @@ def _request(self, method, path, body=None):
try:
response = None
if 'PUT' == method:
response = requests.put(url, data=body, timeout=self._timeout_secs)
response = self._session.put(url, data=body, timeout=self._timeout_secs)
elif 'GET' == method:
response = requests.get(url, timeout=self._timeout_secs, stream=True)
response = self._session.get(url, timeout=self._timeout_secs, stream=True)
elif 'HEAD' == method:
response = requests.head(url, timeout=self._timeout_secs)
response = self._session.head(url, timeout=self._timeout_secs)
elif 'DELETE' == method:
response = requests.delete(url, timeout=self._timeout_secs)
response = self._session.delete(url, timeout=self._timeout_secs)
else:
raise ValueError('Unknown request method %s' % method)

Expand Down

0 comments on commit cc8931e

Please sign in to comment.