Skip to content

Commit

Permalink
Introduce access to point in time API (quandl#172)
Browse files Browse the repository at this point in the history
* quandl: introduce point in time api methods

Update quandl-python to retrieve point in time data via API.  Similar to
datatables, point in time allows callers to retrieve versioned records for
datatables for a given date or date interval.

Signed-off-by: Jamie Couture <[email protected]>

* Add documentation for point in time

Update developer documentation to provide examples on how to use point
in time.

Signed-off-by: Jamie Couture <[email protected]>

* quandl-python: 3.6.0

Signed-off-by: Jamie Couture <[email protected]>

Co-authored-by: Ruan Carlos <[email protected]>
  • Loading branch information
couture-ql and Ruan Carlos authored Jan 11, 2021
1 parent c3efa8d commit 2c45a24
Show file tree
Hide file tree
Showing 9 changed files with 328 additions and 18 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
### 3.6.0 - 2020-12-15

* Add access to Point in Time API.

### 3.5.3 - 2020-10-14

* Support passing a pandas Series to `get_table`
Expand Down
35 changes: 26 additions & 9 deletions FOR_ANALYSTS.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# Quick Method Guide - Quandl-Python

This quick guide offers convenient ways to retrieve individual datasets or datatables with the Python package without the need for complex commands.
This quick guide offers convenient ways to retrieve individual datasets or datatables with the Python package without the need for complex commands.

## Retrieving Data

Retrieving data can be achieved easily using the two methods `quandl.get` for datasets and `quandl.get_table` for datatables. In both cases we strongly recommend that you set your api key via:
Retrieving data can be achieved easily using these methods `quandl.get` for datasets, `quandl.get_table` for datatables and `quandl.get_point_in_time` for point in time data. In all cases we strongly recommend that you set your api key via:

```python
import quandl
Expand Down Expand Up @@ -42,7 +42,7 @@ This revised query will find all data points annually for the dataset `NSE/OIL`
The following additional parameters can be specified for a dataset call:

| Option | Explanation | Example | Description |
|---|---|---|---|
|--------|-------------|---------|-------------|
| api_key | Your access key | `api_key='tEsTkEy123456789'` | Used to identify who you are and provide more access. Only required if not set via `quandl.ApiConfig.api_key=` |
| \<filter / transformation parameter\> | A parameter which filters or transforms the resulting data | `start_date='2010-01-01` | For a full list see our [api docs](https://www.quandl.com/docs/api#data) |

Expand All @@ -57,7 +57,7 @@ import quandl
quandl.bulkdownload('EOD')
```

After the download is finished, the `quandl.bulkdownload` will return the filename of the downloaded zip file.
After the download is finished, the `quandl.bulkdownload` will return the filename of the downloaded zip file.

To download database data from the previous day, use the download_type option:

Expand Down Expand Up @@ -101,7 +101,7 @@ See www.quandl.com/docs/api for more information.
Returning Dataframe for ['WIKI.AAPL.11', 'WIKI.MSFT.11']
WIKI.AAPL - Close WIKI.MSFT - Close
Date
Date
1997-08-20 6.16 17.57
1997-08-21 6.00 17.23
1997-08-22 5.91 17.16
Expand Down Expand Up @@ -135,14 +135,14 @@ data = quandl.get_table('ZACKS/FC', paginate=True, ticker=['AAPL', 'MSFT'], per_

In this query we are asking for more pages of data, `ticker` values of either `AAPL` or `MSFT` and a `per_end_date` that is greater than or equal to `2015-01-01`. We are also filtering the returned columns on `ticker`, `per_end_date` and `comp_name` rather than all available columns. The output format is `pandas`.

Download table data as a zip file. You can download all the table data in a data table in a single call. The following will download the entire F1 table data as a zip file to your current working directory:
Download table data as a zip file. You can download all the table data in a data table in a single call. The following will download the entire F1 table data as a zip file to your current working directory:

```python
import quandl
data = quandl.export_table('MER/F1')
```

You can also specify where to download the zip file:
You can also specify where to download the zip file:

```python
import quandl
Expand All @@ -151,7 +151,7 @@ data = quandl.export_table('MER/F1', filename='/my/path/db.zip')

Note that if you are downloading the whole table, it will take longer to generate the zip file.

You can also specify what data you want to download with filters and parameters.(`cursor_id` and `paginate` are not supported for exporting table zip file):
You can also specify what data you want to download with filters and parameters.(`cursor_id` and `paginate` are not supported for exporting table zip file):

```python
import quandl
Expand Down Expand Up @@ -181,6 +181,23 @@ For more information on how to use and manipulate the resulting data see the [pa
* *(recommended)* Refine your filter parameters to retrieve a smaller results set
* Use the the [Detailed](./FOR_DEVELOPERS.md) method to iterate through more of the data.

### Point in Time

PointInTime works similarly to datatables but filtering the data based on dates. For example, a simple way to retrieve datatable information for a specific date would be:

```python
import quandl
data = quandl.get_point_in_time('DATABASE/CODE', interval='asofdate', date='2020-01-01')
```

#### Available options

| Interval | Explanation | Required params | Example |
|----------|-------------|-----------------|---------|
| asofdate | Returns data as of a specific date | date | `quandl.get_point_in_time('DATABASE/CODE', interval='asofdate', date='2020-01-01')` |
| from | Returns data from `start` up to but excluding `end`; [start, end) | start_date, end_date | `quandl.get_point_in_time('DATABASE/CODE', interval='from', start_date='2020-01-01', end_date='2020-02-01')` |
| between | Returns data inclusively between dates; [start, end] | start_end, end_date | `quandl.get_point_in_time('DATABASE/CODE', interval='between', start_date='2020-01-01', end_date='2020-01-31')` |

## More usages

For even more advanced usage please see our [Detailed Method Guide] (./FOR_DEVELOPERS.md).
For even more advanced usage please see our [Detailed Method Guide](./FOR_DEVELOPERS.md).
26 changes: 19 additions & 7 deletions FOR_DEVELOPERS.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# Detailed Method Guide - Quandl/Python

In addition to the Quick methods for retrieving data, some additional commands may be used for more querying specificity. These include:

* Retrieving metadata without data
* Customizing how data is returned more granularly
* Allowing easier iteration of data

In each of the following sections it is assumed your Quandl API key has been set via:
In each of the following sections it is assumed your Quandl API key has been set via:

```python
import quandl
Expand Down Expand Up @@ -72,14 +72,14 @@ while True:
break
```

Download table data as a zip file. You can download all the table data in a data table in a single call. The following will download the entire F1 table data as a zip file to your current working directory:
Download table data as a zip file. You can download all the table data in a data table in a single call. The following will download the entire F1 table data as a zip file to your current working directory:

```python
import quandl
data = quandl.export_table('MER/F1')
```

You can also specify where to download the zip file:
You can also specify where to download the zip file:

```python
import quandl
Expand All @@ -88,7 +88,7 @@ data = quandl.export_table('MER/F1', filename='/my/path/db.zip')

Note that if you are downloading the whole table, it will take longer to generate the zip file.

You can also specify what data you want to download with filters and parameters.(`cursor_id` and `paginate` are not supported for exporting table zip file):
You can also specify what data you want to download with filters and parameters.(`cursor_id` and `paginate` are not supported for exporting table zip file):

```python
import quandl
Expand All @@ -99,7 +99,6 @@ After the download is finished, the filename of the downloaded zip file will be

Sometimes it takes a while to generate the zip file, you'll get a message while the file is being generated. Once the file is generated, it will start the download of the zip file.


### Download Entire Database (Bulk Download)

To get the url for downloading all dataset data in a database:
Expand Down Expand Up @@ -181,6 +180,19 @@ All options beyond specifying the dataset `WIKI/AAPL` are optional.

See the `pandas` and `NumPy` documentation for a wealth of options on data manipulation.

### Point in Time

Point in time data can be retrieved in much the same was as a datatable. For example:

```python
data = quandl.PointInTime('DATATABLE/CODE', pit={'interval': 'asofdate', 'date': '2020-01-01'}).data().to_list()
data = quandl.PointInTime('DATATABLE/CODE', pit={'interval': 'asofdate', 'date': '2020-01-01'}).data().to_pandas()
# or
data = quandl.PointInTime('DATATABLE/CODE', pit={'interval': 'from', 'start_date': '2020-01-01', 'end_date': '2020-01-15'}).data()
```

For more options please check [this table](FOR_ANALYSTS.md#point-in-time)

## Retrieving metadata

### Dataset
Expand Down Expand Up @@ -232,7 +244,7 @@ quandl.Database('WIKI').datasets()

### Datatable

Much like databases and datasets you can retrieve datatable metadata via its Quandl code:
Much like databases and datasets you can retrieve datatable metadata via its Quandl code:

```python
dt = quandl.Datatable('ZACKS/FC')
Expand Down
4 changes: 3 additions & 1 deletion quandl/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,11 @@
from .model.database import Database
from .model.dataset import Dataset
from .model.datatable import Datatable
from .model.point_in_time import PointInTime
from .model.data import Data
from .model.merged_dataset import MergedDataset
from .get import get
from .bulkdownload import bulkdownload
from .export_table import export_table
from .get_table import get_table
from .get_table import get_table
from .get_point_in_time import get_point_in_time
64 changes: 64 additions & 0 deletions quandl/get_point_in_time.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
from quandl.model.point_in_time import PointInTime
from quandl.errors.quandl_error import LimitExceededError
from .api_config import ApiConfig
from .message import Message
from quandl.errors.quandl_error import InvalidRequestError
import warnings
import copy


def get_point_in_time(datatable_code, **options):
validate_pit_options(options)
pit_options = {}

# Remove the PIT params/keys from the options to not send it as a query params
for k in ['interval', 'date', 'start_date', 'end_date']:
if k in options.keys():
pit_options[k] = options.pop(k)

if 'paginate' in options.keys():
paginate = options.pop('paginate')
else:
paginate = None

data = None
page_count = 0
while True:
next_options = copy.deepcopy(options)
next_data = PointInTime(datatable_code, pit=pit_options).data(params=next_options)

if data is None:
data = next_data
else:
data.extend(next_data)

if page_count >= ApiConfig.page_limit:
raise LimitExceededError(
Message.WARN_DATA_LIMIT_EXCEEDED % (datatable_code,
ApiConfig.api_key
)
)

next_cursor_id = next_data.meta['next_cursor_id']

if next_cursor_id is None:
break
elif paginate is not True and next_cursor_id is not None:
warnings.warn(Message.WARN_PAGE_LIMIT_EXCEEDED, UserWarning)
break

page_count = page_count + 1
options['qopts.cursor_id'] = next_cursor_id
return data.to_pandas()


def validate_pit_options(options):
if 'interval' not in options.keys():
raise InvalidRequestError('option `interval` is required')

if options['interval'] not in ['asofdate', 'from', 'between']:
raise InvalidRequestError('option `interval` is invalid')

if options['interval'] in ['from', 'between']:
if 'start_date' not in options.keys() or 'end_date' not in options.keys():
raise InvalidRequestError('options `start_date` and `end_date` are required')
37 changes: 37 additions & 0 deletions quandl/model/point_in_time.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
from quandl.operations.get import GetOperation
from quandl.operations.list import ListOperation
from .data import Data
from .model_base import ModelBase
from datetime import date

import logging
log = logging.getLogger(__name__)


class PointInTime(GetOperation, ListOperation, ModelBase):
def data(self, **options):
if not options:
options = {'params': {}}
return Data.page(self, **options)

def default_path(self):
return "%s/:id/%s" % (self.lookup_key(), self.pit_url(),)

def lookup_key(self):
return 'pit'

def pit_url(self):
interval = self.options['pit']['interval']
if interval in ['asofdate']:
if 'date' not in self.options['pit'].keys():
date_replace = date.today()
else:
date_replace = self.options['pit']['date']
return "%s/%s" % (interval, date_replace, )
else:
start_date = self.options['pit']['start_date']
end_date = self.options['pit']['end_date']
if interval == 'between':
return "%s/%s/%s" % (interval, start_date, end_date, )
else:
return "%s/%s/to/%s" % (interval, start_date, end_date, )
2 changes: 1 addition & 1 deletion quandl/version.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
VERSION = '3.5.3'
VERSION = '3.6.0'
Loading

0 comments on commit 2c45a24

Please sign in to comment.