Skip to content

Latest commit

 

History

History
 
 

Data

alt tag

LEAN Data Formats

Introduction

From the beginning LEAN strived to use an open, human readible data format - independent of any specific database or file format. From this core philosophy we built LEAN to read its financial data from flat files on disk. Data compression is done in zip format; and all individual files are CSV or JSON.

When there is no activity for a security, the price is omitted to the file. Only new ticks, and price changes are recorded.

File Data Format

Although we strive to make all data formats identical it is often not possible. Below are links to dedicated documentation on the file format of the data in each asset type:

Equity | Forex | Options | Futures | Crypto

Folder Structure

Data files are separated and nested in a few predictable layers:

  • Tick, Second and Minute Financial Data: /data/securityType/marketName/resolution/ticker/date_tradeType.zip

  • Hour, Daily Financial Data: /data/securityType/marketName/resolution/ticker.zip

The marketName value is used to separate different tradable assets with the same ticker. E.g. EURUSD is traded on multiple brokerages all with slightly different prices.

Core Data Types

LEAN has a few core data types which are represented in all the asset classes we support. Below are links to their implementation in LEAN.

  • TradeBar - TradeBar represents trade ticks of assets consolidated for a period. TradeBar file format is slightly different for high resolution (second, minute) and low resolution (daily, hour).

  • QuoteBar - QuoteBar represents top of book quote data consolidated over a period of time (bid and ask bar).

  • Tick - Tick data represents an individual record of trades ("trade ticks") or quote updates ("quote tick") for an asset. Tick data is instantaneous - it does not have a period.

Data Readers

All data is parsed from disk via Reader() methods. The Reader takes a single line of the file and converts it the appropriate type. i.e. TradeBar.Reader() method is a factory which returns TradeBar objects. When implementing custom data Readers are used

Other Data Formats

Theoretically LEAN can accept data in any format (database, API or flatfile). However practically we currently have reader implementations written for a flat file system.