Skip to content

PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).

License

Notifications You must be signed in to change notification settings

LongRunner800/PGPortfolio

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

65 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is the implementation of our paper, A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem (arXiv:1706.10059), together with a toolkit of portfolio management research.

  • The policy optimization method we described in the paper is designed specifically for portfolio management problem.
    • Differing from the general-purpose reinforcement learning algorithms, it has similar efficiency and robustness to supervized learning.
    • This is because we formulate the problem as an immediate reward optimization problem regularised by transaction cost, which does not require a monte-carlo or bootstrapped estimation of the gradients.
  • One can configurate the topology, training method or input data in a separate json file. The training process will be recorded and user can visualize the training using tensorboard. Result summary and parallel training are allowed for better hyper-parameters optimization.
  • The financial-model-based portfolio management algorithms are also embedded in this library for comparision purpose, whose implementation is based on Li and Hoi's toolkit OLPS.

Differences from the article version

Note that this library is a part of our main project, and it is several versions ahead of the article.

  • In this version, some technical bugs are fixed and improvements in hyper-parameter tuning and engineering are made.
    • The most important bug in the arxiv v2 article is that the test time-span mentioned is about 30% shorter than the actual experiment. Thus the volumn-observation interval (for asset selection) overlapped with the backtest data in the paper.
  • With new hyper-parameters, users can train the models with smaller time durations.(less than 30 mins)
  • All updates will be incorporated into future versions of the paper.
  • Original versioning history, and internal discussions, including some in-code comments, are removed in this open-sourced edition. These contains our unimplemented ideas, some of which will very likely become the foundations of our future publications

Platform Support

Python 3.5+ in windows and Python 2.7+/3.5+ in linux are supported.

Dependencies

Install Dependencies via pip install -r requirements.txt

  • tensorflow (>= 1.0.0)
  • tflearn
  • pandas
  • ...

User Guide

Please check out User Guide

Acknowledgement

This project would not have been finished without using the codes from the following open source projects:

Community Contribution

We welcome contributions from the community, including but not limited to:

  • Bug fixing
  • Interfacing to other markets such as stock, futures, options
  • Adding broker API (under marketdata)
  • More backtest strategies (under tdagent)

Risk Disclaimer (for Live-trading)

There is always risk of loss in trading. All trading strategies are used at your own risk

  • The project has been open-sourced for many years (since 2017). The market efficiency may have increased quite a lot since then. There is no guarantee the exact same algorithm can still work.
  • Although we tried to make assumptions closer to the situation in the real market, the results are all from backtest on static historical data. The slippage or market impact can always be a problem if you deploy it as a live trading bot.

About

PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%