Data Science for Finance is a survey course on the machine learning models and mathematical optimization methods pertaining to financial engineering and financial data. We spend roughly 2/3 of the course discussing machine learning and the remaining 1/3 discussing portfolio optimization. Specifically, we will cover convexity, the fundamentals of mathematical optimization, and discuss convex programming using open source software CVX; unsupervised methods for clustering and dimension-reduction as well as kernels; supervised methods including regression, classification, and their kernel variants; a brief overview of deep learning; portfolio optimization with transaction costs and induced sparsity; and robust optimization. Unlike a traditional machine learning course, however, we will give special attention to financial applications—such as index-tracking and sector segmentation—and the challenges faced when working with financial data—such as leptokurtic or asymmetric market returns and non-stationary environments.
The course will consist of semi-weekly lectures, a final examination that constitutes 40% of grade, and four homwework assignments—each constituting 15% of grade—that involve a written component and a programming component to be completed in Python. Accordingly, we will use Jupyter Notebook for assignment submissions as they support LaTeX syntax for inline mathematical expressions (via MathJax) and programming in Python, including inline display of figures.
Required:
- Knowledge of linear algebra at the level of Math 54 (i.e. vectors, matrices, eigenvalues/eigenvectors, orthogonality)
- Basic knowledge of multivariate statistics (i.e. covariance matrices, multivariate Gaussian distribution, etc.) and multivariate calculus (i.e. gradient vector, Hessian matrix, and Taylor approximation)
- Practical knowledge of finance and economics; measures of risk, random walks, utility maximization, etc.
- Basic programming experience with a mathematical or statistical language such as Matlab, R, or Python
- Some experience working with time series data (autocorrelation, stationarity, ARIMA, etc.)
Optional, but helpful:
- Experience applying computational linear algebra at the level of EE127 or Math 221
- Familiarity with numerical analysis at the level of Math 128A
- Familiarity with Probability and Mathematical Statistics at the level of Stat 135 or Rice's textbook
- Familiarity with convex programming at the level of Boyd and Vandenberghe's textbook
The course runs from 5 June 2017 to 26 July 2017 with the final examination scheduled for 2 August 2017 from 1-4PM. Professor El Ghaoui will lecture 1-3PM on Mondays and Wednesdays in F320 with the exception of 12 June and 14 June, which will be held in C135 and Andersen, respectively. Mustafa will lead discussion Monday 3-4PM in F320, with the exception of 12 June, which will be held in the Innovation Lab. A detailed lecture outline follows directly:
Week | Lecture No. | Date | Lecture Title | HW given | HW due |
---|---|---|---|---|---|
1 | 01 | 06/05/17 | Optimization models and convexity | ||
1 | 02 | 06/07/17 | k-means and clusterpath | 1 | |
2 | 03 | 06/12/17 | Covariance estimation and PCA | ||
2 | 04 | 06/14/17 | Generalized low rank models and Matrix Completion | ||
3 | 05 | 06/19/17 | LS/LAD regression and penalization | ||
3 | 06 | 06/21/17 | SVM and classification | 2 | 1 |
4 | 07 | 06/26/17 | Kernel Methods | ||
4 | 08 | 06/28/17 | Feature engineering and encoding | ||
5 | 09 | 07/03/17 | Deep learning and recent developments | ||
5 | 10 | 07/05/17 | Time Series Decomposition and Harmonic Regression | 3 | 2 |
6 | 11 | 07/10/17 | Portfolio optimization: Markowitz, Sharpe, and beyond | ||
6 | 12 | 07/12/17 | Constraints and Sparsity on the Simplex | ||
7 | 13 | 07/17/17 | Robust Optimization I | ||
7 | 14 | 07/19/17 | Robust Optimization II | 4 | 3 |
8 | 15 | 07/24/17 | (Optional) Review I | ||
8 | 16 | 07/26/17 | (Optional) Review II | ||
9 | — | 08/02/17 | Final | 4 |
While a textbook is not required for this course, we provide the following collection of references to supplement the lectures.
- Optimization Models. Calafiore and El Ghaoui (2014).
- Elements of Statistical Learning. Hastie, Tibshirani, and Friedman (2009).
- Optimization Methods in Finance. Cornuejols and Tütüncü (2007).
- Convex Optimization. Boyd and Vandenberghe (2004).
- Deep Learning. Goodfellow, Bengio, and Courville (2016).
Homework is to be submitted electronically through bCourses and is due exactly two weeks from the day it is assigned. You may submit homework as late as 11:59PM on the day it is due. If you submit homework past the deadline, your score will be deducted by 5% for each hour of tardiness in excess of the deadline. For example, if you submit your homework anywhere between 12:00AM and 12:59AM, you will be deducted 5%; if you submit between 1:00AM and 1:59AM, you will be deducted 10%; and so forth. Students are encouraged to collaborate in groups of (no more than) four to complete the homework, however each student must submit his or her own unique solution and state their collaborators.
There will be one in-class exam at the end of the course on 2 August 2017 from 1-4PM. The exam will be open note, but no calculators, cell phones, electronic dictionaries, laptops, tablets, or other electronic devices are allowed. If you are apprehended disobeying the aforementioned policies or collaborating with other students during the exam period, your exam will be destroyed and you will recieve no credit. If you are a student registered with the Disabled Student Program (DSP) and you require special arrangements during exams, you must provide the appropriate documentation and make arrangements at least 10 days prior to the exam, detailing the special arrangements you require.
MAKEUP EXAMINATION WILL NOT BE OFFERED UNDER ANY CIRCUMSTANCES WHATSOEVER.
The grades for homeworks and exams will be changed only if there is a clear error on the part of the grader, such as adding up marks incorrectly. Grading errors must be brought to Mustafa's attention immediately after the homework/exam is returned.
This term we will be using Piazza for class discussion. Find our class page here. You may also reach us via email or office hours:
Title | Name | Email Address | Office Hour |
---|---|---|---|
Professor | Laurent El Ghaoui | [email protected] |
Wednesday 3-4PM in F320 |
Assistant | Mustafa S Eisa | [email protected] |
Friday 4-5PM in S276 |
MFE Program | — | [email protected] |
— |