34_corr_matrix
Folders and files
Name | Name | Last commit date | ||
---|---|---|---|---|
parent directory.. | ||||
For this assignment, you will write the first step toward your final evaluative assignment. Ultimately, you will write a program that, given a universe of assets, creates the optimal portfolios by calculating the efficient frontier, as described in your portfolio theory module. For this first part, you will calculate the correlation matrix for a universe of assets, given historical price data. At a high level, your program should: (1) Read historical price data from a file. (2) For each asset, calculate the rate of return for each time step. (3) For each asset, calculate the average return and standard deviation. (4) Calculate the covariance matrix for all the assets. (5) Calculate the correlation matrix for all the assets. Here are the detailed specifications. You should provide a Makefile that compiles your code into an executable named "correl_matrix". Your program should take exactly one command line argument, the name of the file to read. An example of this format is given in small.csv and year.csv. The first line of the data file gives the time increment (which you can ignore), followed by comma-separated asset names. All subsequent lines should have a time label (which you can ignore), followed by comma-separated prices, of which there should be the same number as there are assets. Note in year.csv that some of the prices are null. Your program should handle null or non-numeric data in some fields by just repeating the previous valid price for that asset. (Of course, if there is no valid data in a column, that is an error.) For each change in time step (it does not matter if the data is daily, monthly, or something else), you should compute the rate of return. Once you have the rates of return, you can compute the average return and standard deviation for that asset. As shown in the portfolio theory modules, standard deviation is calculated as follows: / 1 \ sigma = sqrt| ----- Sum_t (r_t - r_avg)^2 | \n - 1 / Next, you will calculate the covariance matrix, where each element is the covariance of the two assets at that row and column. 1 2 ... n - - 1 | s_11 s_12 ... s_1n | 2 | s_12 s_22 | : | : `. | n | s_1n ` s_nn | - - Recall that covariance for two assets a and b is given by: 1 s_ab = --- Sum_t (ra_t - ra_avg)(rb_t - rb_avg) n Finally, you will use the single asset's standard deviation to calculate the correlation matrix. Recall that correlation is given by: s_ab p_ab = ------- s_a s_b Where the matrix looks like: 1 2 ... n - - 1 | 1 p_12 ... p_1n | 2 | p_12 1 | : | : `. | n | p_1n ` 1 | - - Recall that -1 < p_ab < 1, where positive correlation means assets change in the same direction, and negative correlation means assets change in opposite directions. Therefore, the correlation of an asset with itself should be exactly 1. Note, however, that these formulas will not give you exactly p_aa = 1 but will approach 1 for large time series of data. For this project, let p_aa be exactly 1 instead of doing the correlation compuation. Your program should print the result to stdout as follows. [list of assets, newline delimited] [correlation matrix] Examples are given in small.out and year.out. Note that the matrix must be formatted with open and close square brackets and comma-delimited values, such that each floating point number has 7 spaces and four digits after the decimal point. See ios_base::width, setprecision, and fixed in the C++ library. For full credit, your program must valgrind cleanly. Of course, you should test your program on many more inputs than those provided. You will also be graded on code quality. This means your code should make good use of abstraction, have good variable, function, and class names, be well commented and formatted, and have at least one class definition. While you are free to implement this in any way that is reasonable, I recommend making an Asset class, keeping in mind that an asset has a name, time series rate of return, average return, and standard deviation. Making the covariance calculation a member function of Asset could also be nice. Rather than write a Matrix class, I chose to typedef a vector of vectors of doubles. Another good abstraction would be to separate your source code into multiple files. One idea is to have files: main.cpp, parse.cpp, asset.hpp, and asset.cpp.