200 likes | 459 Vues
Statistical Arbitrage. Ying Chen, Leonardo Bachega Yandong Guo, Xing Liu February, 2010. Outline. Overview of the project Improvements in the last week Speedup the data access Improve the PCA algorithm Use adjusted price in PNL calculation Taking trading volume into account Future work.
E N D
Statistical Arbitrage Ying Chen, Leonardo Bachega Yandong Guo, Xing Liu February, 2010
Outline Overview of the project Improvements in the last week Speedup the data access Improve the PCA algorithm Use adjusted price in PNL calculation Taking trading volume into account Future work
Framework Current stock prices PCA Eigenportfolios Market model Adjusted Stock price Series + indices Raw Historical Data From WRDS Compute S-scores Signal trade orders 252-day returns ETFs for industry sectors Market model Residual process model 60-day returns Residuals as increments of AR process Data pre-processing (python scripts) Back-testing simulations (matlab scripts)
Code Speedup Data access Tradeoff • Always read from disk: very slow • Everything in memory: not robust, can be slow Cache parts of dataset in memory Before After Total Speedup > 16 times Fast code Same
PCA amelioration (1/4) • Suppose X is a nxp matrix including n samples and p features; • Original algorithm: • Calculate the Eigen-decomposition of the correlation matrix: • The matrix Q consists of the Eigen-vectors of the correlation matrix
PCA amelioration (2/4) • Suppose X is a nxp matrix including n samples and p features; • Substituted algorithm: • We use singular value decomposition (SVD) to get the eigenvectors. • Then V consists of Eigen-vectors of the correlation matrix. • This will reduce the computational complexity by around 80%
PCA amelioration (3/4) • Proof: • Since U and V are orthogonal, V consists of the eigen-vectors of the correlation matrix • And equals to diagonal matrix D
PCA amelioration (4/4) • Notice: if p is one eigenvector of X, then –p is also its eigenvector • Since if • Then • The effect of “negative” can be removed by the estimation.
Experiment result (Fig. 1) Top 50 eigenvalues of the correlation matrix of market returns computed on May 1 2007 estimated using a 1-year window and a universe of 1590 stocks
Experiment result 2 Value of the first eigenvector
Experiment result 2 Value of the second eigenvector
Experiment result 2 Value of the third eigenvector
Preliminary PNL Experiment Feb-12-1998 Dec-13-1994
Taking trading volume into account • Problem • mean-reversion strategies are sensitive to trading volume immediately before the signal was triggered. • Modified returns is the average daily trading volume over a given trading window. • Experiments • PCA/ETF actual price vs. using trading volume
Top 50 eigenvalues of the correlation matrix—trading time Top 50 eigenvalues of the correlation matrix of market returns computed on May 1 2007 estimated using a 1-year window and a universe of 1590 stocks
Top 50 eigenvalues of the correlation matrix——calendar time Top 50 eigenvalues of the correlation matrix of market returns computed on May 1 2007 estimated using a 1-year window and a universe of 1590 stocks
Value of the first eigenvector Value of the first eigenvector
Future work • Experiment on ETF • Associate each stock with one ETF • Compare ETF with PCA • Take into account • Transaction fee, interest, dividend • Calculate PCA using trading-time modified return