A. Predicting Price Direction 34 marks
Unpredictability of shortterm asset returns is a subject of asset pricing research: ecient markets produce nearNormal daily returns with low correlation to past values. That limits application of autoregression on lagged returns. However, the progress is possible and in this assignment you will make direction predictions using any 2 out of 4 types of Classier of your choice.1
Predict sign of next daily move but welcome to modify the task to predict for longer periods, a 5day move. Certain classiers SVM, ANN are more suited to make such longerterm predictions.
The assignment limits the task to binomial prediction in asset price movement: positive or negative return1,1. For some classiers, particularly neural networks or if using baggingboosting, relabel as 0,1.
Start with lagged logreturns rt1,rt2, as your features. Use ADDITIONAL simple variations around price Pt from Table 1. More complex indicators eg, RSI, Stochastic K, MACD, CCI, AccDistrib are beyond the scope of the assignment.
Study Design:
Choose 2 equities with a base of comparison eg, same industry, or 2 broad market indexes, or 2 FamaFrench factors. FF factors are good candidate series to try prediction of monthly returns.
For classiers other than ANNs, use 7 or fewer lagged values. Can also introduce past 5D return or 5D Momentum Pt Pt5 as a feature.
Classier A.1 Logistic Classier and Bayesian Classier
a Make sure to implement penalised versions of logistic regression and discuss impact on coecients. Apply and discuss the dierence between L1 and L2 cost functions, the impact made on regression coecients. b Demonstrate the use of sklearn.model selection for reshued samples and kfold crossvalidation.
Classier A.2 Support Vector Machines
a Consider soft vs. hard margin, present in mathematical notation and consider impact on your 2D relationships. b Specically consider Momentum Feature vs Return t1 and provide 2D visualisation updown points in dierent colour. While support vectors are dicult to present, use SVM SVC.supportvectors and prepare interpretable visualisations. c No need to vary type of kernel.
Classier A.3 Decision Tree Regressor or Boosted Random Forest
a Visualise the decision tree Regressor note limitations of graphviz and discuss if splits are sensible choices. Split gives a percentage sorted into one class up versus another down. b Report hyperparameters: min number to split, minimum number in leaf, and maximum depth. c Decision tree Classier builds a very elaborate tree that achieves perfect insample t likely to be suited to nontime series data; it is critical to test the prediction on a holdout sample.
Classier A.4 Articial Neural Network
If on believes the data carries autoregressive structure: a recurrent neural network model can be a successful alternative to time series regression. a Attempt to use LSTM classier with features given in Table 1. LSTM can come out as one of bestpredicting models from nancial ratiosvolatility estimatorsadv technical indicators but those features are beyond the scope. b Dealing with the arbitrary length of sequence is the major characteristic of LTSM. Attempt prediction of 5D or 10D return for equity or 1W, 1M for FF factor, but for robust estimation use57 years of data for equity.
B. Prediction Quality and Bias each chosen classier
Task B.1 Investigate the prediction quality using confusion matrix precisionrecall statistics and area under ROC curvethese are possible for all classiers if prediction is binomial. Particularly check the quality of predicting the down movements negative sign of return.
Task B.2 Improve your use of classier by changing features or hyperparameters, for example with sklearn.modelselection.GridSearchCV. Alternatively, introduce baggingboosting and discuss impact on prediction quality. A new boosted model deals with mistakes of the previous modelscommon use is AdaBoost for decision trees as weak learners. Particularly describe steps taken to reduce misclassied negative returns. Present comparison BEFORE and AFTER your improvements.
Task B.3 Develop a scheme that utilises transition probabilities predictproba method. Provide separate scatter plots for probabilities of up and down moves, using colour codes for correctlyincorrectly realised prediction. Devise a PL that relies on fractional betting and the edge p1p2p1, where probability of move p is above a threshold 7590. Discuss overrelying on transition probabilities for poorly predicted negative returns.
Work on these tasks can be appended to each classier use case.
Instructions
Work on ALL tasks in the format required. Recite mathematical underpinnings for each chosen Classier. Code must be submitted and be producing the computational output. Full mathematical workings required for Interest Rates Modeling questions.
Format and Coding: Submit ONE .pdf report le and ONE .zip le with data and code, le name starting with your LASTNAME. It is advantageous to merge all your workings in one PDF le.
Implementation is best done in Python using sklearn. For those starting with Python, price direction prediction MODIFIED.ipynb provided as a template to start the work.
It is acceptable to implement classication in RMatlab, but tutors support might be limited. Matlab use should not devolve to exploration with Classication Learner App only.
It is possible to have a limited implementation in Excel eg, logistic regression, however that risks to be below passing mark 60 because of missing other kinds of Classiersprediction quality.
Report Content and Analytical Quality:
If printing out Python Notebook as your reportplease ensure it comes across as an analytical report with a headers to separate sections, b clarity which sections address Questions A.1A.4 and B.1B.3, and c avoid large tables of output show the headtailselected sample.
It is not expected that you will have particularly good accuracy in predicting shortterm returns from past returns, but prediction analysis and clear explanation of improvement steps is what matters.
Within each kind of Classier, you might like to presentFocus on explaining the underpinnings and tuning of Classiers. Your implementation should include tuning of parameters that are specic to each Classier, eg, regularisation strength for Logistic, margin softness for SVM. It is good practice to save some data as a holdout sample not used in estimation on which to test your tted models.
Reviews
There are no reviews yet.