Assessed Learning Outcomes
This second assessment aims at testing your ability to build up a predictive model based on probability distributions analyse a dataset to gain understanding of the data from within Python build up a regression model and evaluate it communicate your findin gs on your predictive model
How to submit
For this assignment,, you need to submit the followings::
1.. A short report (iin .ppdf)) on your findings in exploring the given datasets,, a descriptionof your models and their evaluations,, as well as any decisions or acti ons that may be taken following your analyses..
2.. The Python source code written in order to complete the tasks set in the paper.. It isrecommended to submit two Python code files,, say Task1 sol..ppy and Task2 sol..ppy for the two problems you have proposed solut ions..
3.. A signed coursework cover
1 Predicting the W inner
Consider the Premier League dataset,, which records the results of the Premier league matches thus far during the current season 2017//118.. The data include Full Time Result,, Full Time Home// Away Team Goals,, Half Time Home//AAway Team Goals and other type s of variables.. Please have a look at the given notes along with the dataset in order to understand the abbreviations used in the dataset..
Objective:: Using the given dataset,, we would like to build up a model that can predict the winning team of the next p remier league match between Manchester United and Manchester City by using simulation and a historical dataset..
Manchester City vs Manchester United
Analyse the given dataset in order to show the followings::
1.. Check for missing values in the dataset,, drop columns that may be irrelevant to theproblem of predicting the winning team and provide a descriptive analysis of the dataset..
2.. Extract all home and away matches played by both teams as well as the number of goa lsscored or conceded.. You may present the results in the form of data frames..
3.. To get a better picture of how Manchester United and Manchester City stack up againsteach other,, juxtapose the teams offensive and defensive performance data.. For
that purpose,, plot the goal scores frequency of the Manchester City s away offense against Manchester United s home defense and the Manchester City s home offense against Manchester United s away defense..
Simulation
To predict the winning team,, we can create fantasy gam es with an objective to estimate the probability that one team will beat another..
1.. Use empirical distributions of goals scored by the two teams to predict the winningteam by simulating a large enough fantasy games between both teams.. Handle possible draws a ppropriately..
2.. A balanced simulation should consider both the offensive and defensive performance ofeach team.. Perform a balanced simulation of the match between both teams in order to predict the winning team.. Handle possible draws appropriately..
3.. Use and j ustify theoretical probability distributions to simulate Manchester City ManchesterUnited s games as paired random drawings.. Then,, execute a balanced simulation of offensive and defensive performance with your probability models for goals scored.. Handle po ssible draws appropriately..
2 Predicting a Stock price
Consider the AAPL stock,, which records the daily AAPL stock prices from 1980s to date.. The data include the Open price when the market opens,, High the highest price on the day,, Low the lowest price of the day,, Volume the amount of stocks traded,, Cl ose the price when the market closes,, and the Adjusted close the adjusted stock price to account for stock splits that could have occurred..
Objective:: Using the given dataset,, we would like to build up a model that can predict the price of the stock for the next five days..
1.. Create the time series of the given stock prices.. You should consider the adjusted closeprices.. Comment on the graph obtained spotting trends or possible sharp price changes..
2.. Construct a predictive model of stock prices with any predic tors you feel are relevant.. You may introduce additional attributes into the dataset e..gg.. moving averages,, see
https::////een..wwikipedia..oorg//wwiki//MMoving_average
,
Bollinger bands,, see
https::////een..wwikipedia..oorg//wwiki//BBollinger_Bands
,
etc.. Justify why your model i s appropriate to use..
3.. Write down the mathematical equation of your fitted model and evaluate your model..MMake sure to withhold a subset of the data for testing.. You should aim for a model with a higher accuracy..
4.. Include in your report a discussion if you could make any money with your predictivemodel..
Mark Scheme
The following areas are assessed::
1.. Man.. City vs.. Man.. U.. model + justification + evaluation
[330 marks]]
2.. AAPL Stock prices prediction model + justification + evaluation
[330 marks]]
3.. Quality of coding
[220 marks]]
4.. writing a report (uup to 5 pages including graphs)) interpreting the results
[220 marks]]
Indicative weights on the assessed learning outcomes are given above.. The following is a guide for the marking::
First ( 70 to 100 marks)):: A complete coverage of data science techniques exploring the dataset;; both predictive models are detailed and well justified along with the evaluation of the regression model and perhaps an attempt to evaluate how good your model for finding the winning team is;; and a well written and structured report on the results obtained from the datasets and any decisions that may be recommended..
Second Upper ( 60 to 69 marks)):: A good coverage of data science techniques exploring the dataset;; bo th predictive models are justified with an appreciable accuracy for the regression model;; and a well structured narrative on the results obtained from the datasets and any decisions that may be recommended..
Second Lower ( 50 to 59 marks)):: Some techniques used for model building and evaluation are overlooked;; at least one predictive model partially justified with an appreciable accuracy is given;; and a good narrative of the findings about the dataset with few deficiencies..
Third ( 40 to 49 marks)):: Essentia l data science techniques are covered;; at least one predictive model is given with some justification;; and a written report describing some of the work done..
Fail ( 39 marks)):: Not satisfy the pass criteria and will still get some marks in most cases..
None submission:: A mark of 0 will be awarded..
Reviews
There are no reviews yet.