COMP529/336: COURSEWORK ASSIGNMENT #2 (STREAM ANALYTICS)
COMP529/336: COURSEWORK ASSIGNMENT #2 (STREAM ANALYTICS)
Dr. Bakhtiar Amen
Coursework Date: 25/11/2019
Due Date: 06/01/ 2019
Introduction
This assessed coursework assignment is worth 20% of your overall grade for COMP529/COMP336 module. Failure on this assignment can be compensated through higher marks in other assessments on the module. The assignment aims to test your understanding of streaming analytics, with a focus on your ability to use Storm to solve Big Data Analytic problems. More specifically, it aims to partially assess the following learning outcome for COMP529/336: understanding of the middleware that can be used to enable algorithms to scale up to analysis of large data streams in real-time.
AsSESSMENT
The report will be assessed according to the following criteria:
Criterion
Percentage
Clarity of presentation (including succinctness) of main report
20%
Quality of Java code (including assessment of how easy it is to understand)
40%
Quality of analysis performed
40%
SUBMISSION
Please submit your coursework online using the COMP529/ COMP336 page on VITAL by 3pm on Monday 6th Januray. Standard lateness penalties will apply to any work handed in after this time. The report andJava program must be written by yourself, using your own words (see the University guidance on academic integrity for additional information).
TASK
The UK general election is planned to be held on the 12th December 2019. For this election, four parties are running their own campaign to win most of the parliament seats. For every parliamentary constituency of the United Kingdom, one Member of Parliament (MP) will be elected to join the House of Commons from either: Labour, Conservative, Liberal Democrats, Green or Brexit party.
To predict the outcome of UKs 2019 general election, you have been asked to monitor Twitter through one of the hash-tags (e.g., #generalelection or #GE2019). This will allow you to identify the key supporters, and to predict what party is most likely will win the majority of the election on December the 12th. The code for a spout that extracts a streaming feed from Twitter is here:https://github.com/davidkiss/storm-twitter-word-count
Your task is therefore as follows:
Set up a Storm cluster;
Write a Java program for a Storm topology job that includes a:
Spout that produces a stream of tweets;
Bolt that identifies tweets that contain some keywords related to each party (e.g., green party, conservative, labout, brexit).
Bolt that collects information about the likely outcome from the general election. Use the Storm topology to predict who will win the UK general election on the December 12th.
Your output report
The output from this coursework is a brief report suggested to have sections that describe:
Middleware configuration: How you configured the Storm middleware (including a description of your Storm cluster and your rationale for this choice).
Data Analytic Design: How you designed the Storm topology (including your rationale for your design).
Results: The results obtained (excluding any discussion).
Discussion of Results;
Conclusions and Recommendations (including discussion of how you would perform the task if it were to be undertaken at much larger scale).
Format of your report
The output from this coursework is a brief report to be less than or equal to two A4 pages excluding any appendices (two pages only), text size is 12-point, justify text, and in only pdf/docx formats.
Make sure to save your file under your surname + module code (e.g., Abcd_COMP336).
You should include a listing of the Java program for your Storm topology in an appendix (no longer than 2 pages).
Reviews
There are no reviews yet.