[SOLVED] R html database graph statistic Overview

$25

File Name: R_html_database_graph_statistic_Overview.zip
File Size: 376.8 KB

5/5 - (1 vote)

Overview
Coursework
Univariate Statistics and Methodology using R 2019/2020
Read this whole document before you do anything else. Deadline: 20th of January 2020
For the course assignment, you will be expected to retrieve, clean, and analyse a data set. In this document we provide the primary research questions to be answered, information on the structure and format of the final report, information on code that should be submitted, and a brief overview of the marking criteria. You can find an R script template on LEARN.
It can be tempting to over-complicate assessments like this, particularly if you have a long time to complete them. The labs have been designed to prepare you for this assignment: to explore data, to conduct appropri- ate analyses for given data types, and to make decisions that you can justify. Bear in mind that completing this assessment does not require any knowledge that wasnt covered in lectures, labs, and readings.
What you need to submit
For your assessment you need to submit two documents: your report and your R code. More instructions on how to submit are below. Here, we provide more detail on what to submit.
Report
You need to produce a report answering the assignment questions below. Your report should include ap- propriate analyses to provide answers to these questions while describing the process and utilising graphics where necessary to illustrate your points.
Yourreportshouldclearlyidentifythedecisionsyoumadeinanalysingthedata,aswellassummaris- ing what can be concluded from your analysis.
Figuresandtablesshouldbenumberedandcaptioned,andreferredtointhetext;importantstatisti- cal outcomes should be summarised in the text.
Reporting should follow APA 6th Edition guidelines for the presentation of tables, figures, and statisti- cal results (see final lecture for more information). Alternative style is acceptable so long as it is clear and consistent.
Your report should be a maximum of 4 sides of A4 (including tables and figures), in a standard font, size 12, with normal 1 inch margins.
1

Code
Your report must be accompanied by an R script (a text file with the extension .R, the default file type when saving a script from R-Studio) which can be used to exactly reproduce the results set out in your submitted report. It should include all steps taken in data cleaning and all analyses. Every answer to the assignment tasks/questions given below must be obtainable from your code.
Please try to write clear and informative comments within the file. You may find it useful to have a comment heading for the code which answers each question.
Any code copied and pasted or otherwise adapted from internet examples should be cited appropriately in the comments. An appropriate citation should include the URL where the code was found, the name of the website or blog, and the original authors name. In the absence of a proper name, you can cite the contributors nickname or alias.
Important
1. AcodetemplateisavailableonLEARN.
2. At the top of your script you should place any use of library() to load any packages you need, the reading in of data, any self-written functions, and details of collaboration (see below).
3. Collaboration:YoucanworkontheR-scriptinsmallgroups(nomorethan4students)ifpreferred. Please include a comment line (line starting with #) which includes the exam numbers (not the names) of those you worked with. For example:
# Produced in collaboration with students B045329 and B018429
Within the script point out (again using comments) which blocks of code are shared. Please ensure that your acknowledgements match those of others in your group (if you say you produced the script in collaboration with B045329, we expect B045329 to acknowledge you).
Important: While the code can be worked on in small groups, the written report must be produced entirely inde- pendently. It is not OK to include sections in the written report that are written collaboratively.
Submission and Marking Submitting your work
All coursework must be submitted before 12:00 (noon) on 20th of January 2020 via Turnitin. You can access it by clicking on the Assessment details and submission tab of the course page on LEARN. There are two sections there, one for each of the two files you are required to submit. You will be asked to provide your name and submission title. The submission title must be your exam number (and nothing more). Your name
2

will not appear anywhere in the documents accessed by the markers. To ensure that the marking is entirely anonymous, please do not include your name or student number anywhere in either of the submitted files.
Remember, the files you are required to submit are:
Report,asdescribedabove.Thefilenamemustbeyourexamnumberwithwhateverextensionispro- vided by your chosen word processor (e.g., B045329.docx). The file you create should have your exam number on each page (e.g., in the header or footer).
R script which runs all of the data cleaning and the final analyses reported. The filename must be your exam number with the .R extension (e.g., B045329.R).
Please ensure that you name your documents exactly as above. File names such as R Script for B04329.R or B044329 Report final.docx slow down document matching and marking and will result in loss of marks.
Please check LEARN for detailed instructions on the submission process prior to submitting.
Marking Criteria
The code is worth 20% of the coursework marks, and the report is worth 80% of the coursework marks.
Work will automatically fail (max mark of 30%) unless both components are submitted. You will be assessed on the following:
1. Appropriate cleaning of the data set and key variables of interest, making appropriate and justified decisions on the steps you take.
2. Selection of appropriate statistical tests and variables to answer the primary research question and the justifications provided for your selections.
3. Interpretationoftheresultsoftheselectedanalyses.
4. R-code that runs without errors all the way through, is clear and appropriately commented. For handy
tips on writing good code, see http://adv-r.had.co.nz/Style.html (no need to stick religiously to the
guidelines but following them does make code nice and tidy).
5. Last but not least: Clarity of writing and formatting. The report should conform to the APA 6th Edi-
tion style guidelines for formatting text, tables, and figures, reporting results of statistical analyses, writing style, etc. However, alternative style is acceptable provided it is comparably clear and con- sistent. For a useful resource, see https://owl.purdue.edu/owl/research_and_citation/apa_style/ apa_style_introduction.html.
Pre-submission Checklist
1. Doyouhavetwoseparatefilestosubmit,andarethefilenamescorrect?
2. Isyourexamnumberpresent?
3. Haveyouremovedanymentionofyourname?
4. Is your code reproducible? If you clear your environment (use the little broomstick icon in the rstu-
dio environment pane), and run your script from start to finish, does it successfully reproduce your analyses?
3

5. Doesyourcodeincludetheexamnumbersofallofyourcollaborators? 6. Isyourreportwithinthepagelimit?
4

Assignment
Motoring offences dataset
Two datasets can be loaded from the following url:
load(url(https://edin.ac/2QCZNVu))
Data provided contains information about the nature and circumstances of motorists stopped and breathal- ysed by the Police.
Data is collected every time that driver is stopped by the Police and breathalysed. Records indicate the speed at which the driver is travelling when they are stopped, and the blood alcohol content of the driver when measured via breathalyser. Information is also captured on the age and prior motoring offences of the driver, and whether the incident occurred at day or night. Police officers may have had reasons for stopping drivers other than presuming them to be intoxicated (for instance, someone who is stopped for speeding may subsequently be breathalysed if they are deemed to be acting unusually).
Each time a police officer stops a motorist, an incident ID is created. A separate database used primarily for administrative purposes includes records of which officer (recorded as initials) attends which incidents.
Data dictionary
Variable
age nighttime prior_offence speed
bac
outcome incident_id officer
Description
Age of driver (in years)
Whether or not the incident occurred at night
Offence code for any prior motoring offences
Speed when stopped by police (mph)
Blood Alcohol Content (%) as measured by breathalyser Outcome of stop (fine,warning)
ID of incident
Officer attending (initials)
Offence Codes
Offence code
N DR50 DR80 CD.. PL.. SP.. TS..
Description
No prior offence
In charge of veh. while unfit through drink In charge of veh. while unfit through drugs Careless Driving offences
Driving without L Plates
Speeding offences
Traffic direction and signs offences
5

1 The relationship between blood alcohol content and age.
Once you are content that the data are appropriately cleaned, run the following model:
m1<-lm(bac ~ age, data = drinkdriving) 1.1Check model diagnostics, and when you are happy, concisely report and interpret the results of your model [8 marks]1.2What is the predicted blood alcohol content for a 50 year old driver who gets stopped by the Police?[4 marks]1.3Produce and interpret a diagnostic plot of the model that shows whether or not the model residuals are normally distributed.[4 marks]2 Driving speeds, night vs. day 2.1Does either time of day or speed of driving predict the blood alcohol content over and above their age? Fit appropriate model(s) to test this question.[8 marks]2.2Run model diagnostics and, if needed, re-fit the model(s). How much variation in blood alcohol content is accounted for by drivers ages and speeds, and time of day of incidents?[4 marks] 62.3Is there evidence to suggest that people drive faster at night than during the day?3 Fines vs. Warnings 3.1[4 marks]Construct a model to investigate what contributes to the likelihood of a driver receiving a fine as opposed to a warning.Report the results of your model.What has the biggest effect on the likelihood of receiving a fine, 1~SD increase in driving speed or 1~SD increase in blood alcohol content?[12 marks]3.2Are people with prior convictions for drink driving offences more likely to get a penalty fine (as opposed to a warning) than those who have non-drink-related offences?[4 marks]3.3Does whether or not a driver has a prior motoring offence (of any kind) influence the likelihood of receivinga fine?4 Plotting predicted probabilities.[4 marks]Create a visualisation of the predicted probabilities of receiving a fine for different ages of drivers who have no prior offences and are stopped during the daytime driving 30mph with 0 blood alcohol content. To do this, you will need to make predictions from your model for a set of different values for age, while holding the other variables constant at the values above (see below). You neednt worry about confidence intervals. Thenyouwillneedtoplotthepredictionsforeachagevalue 7 age nighttimespeed bac prior_offence model_prediction 30 0 None ? 30 0 None ? 30 0 None ? 30 0 None ? 30 0 None ? 30 0 None ?17 day18 day19 day20 day21 day22 day ……………… 5 Corrupt cops?Investigate the hypothesis that one of the police officers is biased in how the give out fines and warnings with respect to a drivers age.Does the data suggest that this might be the case? If so, which police officer(s) is biased, and how?[16 marks]8[12 marks]

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] R html database graph statistic Overview
$25