[SOLVED] R html SQL graph statistic Branch: master Find file Copy path course_materials / Exercises / final_project / final_project.md

$25

File Name: R_html_SQL_graph_statistic_Branch:_master_Find_file_Copy_path_course_materials_/_Exercises_/_final_project_/_final_project.md.zip
File Size: 1177.5 KB

5/5 - (1 vote)

Branch: master Find file Copy path course_materials / Exercises / final_project / final_project.md
tbrambor Add final project instructions and sample proposals 2bb7ccd on 21 Oct
1 contributor
Blame History
133 lines (86 sloc)
7.31 KB
Raw
title
author
date
always_allow_html
QMSS G5072 Final Project
Thomas Brambor
2019- 10- 06
true
html
ke
tru
Final Project Goals of the Project
The goal of the final project for the course is to make use of some of the technical abilities you acquired throughout the course. These may include the following:
output _docu
ep_md
e
Data Acquisition
m

Import data from different file formats. Use APIs to obtain data.
Write an API client for an API and/or functions associated with API interaction.
Handle, parse, and transform JSON and XML. Web scrape data from public websites.
Use SQL queries to obtain data.
Data Cleaning, Transformation, and Organization
Use data wrangling (including the tidyverse ) to transform
your data into a dataset or R object ready for analysis. Use loops and other iterative processes.
Use functions and functional programming to export repetitive or difficult tasks.
Handle and process strings and use regular expressions.
Documentation and Presentation
Provide and document useful functions and/or data as part of an R package.
Make your package and associated material available on Github.
Use R Markdown to generate documentation, vignettes, and write-ups.
Of course, you are welcome to go beyond the tools offered in the course as well. However, the focus of the project (and grading) should be on the tools conveyed in this course.
Options

There are two options to choose from for the final project. Please choose one option and make sure to indicate your choice on the proposal and final project submission.
Option A: Data Project
The focus is on the acquisition, cleaning, transformation, organization, and presentation of data.
A1. Substantial Data Collection (General)
The first type of data project would be the collection of a substantial amount of data from at least two sources and combining them into an overall data set. Here the focus would be on the technical aspects of the collection (API, web scraping, SQL), data wrangling, cleaning, organization, presentation etc. of data.
A2. Substantial Data Collection (Text)
For students dealing with potentially messy data (e.g. social media text data), I am prepared to allow using a single data source. The additional part of cleaning and transforming the data is then replacing the effort of obtaining and merging multiple data sources.
Presentation
For either data project, the evaluation will not be mainly based on the amount of data collected but rather on the technical difficulties involved and the breadth of skills displayed.
Some amount of summary statistics to provide an overview of the data should be included. Graphical visualizations are welcome but since the course does not teach visualization techniques, nothing beyond simple graphics is required.

Output
The output includes the data and the code used to obtain the data. Both should be wrapped in a R package (see below). To reiterate, the package needs to include the code with which you collected the data rather than just the final dataset and analysis code. Make the coding documentation clear enough so that someone else could replicate your effort and re-collect the data. Consider including a brief write-up on the collection as well to ease possible replication efforts.
Option B: Functional / API Project
The second type of project will not focus on acquiring the data per se, but rather the methods to acquire data and the functions associated with it.
This project will entail writing multiple functions for an API that currently has no R package associated with it and packaging it into an API client R package (with the potential for public release).
Presentation
For the API project, the evaluation will be based on technical difficulties involved, the breadth of skills displayed, and the variety of functions and options included for a user of the API.
A significant part of the package, beyond the documented code and functions themselves, is the use of vignettes. Using one or more vignettes, you should aim to show all functionalities of the package to a user of your package.
Output

The output for evaluation will be an API client wrapped in a R package. The focus here lies on the quality, usability, and documentation of the functions provided in the package.
The R Package
For both options, the documentation of the work will be in the form of a R package. There are lots of good examples for guidance on CRAN, e.g.
Data R Packages
HistData USAboundaries nasaweather acs
API Client R Packages
rnoaa WikipediR ZillowR rtweet
I highly recommend to follow the advice and guidelines presented in Hadley Wickhams book R Packages.
The following parts should be included
All functions, data, and the package itself need to be documented and exported.
A license is specified.
Your package should pass check() without errors (warnings and notes are OK, though it would be great if there were

none; try to address the issues pointed out by check()).
A readme file.
The data (if data package).
One or more vignettes to describe and discuss the data and/or functions.
Choice of Topic and Proposal
You are free to choose the type of project (see above) and which kind of data or API to use. To make sure you are on the right track, on November 29 (at the latest; feel free to submit earlier and receive feedback) we ask you to submit a proposal.
The proposal should include the following information: Name of project
Type of project: Data (A1/A2) or API Client (B)
Brief description of the purpose
Links to data sources / API etc. Outline the technical steps / challenges you plan to address and include in your submission. Are there any significant hurdles that you have doubts about? Would not solving them render the project incomplete?
Bonus
With your own future in mind, it may be a good idea to show off your skills. For a bonus part, consider publishing a website for your completed package. See Hadley Wickhams pkgdown for instructions on how to do that.
http://hadley.github.io/pkgdown/
Submission

Please follow the instructions to submit your project. For the final project, please use your submission issue to tag the TAs and the instructor (using @tbrambor). On GitHub please use a folder entitled Final_Project to submit all materials. In addition, you are also welcome to use a personal folder on your own Github account to publish your package (and allow installation directly via the devtools::install_github() command).
The proposal is due on Friday, November 29 at 5pm. The final project is due on December 13 at 5pm.

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] R html SQL graph statistic Branch: master Find file Copy path course_materials / Exercises / final_project / final_project.md
$25