[SOLVED] R algorithm GUI python database statistic Syllabus for EM 623 Data Science and Knowledge Discovery in Engineering Management Fall 2018

$25

File Name: R_algorithm_GUI_python_database_statistic_Syllabus_for_EM_623__Data_Science_and_Knowledge_Discovery_in_Engineering_Management_Fall_2018.zip
File Size: 1271.7 KB

5/5 - (1 vote)

Syllabus for EM 623 Data Science and Knowledge Discovery in Engineering Management Fall 2018
INSTRUCTOR:
Carlo Lipizzi
School of Systems & Enterprises
Office: Babbio #504
Email: [email protected]
Office hours: Mondays 2pm to 6pm and by appointment email
PURPOSE:
This syllabus provides the student with information about the details and guidance necessary to complete EM 623.
TEXT:
1. Lecture Notes and Handouts
2. KNIME Essentials, Gabor Bakos, October 16, 2013 [Available on Canvas as pdf]
Additional and recommended texts:
1. Focus on Data Mining theory and algorithms. Discovering Knowledge in Data: An introduction to Data Mining, Daniel T. Larose, John Wiley, 2004
2. Focus on Applications. Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, Eric Siegel, Wiley, February 2013
3. Focus on Rattle. Data Mining with Rattle and R: The Art of Excavating Data for Knowledge Discovery, Springer, 2011
4. Focus on Text Mining using Python. Natural Language Processing with Python, Steven Bird, O.Reilly Media, 2009
5. Focus on Network Analysis. Social Network Analysis for Startups, M. Tsvetovat & A. Kouznetsov, O.Reilly Media, 2011
COURSE DESCRIPTION:
The digital tools we are using every day are creating data from everything we do at an unprecedented rate: every day, 2.5 quintillion (1018) bytes of data are created and 90% of the data in the world today was created within the past two years.
Data can be structured generated by business applications and unstructured generated by the web, often as text. Data piles up quickly and compound annual data growth both threatens to bury todays application infrastructure and provides a great opportunity to have insides on customers, processes, markets.
Getting usable information from such a vast amount of data may require more than intuition. The intuition we use to make judgments is an excellent guide some of the time, but gives a distorted view at other times. Creating views, extracting trends, define patterns, identify clusters is all something we need to actually manage large data.
This mining process requires a combination of tools, ability to represent knowledge and domain-specific expertise. A number of successful applications have been reported in areas such as credit rating, fraud detection, database marketing, customer relationship management, stock market investments, security. The field of data mining has evolved from the disciplines of statistics and artificial intelligence.
Page 1

This course will examine methods that have emerged from both fields and proven to be of value in recognizing patterns and making predictions from an applications perspective. We will survey applications and tools, providing also opportunities for hands-on experimentation with algorithms for data mining using software tools and cases. Final goal of the course is to provide the students with a data toolbox they can use in their activities. This toolbox contains methods and tools that students will use themselves during the course for real world applications.
COURSE OBJECTIVES:
The course aims to:
Provide the student with a way to understand the potential value of the data for application purposes and how to manage and prepare data to extract context-specific value
Help the student to understand how analytical techniques, text mining and network analysis can enhance decision making by converting data into information and insights for decision-making
Provide insight into how to choose and use the most effective data mining techniques and tools based on the problem at hand
Provide the student with a software toolkit to apply models and techniques to real decision problems
Develop and modify data mining prototypes using R for data mining, Gephi for network analysis and
Python for text mining.
Make the students work on engineering management applications, such as Marketing/Entrepreneurship, Product indirect testing, Organizations analysis.
Overall, the course will provide the student with an application oriented data toolbox of methods, techniques and tools to be applied to the data-intensive applications students may have in their future professional activities.
COURSE OUTCOMES:
On top of what detailed in the course objectives, through this course students will develop: Knowledge
o Abilitytounderstand,analyze,planandselecttheproperimplementationstrategytoextract actionable information from data
Attitude
o Abilitytofacestructuredandunstructuredpotentiallyvastamountofdatawithapragmaticand
solution oriented attitude Skills
o Basicabilitytousesomeofthemostpopulartoolsandtechniquesfordata/textminingand network analysis
GRADING:
Homework 35%
Midterm Exam 25%
Final Exam 40%
Page 2

Midterm Exam will be performed in class with a 3 hours duration. Students will work on a data mining case using one or more of the software presented during the classes. Students can use notes and books.
Final Exam will be a project students will prepare individually at home and submit on Canvas. A selection of projects will be presented in class.
Homework will be done using one or more of the software tools used in class.
If part of an assignment is not original and not from a cited source, the case will be considered as cheating/plagiarism. Cheating of any kind will result in a zero grade for the assignment
MODULES:
Week
Topics Covered
1
Machine Learning and Data Mining: Introduction, life cycle and case studies
2
Assessing the value of data: understanding, cleaning and transforming
3
Data management: generalized tools and techniques Excel and DBMS
4
Data mining specific tools: introduction to R with Rattle GUI and to Knime
5
Supervised and un-supervised learning theory and examples
6
Clustering and association analysis using kMeans and basket analysis R/Rattle Knime applications
7
Decision Trees: definitions, algorithms, applications, optimizations and implementation using R/Rattle Knime
8
Midterm exam discussion
9
Neural Networks for classification and prediction: applications and examples using R/Rattle Knime
10
Network Analysis to mine complex data: introduction, applications and examples using Gephi
11
Network analysis implementations using Gephi
12
Text mining: introduction, applications and techniques
13
More on Text Mining & Course recap
14
Final exam discussion
SOFTWARE:
We will use:
Knime and R for most examples. R will be used with an IDE, RStudio and a GUI for data mining, Rattle
Knime and Wordij for text mining
Gephi for Network Analysis.
All the software is Open Source and run on all the major platforms (Mac, Windows and Linux). Latest versions recommended.
Page 3

A virtual machine with all the tools will also be provided.
Students are required to install all the software into their computers before starting week 3. Contact the instructor for help.
Page 4

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] R algorithm GUI python database statistic Syllabus for EM 623 Data Science and Knowledge Discovery in Engineering Management Fall 2018
$25