In this project, you will practice what you learn in class to solve a real-world data mining problem. You can choose any problem that you are interested in as long as it can be formulated as a data mining task. This project is a team project. Each team should not have more than two members.
Complete the following tasks:
- Pick a real-world application that data mining may help.
- Formulate it as a data mining problem (clustering, classification, pattern mining, anomaly detection, recommendation, or a combination of these tasks).
- Collect relevant datasets. Some possible sources:
- Preprocess the datasets into the format that can be used by data mining algorithms if necessary.
- Apply your implemented algorithms or any existing package to solve the proposed problem.
- Discuss the data mining results you obtain and evaluate the results.
- Prepare for a short report based on the key points of your project. Name it as project.pdf or project.doc or project.docx
- Log in any CSE department server and submit your report as follows: submit_cse469 pdf
Your report should include the following components.
- Introduction: What data mining problem you are trying to solve? What impact it will bring if the problem is solved?
- Formulation: Which data mining task it can be formulated into? Whats the input and the expected output?
- Datasets: Where do you get the datasets? Give some statistics about the data. How do you preprocess the data?
- Algorithm: Which data mining algorithm do you apply?
- Experiments: Evaluate the output using an appropriate evaluation metric. Show the results you get and discuss whether they are meaningful.
- (Optional) Challenges: What challenges do you find in the data? How do you tackle these challenges?
Reviews
There are no reviews yet.