This is a group assessment worth 40% of the total mark for FIT5196. It consists of three tasks related to data analysis and manipulation.
- Input files:
Group_dirty_data.csv
,Group_outlier_data.csv
,Group_missing_data.csv
,warehouse.csv
- Output files:
Group_dirty_data_solution.csv
,Group_outlier_data_solution.csv
,Group_missing_data_solution.csv
,Group_ass2_task1.ipynb
,Group_ass2_task1.py
The dataset contains transactional retail data from an online electronics store (DigiCO) in Melbourne, Australia. Each row represents a single order with columns such asorder_id
,customer_id
,date
, etc.
- Detect and fix errors in
_dirty_data.csv
- Impute the missing values in
_missing_data.csv
- Detect and remove outlier rows in
_outlier_data.csv
(w.r.t. thedelivery_charges
attribute only)
Thegroup_id_ass2_task1.ipynb
should demonstrate the methodology to achieve correct results. This includes using appropriate Python functions for input, process, and output, and presenting the solution in an efficient and proper way.
- Input file:
suburb_info.xlsx
- Output file:
Group_ass2_task2.ipynb
Study the effect of different normalisation/transformation methods on columnsnumber_of_houses
,number_of_units
,population
,aus_born_perc
,median_income
,median_house_price
to prepare data for a linear regression model to predictmedian_house_price
.
- Input file: None
- Output file:
Group_report.pdf
- Feedback Session During Week 10 Applied Session: Present progress, future planning, record TA’s suggestions, and continue work based on suggestions.
- Group Reflection Presentation (Hurdle): Present methodology and answer questions during Week 12 applied sessions. Mandatory attendance.
- Reflective Report: Provide a report based on feedback, tailored solutions, and any related findings.
- Submit 6 files:
Group_dirty_data_solution.csv
,Group_missing_data_solution.csv
,Group_outlier_data_solution.csv
,Group_ass2_task1.ipynb
,Group_ass2_task1.py
,Group_ass2_task2.ipynb
,Group_report.pdf
- Zip all files into
Group_ass2.zip
- Follow file naming standards and ensure files are parsable and readable.
- Instructions for generating
.py
files from notebooks. - Submission checklist details.
Reviews
There are no reviews yet.