Name: [SOLVED] New Assignment 3 2025
Brand: Assignment Chef
SKU: 60974
Price: 25 USD
Availability: InStock
Rating: 5 (1 reviews)

5/5 - (1 vote)

New Assignment 32025

Part 1 Select an appropriate model to train the dataset and make predictions (3 Points)

The UCI Adult dataset-sometimes called the Census Income dataset-is a classic resource in machine learning for demonstrating classification tasks, particularly binary classification.

Dataset Description

·Number of Instances:Around 48,842 rows(depending on whether duplicates/missing rows are handled).

·Number of Attributes:14 features(plus the target)

·Feature Types:

■ Numeric(e.g.,age,hours-per-week,capital-gain).

■ Categorical (e.g.,workclass,marital-status,occupation,sex).

·Target Column:

■ Labeled as income,with possible values >50K or<=50K.

·Common practice is to convert this to binary(1 for>50K,O for<=50K).

Feature List

·age(numeric)

·workclass (categorical:Private,Self-emp,Government,etc.)

·fnlwgt(numeric:“final weight,”representing how many people in the US population each record represents)

·education (categorical:Bachelors,HS-grad,etc.)

·education_num (numeric:1-16,encoded years of education)

·marital_status(categorical)

·occupation (categorical)

·relationship(categorical:Husband,Wife,Not-in-family,etc.)

·race (categorical)

·sex(categorical:Male/Female)

·capital_gain(numeric)

·capital_loss(numeric)

·hours_per_week (numeric)

·native_country(categorical)

·income (target:>50K/<=50K)

Task Overview

Data Acquisition &Understanding(Code provided)

·Download the dataset (e.g.,adult.data from the UCI Repository or Kaggle).

·Familiarize yourself with the 14 features and the target column (>50K/<=50K).

Data Cleaning

·Import the dataset into a DataFrame (Code provided)

·Identify and handle missing values (often represented by”?”).Decide whether to drop or impute those rows( 0.25 points). Feature Engineering &Encoding

·Convert the target (income)to a binary numeric:1 if>50K,0 if<=50K(0.25 points).

·Encode categorical columns appropriately(e.g.,workclass,education,marital_status):(0.5 points)

■ One-hot encoding(dummy variables)or label encoding.

·Consider dropping high-cardinality or rarely occurring categories,or grouping them.

Data Splitting:Split into train and test sets(0.5 points)

Model Training:Select a suitable model and appropriate columns to train the model.(0.5 points)

Evaluation:( 0.5 points)

·Generate predictions on the test set and compute classification metrics:

■ Accuracy

■ Precision,Recall,F1-score

■ Confusion matrix Prediction:Make an imaginary person,use the model to predict whether the person’s income will be above 50K(0.5 points).

#If you have not installed the UCI Machine Learning Repo module,un-comment the next line and install it.

#!pip install ucimlrepo

#This is the part you download the dataset and convert it to a pandas data frame.

from ucimlrepo import fetch_ucirepo

import pandas as pd

import numpy as np

adult =fetch ucirepo(id=2) A=adult.data.features

B=adult.data.targets

df=pd.concat([A,B],axis=1) df

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Whatsapp Us

[SOLVED] New Assignment 3 2025

Reviews

Related products

[SOLVED] MongoDB CRUD Faculty

[SOLVED] Assignment 1 – a maze game

[SOLVED] Programming Project for TCP Socket Programming

[SOLVED] Library search system

[SOLVED] Individual Programming Assignment

[SOLVED] HR Analytics: Job Change of Data Scientists