- Association rules:
One of the major techniques in data mining involves the discovery of association rules. These rules correlate the presence of a set of items with another range of values for another set of variables. The database in this context is regarded as a collection of transactions, each involving a set of items, as shown below.
Trans ID Items Purchased
- Meat, Potato, Onion
- Meat, Noodle
- Noodle, Spinach
- Meat, Potato, Onion
- Onion, Potato, Noodle
- Eggs, Spinach
- Eggs, Noodle
- Meat, Potato, Salt, Onion
- Salt, Spinach
- Meat, Potato
- Apply the Apriori algorithm on this dataset.
Note that, the set of items is {Meat, Potato, Onion, Noodle, Spinach, Eggs, Salt}. You may use 0.3 for the minimum support value.
- Show the rules that have a confidence of 0.8 or greater for an itemset containing three items.
- Classification:
Classification is the process of learning a model that describes different classes of data and
the classes should be pre-determined. Consider the following set of data records:
ID | Age | City | Gender | Education | Profile |
101 | 20-30 | NY | F | College | Employed |
102 | 31-40 | NY | F | College | Employed |
103 | 51-60 | NY | F | College | Unemployed |
104 | 20-30 | LA | M | High School | Unemployed |
105 | 41-50 | NY | F | College | Employed |
106 | 41-50 | NY | F | Graduate | Employed |
107 | 20-30 | LA | M | College | Employed |
108 | 20-30 | NY | F | High School | Unemployed |
109 | 20-30 | NY | F | College | Employed |
110 51-60 SF M College Unemployed |
Assuming, that the class attribute is Profile, apply a classification algorithm to this dataset.
- Clustering: Consider the following set of two-dimensional records:
RID Age Years of Service
- 30 5
- 50 25
- 50 15
- 25 5
- 30 10
- 55 25
- Marks:
Use the K-means algorithm to cluster this dataset. You can use a value of 2 for K and can assume that the records with RIDs 103, and 104 are used for the initial cluster centroids.
- Marks:
What is the difference between describing discovered knowledge using clustering and describing it using classification?
Reviews
There are no reviews yet.