[Solved] CS145 Homework #5

$25

File Name: CS145_Homework_#5.zip
File Size: 160.14 KB

SKU: [Solved] CS145 Homework #5 Category: Tag:
5/5 - (1 vote)

1 Frequent Pattern Mining for Set Data

Given a transaction database shown in Table 1, answer the following questions. Note that the parameter min support is set as 2.

  • Find all the frequent patterns using Apriori Algorithm. Details of the procedure are expected.
  • Construct and draw the FP-tree of the transaction database.
  • For the item d, show its conditional pattern base (projected database) and conditional FP-tree.
  • Find frequent patterns based on ds conditional FP-tree.

Table 1: The transaction database for the question 1.

TID Items
1 b,c,j
2 a,b,d
3 a,c
4 b,d
5 a,b,c,e
6 b,c,k
7 a,c
8 a,b,e,i
9 b,d
10 a,b,c,d

1

Introduction to Data Mining (UCLA CS 145) Homework #5

2 Apriori for Yelp

In apriori.py, fill in the missing lines, with the following parameters (already set in the code): min_support=50, min_conf=0.25, and ignore_one_item_set=True. Output the frequent patterns and rules associated with the Yelp data (the same one as the project) which we have stored in yelp.csv and id_name.csv. Do NOT modify the print_items_rules() function and directly copy the entire output of the following command in your report in plain text format (do NOT take a screenshot):

python2.7 apriori.py

What patterns and rules do you see? Where are these businesses located? What do these results mean? Do a quick Google search and briefly interpret the patterns and rules mined from Yelp in 50 words or less.

3 Correlation Analysis

Table 2 shows how many transactions containing beer and/or nuts among 10000 transactions. Answer the following questions based on Table 2.

  • Calculate confidence, lift, and all confidence between buying beer and buying nuts.
  • What are your conclusions of the relationship between buying beer and buying nuts, based on the above measures?

Table 2: Contingency table for question 2.

Beer No Beer Totel
Nuts 150 700 850
No Nuts 350 8800 9150
Total 500 9500 10000

4 Sequential Pattern Mining (GSP Algorithm)

  • For a sequence s = hab(cd)(ef)i, how many events or elements does it contain? What is the length of s? How many non-empty subsequences does s contain?
  • Suppose we have L3 = {h(ac)ei,hb(cd)i,hbcei,ha(cdi,h(ab)di,h(ab)ci} as the frequent 3sequences, write down all the candidate 4-sequences C4 with the details of the join and pruning steps.

2

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[Solved] CS145 Homework #5
$25