Introduction to AI
Uncertainty and Probability
20% of Final Mark
1 Question Description
Part 1: Reasoning Under Uncertainty Basics [30 marks]
This part contains several questions about the basics of reasoning under uncertainty. You need to write your answers to each of these questions in your report, and Show your working.
For calculations, you need to show the steps in the form like, to demonstrate that you know how to calculate them.
For proving, you also need to show the steps during the proof.
Question 1 [10 marks]
The tables below give the prior distribution P(X), and two conditional distributions P(Y |X) and P(Z|Y ). It is also known that Z is independent from X given Y . All the three variables (X, Y , and Z) are binary variables. Compute the table of their joint distribution based on the chain rule.
|
|
- Create the full joint probability table of X and Y , i.e. the table containing the following four joint probabilities P(X = 0,Y = 0), P(X = 0,Y = 1), P(X = 1,Y = 0), P(X = 1,Y = 1).
Also explain which probability rules you used.
- If given P(X = 1,Y = 0,Z = 0) = 0.336, P(X = 0,Y = 1,Z = 0) = 0.168, P(X = 0,Y = 0,Z = 1) = 0.036, and P(X = 0,Y = 1,Z = 1) = 0.042, create the full joint probability table of the three variables X, Y , and Z. Also explain which probability rules you used.
- From the above joint probability table of X, Y , and Z:
- calculate the probability of P(Z = 0) and P(X = 0,Z = 0), (ii) judge whether X and Z are independent to each other and explain why.
- From the above joint probability table of X, Y , and Z:
- calculate the probability of P(X = 1,Y = 0|Z = 1), (ii) calculate the probability of P(X = 0|Y = 0,Z = 0).
Question 2 [10 marks]
Consider three Boolean variables A, B, and C, A B|C, P(B) = 0.7, P(C) = 0.4, P(A|B) = 0.3, P(A|C) = 0.5, and P(B|C) = 0.2, calculate the following probabilities. Show your working.
- P(B,C)
- P(A|B)
- P(A,B|C)
- P(A|B,C)
- P(A,B,C)
Question 3 [10 marks]
Dragonfly has a rare species, which always has an extra set of wings. However, common dragonflies can sometimes mutate and get an extra set of wings. A dragonfly either belongs to the common species or the rare species with the extra wings. There are 0.3% dragonflies belonging to the rare species with the extra set of wings. For the common dragonflies, the probability of the extra-wing mutation is 0.1%. Now you see a dragonfly with an extra pair of wings. What is the probability that it belongs to the rare species? Show your working.
Question 4 [for COMP420 ONLY, 10 marks]
Prove the following statements. Show your working.
- If P(A|B,C) = P(B|A,C), then P(A|C) = P(B|C)
- If P(A|B,C) = P(A), then P(B,C|A) = P(B,C)
- If P(A,B|C) = P(A|C) P(B|C), then P(A|B,C) = P(A|C)
Part 2: Naive Bayes Method [25 marks]
This part is to implement the Naive Bayes algorithm, and evaluate the program on the spam data set to be described below. The program should build a Naive Bayes classifier from the labelled data set and apply it to the unlabelled set.
Problem Description
The labelled data set is in the file spamLabelled.dat, which describes 200 emails, labelled as spam or non-spam. Each email is specified by 12 binary attributes, indicating the presence of features such as Viagra, MILLION DOLLARS, significant amounts of text in CAPS, an invalid reply-to address, and so on. Note that there are 212=4096 possible input patterns, compared to a data set of just 200 examples.
The layout of the data is that each row is an instance of features from one email, and columns correspond to the features, which are binary: the feature is either there or not. The last (rightmost) column is the class: 1 = spam, 0 = non-spam.
The file spamUnlabelled.dat contains 10 new input patterns to be classified.
Theres a good entry in wikipedia (http://en.wikipedia.org/wiki/Naive Bayesian classifier that discusses exactly the domain were applying the algorithm to. You are recommended to read this article.
As we discussed during the lectures, zero probabilities are a problem for the Naive Bayes method. For example, if the training data did not include a C = 1 instance with attribute F8 = 1, the simplest version of the algorithm will assume that P(C = 1|F8 = 1) = 0, and never predict C = 1 if F8 = 1. This is generally a bad idea because P(C = 1|F8 = 1) is unlikely to be exactly zero, even if it is very low. The simplest solution is to initialise all the counts to 1, rather than 0, which means every P(C|F) has at least a low probability. As discussed in the lecture, you should divide by the right number when you convert the counts into probabilities.
Requirements
Your job is to use the Naive Bayes method to classify the unlabelled instances in the spamUnlabelled.dat file. The method should use the training data in spamLabelled.dat to construct the classifier (Naive Bayes probability tables), and then apply the classifier to the data in spamUnlabelled.dat
You should implement the Naive Bayes method from scratch (not call it from any machine learning library). Your program should take two file names as command line arguments, construct a classifier from the data in the first file, and then apply the classifier to the data in the second file.
You may write the program code in Java, C/C++, or any other programming language.
You should submit the following files electronically and also a report.
- (15 marks) Program code for your Naive Bayes Classifier (both the source code and the executable program running on ECS School machines),
- (2 marks) txt containing the output of your program on the unlabelled data set, and
- (8 marks) A report in PDF, text or DOC format. The report should include:
- the probabilities P(Fi|c) for each feature i.
- For each instance in the unlabelled set, given the input vector F, the probability P(S|D), the probability P(S|D), and the predicted class of the input vector. Here D is an email represented by F, S refers to class spam and S refers to class non-spam.
- The derivation of the Naive Bayes algorithm assumes that the attributes are conditionallyindependent. Why is this like to be an invalid assumption for the spam data? Discuss the possible effect of two attributes not being independent.
Part 3: Bayesian Networks: Questions [30 marks]
This part contains several questions about Bayesian networks. You will need to write your answers in the report and show your working.
Consider the Bayesian network below, where all the variables are boolean.
Question 1 [5 marks] Write the factorisation of this Bayesian network.
Question 2 [5 marks] Describe under which condition the following pair of nodes will be (conditionally) independent. For example, A and E are conditionally independent given C.
- A and B.
- A and D.(iii) B and G.
- D and F.
- C and G.
Question 3 [10 marks] Draw the Bayesian network if the node ordering is E D F G
B C A.
Question 4 [10 marks] Suppose P(A) = 0.7, P(B) = 0.2, P(C|A) = 0.6, P(C|A) = 0.3, P(D|B,C) = 0.7, P(D|B,C) = 0.6, P(D|B,C) = 0.5, P(D|B,C) = 0.2. Calculate the probability P(D).
Part 4: Building Bayesian Network [30 marks]
This part is to build a Bayesian Network for the problem described below.
Problem Description
Dr. Rachel Nicholson is a Professor, who lives far away from her university. So, she prefer to work at home and she only comes to her office if she has research meetings with her postgraduate students, or teaching lectures for undergraduate students, or she has both meetings and teaching:
- The probability for Rachel to have meetings is 70%, the probability of Rachel has lectures is 60%.
- If Rachel has both meetings and lectures, the probability of Rachel comes to her office is 95%.
- If Rachel only has meetings (without lectures), the probability of Rachel comes to her office is 75% because she can Skype with her students.
- If Rachel only has lectures (without meetings), the probability of Rachel comes to her office is 80%.
- If Rachel has neither meetings nor lectures, there is a only 6% chance that she comes to the office.
- When Rachel is in her office, half the time her light is off (when she is trying to hide from others to get work done quickly).
- When she is not in her office, she leaves her light on only 2% of the time since the cleaners come for cleaning.
- When Rachel is in her office, 80% of the time she logged onto the computer.
- Because she sometimes work from home, 20% of the time she is not in her office, she is still logged onto the computer.
Note regarding the calculation, you should show your working process of the calculation to demonstrate that you know how to calculate them.
Requirements
- Construct a Bayesian network to represent the above scenario. (Hint: First decide what your domain variables are; these will be your network nodes. Then decide what the causal relationships are between the domain variables and add directed arcs in the network from cause to effect. Finally, you have to add the prior probabilities for nodes without parents, and the conditional probabilities for nodes that have parents.)
- Calculate how many free parameters in your Bayesian network ?
- What is the joint probability that Rachel has lectures, has no meetings, she is in her officeand logged on her computer but with lights off.
- Calculate the probability that Rachel is in the office.
- If Rachel is in the office, what is the probability that she is logged on, but her light is off.
- Suppose a student checks Rachels login status and sees that she is logged on. What effectdoes this have on the students belief that Rachels light is on ?
Part 5: Bayesian Network: Applications [For COMP420 ONLY, 20 marks]
Identify a real-world application (different from the examples given in this assignment and the lectures) that can be described using Bayesian network. There should be at least 5 random variables in this Bayesian network.
In your report, you should:
- Clearly define the random variables and their domains.
- Clearly describe their relationships (using plain language).
- Draw the Bayesian network that can reflect the described relationship.
- Write the factorisation of the Bayesian network.
2 Relevant Data Files and Program Files
The relevant data files, information files about the data sets, and some utility programs files can be found from the following directory:
/vol/comp307/assignment3/
3 Assessment
We will endeavour to mark your work and return it to you as soon as possible, hopefully in 2 weeks. The tutor(s) will run a number of helpdesks to provide assistance.
4 Submission Guidelines
4.1 Submission Requirements
- Programs for all individual parts. To avoid confusion, all the individual parts should usedirectories part1/, part2/, and all pieces of programs should be stored in their corresponding directories. Within each directory, please provide a readme file that specifies how to compile and run your program. A script file called txt should also be provided to show how your program run properly. If you programs cannot run properly, you should provide a buglist file.
- A report document that consist of the answers of all the individual parts. The document should mark each part clearly. The document can be written in PDF, text or the DOC format.
- For drawing the diagram such as the Bayesian network, you need to make the diagram very clear to be marked. We highly recommend using drawing tools rather than by hand.
4.2 Submission Method
The programs and the PDF/Text/DOC version of the document should be submitted through web submission system from the COMP307/420 course web site by the due time.
4.3 Late Penalties
All assignments must be submitted on time unless you have made a prior arrangement with the lecturer or have a valid medical excuse (for minor illnesses it is sufficient to discuss this with the lecturer.) The penalty for assignments that are handed in late without prior arrangement is one grade reduction per day. Assignments that are more than one week late will not be marked.
Reviews
There are no reviews yet.