Student Number:
The University of Melbourne Semester 2 Assessment 2022
School of Computing and Information Systems
COMP90073 Security Analytics
Reading Time: 15 minutes.
Writing Time: 2 hours.
This paper has 19 pages including this cover page. Common Content Papers: None
Authorised Materials: Lecture notes, books, computer, online material.
Instructions to Students:
-
This paper counts for 60% of your final grade, and is worth 60 marks in total.
-
There are 22 questions, with marks as indicated.
-
Answer all the questions on the exam paper if possible, and then upload the completed exam paper containing your solutions. If you are unable to print the exam paper or electronically edit the exam paper, you may write on your own blank paper and then upload images of your written answers.
-
You may upload your exam answers multiple times if you need to revise an answer at any time during the exam.
-
You must not communicate with other students or seek assistance from any- one else taking whilst taking this exam, e.g. using messaging, chat rooms, email, telephone or face-to-face. Also, you must not assist anyone else taking the exam. You must not post answers to the questions or discussion of the questions online. Failure to comply with these instructions may be considered as academic misconduct.
-
You are free to use the course materials and your laptop/PC in this exam but note that there is a 2-hour time window for the exam hence you should be mindful of the time spent using such resources.
-
Answer the questions as clearly and precisely as you can.
-
Your writing should be clear. Unreadable answers will be deemed wrong. Excessively long answers or irrelevant information may be penalised.
-
For numerical methods, marks will be given for applying the correct method.
Library: This paper may not be reproduced or held by the Baillieu Library.
Section A: Short Answer Questions (Use your own words to provide a short explanation to each question) [10 marks in total]
-
What is the pattern for a typical DNS amplification attack, and why?
[1 marks]
Answer:
-
What type of information cannot be obtained from firewall logs? [1 marks]
-
Account name and process name
-
Source and destination IP addresses
-
Application name and whether it has known vulnerability
-
Whether the application has been used by malware before
Answer:
-
-
Output of anomaly detection methods can be either score or label. Provide an example scenario where one prefers label over score. [1 marks]
Answer:
-
Supervised machine learning models are not used for anomaly detection be- cause in anomaly detection problems the data is highly imbalanced. One common solution to mitigate imbalance training is to over-sample the posi- tive class (or under-sample the negative class). Are such solutions e↵ective for anomaly detection problems? Justify your answer. [1 marks]
Answer:
-
Most of the anomaly detection methods introduced in this subject as- sume the training data is clean (i.e., not noisy), otherwise, their perfor- mance can significantly be impacted. Name two anomaly detection meth- ods that are less susceptible to noisy data and discuss why they are more resilient. [1 marks]
Answer:
-
In Support Vector Data Description (SVDD) what are the training samples with 0 < ↵ < C, and what’s their role in decision making? [1 marks]
Answer:
-
Which of the following options is not a valid adversarial sample for the given data point x. [1 marks]
Answer:
-
Adversarial training is an e↵ective defence method against adversarial at- tacks. How does it augment the training dataset? [1 marks]
Answer:
-
One limitation of adversarial training is that it degrades the model’s per- formance on clean data. Why is that? [1 marks]
Answer:
-
In adversarial attacks against reinforcement learning models, the attacker does not need to perturb every state observed by the agent. What is the heuristic method that decides whether to poison an observed state? [1 marks]
Answer:
Section B: Method and calculation Questions [30 marks in total]
-
You are a security expert working for MBank Financial Group. Your re- sponsibility is to secure the company’s IT systems, in particular, Payroll, Customer Relationship Management System and brochure hosting site.
-
How do you measure the confidentiality of the information you need to protect, and how can it be applied to information in those three systems? [2 marks]
Answer:
-
What are the controls to ensure the CIA triad is supported? [1 marks]
Answer:
-
-
One recently disclosed critical vulnerability on MBank Financial Group’s online share trading platform allows an attacker to gain unauthorised access to a customer’s share portfolio. Should it be exploited, this will cause a major impact on MBank Financial Group. The detailed metrics and ratings of the exploit are tabled below.
Metrics
Rating
Skill (High skill level required ! low or no skill required)
2
Ease of Access (very difficult to do ! very simple to do)
2
Incentive (high incentive ! Low incentive)
5
Resource (requires expensive or rare equipment ! no resources required)
3
-
What is the likelihood score? [1 marks]
Answer:
-
What is the risk level? [1 marks]
Answer:
-
What is the recommended action, and why? Choose the appropriate answer, and briefly explain your choice.
-
Immediate action required to mitigate the risk or decide to not proceed
-
Action should be taken to compensate for the risk
-
Action should be taken to monitor the risk
[1 marks]
Answer:
-
-
-
The XLeague company designed a new version game, whose Intellectual Property is worth $1,600,000. The exposure factor is 80%, and the annu- alised rate of occurrence is 25%.
-
What is the single loss expectancy? [1 marks]
Answer:
-
What is the annualised loss expectancy? [1 marks]
Answer:
-
-
The table below shows a list of items, use FP-growth to identify frequent patterns with Min sup=3. Your work should include FP-tree, Conditional pattern base, Conditional FP-tree, and Frequent patterns. [3 marks]
TID
List of items
T100
{a, b, c, k, l, m, s}
T200
{f, a, b, c, d, g, i, m, p}
T300
{a, b, d, h, j, k, w}
T400
{b, c, k, m, p}
T500
{a, f, c, e, l, p, k, n}
T600
{f, c, e, p, w}
Answer:
-
Local outlier factor (LOF) is one of the most e↵ective anomaly detection techniques, however, it struggles to identify group anomalies which can appear frequently in cyber security problems. How would you extend LOF to be able to detect group anomalies as well as point anomalies? Discuss how your solution achieves this goal. [1 marks]
Answer:
-
Which of the following statements is/are not true about isolation forest (iForest) and half-space tree (HS-tree)? Justify your choices. [2 marks]
-
iForest splits the space randomly, while HS-tree splits the space from the middle point.
-
iForest and HS-tree both identify regions of normal and anomaly.
-
iForest generates trees with arbitrary shapes, and di↵erent lengths, while in HS-tree all branches have the same depth.
-
iForest is not able to identify local anomaly, while HS-tree can.
Answer:
-
-
In your own word explain how graph convolutional networks (GCNs) adapt the idea of convolution to graph networks and why such a solution is needed.
[1 marks]
Answer:
-
One class support vector machine (OCSVM) solves the following quadratic problem to generate the decision boundary,
n
min 1 ||w||2 + 1 X ⇠i — ⇢
w,⇠i,⇢ 2
⌫n
s.t.
i=1
(w · (xi)) ≥ ⇢ — ⇠i, ∀i = 1, ·· · ,n
⇠i ≥ 0, ∀i = 1, ·· · ,n
What are the role of ⌫, ⇠, and ⇢ in this equation? And why do we want to maximise the distance of the hyperplane from the centre? [2 marks]
Answer:
-
In Task 1 of Assignment 2, we asked you to train an anomaly detection al- gorithm on extracted features from network traffic, and gave you a training, a test, and a validation set. To address this task, one of your classmates, Flora, takes the following steps:
-
Flora starts by fitting a PCA (n component = 20) on the validation set with all the 15 features (including stream ID, without label), and calls the output model as “PCAfitted”.
-
Then, Flora applies the PCA to the validation set, and denotes the reduced dataset (processed by PCA) as “Dataval PCA”.
-
Afterwards, Flora trains DBSCAN on Dataval PCA, and fine-tunes the parameters to get the highest accuracy.
-
Finally, Flora extracts features from the training and test datasets by applying the PCAfitted model, and applies DBSCAN to both data sets.
Flora finds the False Positive (FP) rate is too high for the trained DBSCAN model. Can you give some suggestions on how e↵ectively Flora can reduce the FP rate? Will your method a↵ect its True Positive (TP) rate? [2 marks]
Answer:
-
-
A binary linear Support Vector Machine (SVM) model (f) classifies input x using the following: f(x) = w · x + b, i.e., if w · x + b > 0, x is classified into the positive class; otherwise, it is classified into the negative class. As demonstrated in Figure 1, in order to generate an adversarial sample x0 against f for input x, one option is to perturb x in a direction orthogonal to the decision boundary hyperplane.
x
Positive Class (+)
Decision boundary
x’
Negative Class (–)
Figure 1: Generating an adversarial sample against a binary linear SVM classifier by moving the original input in a direction orthogonal to the decision boundary.
Suppose that w = [4 3], b = 2, and x = [x1 x2]T , i.e., the input x is two dimensional. Generate an adversarial sample x0 for point (—1, 3) with the following two approaches:
-
Fast gradient sign method (FGSM).
-
Reviews
There are no reviews yet.