UNIVERSITY OF WATERLOO Cheriton School of Computer Science
CS 458/658 Computer Security and Privacy Winter 2021
ASSIGNMENT 2
Total marks: 89 + 5 Bonus Marks Written Response TA: Sajin Sasy
Programming Response TAs: Miti Mazmudar, Matthew Rafuse
TA Office Hours: Mondays 10:00 11:00 EDT
Please use Piazza for questions and clarifications. We will be using BigBlueButton for TA office hours this term; we have separate online rooms for the written and programming parts. To attend office hours, access the corresponding URL for that assignment part and use the corresponding access code when prompted. When asked for your name please enter both your first and last name as they appear in LEARN.
Written: https://bbb.crysp.org/b/you-9rz-fv4 Access code: 190863 Programming: https://bbb.crysp.org/b/you-nv9-7pn Access code: 367905
Milestone due date: February 26th 2021 at 3:00pm (optional)
Assignment due date: March 12th 2021 at 3:00 pm (the usual 48-hour automatic ex- tension applies).
1
What to hand-in
All assignment submission takes place on the student.cs machines (not ugster or the virtual environments), using the submit facility. In particular, log in to the Linux student environment (linux.student.cs.uwaterloo.ca), go to the directory that contains your solution, and submitusingthefollowingcommand:submit cs458 2 .(dotincluded).CS658students should also use this command and ignore the warning message.
1. A2 milestone deadline:
Note that the A2 Milestone is optional. If you submit your (functional) A2 milestone by the deadline, you will receive an additional 5 bonus marks. These marks are on top of the total marks for the assignment, and not submitting the milestone does not preclude you from getting full marks on the assignment.
src.tar: Your source files for the first question of the programming assignment, in your supported language of choice, inside a tarball. After the milestone, your submissions for the first question will not be tested. See the instructions below on how to create the tarball.
2. A2 deadline:
a2.pdf: A PDF file containing your answers for the written-response questions. It must contain, at the top of the first page, your name, UW userid, and student number. If it does not, a three mark penalty will be assessed. Be sure to embed all fonts into your PDF files. Some students files were unreadable in the past; if we cant read it, we cant mark it. Note that renaming a .txt file to a .pdf file does not make it a PDF file.
src.tar: Your source files for the 2nd to the 7th questions of the programming assignment, in your supported language of choice, inside a tarball.
Re-read the output requirements and testing and marking sections, before creating your tarball. To create the tarball, cd to the directory containing your code, and run the command
tar cvf src.tar .
(including the .). If you are using an interpreted language, your source should include an executable script named ids that runs your code using the proper shebang. For compiled languages, include a Makefile in your source with a default target that builds an executable named ids.
2
1 Written Response Questions [42 marks]
1.1 [19 marks] Intelligent Agents of Intelligence fight for Intelligence
CipherIsland Intelligence Service (CIIS), the central espionage service of CipherIsland, uses the Bell-La Padula confidentiality model to protect its documents with the following sensitivity/clear- ance levels:
Director >c Executive >c Handler >c Agent >c Support >c Unclassified
CIIS also compartmentalizes all of its documents by projects, with the respective project code-
names for access control. The project codenames are typically greek alphabets.
1. [8 marks] Sterling Archer, the best field agent at CIIS, absolutely detests access control mechanisms and has no intent to understand how Bell-La Padula works. All he knows is that his clearance level is (Agent, {, , }). For each of the following documents, help Sterling Archer figure out whether he has read access, write access, both, or neither for the following documents under the Bell-La Padula Confidentiality Model:
(i) F320: (Executive, {, }) (ii) F210: (Director, {})
(iii) F102: (Support, {, }) (iv) F513: (Support, {, })
(v) F219: (Agent, {, , }) (vi) F924: (Support, )
(vii) F100: (Agent, {, , })
(viii) F465:(Director,{,,,,})
2. [6 marks] CIIS is actively under attack by its rival agency, the central espionage service of CryptoLand, CryptoLand Intelligence Service (CLIS). With help of their secret mind-control weapon, CLIS has successfully infiltrated CIIS by controlling two of their employees, Ray Gillete and Cyril Figgis. Unfortunately, while their secret weapon allows them to control the actions of an individual, it does not retain the target individuals memory. Hence CLIS has no idea what credentials Ray Gillete and Cyril Figgis hold.
CLIS knows from its previous infiltration attempts that CIIS has a strict security policy that triggers an alarm if an employee tries to access files without appropriate credentials more than once. On an employees first attempt to access a file without appropriate credentials,
3
the employee is warned by the system. The second attempt results in an internal alarm and the employee being locked out of the system. To avoid triggering the alarm CLIS decides to access files strategically to figure out the credentials their infiltrators hold. However, it turns out fine-grained muscle control with their mind-control weapon is extremely hard; Ray and Cyril effectively end up trying to access random files.
Ray successfully reads files F111 (Support, {,}) and F331 (Handler, {,}), but triggers a warning when attempting to read F222 (Agent, {, , }).
Cyril successfully reads files F212 (Unclassified, {}) and F396 (Agent, {, }), but triggers a warning when attempting to read F579 (Agent, {, }).
Given that CIIS only has {, , , , , , } as the set of compartments, and the above inter- actions of Ray and Cyril:
(a) What is the lowest clearance level and minimal set of compartments Ray must hold? What is highest clearance level and maximal set of compartments Ray could have?
(b) What is the lowest clearance level and minimal set of compartments Cyril holds? What is the highest clearance level and maximal set of compartments Cyril could have?
(c) CLIS desperately wants to read the file F999 (Handler, {, , }), which of their two infiltrators is the better choice to attempt reading this file and why?
3. [5 marks] Meanwhile, CIIS is engaged in infiltrating CLIS and gaining access to their in- telligence. From their undercover mole Barry Dilton they come to know that CLIS uses a Biba integrity model for its documents, specifically one with a Low Watermark property and with the same sensitivity/clearance levels as CIIS. CLIS too uses greek letters for its project codenames.
Barry has been tasked with exposing the file F123 (Agent, {.,}), by dropping down F123s integrity to (Unclassified, ). However, he is aware that CLIS has an alarm that will trigger if the clearance level of a subject or object changes by more than one level in an action; similarly the alarm also triggers if the integrity level of the subject or object changes by more than one compartment in an action. (The system will allow for a change in one level of clearance as well as one change in compartment within the same action.) The alarm will also instantly lock out the subject whose actions triggered the alarm preventing them from taking any further actions that might harm CLIS.
Detail the steps Barry needs to take to complete his task, without setting off the alarm, given that Barrys credential is (Handler, {, , , , }) and that he can see the following set of files in the system:
(i) F546: (Handler, {, , }) (ii) F101: (Unclassified, )
(iii) F513: (Agent, {, })
4
1.2
(iv) F121: (Executive, {, , }) (v) F676: (Agent, {, , }) (vi) F917: (Agent, {, , })
(vii) F369: (Support, {})
(viii) F129:(Agent,{,,,})
(Note that a simple way to manipulate credentials of a file would be to add an empty string to the file in question, so that the file itself doesnt change but the integrity level gets updated by the Low Watermark property.)
[13 marks] Securing password authentication
CIIS among other precautionary mechanisms is auditing the security of their password authentica- tion mechanism. Currently, they store the hash of a password (fingerprint) in a file, and authenticate their employee login attempts against this fingerprint. Their scheme for generating and verifying fingerprints is sketched below:
Every password entry P maintains an 8-bit random salt S used for generating its fingerprint F, and the system uses a hash function H.
The fingerprint of a password is computed as F = H(P ) S and is stored in their password fingerprint file (essentially their version of /etc/shadow) along with the username and S for that user, where is the bitwise XOR operator.
When an employee attempts to log in with a password P, the system verifies the password by computing H(P) S where S is the salt for that user in their password fingerprint file.
1. [3 marks] Is this scheme secure? If no, what attacks are they currently susceptible to?
2. [2 marks] How can they improve their scheme without changing the underlying hash func- tion?
Roger Pollock, CIISs resident cryptographer realizes during the audit that the hash function (H) they have been using for their password authentication is actually just an 8-bit CRC (Cyclic Re- dundancy Check), and is furious at this oversight on the companys part.
Roger: We absolutely need to use a Cryptographic Hash Function for password authentication, since they provide the following desirable properties:
5
5.
1.3
3. 4.
Pre-image resistance: Given a hash value h, it should be difficult to find any message m, such that H(m) = h.
Second pre-image resistance: Given an input m1, it should be difficult to find a different input m2 such that H(m1) = H(m2).
Collision resistance: It should be difficult to find two different messages m1 and m2 such that H(m1) = H(m2).1
So please make sure we use a strong cryptographic hash function.
[3 marks] Which of the above listed desirable properties of a cryptographic hash function does an 8-bit CRC have?
[3 marks] Erythrina, the CIIS employee responsible for re-implementing the password au- thentication module remembered a cryptographic hash function from a security course she took as an undergraduate student herself about a decade ago. Here is a hash of a password she deems very secure which she generated using the hash function she remembered (The hash value below is before the salt is added to it):
EA0C04513C32717F3A09FF7B1FA882C4D8424B2A
Name and justify a candidate hash function that could have produced this hash. What is the password that hashes to that value, and how did you determine it?
[2 marks] Propose an alternate hash function that could provide better security properties and justify your choice.
[10 marks] Firing up the Firewall
After a recent breach of security, and loss of several confidential files. CIIS has decided to set up its firewall again. You are tasked with this ordeal alongside the current network security expert Lana Kane. CIIS owns the IP address range 17.27.13.0/25. The following are the network functionalities that CIIS requires for its day-to-day operations:
All employees of CIIS should be able to browse the internet from within their network (i.e. browse all HTTP and HTTPS web pages).
1To clarify the difference between second pre-image resistance and collision resistance is in the second pre-image resistance clause one starts with a fixed m1, and it states that it should then be hard to find an m2 that hashes to the same value as m1 hashes to. While the collision resistance property states that it should be hard to find any arbitrary m1 and m2 that hashes to the same value, which is clearly a stronger property to ask for. Hence second pre-image resistance is also often referred to as weak collision resistance.
6
Their public webpage which is hosted on an internal server (with the IP address 17.27.13.7 and served with HTTPS) must be accessible on the internet.
Employees should be able to ssh into their work devices in the company network from any- where in the world.
CIIS only trusts a special DNS server (located at the IP address 33.99.22.101) hosted by an allied organization to handle all of its DNS lookups. This DNS server is unique in that it serves requests on port 1551 (normally DNS servers serve requests on port 53) and also expects the clients to send these requests from the ports in range 5000 to 5100.
CIIS also maintains an IRC server (on port 3223 of a server with IP address 17.27.13.17) which is meant to facilitate communications of their covert agent (with the IP address 9.19.11.217) with the rest of the organization.
1. [2 marks] Lana Kane is of the opinion that they should reinstate a deny-list with the list of all known malicious IP addresses along with the source IP address of the recent breach to protect against future attacks. Is this a good defense strategy? Why or why not?
2. [2 marks] While configuring the firewall, you notice a series of IP packets from outside the company network that have their source IP addresses as 32.23.11.17. What kind of an attack is this? What type of firewall can be used to defend against it?
3. [6 marks] Configure the firewall by adding the required rules to meet the aforementioned requirements of CLIS. Rules must include the following:
DROP or ALLOW
Source IP Address(es)
Destintation IP Address(es) Source Port(s)
Destination Port(s)
TCP or UDP or BOTH
Here is an example rule to allow access to HTTP pages from a server with IP address 5.5.5.5: ALLOW 5.5.5.5 => 32.23.11.0/25 FROM PORT 80 to all BY TCP
HINTS:
CIDR Notation may be helpful for this portion of the assignment.
Some requirements may need more than one rule.
Ports can be specified as a singular value, range, as a set, or as all as seen in the example above.
7
2 Programming Question [47 marks]
In this part of the assignment, you will write some software that interacts with real-world network- ing technologies. The goal of this section is to introduce you to the details of network security, as well as some specific attacks.
This assignment typically involves a substantial amount of programming. For this reason, we suggest that you start working on your solutions well in advance of the deadline.
An incredibly brief networking primer:
Information is sent across the network in packetssmall units of information. Packets contain multiple layers of information. Each layer serves a different conceptual purpose. For example, a typical packet containing data as part of a TCP connection contains the following layers:
1. Ethernet Layer: This layer contains the source and destination MAC addresses, which are used for sending data between machines on a local network segment.
2. Internet Protocol (IP) Layer: This layer contains source and destination IP addresses, which are used for routing packets between networks.
3. Transmission Control Protocol (TCP) Layer: This layer contains source and destination port numbers, packet type flags, and connection state information. This information is used for creating the concept of stateful connections on the packet-based network, and for differ- entiating services on the recipient machine.
Each layer typically consists of some headers containing information specific to that layer, an integer specifying the type of the next layer, and then the next layer. Some layers, such as Ethernet, have headers with known fixed sizes. Other layers may contain the header and content length as part of the packet. The TCP and UDP headers do not specify the type of data that they contain; you should guess the format of the data based on the port numbers. Most application protocols have well-known port numbers (e.g., DNS on port 53, HTTP on port 80, NTP on port 123).
For this assignment, most questions involve packets that contain information inside an IP layer. Im- portant protocols include ICMP, which is often used for network troubleshooting (e.g., pings), UDP, which is a simple connectionless protocol for efficiently sending self-contained pieces of in- formation, and TCP, which is a protocol that provides the notion of streaming connections. You will also need to work with the DNS, HTTP, and ARP protocols. DNS is a protocol that typically oper- ates over UDP. It allows machines to ask questions about domain names (e.g., example.com) by
8
retrieving records (e.g., the A, or address, record) containing values set by the domains owner (e.g., the A record for example.com is 93.184.216.34). HTTP is the protocol that is used by the worldwide web, allowing web browsers to request web pages from web servers. ARP is a protocol that deals with low-level LAN infrastructure such as mapping IP addresses to MAC addresses. We recommend reading up Section 10.6 of the van Oorschot textbook to familiarize yourself with the terminology.
The Setting
Your security consultancy has been approached by a large technology business, Initrode. Initrode has recently been the victim of several cyberattacks that have evaded their firewalls. They have hired you to improve their security systems so that they can identify attacks as they occur.
Initrode has purchased a powerful new internal switch that will host an intrusion detection system (IDS). This IDS is network-based, and it will silently monitor all network traffic for known attacks based on provided signatures. Your job is to implement the application that runs on this machine. Your application will receive suspicious packet capture files from a network monitoring program and output any detected attacks, as well as some details about them. The output from your program will be used by other scripts to send alerts to the network administrators, or to dynamically add rules to the firewall, as deemed appropriate.
The Initrode network has the following structure:
9
Public IP Address: 8.5.4.92
Private Network: 10.0.0.0/8
Internet Firewall
Router
Switch
IDS
The public IP address of the Initrode network is 8.5.4.92. Within the corporate LAN, machines are assigned IP addresses in the 10.0.0.0/8 range (in CIDR notation). The router connecting the Initrode network to the internet uses network address translation (NAT) to move packets between these networks, much like the consumer routers found in homes. Initrode does not permit IPv6 packets to be transmitted, so you will only need to parse IPv4 packets. Note that the switch (and thus the IDS) is placed on the LAN side of the router and can thus silently observe all internal traffic.
Your Task
The host machine for the IDS will monitor network traffic using the popular tcpdump utility for Unix-like operating systems. Another programmer has written software that causes suspicious network traffic to be saved in pcap filesthe file format used by tcpdump to save captured packet sequences. Your program will be responsible for analyzing these pcap files and raising an alarm if certain attacks are detected.
Your program must accept a single command-line argument: a file path for a pcap file. You will need to read the packets from this file to determine if any attacks have occurred. You may assume that the packets in the pcap file are complete and sorted by timestamp. When an attack is detected, you will need to print an alert to the standard output stream. The output format for question 1 is indicated in the question. For questions 27, your alert messages should conform exactly to the following format:
[attack]: details
In your output, both attack and details should be replaced with the information specified in the relevant section of the assignment. The output from your program will be processed by a series of scripts written by another programmer. These scripts will react to your alerts in a manner determined by the network administrators.
Outline: The following sections describe the attacks that you should detect. Marks for the pro- gramming part are allocated as follows:
Detections: Detect a variety of network-based attacks
[6 marks] Anomaly detection: Count packets and sizes
[6 marks] Spoofed packets: Detect packets with clearly spoofed addresses [6 marks] ARP spoofing: Detect ARP cache poisoning attacks
[6 marks] Unauthorized servers: Detect LAN-based servers
[6 marks] IIS worms: Detect the presence of famous worms
10
[6 marks] Sinkhole lookups: Detect DNS queries for sinkholed domains
[6 marks] NTP reflection DDoS: Detect amplified denial-of-service attacks
[5 marks] Output requirements: Use correct output formatting
The remainder of the assignment describes each part in detail. Please go through the References section, even if you are familiar with networks, as it will help you develop a strategy to solve each question and point you to necessary references that specify details of attacks. We strongly recom- mend that you install the Wireshark application to help you with this assignment, as described in that section. Before you start coding in your preferred language of choice, go through the program- ming languages section and familiarize yourself with the testing and marking procedures. You will find test files, project skeletons, and additional information on LEARN. Refer to the start of this document on what to hand-in for each deadline.
2.1 [6 marks] Anomaly detection
Despite our best efforts, sometimes attacks can pass by our IDS undetected. However, it is some- times possible to detect that something is unusual, even if were not sure what the problem is. Anomaly detection is the process of detecting when a system, such as a network, is behaving un- usually. One simple way to do this for a network is to detect when there is more traffic than usual for the time of day or day of the weekthis may indicate that a virus is performing a denial-of- service attack, or that corporate secrets are being stolen.
Luckily, the IDS machine already has some scripts to determine if a given amount of bandwidth is unusual; all your program needs to do is report how many packets are contained in the input file, as well as the sum of the packet sizes, in bytes. Given this information, the external scripts will determine if there is an unusual amount of suspicious activity.
This task does not use the same output format as other questions. Your program should output the following line to the standard output stream after processing the pcap file:
Analyzed packet-count packets, size bytes
In your output, packet-count should be replaced with the number of packets contained in the pcap file, and size should be replaced with the sum of the packet sizes, in bytes. Note that you should not include the size of the pcap headers, which contain information such as timestamps, in your computation of sizeonly the sizes of the captured packets should be summed.
Sample input for testing: q1-anomaly.pcap 11
Sample output for testing: q1-anomaly-output.log
In this assignment, you may assume that all capture files provide the complete contents of every
packet;nopacketswillbetruncated(i.e.,caplen == lenforeverypacketinthefile).
2.2 [6 marks] Spoofed packets
Packets with spoofed sources or destinations are usually part of an attack, such as a distributed denial-of-service attack, and are unexpected in Initrodes network. In fact, this problem is common enough to prompt a best current practice entry from the Internet Engineering Task Force (BCP 38). Network Ingress Filtering restricts outgoing network traffic from invalid source IP addresses.
Your IDS can monitor all of the network traffic on the corporate LAN, where local computers are within the 10.0.0.0/8 IP range. Every packet visible to your IDS is expected to be coming from or traveling to one of these local machines. Write a rule for your IDS that detects packets that do not satisfy these constraints (i.e., packets that clearly must contain spoofed information).
2.3
Value for attack in your output: Spoofed IP address Valuefordetailsinyouroutput:src:source, dst:destinationwheresource
and destination are the IP addresses from the packet. Sample input for testing: q2-spoofed.pcap
Expected output for testing: q2-spoofed-output.log
[6 marks] ARP Spoofing
How ARP works: On a local network, machines typically communicate by transmitting Ethernet packets. Ethernet packets are sent to and from MAC addresses that (in theory) uniquely identify particular networking hardware. The Ethernet frame wraps data for a higher-level protocol, such as the Internet Protocol (IP). IP packets are sent to and from IP addresses. However, since all IP information must be wrapped inside of Ethernet frames, sending an IP packet to another machine requires addressing the packet to a particular MAC address. To communicate with other machines on the LAN, each computer maintains a dynamic table in memory that maps IP addresses to MAC addresses. This table is populated using the Address Resolution Protocol (ARP).
ARP packets are wrapped inside Ethernet frames, just like any other packets sent through the LAN. When Alice wants to talk to Bob, Alice first sends an ARP packet to a special broadcast address
12
asking for the MAC address for Bob. The network switch ensures that this packet is delivered to all machines on the LAN. When Bobs machine receives the packet, it responds with an ARP response packet indicating Bobs MAC address. Alice then updates her ARP table to map Bobs IP to his MAC address. All future packets sent from Alice to Bob can now be addressed to the proper MAC address.
ARP spoofing: ARP spoofing is an attack where a machine on the LAN maliciously manipulates ARP tables in order to insert itself as a man in the middle. Mallory can send an ARP response to Alice saying that Bob is located at Mallorys MAC address. She can then send another ARP response to Bob saying that Alice is located at Mallorys MAC address. Now all communications from Alice to Bob are covertly redirected through Mallory. If Bob is the gateway providing access to the Internet, then Mallory can effectively monitor and modify all of Alices online communica- tions.
Your IDS should detect possible ARP spoofing by maintaining its own mapping from IP addresses to MAC addresses. Whenever you observe an ARP reply packet, update your table to record the new mapping for the source address. If any existing entry in the table is ever changed (i.e., if a given IP address was previously mapped to an old MAC address A, and is now updated to a new MAC address B), then an alert should be raised. The network administrator should investigate these events in order to identify if the action was legitimate (e.g., if a user replaced a network card in their computer) or malicious.
For the purposes of this assignment, you only need to record mappings that appear as the source MAC/IPv4 address in ARP reply packets.
2.4
Value for attack in your output: Potential ARP spoofing
Value for details in your output: ip:ip, old:oldmac, new:newmac where ip is
the IP address, oldmac is the previous MAC address, and newmac is the new MAC address. Sample input for testing: q3-arp.pcap
Expected output for testing: q3-arp-output.log
[6 marks] Unauthorized servers
Many computer viruses allow the virus author to issue commands to the infected system over the Internet. A very simple approach for facilitating this remote control is to set up a server on the infected machine. The virus author can then connect to the server and issue commands. Initrode has a corporate policy that prohibits any remotely accessible servers on machines in the LAN, as they are intended to be workstations used by employees. (Here, the server does not need to
13
be an application-level server, e.g. an HTTP server, but rather just a machine that accepts TCP connections via sockets.)
Write two rules for your IDS:
1. Detect when an external computer (i.e., one outside of the 10.0.0.0/8 IP range) attempts to connect to a server running within the LAN. These requests should be blocked by the firewall, so if they are visible to your IDS then this indicates that the firewall has failed or has been somehow subverted.
Value for attack in your output: Attempted server connection
2. Detect when a server running within the LAN accepts a connection from an external com- puter. Note that it is not necessary for the connection to be established; you should raise an alert as soon as a machine on the LAN expresses the intent to accept a connection from an external machine.
Value for attack in your output: Accepted server connection
In both cases, the value for details in your output should be:
rem:remote, srv:server, port:portwhereremoteistheIPaddressoftheexternal computer, server is the IP address of the LAN-based server, and port is the port that the external computer attempted to connect to.
2.5 [6 marks] IIS worms
Several of the most iconic computer worms in historyCode Red, Code Red II, the Sadmind worm, and Nimdaall attacked Microsoft IIS web servers by using a directory traversal vul- nerability caused by incorrect parsing of unicode characters. By sending requests for maliciously- crafted pages, the worms could cause the web servers to execute programs anywhere on the servers hard drive. Specifically, it was possible to cause the servers to execute the Windows command-line interpreter with arguments specifying a command to execute.
While these vulnerabilities were patched by Microsoft over a decade ago, abandoned and infected machines around the world continue to scan the Internet to this day, looking for potential targets. Initrode has noticed that one of their employees is constantly being infected by these worms due to carelessly downloading vintage video games from irreputable websites. Unfortunately, the em- ployee in question is the CEO and founder of the company and cannot be reprimanded for political reasons. Instead, you will need to write an IDS rule to detect when their computer has been infected so that the IT department can quietly remove the worm.
14
More information about the vulnerability exploited by these worms is available from the SANS Institute.
Write a rule for your IDS that detects malicious web requests attempting to exploit these unicode vulnerabilities. Note that you should not attempt to detect a specific worm such as Nimda; in- stead, you should detect any of these directory traversal attacks. Your solution for this task should examine each packet and perform the following steps:
1. Determine if the packet is an IP packet. If so, continue.
2. Determine if the packet is a TCP packet. If so, continue.
3. Determine if the packet is likely to contain an HTTP request (hint: check the destination port). If so, continue.
4. Parse the TCP packet contents to locate the page that has been requested from the server. If you found a web request, continue.
5. Check the page for the malicious unicode characters mentioned in the SANS article. If found, raise an alert.
Your solution should be able to detect all of the sensible examples provided in the table in the SANS article irrespectively of the worms payload. (As mentioned in the article, table entries 42, 44, 45, 47, 52, 53, 54, and 55 are not valid attacks and do not need to be detected. You should be able to detect all of the other table entries.) The sample file provided for this task contains all of these examples; your IDS should produce 62 alerts for this input, as in the sample output. Hint: many of the examples in the article use the same unicode exploit pattern; you will only need to look for approximately 15 patterns to detect all of the valid cases.
You are not required to detect attacks that take place over TLS connections (i.e., HTTPS).2 You should not attempt to match the payload of the worms (e.g., executing cmd.exe); instead, detect the unicode attacks. Your IDS should detect attacks that are made via all valid HTTP 1.1 request types (GET, POST, HEAD, PUT, DELETE, and OPTIONS).
Value for attack in your output: Unicode IIS exploit
Valuefordetailsinyouroutput:src:source, dst:destinationwheresource and destination are the source and destination IP addresses from the packet in the pcap file.
2If you were able to reliably detect attacks within encrypted web connections in the given setting, then you would have broken TLS. If you have broken TLS, please let us know!
15
Sample input for testing: q5-unicode.pcap
Expected output for testing: q5-unicode-output.log
2.6 [6 marks] Sinkhole Lookups
Since connections to servers running within the LAN are easily detectable, many viruses receive commands by making outbound connections to the Internet instead. For example, a virus might connect to a website operated by the virus author in order to download new commands. In order to dismantle botnets (networks of infected machines), Internet authorities will often collaborate to disable these command and control servers by seizing control of the domain names and redi- recting them to harmless IP addresses. These harmless servers, called sinkholes, do not return any commands to the infected machines, effectively disabling the botnet. They can also log the incoming connections in order to gauge the size of the former botnet, and possibly notify owners of infected machines.
In order to detect infected machines on the Initrode network, your IDS should identify DNS re- quests that resolve to known sinkhole IP addresses. Your IDS should read a list of IP addresses from sinkholes.txt, which will be located in the current working directory. Each line of this file contains the plaintext form of an IP address of a known sinkhole. If your IDS observes any DNS responses specifying any of these IP addresses as the A record for a domain, it should raise an alert.
You may assume that:
DNS takes place over UDP only.
DNS requests contain only a single request, and that the request is for an A record. Output format:
Value for attack in your output: Sinkhole lookup
Valuefordetailsinyouroutput:src:source, host:host, ip:ipwheresource is the source IP address of the machine performing the DNS query, host is the hostname being queried, and ip is the IP address of the sinkhole.
Sample input for testing: q6-sinkholes.pcap
Expected output for testing: q6-sinkholes-output.log
16
2.7 [6 marks] NTP reflection DDoS attacks
A common way to attack the availability of an Internet resource is to perform a Distributed Denial- of-Service (DDoS) attack. In a DDoS attack, a large set of computers sends traffic to the victim machine as quickly as possible. The traffic in question might be packets full of meaningless data, or actual requests for the service that the victim provides. The large amount of bandwidth overwhelms the victims service capacity, preventing them from responding to legitimate requests. In this way, attackers controlling large networks of infected machines (botnets) can remove others from the Internet. Common uses of DDoS attacks include politically-motivated attacks on websites, extortion of gambling websites before significant sporting events occur, and disconnecting players from online video games.
The Network Time Protocol (NTP) allows computers to synchronize their system clocks over the Internet. A few years ago, attackers discovered that many NTP servers support a command that returns a lot of data in response to a small query. Moreover, NTP requests are delivered using UDP packetsthey do not require connections to be established, as in TCP-based protocols. These factors allow innocent NTP servers to be used to launch amplified DDoS attacks.
To perform the attack, the attacker sends a MON GETLIST 1 request to multiple NTP servers. The UDP packets containing these requests have a forged source addressthey appear to originate from the victim of the attack. The NTP servers then dutifully send lists of their last 600 clients to the victim, who they believe to be the source of the requests. These responses are typically 50 times larger than the request, resulting in a massive amplification of bandwidth and thus a more powerful DDoS attack.
Note that these attacks would be identified by your rule that detects packets with obviously spoofed source addresses. Nonetheless, it is often useful to have your IDS produce more specific alerts. Initrode has no need to allow outgoing MON GETLIST 1 requests, so the presence of one indicates that there is likely an infected machine on the LAN. When observing such a request, your IDS should output an NTP DDoS alert (in addition to the spoofed packet alert introduced earlier in the assignment).
Value for attack in your output: NTP DDoS
Value for details in your output: vic:victim, srv:server where victim is the
IP address of the intended victim and server is the IP address of the NTP server.
Sample input for testing: q7-ntp.pcap
Expected output for testing: q7-ntp-output.log
17
2.8 [5 marks] Output requirements
The output format for question 1 is:
Analyzed packet-count packets, size bytes
The output format for questions 27 is:
[attack]: details
where both attack and details should be replaced with the information specified in the re- spective question.
Ensure that your output conforms to the given format. In particular, ensure that there are no differences between your output and the expected output for the sample files. Read the expected alert formats carefully. Common mistakes that students have made in the past are:
Incorrect labels (e.g., writing server instead of srv)
Omitting commas and/or spaces between alert details
Inserting spaces around, or completely omitting, the colons
Since we mark your output using an automated testing suite, simple discrepancies in your output tend to lead to incorrect zero marks, needless remark requests, and loss of marks for the output requirements. Be sure to double check your outputs carefully to avoid this hassle!
2.9 Testing and Marking
Your program must be called ids. This single program should detect all of the different attacks. It will be invoked in the following manner:
ids /path/to/capture.pcap
For each attack, a file containing an example of the attack has been provided. To test your IDS, you can provide the path of the sample file as the first argument to your program. Do not attempt to test your IDS by running actual attacks against systems that you do not own. Each of the sample files comes with a text file containing the expected output. To ensure that your output is the expected one for the problem, you can compare the attacks that your program detected against the expected attacks using diff:
ids sample-file.pcap | diff expected-output-file.log
18
If your IDS is working properly, the output from this command should be empty. However, you should ensure that you are identifying attacks using the requested methodology (e.g., hard-coding output for the sample files is not a valid solution). Upon submission, your program will be tested using additional sample files. Your IDS should avoid raising false positives as we will be testing it to ensure that it does not issue spurious alerts. Your IDS should not require Internet access to download any libraries or complete any tasks; it is important that network-based IDS software does not reveal its presence to attackers under any circumstances. For this reason, Initrodes network switch executes your IDS inside a virtual machine that does not have any network connectivity.
For marking, we will compile and execute your IDS in a virtual machine with no access to the Internet. The following steps will be performed:
1. Your submission files will be extracted into the (initially empty) current working directory.
2. If Makefile is found, then make will be executed to compile your code.
3. sinkholes.txt will be copied into the current working directory, overwriting any exist- ing version.
4. (For students writing their solutions in Go) If there are bin, pkg, and src subdirecto- ries in the submission directory, and at least one .go file is found in the submission, then gopacketwillbecopiedintosrcandgo installwillbeexecutedforthepackage.
5. ./ids path will be executed as a non-root user, where path is the absolute path to a pcap-format packet capture file.
6. The output of step 5 will be compared to the expected output for the test case.
7. If there are more tests to run, go to step 3. Otherwise, derive a mark based on the program outputs.
We will provide you with access to a system where you can submit your files to ensure that they will compile successfully in the marking environment. We will make an announcement through LEARN and/or email when this system becomes available. You will be expected to ensure that your code compiles in this environment before your final submission.
2.10 Programming Languages
You may implement your solution in several of the most popular programming languages. You may choose any of the languages supported by our marking system to use for your IDS implementation. You are expected to use libpcap to parse the contents of the pcap files. For C and C++, you can
19
use libpcap directly. For other languages, designated wrapper libraries will be made available to you. While it is very natural to parse packet contents in C and C++ due to native libpcap access and pointer arithmetic, libraries for other languages may offer richer processing capabilities.
The marking system runs Ubuntu 16.04; your submission is expected to operate in this environ- ment. The following table enumerates the supported languages, the available pcap libraries in the marking environment, and, for interpreted languages, the shebang (the line starting with #!) that you should include as the first line of your source file:
Language
Version
Shebang
pcap Library
C
gcc 5.4.0
(Makefile)
libpcap
C++
g++ 5.4.0
(Makefile)
libpcap
Go
1.6.2
(Makefile)
gopacket
Python
2.7.12 3.5.2
#!/usr/bin/env python
#!/usr/bin/env python3
Scapy & dpkt Scapy-python3
You may not use any third-party libraries other than the pcap libraries explicitly mentioned in the table, and the standard libraries for your chosen language. We do not guarantee support for parsing libraries in other languages within our marking environment. (In line with the testing and marking procedures, your IDS should not require Internet access to download any libraries or to complete any tasks as your code will be executed inside a virtual machine that does not have any network connectivity.)
We have provided project skeletons for each of the languages that are known to compile in the marking environment on LEARN and/or Piazza. We highly recommend that you use these files as the base for your implementation.
For solutions implemented in Go, your submission is extracted into the $GOPATH. Consequently, you will need to have bin, pkg, and src subdirectories inside your submission. However, you will still need to include a Makefile that copies your final executable to $GOPATH/ids. The provided skeleton accomplishes all of this. It places your code in src/ids/. If the marking script detects that your submission resembles a GOPATH, then it will copy and install the gopacket library within it before running your code. This means that you do not need to include gopacket with your submission.
2.11 Background & Hints
Writing solutions: To solve the problems in the programming part, you may find it helpful to follow these general steps:
20
1. Determine what the question is asking you to detect. What networking protocols are in- volved? What do these protocols accomplish in general, and how do they work at a high level? The short textbook sections mentioned in the references will briefly describe each of the protocols discussed above.
2. Determine what information you will need to check in the packet to identify the attack. (You may assume that all servers listen on the default ports for their protocols.) What fields will you need to check? What protocol layer contains the relevant fields? How can you determine the byte offset for the field? Examine the sample files and locate information on the web to help you. Installing Wireshark, as discussed below, will help you process the log files manually and understand their structure. The Wikipedia links will be handy references for the structure of packets, while you code your parser.
3. Write your implementation and test its correctness using the given sample files. Did you detect the sample file? Is your solution likely to produce false positives (i.e., is your IDS likely to encounter packets that would trigger your alert but are not attacks)?
References:
ARP: A quick animation of how ARP works. Wikipedia Structure of ARP packets. Section 11.5 of the van Oorschot textbook ARP and ARP spoofing descriptions. Figure 11.8 illustrates an ARP spoofing attack.
IPv4: Wikipedia. Section 10.6 Figure 10.14 shows an IP packet encapsulating a TCP segment/UDP datagram.
TCP: Wikipedia TCP server and clients. Section 10.6 of the van Oorschot textbook TCP Header, TCP Connection set-up and Figure 10.15 for the TCP header structure. Sec- tion 11.6 Figure 11.9 illustrates the TCP three-way handshake. Wikipedia TCP Segment Structure
UDP: UDP datagram structure
DNS: Section 11.5 of the van Oorschot textbook DNS description and DNS resolu- tion example. Wikipedia message format and protocol transport.
NTP: Overview of the amplification-based DDoS. Structure of NTP packets and detecting the attack.
Wireshark: Wireshark is a graphical application that can read the packet captures in the sample files. Using this tool, you can browse the packets in the files and examine their contents. For each packet, Wireshark will show you the contents of the various protocol layers. It will also interpret
21
the fields of each layer and highlight the bytes that correspond to each field. Using Wireshark, in combination with the references, you can discover the fields that your program will need to parse.
Note that integers within packets are stored in network order, which typically means big endian formatthe most significant bytes come first. On most hosts, this is the opposite of how integers are stored in memory. Depending on your language and library of choice, you may need to swap the byte orders before processing integers within your program. If you are writing in C or C++, you should use the ntoh family of functions. You may also find inet ntop to be useful.
22
Reviews
There are no reviews yet.