The goals of this project:
Additional information:
Accessing project resources:
Setup (0 points)
Please note that the file is over 9GB, so it will take some time to download. Do not wait until the last minute to download it. Do it right away!
When you begin each phase, make sure to change directories for each part of the project.
TABLE OF CONTENTS
Phase 1 (50 points):
You will investigate and label some of the more sophisticated malware behaviors from the five malware reports we provided. Use the included JoeSandbox reports to identify the malware’s behavior. Note that malware samples can share behaviors. So initially you should assume that each malware we question you about below has every behavior listed. It’s your job to determine if that assumption is actually true.
Hint: Look at the API/system call sequence under each process generated by the malware sample and determine what the malware is doing. Note that each JoeSandbox report may contain multiple processes with many different system call sequences. If any of the behaviors are seen (or attempted, but not necessarily successful) in any process in the report, then that malware has attempted that behavior. This is, of course, not completely practical, as legitimate applications may perform the same actions in a benign fashion. We are not concerned with differentiating the two in this assignment, but it is some food for thought.
Clarification for attempted: We mean by “attempted” that a specific action was attempted but failed. By “specific” we mean that it is clear which action is attempted. If you have a registry key, for instance, that is unambiguous (like, say, it is used only to set a startup option), but it fails to change the key, that is an attempt for our purposes. But if you have a more generic registry key that governs multiple settings, we don’t know for sure which key or keys it is attacking and so the action would not count as an “attempt”.
You will encounter that the same API functions can end with either a W or an A. This is a standard practice in the Windows API, and this document explains the difference (either one could in theory be present in the wild): https://docs.microsoft.com/en-us/windows/desktop/intl/unicode-in-thewindows-api
For each of the following questions, mark (true/false) which of the malware exhibit the identified behavior:
DELIVERABLE: Your deliverable for this part of the assignment will be your final JSON file with your answers to the 20 questions.
Download the submission template or use the JSON format below for your answers:
{
“sample1”: {
“behavior01”: ,
“behavior02”: ,
“behavior03”: ,
“behavior04”: ,
“behavior05”: ,
“behavior06”: ,
“behavior07”: ,
“behavior08”: ,
“behavior09”: ,
“behavior10”: ,
“behavior11”: ,
“behavior12”: ,
“behavior13”: ,
“behavior14”: ,
“behavior15”: ,
“behavior16”: ,
“behavior17”: ,
“behavior18”: ,
“behavior19”: , “behavior20”:
},
“sample2”: {
“behavior01”: ,
“behavior02”: ,
“behavior03”: ,
“behavior04”: ,
“behavior05”: ,
“behavior06”: ,
“behavior07”: ,
“behavior08”: ,
“behavior09”: ,
“behavior10”: ,
“behavior11”: ,
“behavior12”: ,
“behavior13”: ,
“behavior14”: ,
“behavior15”: ,
“behavior16”: ,
“behavior17”: ,
“behavior18”: ,
“behavior19”: , “behavior20”:
},
“sample3”: {
“behavior01”: ,
“behavior02”: ,
“behavior03”: ,
“behavior04”: ,
“behavior05”: ,
“behavior06”: ,
“behavior07”: ,
“behavior08”: ,
“behavior09”: ,
“behavior10”: ,
“behavior11”: ,
“behavior12”: ,
“behavior13”: ,
“behavior14”: ,
“behavior15”: ,
“behavior16”: ,
“behavior17”: ,
“behavior18”: ,
“behavior19”: , “behavior20”:
},
“sample4”: {
“behavior01”: ,
“behavior02”: ,
“behavior03”: ,
“behavior04”: , “behavior05”: ,
“behavior06”: ,
“behavior07”: ,
“behavior08”: ,
“behavior09”: ,
“behavior10”: ,
“behavior11”: ,
“behavior12”: ,
“behavior13”: ,
“behavior14”: ,
“behavior15”: ,
“behavior16”: ,
“behavior17”: ,
“behavior18”: ,
“behavior19”: , “behavior20”:
},
“sample5”: {
“behavior01”: ,
“behavior02”: ,
“behavior03”: ,
“behavior04”: ,
“behavior05”: ,
“behavior06”: ,
“behavior07”: ,
“behavior08”: ,
“behavior09”: ,
“behavior10”: ,
“behavior11”: ,
“behavior12”: ,
“behavior13”: ,
“behavior14”: ,
“behavior15”: ,
“behavior16”: ,
“behavior17”: ,
“behavior18”: ,
“behavior19”: , “behavior20”: }
}
The submitted answers should be in the format (this is an example only):
{
“sample1”: {
“behavior01”: true,
“behavior02”: false,
“behavior03”: true, “behavior04”: true, .
.
.
}
The naming of the submission file is not important, as long as it is JSON (“submission.json” is an example). Incorrectly formatted JSON files or typos count as a submission if the submission attempt fails. We have provided a validation script named “json_validator.py” which will check your file for proper formatting. To run the validator on your file, use the following command: “python json_validator.py /path/to/solution.json” at the command line in the /home/malware directory. The validator will either return “JSON file correctly formatted.” if the submission file is correct, or will return the errors found. It is not required to use the validation script, although it is highly recommend to prevent erroneous submissions. We will not provide extra submission attempts. This validation script works only for Phase 1 and the Extra Credit portions of the project.
For Phase 1, you will have 5 attempts to submit your answers. Improperly formatted JSON files will fail and count as a submission. If you attempt to make more submissions than the limit, your grade will be a ZERO for submissions past five. You must fill out “true” or “false” for all 100 behaviors (5 samples X 20 behaviors) or the submission will fail and count as one submission attempt. You will want to choose your best submission of the first 5 manually in Gradescope, but this MUST be done BEFORE the project deadline. No late submissions or requests to update the submission will be accepted after the project deadline. Please submit the answers in the JSON file in the Gradescope assignment Project Malware Analysis – Phase I.
Phase 2 (50 points)
For this phase, we will be going over some of the basic concepts of malware analysis. None of the samples or scripts provided here are actually malicious, but they are provided as a way to understand the basic concepts of static and dynamic analysis.
To do so, we will work with the samples by de-obfuscating and executing various samples as needed to understand how the samples function. The overall goal of each task will be to run the program or call the correct endpoint with the correct data to get your flag to send to the autograder.
NOTE:
When handling actual malware, additional due diligence is needed to ensure that you don’t accidentally infect your own machine or other machines on your network. The overall process for setting this environment up is outside the scope of this project, but you can find many helpful resources online along with CS6747: “Advanced Malware Analysis” if you wish to continue studies on your own. There are no malicious malware samples in the VM.
To get started we will work through a number of simple scripts to understand some basics about deobfuscation that will be helpful in later exercises. Malware authors will often obfuscate their payloads through various means to attempt to bypass IPS and AV systems, as well as to increase the effort required by analysts to contain and remediate a breach. Understanding some of these techniques will be important when we go to analyze some of the other samples in this project.
These are some basic concepts of static analysis and are often used by malware authors and red team (penetration testers) operators in their work. All of these warm ups should provide a script for you to execute with your GTID and get a flag if you do so correctly. De-obfuscate the samples below and execute them to get your flag.
We saw this sample come in earlier. It performs some simple encoding to execute the command. It looks like it spits out a flag, but we aren’t totally sure.
Can you figure out how to get your flag?
base64 -d <<< IyEgL3Vzci9iaW4vc2gKYTEoKXsKICBlY2hvICJPaCBsb29rLCB0aGlzIGlzIGRlZmluaXRlbHkgYSBmbGFnOiAkKGVj
Great job on the last one. This one is a little less straightforward though. The attacker left this long string behind. We think that they were trying to pack something in this string by compressing it, but we aren’t sure what.
Can you figure out what is going on here?
Hints:
N3q8ryccAASoqGIr+RMAAAAAAAAVAAAAAAAAAH2lL03gE50TlF0AKBK8YCl3X7OgZocDYaJosK2umXg2E4a5Nb0ICtCLgteu5MrmbJIpgu
One more to go! This long string of text was left in another file on the system. Can you figure out what is going on?
These samples are set up to roughly approximate some Command and Control (C2) traffic between the client samples and the server we will run. To perform this analysis, you will start the server container, and then you will execute the client scripts to see what actions they perform.
Much of the dynamic network analysis can be performed with Wireshark, and some additional static analysis work may need to be done to look at the samples and what they are executing.
Additionally, you will need to craft your own requests to send to the C2 server to get your flag. You are welcome to do this using cURL, python, or whatever other HTTP request program you like to use. To get your flag, you will need to send a request to the correct endpoint followed by your GTID.
Example provided below: http://localhost:8085/path/to/endpoint/9999999999
Once you analyze the samples and submit a successful request to get your flag, you’ll receive a JSON message that looks something like the following:
{
“flag”: “Now that’s a flag: <your flag value will be here>”
}
This is a simple example to get started and make sure that you have all of your pieces set up correctly to capture the traffic between the client and server.
In this sample, the initial client-1 program acts as the first stage of the malware sample. Your goal is as follows:
In this sample, the client-2 program makes a couple of calls and performs some familiar obfuscation techniques. Perform the following steps:
Submit your flags in GradeScope as a json file named ‘phase2.json’ with the following format:
{
“warmup1”: “replace_the_placeholder_flag”,
“warmup2”: “replace_the_placeholder_flag”,
“warmup3”: “replace_the_placeholder_flag”,
“client0”: “replace_the_placeholder_flag”,
“client1”: “replace_the_placeholder_flag”,
“client2”: “replace_the_placeholder_flag”
}
You can also use the provided template file to build your submission.
Reviews
There are no reviews yet.