, , , ,

[SOLVED] CS6035 Malware Analysis Project Solved

$25

File Name: CS6035_Malware_Analysis_Project_Solved.zip
File Size: 357.96 KB

5/5 - (1 vote)

Summer 2023 solution link: CLICK TO VIEWhttps://cs6035.s3.amazonaws.com/cs6035_p2_p3_p4_spring2023.ovaPlease note that the file is over 13GB, so it will take some time to download. Do not wait until the last minute to download it. Do it right away!SHA-256 for this file:6C1119C82B18B5C2CB87D52D708FAA9CFDE80D319E3C44C2D9A36C9D603AA859You will investigate and label some of the more sophisticated malware behaviors from the five malware samples we provided. Use the included JoeSandbox reports to identify the malwares behavior. Note that malware samples can share behaviors. So initially you should assume that each malware we question you about below has every behavior listed. Its your job to determine if that assumption is actually true.Hint: Look at the API/system call sequence under each process generated by the malware sample and determine what the malware is doing. Note that each JoeSandbox report may contain multiple processes with many different system call sequences. If any of the behaviors are seen (or attempted, but not necessarily successful) in any process in the report, then that malware has attempted that behavior. This is, of course, not completely practical, as legitimate applications may perform the same actions in a benign fashion. We are not concerned with differentiating the two in this assignment, but it is some food for thought.Clarification for attempted: We mean by attempted that a specific action was attempted but failed. By specific we mean that it is clear which action is attempted. If you have a registry key, for instance, that is unambiguous (like, say, it is used only to set a startup option), but it fails to change the key, that is an attempt for our purposes. But if you have a more generic registry key that governs multiple settings, we dont know for sure which key or keys it is attacking and so the action would not count as an attempt.You will encounter that the same API functions can end with either a W or an A. This is a standard practice in the Windows API, and this document explains the difference (either one could in theory be present in the wild): https://docs.microsoft.com/en-us/windows/desktop/intl/unicode-in-the-windows-apiFor each of the following questions, mark which of the malware exhibit the identified behavior:DELIVERABLE: Your deliverable for this part of the assignment will be your final JSON file with your answers to the 20 questions.Download the submission template or use the JSON format below for your answers:The submitted answers should be in the format (this is an example only):The naming of the submission file is not important, as long as it is JSON (submission.json is an example). You will have 20 attempts to submit your answers. If you attempt to make more submissions than the limit, your grade will be a ZERO for this Phase. You will be able to choose your best submission of the 20 manually in Gradescope, but this MUST be done BEFORE the project deadline. No late submissions or requests to update the submission will be accepted after the project deadline. Please submit the answers in the JSON file in the Gradescope assignment Project Malware Analysis Phase I.In this phase you will learn how to apply Machine Learning concepts to malware classification. Youll be given a dataset of malware samples. Using Malheur, the software used for clustering malware in this project, youll run an unsupervised learning clustering algorithm in order to classify them by behavior.All other values can be changed in the configuration file. Refer the malheur manual for specifics on each configuration parameter.Each semester, many students are concerned about the value of ngram_len in the configuration file and how it relates to this project. The ngram_len parameter is one of the parameters that can be changed in the configuration file, and you may submit any value for this parameter. The malheur manual states the following about ngram_len:This parameter specifies the length of n-grams. If the events in the reports are not sequential, this parameter should be set to 1. In all other cases, it determines the length of event sequences to be mapped to the vector space, so called n-grams.Malheur manualWhile the malware behavior is encoded by listing all API calls in sequential order, (see Understanding the dataset below) if you receive better performance by selecting a different parameter value than the default provided to you, you may select another value.Then, we combine the closest two clusters.And repeat the process.We do this until each of the closest two clusters are fairly far away from one another. The exact minimum distance between the final clusters, can usually be set as a parameter to the clustering algorithm. This gives us the clusters we were expecting:0bc19b9304d5c409b9f480a9121c8c8abcef2f3a595ed6b2758daeb2d679b74a.dinwod(Since the file extension for this malware sample is dinwod, it belongs to malware family Dinwod.)0bc19b9304d5c409b9f480a9121c8c8abcef2f3a595ed6b2758daeb2d679b74a.dinwodhttps://www.virustotal.com/gui/file/0bc19b9304d5c409b9f480a9121c8c8abcef2f3a595ed6b2758daeb2d679b74aWe have already provided the malheur binaries, datasets, and example configuration files. Once you are inside Project 2 VM, just enter the avml directory on the VM desktop using a terminal:$ cd /home/debian/Desktop/avml/Now, its time to get familiar with malheur. As an initial warm up exercise, lets learn same sample commands:Cluster samples (training phase):$ malheur -c config.mlw -o training.txt -vv cluster dataset/training/; head training.txtVerify clustering (testing phase):$ malheur -c config.mlw -o testing.txt -vv classify dataset/testing/; head testing.txtClassify Project 2 samples (classification phase):$ malheur -c config.mlw -o classify.txt -vv classify subjects/; head classify.txtGoal 1 is a prerequisite for Goal 2. Therefore, any results for Goal 2 are only considered valid if your configuration parameters also allow for Goal 1 to be achieved.Now that you have some experience analyzing malware, take a moment to read this brief reflection. For this project you used analysis tools that do the analysis for you. In practice, entire teams of people are devoted to work on a single malware executable at a time to debug it, disassemble it and study its binary, perform static analysis techniques, dynamic analysis techniques, and other techniques to thoroughly understand what the malware is doing. Luckily for you, it takes an enormous amount of time to perfect/improve the skills of malware analysis, so we dont require it for this project. However, to give you a scale of how much work this all takes, consider that antivirus companies receive somewhere on the order of 250,000 samples of (possible) malware every day. We had you analyze 5 binaries. Imagine the types of systems needed to handle this amount of malware and study them thoroughly enough for that day, because the next day theyre going to receive 250,000 new samples. If a malware analysis engine is unable to analyze a piece of malware within a day, theyve already lost to malware authors. Also consider that not all of the 250,000 samples will be malicious. According to Prudent Practices for Designing Malware Experiments: Status Quo and Outlook(Rossow et al., 2012), as many as 3-30% may be benign!Another way to look at the size issue of malware analysis, consider this paper Needles in a Haystack: Mining Information from Public Dynamic Analysis Sandboxes for Malware Intelligence.(Graziano et al., 2015) where the authors discovered that notorious malware samples had actually been submitted months, even years before the malware was detected and classified as malicious in the wild.Remember, analyzing malware is a delicate and potentially dangerous act. Please be cautious and use good practices when analyzing malware in the future. If you let malware run for too long, you may be contributing to the problem and may be contacted by the FBI (and/or other authorities) as a result of this unintentional malicious contribution. At Georgia Tech, researchers, professors, and graduate students are able to analyze malware in controlled environments and have been given permission by the research community to perform these analyses long-term. We make efforts to contact the general research community and Georgia Techs OIT Department to inform them that we are running malware, so they wont raise red flags if they detect malicious activity coming out of our analysis servers.For your curious mind:There is disagreement in the malware research community as to what exactly classifies malicious activity. For example, some say that adware is a form of malware, while others do not. Can you think of arguments for either side? Lets take this kind of thinking one step further. As a thought experiment, ask yourself this: If a piece of software has malicious code contained within it, but the malicious code is never executed when it is run, is/should that software be considered malicious? What if the malware author intentionally put in a buffer overflow vulnerability that allows someone to execute that malicious code? So, the only way of knowing the code can be executed is to exploit the malware. This seems like it would be a much more advanced form of trigger malware doesnt it? Think of other tricks malware authors may employ to prevent researchers from discovering a malwares true intentions.Be careful if you ever get your hands-on malware source code. We always make sure we read and fully understand malware source code before we compile and run it. Remember, safety is the number one priority in malware analysis.If youre interested in reading more information about researching malware, we recommend you read The Art of Computer Virus Research and Defense by Peter Szor. Its known in the research community as a must-read for those interested in studying malware.ReferencesRossow, C., Dietrich, C. J., Grier, C., Kreibich, C., Paxson, V., Pohlmann, N., . . . Steen, M. V. (2012). Prudent practices for designing malware experiments: Status quo and outlook. 2012 IEEE Symposium on Security and Privacy. doi:10.1109/sp.2012.14 http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6234405Graziano, M., Canali, D., Bilge, L., Lanzi, A. & Balzarotti, D. (2015). Needles in a Haystack: Mining Information from Public Dynamic Analysis Sandboxes for Malware Intelligence.. In J. Jung & T. Holz (eds.), USENIX Security Symposium (p./pp. 1057-1072), : USENIX Association. https://www.usenix.org/system/files/conference/usenixsecurity15/sec15-paper-graziano.pdf

Shopping Cart
[SOLVED] CS6035 Malware Analysis Project Solved
$25