{"id":549,"date":"2022-03-09T15:16:09","date_gmt":"2022-03-09T15:16:09","guid":{"rendered":"https:\/\/fde.cat\/index.php\/2022\/03\/09\/detecting-multithreaded-exfiltration-in-zeek\/"},"modified":"2022-03-09T15:16:09","modified_gmt":"2022-03-09T15:16:09","slug":"detecting-multithreaded-exfiltration-in-zeek","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2022\/03\/09\/detecting-multithreaded-exfiltration-in-zeek\/","title":{"rendered":"Detecting Multithreaded Exfiltration in Zeek"},"content":{"rendered":"<p>In a stereotypical heist, the thief\u2019s \u201cgrab-and-go\u201d plan often takes one of two forms. In the first, the thief grabs all they can and makes a loud and fast exit with explosives visible from miles away. In the second, the thief slowly and quietly amasses a fortune by stealing a few dollars at a time. It happens so slowly and subtly that only the most astute guards ever notice. What if a heist can be executed with both speed and stealth? You can imagine my surprise when I discovered that a thief in cyberspace could smuggle valuable data out of a network with both speed and stealth by utilizing a technique invented in\u00a01968.<\/p>\n<p>Data exfiltration (or exfil for short) is the unauthorized movement of data outside of a network. When an organization is attacked by malicious actors in cyberspace, stealing and exfiltrating data\u200a\u2014\u200abusiness plans, intellectual property, or customer information\u200a\u2014\u200ais often their goal. Multithreading is a common technique in a computer\u2019s central processing unit that divides a large task into many smaller parts called threads. A task\u2019s threads are executed in parallel, greatly reducing the overall runtime of the task. When network exfiltration leverages the concurrency of multithreading, threat actors can evade many traditional network detection mechanisms.<\/p>\n<p>Additionally, there is a decreasing barrier to entry for multithreading as it is supported by many programming languages (such as Python) and can be implemented with only a few additional lines of code. Thus, the Salesforce security team designed a simple method for detecting multithreaded exfiltration. Implemented as a module for the Zeek (formerly Bro) network security monitor, the script generates a notification when exfiltration via multithreading occurs.<\/p>\n<p>We have open sourced the Zeek script for detecting multithreaded exfiltration here: <a href=\"https:\/\/github.com\/salesforce\/multithreaded-exfil-detection\/\">https:\/\/github.com\/salesforce\/multithreaded-exfil-detection\/<\/a><\/p>\n<h3>Intro to Multithreading<\/h3>\n<p>If you are not familiar with multiprogramming concepts, here is a quick\u00a0review:<\/p>\n<p>A thread is a sequence of instructions within a program that can be executed independently of other code. For simplicity, you can assume that a thread is simply a subset of a process. Multithreading is defined as the ability of a processor to execute multiple threads concurrently.<\/p>\n<p>In a simple, single-core CPU, multithreading is achieved by frequently switching between threads. This is called \u201ccontext switching.\u201d In context switching, the state of a thread is saved and the state of another thread is loaded whenever an interrupt occurs. Context switching takes place so frequently that all the threads appear to be running in parallel.<\/p>\n<p>Threading can also be applied to networking tasks, including exfiltration. When threading is used to send data over a network, multiple connections and data streams concurrently send data over the\u00a0network.<\/p>\n<h3>Exfiltration\u200a\u2014\u200aThe Story so\u00a0Far<\/h3>\n<p>To understand why multithreaded exfiltration might evade standard detection logic, it is necessary to give a brief overview of the history of the hunt for network-based exfiltration. When a threat actor\u2019s goal is to exfiltrate data out of a corporate network to their command-and-control (C2) server, there are two main options: fast and loud or slow and stealthy.<\/p>\n<p>To use the analogy from earlier, \u201cfast and loud\u201d network exfiltration is like a thief that leaves a bank vault by blasting a hole in the wall with explosives and quickly escapes in a getaway car. When a threat actor tries to exfiltrate \u201cfast,\u201d it typically takes the form of massive file transfers leaving the internal network. These massive file transfers are \u201cloud\u201d because they are easily detected. When a massive file transfer occurs, the outbound byte rate and byte count of the network connection rapidly increases\u200a\u2014\u200aa clear indication of exfiltration activity. If the exfiltration is successful, the threat actor steals a lot of data in a single fell swoop, leaving incident responders with very few options since their information is already stolen and the damage is already done. However, the exfiltration attempt is often so blatantly obvious that it will alert security analysts to the\u00a0threat.<\/p>\n<p>In contrast, \u201cslow and stealthy\u201d network exfiltration is comparable to the thief that quietly steals a dollar here and a dollar there without anyone noticing that money is missing. This kind of exfiltration is \u201cslow\u201d because threat actors exfiltrate data to their C2 server over weeks or months. Why does it take such a long time to exfiltrate? In order to be \u201cstealthy,\u201d the exfiltration of data blends in with the noise of normal business operations. By exfiltrating a few bytes at a time over a long period of time, security analysts are unlikely to notice anything anomalous in their network\u00a0logs.<\/p>\n<p>Figure 1: This is a visualization of the traditional exfiltration methods. The thickness of the arrow represents the amount of data being transmitted and the length of the arrow represents the\u00a0time.<\/p>\n<h3>Multithreaded Exfiltration<\/h3>\n<p>Now that both multithreading and exfiltration have been introduced, it is time to see what happens when data exfiltration utilizes multithreading. Multithreaded exfiltration not only has the short time duration of \u201cfast and loud\u201d exfiltration but also evades detection more often than not. Network detection of fast and loud exfil often relies on a fundamental assumption: the exfiltrated information is sent over a single connection. Under this assumption, an alert will trigger if any single data stream meets a particular byte rate or threshold of exfiltrated bytes. However, when threading is used in data transfers over the network, many data streams\u200a\u2014\u200aeach a single thread\u200a\u2014\u200atransmit a small portion of the data to the attacker\u2019s C2 server. Individually, none of these data streams will raise any alerts as each one only carries a few bytes. Unfortunately, most network detection logic is not designed to aggregate numerous data streams to see a larger picture. Moreover, constantly tracking every data stream will tax the resources of any enterprise network detection system. Ultimately, however, the detection logic that only looks for a large burst of data in a single stream fails to detect multithreaded exfiltration.<\/p>\n<p>Figure 2: A comparison of fast and loud exfil with multithreaded exfil. The single massive data stream (represented by the large arrow) in the fast and loud exfil diagram is noticeable enough to raise an alert. In multithreaded exfil, no individual data stream (represented by the small arrows) is notable enough to raise an alert, even though the same amount of data is being exfiltrated.<\/p>\n<p>In addition to fooling traditional threshold-based detection, multithreaded exfil can also fly under the radar of the producer-consumer ratio (PCR). Devised by Carter Bullard and John Gerth, PCR was introduced to the world through their <a href=\"https:\/\/qosient.com\/argus\/presentations\/Argus.FloCon.2014.PCR.Presentation.pdf\">FloCon presentation<\/a>. Formally, the producer-consumer ratio is defined as \u201cA normalized value indicating directionality of application information transfer, independent of data load or rate.\u201d Simply put, the PCR is a value between -1 and 1. If a device sends as much data as it receives, it has a PCR of 0. If the PCR is negative, this indicates that the device has downloaded more data than it has uploaded. If the PCR is a positive value, more data was sent than received. Thus, if a device is exfiltrating large amounts of data, its ratio should be more positive than\u00a0usual.<\/p>\n<p>However, when testing multithreaded exfiltration over TLS, the PCR did not immediately skew to positive values. In fact, the overall PCR of the multithreaded exfiltration connections was negative! This was the case for two reasons. First, in this testing scenario, only 530 bytes of data were sent per packet. Second, every connection required a TLS handshake that included the C2 server sending a large TLS certificate. As a result, each connection had a PCR of -0.334184, indicating that more bytes were received than sent. This did not match the assumed attack scenario where the exfiltration connection has a positive PCR value. Thus, multithreaded exfiltration was able to go undetected by PCR\u00a0values.<\/p>\n<p>Trying to accurately detect and alert upon multithreaded exfil with current solutions is hard. Solutions such as Netflow collection can theoretically provide the logs and visibility necessary for discovering multithreaded exfiltration attempts. However, a Netflow-based solution would still require security analysts to sift through the logs to find the exfiltration.<\/p>\n<h3>Detection Strategy<\/h3>\n<p>Based on our analysis, multithreading over the network can be detected in one of two\u00a0ways:<\/p>\n<p><strong>Option 1<\/strong>: Calculate the ratio of outbound data to the number of connections. We expect a multithreaded upload to have many connections with a small amount of data in each. Thus, if the ratio is a high number, then it\u2019s probably not a multithreaded upload. Whereas if the ratio is a low number, then it is more likely to be a multithreaded upload.<\/p>\n<p><strong>Option 2<\/strong>: Track the number of connections between a source and destination address with unique source ports. A multithreaded upload should utilize numerous connections as it sends data over multiple connections.<\/p>\n<p>While both are theoretically viable options, we chose to go with Option 2 and track when multiple connections are seen with the same source IP address, destination IP address, and destination port, but different source\u00a0ports.<\/p>\n<p>We built our multithreaded exfiltration detection by modifying the <a href=\"https:\/\/github.com\/reservoirlabs\/bro-scripts\/tree\/master\/exfil-detection-framework\">exfil framework made by Reservoir Labs<\/a>. The exfil framework is a suite of Zeek scripts that detect file uploads in TCP connections, including TCP sessions that have encrypted payloads. The script tracks every established TCP connection to determine if exfiltration is occurring. To detect multithreaded exfil, we added a table that tracks whether multiple connections have the same source IP, destination IP, and destination port. We also added an event and a function that aggregate the total number of exfiltrated bytes from the various exfil connections. If both the minimum number of connections and byte count are surpassed, then an alert is written to Zeek\u2019s notice\u00a0log.<\/p>\n<p>When it comes to setting thresholds for exfiltrated data bytes, there is no one-size-fits-all approach, as environments vary. Fortunately, both the minimum number of bytes and the minimum number of connections can be easily modified in Zeek scripts. Ultimately, the detection logic is nothing more than a simple aggregation of all the threads. When testing the script against packet captures of multithreaded exfiltration, the script successfully raised a notice when the byte count threshold was surpassed, even though the bytes came from over 300 different data streams. Thus, we were able to accurately detect when exfiltration via multithreading had occurred on our\u00a0network.<\/p>\n<p>Of course, no detection tool is a silver bullet. In a production environment, the alerts from the exfiltration detection logic ought to be correlated with other log sources based on the connection parameters.<\/p>\n<h3>Testing<\/h3>\n<p>The script was tested on production sensors and over a trial run of 10 days. During this window, notices were raised when one of the machines in our environment made large outbound transmissions over multiple data streams. This indicated that the script was working as intended, detecting multiple streams of exchange between two\u00a0systems.<\/p>\n<p>During this time, resource utilization and performance metrics were also actively tracked. Despite our initial concerns, there was no abnormal behavior in the performance of our Zeek\u00a0sensors.<\/p>\n<p><strong>Conclusion<\/strong><\/p>\n<p>Detecting exfiltration is a game of cat and mouse. Threat actors will continually find new techniques for smuggling data out of networks and security teams will find new ways of detecting them. While exfiltration leveraging a technique more than half-a-century old may evade traditional detection logic, we were able to improve detection scripts on our network monitoring sensors to help address this new play As attackers implement new methods of exfiltration, detection products and security teams must match their\u00a0pace.<\/p>\n<p>We encourage you to experiment with our exfil scripts on your Zeek sensors: <a href=\"https:\/\/github.com\/salesforce\/multithreaded-exfil-detection\/\">https:\/\/github.com\/salesforce\/multithreaded-exfil-detection\/<\/a><\/p>\n<h3>Resources<\/h3>\n<h4>More on Exfiltration<\/h4>\n<p><a href=\"https:\/\/www.tripwire.com\/state-of-security\/mitre-framework\/the-mitre-attck-framework-exfiltration\/\">The MITRE ATT&amp;CK Framework: Exfiltration<\/a><\/p>\n<p><a href=\"https:\/\/www.blackhat.com\/presentations\/bh-usa-00\/Ron-Gula\/ron_gula.ppt\">Blackhat Presentation\u200a\u2014\u200aBypassing Intrusion Detection Systems by Ron\u00a0Gula<\/a><\/p>\n<p><a href=\"https:\/\/github.com\/reservoirlabs\/bro-producer-consumer-ratio\">Producer-Consumer Ratio in\u00a0Zeek<\/a><\/p>\n<h4>Other Detection Tools in\u00a0Zeek<\/h4>\n<p><a href=\"https:\/\/engineering.salesforce.com\/tls-fingerprinting-with-ja3-and-ja3s-247362855967\">JA3\u200a\u2014\u200aTLS Fingerprinting Standard<\/a><\/p>\n<p><a href=\"https:\/\/engineering.salesforce.com\/open-sourcing-hassh-abed3ae5044c\">HASSH\u200a\u2014\u200aA profiling method for SSH Clients and\u00a0Servers<\/a><\/p>\n<p><a href=\"https:\/\/engineering.salesforce.com\/gquic-protocol-analysis-and-fingerprinting-in-zeek-a4178855d75f\">GQUIC Protocol Analyzer\u200a\u2014\u200aVisibility into the increasingly popular UDP\u00a0protocol<\/a><\/p>\n<h3>Credits<\/h3>\n<p><a href=\"https:\/\/www.linkedin.com\/in\/manjulalwani\/\">Manju Lalwani (Research &amp; Project\u00a0Lead)<\/a><a href=\"https:\/\/twitter.com\/Ca8l3\">Caleb Yu (Implementation)<\/a><a href=\"https:\/\/www.reservoir.com\/\">Reservoir Labs (Original exfil framework inventors)<\/a><a href=\"https:\/\/salesforce.wd1.myworkdayjobs.com\/External_Career_Site\/1\/refreshFacet\/318c8bb6f553100021d223d9780d30be\">Salesforce Threat Detection Team (Hiring in US and\u00a0India)<\/a><\/p>\n<p><a href=\"https:\/\/engineering.salesforce.com\/detecting-multithreaded-exfiltration-in-zeek-e8a244885896\">Detecting Multithreaded Exfiltration in Zeek<\/a> was originally published in <a href=\"https:\/\/engineering.salesforce.com\/\">Salesforce Engineering<\/a> on Medium, where people are continuing the conversation by highlighting and responding to this story.<\/p>\n<p><a href=\"https:\/\/engineering.salesforce.com\/detecting-multithreaded-exfiltration-in-zeek-e8a244885896?source=rss----cfe1120185d3---4\" target=\"_blank\" class=\"feedzy-rss-link-icon\" rel=\"noopener\">Read More<\/a><\/p>","protected":false},"excerpt":{"rendered":"<p>In a stereotypical heist, the thief\u2019s \u201cgrab-and-go\u201d plan often takes one of two forms. In the first, the thief grabs all they can and makes a loud and fast exit with explosives visible from miles away. In the second, the thief slowly and quietly amasses a fortune by stealing a few dollars at a time.&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2022\/03\/09\/detecting-multithreaded-exfiltration-in-zeek\/\">Continue reading <span class=\"screen-reader-text\">Detecting Multithreaded Exfiltration in Zeek<\/span><\/a><\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[7],"tags":[],"class_list":["post-549","post","type-post","status-publish","format-standard","hentry","category-technology","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":588,"url":"https:\/\/fde.cat\/index.php\/2022\/02\/09\/detecting-multithreaded-exfiltration-in-zeek-2\/","url_meta":{"origin":549,"position":0},"title":"Detecting Multithreaded Exfiltration in Zeek","date":"February 9, 2022","format":false,"excerpt":"Detecting Multithreaded Exfiltration in Zeek In a stereotypical heist, the thief\u2019s \u201cgrab-and-go\u201d plan often takes one of two forms. In the first, the thief grabs all they can and makes a loud and fast exit with explosives visible from miles away. In the second, the thief slowly and quietly amasses\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":879,"url":"https:\/\/fde.cat\/index.php\/2024\/06\/12\/how-meta-trains-large-language-models-at-scale\/","url_meta":{"origin":549,"position":1},"title":"How Meta trains large language models at scale","date":"June 12, 2024","format":false,"excerpt":"As we continue to focus our AI research and development on solving increasingly complex problems, one of the most significant and challenging shifts we\u2019ve experienced is the sheer scale of computation required to train large language models (LLMs). Traditionally, our AI model training has involved a training massive number of\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":670,"url":"https:\/\/fde.cat\/index.php\/2023\/01\/27\/watch-metas-engineers-discuss-optimizing-large-scale-networks\/","url_meta":{"origin":549,"position":2},"title":"Watch Meta\u2019s engineers discuss optimizing large-scale networks","date":"January 27, 2023","format":false,"excerpt":"Managing network solutions amidst a growing scale inherently brings challenges around performance, deployment, and operational complexities.\u00a0 At Meta, we\u2019ve found that these challenges broadly fall into three themes: 1.) \u00a0 Data center networking: Over the past decade, on the physical front, we have seen a rise in vendor-specific hardware that\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":604,"url":"https:\/\/fde.cat\/index.php\/2022\/07\/06\/watch-metas-engineers-discuss-quic-and-tcp-innovations-for-our-network\/","url_meta":{"origin":549,"position":3},"title":"Watch Meta\u2019s engineers discuss QUIC and TCP innovations for our network","date":"July 6, 2022","format":false,"excerpt":"With more than 75 percent of our internet traffic set to use QUIC and HTTP\/3 together, QUIC is slowly moving to become the de facto protocol used for internet communication at Meta. For Meta\u2019s data center network, TCP remains the primary network transport protocol that supports thousands of services on\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":501,"url":"https:\/\/fde.cat\/index.php\/2021\/11\/09\/ocp-summit-2021-open-networking-hardware-lays-the-groundwork-for-the-metaverse\/","url_meta":{"origin":549,"position":4},"title":"OCP Summit 2021: Open networking hardware lays the groundwork for the metaverse","date":"November 9, 2021","format":false,"excerpt":"Open infrastructure technologies and networking hardware will play an important role as we build new technologies for the metaverse, where billions of people will someday come together in virtual spaces. As we head toward the next major computing platform with a continued spirit of embracing openness and disaggregation, we\u2019re announcing\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":710,"url":"https:\/\/fde.cat\/index.php\/2023\/05\/03\/the-malware-threat-landscape-nodestealer-ducktail-and-more\/","url_meta":{"origin":549,"position":5},"title":"The malware threat landscape: NodeStealer, DuckTail, and more","date":"May 3, 2023","format":false,"excerpt":"We\u2019re sharing our latest threat research and technical analysis into persistent malware campaigns targeting businesses across the internet, including threat indicators to help raise our industry\u2019s collective defenses across the internet. These malware families \u2013 including Ducktail, NodeStealer and newer malware posing as ChatGPT and other similar tools\u2013 targeted people\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/549","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=549"}],"version-history":[{"count":0,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/549\/revisions"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=549"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=549"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=549"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}