{"id":480,"date":"2021-09-29T18:00:06","date_gmt":"2021-09-29T18:00:06","guid":{"rendered":"https:\/\/fde.cat\/index.php\/2021\/09\/29\/open-sourcing-mariana-trench-analyzing-android-and-java-app-security-in-depth\/"},"modified":"2021-09-29T18:00:06","modified_gmt":"2021-09-29T18:00:06","slug":"open-sourcing-mariana-trench-analyzing-android-and-java-app-security-in-depth","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2021\/09\/29\/open-sourcing-mariana-trench-analyzing-android-and-java-app-security-in-depth\/","title":{"rendered":"Open-sourcing Mariana Trench: Analyzing Android and Java app security in depth"},"content":{"rendered":"<p><span>We\u2019re sharing details about<\/span><a href=\"https:\/\/mariana-tren.ch\/\"> <span>Mariana Trench<\/span><\/a><span> (MT), a tool we use to spot and prevent security and privacy bugs in Android and Java applications. As part of our effort to help scale security through building automation, we recently open-sourced MT to support security engineers at Facebook and across the industry.\u00a0<\/span><\/p>\n<p><span>This post is the third in our series of deep dives into the static and dynamic analysis tools we rely on. MT is the latest system, following <\/span><a href=\"https:\/\/engineering.fb.com\/2019\/08\/15\/security\/zoncolan\/\"><span>Zoncolan<\/span><\/a><span> and<\/span><a href=\"https:\/\/engineering.fb.com\/2020\/08\/07\/security\/pysa\/\"><span> Pysa<\/span><\/a><span>, built for Hack and Python code respectively.\u00a0<\/span><\/p>\n<p><span>Facebook\u2019s mobile applications, including Facebook, Instagram, and Whatsapp, run on millions of lines of code and are constantly evolving to enable new functionality and improve our services. To handle this volume of code, we build sophisticated systems that help our security engineers detect and review code for potential issues, rather than requiring them to rely on only manual code reviews. In the first half of 2021, over 50 percent of the security vulnerabilities we found across our family of apps were detected using automated tools.<\/span><\/p>\n<p><span>We built MT to focus particularly on Android applications. There are differences in patching and ensuring the adoption of code updates between mobile and web applications, so they require different approaches. While server-side code can be updated almost instantaneously for web apps, mitigating a security bug in an Android application relies on each user updating the application on the device they own in a timely way. This makes it that much more important for any app developer to put systems in place to help prevent vulnerabilities from making it into mobile releases, whenever possible.<\/span><\/p>\n<p><span>MT is designed to be able to scan large mobile codebases and flag potential issues on pull requests before they make it into production. It was built as a result of close collaboration between security and software engineers at Facebook who train MT to look at code and analyze how data flows through it. Analyzing data flows is useful because many security and privacy issues can be modeled as data flowing into a place it shouldn\u2019t.<\/span><\/p>\n<p><span>You can find MT on<\/span><a href=\"https:\/\/github.com\/facebook\/mariana-trench\/\"> <span>GitHub<\/span><\/a><span>, and we\u2019ve released a binary distribution on<\/span><a href=\"https:\/\/pypi.org\/project\/mariana-trench\/\"> <span>PyPI<\/span><\/a><span>. We\u2019ve also written a <\/span><a href=\"https:\/\/mariana-tren.ch\/docs\/getting-started\"><span>short tutorial<\/span><\/a><span> to help get you started. Our teams are actively developing and continuing to improve MT. We welcome your feedback: If you are interested in collaborating with us, please open an issue or reach out to us on GitHub.<\/span><\/p>\n<h2><span>How Mariana Trench works<\/span><\/h2>\n<p><span>MT works very similarly to <\/span><a href=\"https:\/\/engineering.fb.com\/2019\/08\/15\/security\/zoncolan\/\"><span>Zoncolan<\/span><\/a><span> and <\/span><a href=\"https:\/\/engineering.fb.com\/2020\/08\/07\/security\/pysa\/\"><span>Pysa<\/span><\/a><span>. The main difference is that MT is optimized for analyzing Android and Java applications. We briefly cover the basics in this blog post and encourage our readers to review our previous write-ups for a more in-depth technical explanation.<\/span><\/p>\n<p><span>Security engineers often think of vulnerabilities in terms of data flows that they don\u2019t want to see in their applications. For example, an application should not be logging sensitive data or be subject to injection vulnerabilities that would allow attackers to insert malicious code.<\/span><\/p>\n<p><span>In MT, a data flow can be described by:<\/span><\/p>\n<p><span>Source: a point of origin. This can be a user-controlled string entering the app through `Intent.getData`.<\/span><br \/>\n<span>Sink: a destination. In Android, this can be a call to `Log.w` or `Runtime.exec`.<\/span><span>\u00a0<\/span><\/p>\n<p><span>A large codebase can contain many different kinds of corresponding sources and sinks. We can tell MT to show us specific flows by defining <\/span><span>rules<\/span><span>. A rule could specify, for example, that we want to find <\/span><a href=\"https:\/\/support.google.com\/faqs\/answer\/9267555?hl=en\"><span>intent redirections<\/span><\/a><span> (issues that allow attackers to intercept sensitive data) by defining a rule that shows us all traces from \u201cuser-controlled\u201d sources to an \u201cintent redirection\u201d sink.<\/span><\/p>\n<p><span>MT finds possible paths from each source to its corresponding sink. It does this by computing a model for each Java method it sees in the codebase. The models are computed using a static analysis technique called <\/span><a href=\"https:\/\/en.wikipedia.org\/wiki\/Abstract_interpretation\"><span>abstract interpretation<\/span><\/a><span>.<\/span><\/p>\n<h2><span>How security engineers use Mariana Trench<\/span><\/h2>\n<p><span>MT is how security engineers scale their work as part of Facebook\u2019s defense-in-depth application security efforts.\u00a0<\/span><span>In a typical scenario, a security engineer would start by broadly defining the boundaries of the data flows she is interested in scanning the codebase for. For example, if she wants to find SQL injections, she would need to specify where user-controlled data is entering the code (e.g., intents in Android, the filesystem, etc.) and where it is not meant to go (e.g., any API constructing SQL queries). However, this is only the start \u2014 defining a rule connecting the two is not enough. Engineers also have to review the identified issues and refine the rules until the results are sufficiently high-signal.<\/span><\/p>\n<p><span>As with all engineering efforts, any tool that automatically scans code comes with inherent trade-offs. Traditionally, static analysis research has heavily focused on minimizing false positives. For security, that calculus can be very different. In using MT at Facebook, we prioritize finding more potential issues, even if it means showing more false positives. <\/span><span>This is because we care about edge cases: data flows that are theoretically possible and exploitable but rarely happen in production.\u00a0<\/span><\/p>\n<p><span>To help security engineers manage and triage the output, we built MT to let them quickly determine whether an issue is in fact a true positive by letting them search through results based on criteria such as the length of a trace or the specific functions encountered on a trace.\u00a0<\/span><\/p>\n<p><span>Once the rule has been created and has proved effective, we promote it to run on every pull request. If MT finds a flow violating the rule, the flow can then be surfaced to either an on-call security engineer or directly to the software engineer who made the pull request.\u00a0<\/span><\/p>\n<p><span>Rather than relying on MT as a silver bullet, we use it as part of the broader <\/span><a href=\"https:\/\/newsroom.fb.com\/news\/2019\/01\/designing-security-for-billions\/\"><span>defense-in-depth approach<\/span><\/a><span>. As Facebook invests in improving the fidelity of signals MT generates, security engineers continually iterate to refine rules and diagnose limitations of MT in collaboration with the software engineers building our apps.<\/span><\/p>\n<h2><span>Navigating the results: Static Analysis Post Processor<\/span><\/h2>\n<p><span>In addition to building the static analysis systems themselves, we\u2019ve created open source tooling to review and analyze the results produced by MT (as well as Pysa). We call our standalone processing tool Static Analysis Post Processor (<\/span><a href=\"https:\/\/github.com\/facebook\/sapp\"><span>SAPP<\/span><\/a><span>).<\/span><span>\u00a0<\/span><\/p>\n<p><span>We first <\/span><a href=\"https:\/\/www.youtube.com\/watch?v=8I3zlvtpOww\"><span>shared our work<\/span><\/a><span> on SAPP and how to use its command line interface (CLI) to navigate Pysa at DefCon in 2020. SAPP was purposely built to support different static analysis tools, and it supports MT out of the box.\u00a0<\/span><\/p>\n<p><span>SAPP takes the raw output from MT and makes it easy to triage the results. SAPP is designed to visually demonstrate how data can potentially flow from source to sink so it is easier for experts to quickly evaluate whether they agree with the tool\u2019s assessment.\u00a0<\/span><\/p>\n<p><span>SAPP\u2019s trace view illustrates the data flow step-by-step. It highlights the relevant lines of code, allowing the security engineer to walk through possible paths that eventually reach the same sink location in the code.\u00a0<\/span><\/p>\n<p><span>To give you an idea of what this looks like, here is a quick demo of how MT runs on a sample app:<\/span><\/p>\n<div class=\"fb-video\"><\/div>\n<p><span>As you can see, SAPP presents a list of issues, each of which is a potential vulnerability. Each issue contains one or more traces; if several traces are materially similar, they are grouped into the same issue to help evaluate whether the overall issue is valid. SAPP supports extensive filtering and search functionality to allow security engineers to focus on the results they want to explore within each list.<\/span><\/p>\n<h2><span>How to get started with Mariana Trench<\/span><\/h2>\n<p><span>MT is available on<\/span><a href=\"https:\/\/github.com\/facebook\/mariana-trench\/\"> <span>GitHub<\/span><\/a><span>, and we\u2019ve released a binary distribution on<\/span><a href=\"https:\/\/pypi.org\/project\/mariana-trench\/\"> <span>PyPI<\/span><\/a><span>. We\u2019ve also written a <\/span><a href=\"https:\/\/mariana-tren.ch\/docs\/getting-started\"><span>short tutorial<\/span><\/a><span> to help get you started.<\/span><\/p>\n<p><span>Our teams are actively developing MT to continue to improve it. If you have feedback or are interested in collaborating with us, please open an issue or reach out to us on GitHub.<\/span><\/p>\n<p><em><span>We\u2019d like to thank\u00a0<\/span><span>Maxime Arthaud, <\/span><span>Amar Bhosale,\u00a0<\/span><span>Gerben Janssen van Doorn,\u00a0<\/span><span>Yuh Shin Ong, <\/span><span>Chenguang Shen, <\/span><span>Simran Virk, <\/span><span>Shannon Zhu, <\/span><span>and everyone else who worked on Mariana Trench.<\/span><\/em><\/p>\n<p>The post <a href=\"https:\/\/engineering.fb.com\/2021\/09\/29\/security\/mariana-trench\/\">Open-sourcing Mariana Trench: Analyzing Android and Java app security in depth<\/a> appeared first on <a href=\"https:\/\/engineering.fb.com\/\">Facebook Engineering<\/a>.<\/p>\n<p>Facebook Engineering<\/p>","protected":false},"excerpt":{"rendered":"<p>We\u2019re sharing details about Mariana Trench (MT), a tool we use to spot and prevent security and privacy bugs in Android and Java applications. As part of our effort to help scale security through building automation, we recently open-sourced MT to support security engineers at Facebook and across the industry.\u00a0 This post is the third&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2021\/09\/29\/open-sourcing-mariana-trench-analyzing-android-and-java-app-security-in-depth\/\">Continue reading <span class=\"screen-reader-text\">Open-sourcing Mariana Trench: Analyzing Android and Java app security in depth<\/span><\/a><\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[7],"tags":[],"class_list":["post-480","post","type-post","status-publish","format-standard","hentry","category-technology","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":492,"url":"https:\/\/fde.cat\/index.php\/2021\/10\/20\/facebook-engineers-receive-2021-ieee-computer-society-cybersecurity-award-for-static-analysis-tools\/","url_meta":{"origin":480,"position":0},"title":"Facebook engineers receive 2021 IEEE Computer Society Cybersecurity Award for static analysis tools","date":"October 20, 2021","format":false,"excerpt":"Until recently, static analysis tools weren\u2019t seen by our industry as a reliable element of securing code at scale. After nearly a decade of investing in refining these systems, I\u2019m so proud to celebrate our engineering teams today for being awarded the IEEE Computer Society\u2019s Cybersecurity Award for Practice for\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":548,"url":"https:\/\/fde.cat\/index.php\/2022\/03\/08\/an-open-source-compositional-deadlock-detector-for-android-java\/","url_meta":{"origin":480,"position":1},"title":"An open source compositional deadlock detector for Android Java","date":"March 8, 2022","format":false,"excerpt":"What the research is: We\u2019ve developed a new static analyzer that catches deadlocks in Java code for Android without ever running the code. What distinguishes our analyzer from past research is its ability to analyze revisions in codebases with hundreds of millions of lines of code. We have deployed our\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":643,"url":"https:\/\/fde.cat\/index.php\/2022\/10\/24\/from-zero-to-10-million-lines-of-kotlin\/","url_meta":{"origin":480,"position":2},"title":"From zero to 10 million lines of Kotlin","date":"October 24, 2022","format":false,"excerpt":"We\u2019re sharing lessons learned from shifting our Android development from Java to Kotlin. Kotlin is a popular language for Android development and offers some key advantages over Java.\u00a0 As of today, our Android codebase contains over 10 million lines of Kotlin code. We\u2019re open sourcing various examples and utilities we\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":637,"url":"https:\/\/fde.cat\/index.php\/2022\/09\/30\/launching-a-new-chromium-based-webview-for-android\/","url_meta":{"origin":480,"position":3},"title":"Launching a new Chromium-based WebView for Android","date":"September 30, 2022","format":false,"excerpt":"Our in-app browser for Facebook on Android has historically relied on an Android System WebView based on Chromium, the open source project that powers many browsers on Android and other operating systems. On other mobile operating systems, the System WebView component cannot be updated without updating the entire operating system.\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":656,"url":"https:\/\/fde.cat\/index.php\/2022\/11\/22\/retrofitting-null-safety-onto-java-at-meta\/","url_meta":{"origin":480,"position":4},"title":"Retrofitting null-safety onto Java at Meta","date":"November 22, 2022","format":false,"excerpt":"We developed a new static analysis tool called Nullsafe that is used at Meta to detect NullPointerException (NPE) errors in Java code. Interoperability with legacy code and gradual deployment model were key to Nullsafe\u2019s wide adoption and allowed us to recover some null-safety properties in the context of an otherwise\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":334,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/building-data-pipelines-using-kotlin\/","url_meta":{"origin":480,"position":5},"title":"Building Data Pipelines Using Kotlin","date":"August 31, 2021","format":false,"excerpt":"Co-written by Alex\u00a0OscherovUp until recently, we, like many companies, built our data pipelines in any one of a handful of technologies using Java or Scala, including Apache Spark, Storm, and Kafka. But Java is a very verbose language, so writing these pipelines in Java involves a lot of boilerplate code.\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/480","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=480"}],"version-history":[{"count":0,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/480\/revisions"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=480"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=480"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=480"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}