{"id":656,"date":"2022-11-22T14:00:11","date_gmt":"2022-11-22T14:00:11","guid":{"rendered":"https:\/\/fde.cat\/index.php\/2022\/11\/22\/retrofitting-null-safety-onto-java-at-meta\/"},"modified":"2022-11-22T14:00:11","modified_gmt":"2022-11-22T14:00:11","slug":"retrofitting-null-safety-onto-java-at-meta","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2022\/11\/22\/retrofitting-null-safety-onto-java-at-meta\/","title":{"rendered":"Retrofitting null-safety onto Java at Meta"},"content":{"rendered":"<p><span>We developed a new static analysis tool called Nullsafe that is used at Meta to detect NullPointerException (NPE) errors in Java code.<\/span><br \/>\n<span>Interoperability with legacy code and gradual deployment model were key to Nullsafe\u2019s wide adoption and allowed us to recover some null-safety properties in the context of an otherwise null-unsafe language in a multimillion-line codebase.<\/span><br \/>\n<span>Nullsafe has helped significantly reduce the overall number of NPE errors and improved developers\u2019 productivity. This shows the value of static analysis in solving real-world problems at scale.<\/span><\/p>\n<p><span>Null dereferencing is a common type of programming error in Java. On Android, <\/span><span>NullPointerException<\/span><span> (NPE) errors are the <\/span><a href=\"https:\/\/developer.android.com\/games\/optimize\/crash#prevent-crashes-null-pointer\" target=\"_blank\" rel=\"noopener\"><span>largest cause of app crashes on Google Play<\/span><\/a><span>. Since Java doesn\u2019t provide tools to express and check nullness invariants, developers have to rely on testing and dynamic analysis to improve reliability of their code. These techniques are essential but have their own limitations in terms of time-to-signal and coverage.<\/span><\/p>\n<p><span>In 2019, we started a project called <\/span><span>0NPE<\/span><span> with the goal of addressing this challenge within our apps and significantly improving null-safety of Java code through static analysis.<\/span><\/p>\n<p><span>Over the course of two years, we developed Nullsafe, a static analyzer for detecting NPE errors in Java, integrated it into the core developer workflow, and ran a large-scale code transformation to make many million lines of Java code Nullsafe-compliant.<\/span><\/p>\n<p>Figure 1: Percent null-safe code over time (approx.).<\/p>\n<p><span>Taking <a href=\"https:\/\/engineering.fb.com\/2022\/11\/04\/web\/instagram-video-processing-encoding-reduction\/\" target=\"_blank\" rel=\"noopener\">Instagram<\/a>, one of Meta\u2019s largest Android apps, as an example, we observed a 27 percent reduction in production NPE crashes during the 18 months of code transformation. Moreover, NPEs are no longer a leading cause of crashes in both alpha and beta channels, which is a direct reflection of improved developer experience and development velocity.<\/span><\/p>\n<h2><span>The problem of <\/span><span>null<\/span><span>s<\/span><\/h2>\n<p><span>Null pointers are notorious for causing bugs in programs. Even in a tiny snippet of code like the one below, things can go wrong in a number of ways:<\/span><\/p>\n<p><span>Listing 1<\/span><span>: <\/span><span>buggy <\/span><span>getParentName<\/span><span> method<\/span><\/p>\n<p>Path getParentName(Path path) {<br \/>\n  return path.getParent().getFileName();<br \/>\n}<\/p>\n<p><span><span>getParent<\/span><span>()<\/span><\/span><span> may produce <\/span><span>null<\/span><span> and cause a <\/span><span>NullPointerException <\/span>locally<span> in <\/span><span>getParentName(\u2026)<\/span><span>.<\/span><br \/>\n<span><span>getFileName<\/span><span>()<\/span><\/span><span> may return <\/span><span>null<\/span><span> which may propagate further and cause a crash in some other place.<\/span><\/p>\n<p><span>The former is relatively easy to spot and debug, but the latter may prove challenging \u2014 especially as the codebase grows and evolves.\u00a0<\/span><\/p>\n<p><span>Figuring out nullness of values and spotting potential problems is easy in toy examples like the one above, but it becomes extremely hard at the scale of millions of lines of code. Then adding thousands of code changes a day makes it impossible to manually ensure that no single change leads to a <\/span><span>NullPointerException<\/span><span> in some other component. As a result, users suffer from crashes and application developers need to spend an inordinate amount of mental energy tracking nullness of values.<\/span><\/p>\n<p><span>The problem, however, is not the <\/span><span>null<\/span><span> value itself but rather the lack of explicit nullness information in APIs and lack of tooling to validate that the code properly handles nullness.<\/span><\/p>\n<h3><span>Java and nullness<\/span><\/h3>\n<p><span>In response to these challenges Java 8 introduced <\/span><span>java.util.Optional&lt;T&gt;<\/span><span> class. But its performance impact and legacy API compatibility issues meant that <\/span><span>Optional<\/span><span> could not be used as a general-purpose substitute for nullable references.<\/span><\/p>\n<p><span>At the same time, annotations have been used with success as a language extension point. In particular, adding annotations such as <\/span><span>@Nullable<\/span><span> and <\/span><span>@NotNull<\/span><span> to regular nullable reference types is a viable way to extend Java\u2019s types with explicit nullness while avoiding the downsides of <\/span><span>Optional<\/span><span>. However, this approach requires an external checker.<\/span><\/p>\n<p><span>An annotated version of the code from <\/span><span>Listing 1 <\/span><span>might look like this:<\/span><\/p>\n<p><span>Listing 2<\/span><span>: <\/span><span>correct and annotated <\/span><span>getParentName<\/span><span> method<\/span><\/p>\n<p>\/\/ (2)                          (1)<br \/>\n@Nullable Path getParentName(Path path) {<br \/>\n  Path parent = path.getParent(); \/\/ (3)<br \/>\n  return parent != null ? parent.getFileName() : null;<br \/>\n            \/\/ (4)<br \/>\n}<\/p>\n<p><span>Compared to a null-safe but not annotated version, this code adds a single annotation on the return type. There are several things worth noting here:<\/span><\/p>\n<p>Unannotated types are considered not-nullable<span>. This convention greatly reduces the annotation burden but is applied only to first-party code.<\/span><br \/>\nReturn type is marked <span>@Nullable<\/span><span> because the method can return <\/span><span>null<\/span><span>.<\/span><br \/>\nLocal<span> variable <\/span><span>parent<\/span><span> is not annotated, as its <\/span>nullness must be inferred<span> by the static analysis checker. This further reduces the annotation burden.<\/span><br \/>\n<span>Checking a value for <\/span><span>null <\/span>refines its type<span> to be not-nullable in the corresponding branch. This is called <\/span><span>flow-sensitive typing, <\/span><span>and it allows writing code idiomatically and handling nullness only where it\u2019s really necessary.<\/span><\/p>\n<p><span>Code annotated for nullness can be statically checked for null-safety. The analyzer can protect the codebase from regressions and allow developers to move faster with confidence.<\/span><\/p>\n<h3><span>Kotlin and nullness<\/span><\/h3>\n<p><span><a href=\"https:\/\/engineering.fb.com\/2022\/10\/24\/android\/android-java-kotlin-migration\/\" target=\"_blank\" rel=\"noopener\">Kotlin<\/a> is a modern programming language designed to interoperate with Java. In Kotlin, nullness is explicit in the types, and the compiler checks that the code is handling nullness correctly, giving developers instant feedback.\u00a0<\/span><\/p>\n<p><span>We recognize these advantages and, in fact, <\/span><a href=\"https:\/\/engineering.fb.com\/2022\/10\/24\/android\/android-java-kotlin-migration\/\" target=\"_blank\" rel=\"noopener\"><span>use Kotlin heavily at Meta<\/span><\/a><span>. But we also recognize the fact that there is a lot of business-critical Java code that cannot \u2014 and sometimes should not \u2014 be moved to Kotlin overnight.\u00a0<\/span><\/p>\n<p><span>The two languages \u2013 Java and Kotlin \u2013 have to coexist, which means there is still a need for a null-safety solution for Java.<\/span><\/p>\n<h2><span>Static analysis for nullness checking at scale<\/span><\/h2>\n<p><span>Meta\u2019s success building other static analysis tools such as <a href=\"https:\/\/fbinfer.com\/\" target=\"_blank\" rel=\"noopener\">Infer<\/a>, <\/span><a href=\"https:\/\/docs.hhvm.com\/hack\/\" target=\"_blank\" rel=\"noopener\"><span>Hack<\/span><\/a><span>, and <\/span><a href=\"https:\/\/flow.org\/\" target=\"_blank\" rel=\"noopener\"><span>Flow <\/span><\/a><span>and applying them to real-world code-bases made us confident that we could build a nullness checker for Java that is:\u00a0<\/span><\/p>\n<p>Ergonomic:<span> understands the flow of control in the code, doesn\u2019t require developers to bend over backward to make their code compliant, and adds minimal annotation burden.\u00a0<\/span><br \/>\nScalable:<span> able to scale from hundreds of lines of code to millions.<\/span><br \/>\nCompatible with Kotlin:<span> for seamless interoperability.<\/span><\/p>\n<p><span>In retrospect, implementing the static analysis checker itself was probably the easy part. The real effort went into integrating this checker with the development infrastructure, working with the developer communities, and then making millions of lines of production Java code null-safe.<\/span><\/p>\n<p><span>We implemented the first version of our nullness checker for Java as a <\/span><a href=\"https:\/\/fbinfer.com\/docs\/checker-eradicate\" target=\"_blank\" rel=\"noopener\"><span>part of Infer<\/span><\/a><span>, and it served as a great foundation. Later on, we moved to a compiler-based infrastructure. Having a tighter integration with the compiler allowed us to improve the accuracy of the analysis and streamline the integration with development tools.\u00a0<\/span><\/p>\n<p><span>This second version of the analyzer is called Nullsafe, and we will be covering it below.<\/span><\/p>\n<h3><span>Null-checking under the hood<\/span><\/h3>\n<p><span>Java compiler API was introduced via <\/span><a href=\"https:\/\/jcp.org\/en\/jsr\/detail?id=199\" target=\"_blank\" rel=\"noopener\"><span>JSR-199<\/span><\/a><span>. This API gives access to the compiler\u2019s internal representation of a compiled program and allows custom functionality to be added at different stages of the compilation process. We use this API to extend Java\u2019s type-checking with an extra pass that runs Nullsafe analysis and then collects and reports nullness errors.<\/span><\/p>\n<p><span>Two main data structures used in the analysis are the abstract syntax tree (AST) and control flow graph (CFG). See Listing 3 and Figures 2 and 3 for examples.<\/span><\/p>\n<p><span>The AST represents the syntactic structure of the source code without superfluous details like punctuation. We get a program\u2019s AST via the compiler API, together with the type and annotation information.<\/span><br \/>\n<span>The CFG is a flowchart of a piece of code: blocks of instructions connected with arrows representing a change in control flow. We\u2019re using the <\/span><a href=\"https:\/\/github.com\/typetools\/checker-framework\/tree\/master\/dataflow\" target=\"_blank\" rel=\"noopener\"><span>Dataflow<\/span><\/a><span> library to build a CFG for a given AST.<\/span><\/p>\n<p><span>The analysis itself is split into two phases:<\/span><\/p>\n<p><span>The <\/span>type inference<span> phase is responsible for figuring out nullness of various pieces of code, answering questions such as:<\/span><\/p>\n<p><span>Can this method invocation return <\/span><span>null<\/span><span> at program point X<\/span><span>?<\/span><br \/>\n<span>Can this variable be <\/span><span>null<\/span><span> at program point Y<\/span><span>?<\/span><\/p>\n<p><span>The <\/span>type checking<span> phase is responsible for validating that the code doesn\u2019t do anything unsafe, such as dereferencing a nullable value or passing a nullable argument where it\u2019s not expected.<\/span><\/p>\n<p><span>Listing 3<\/span><span>: <\/span><span>example <\/span><span>getOrDefault<\/span><span> method<\/span><\/p>\n<p>String getOrDefault(@Nullable String str, String defaultValue) {<br \/>\n  if (str == null) { return defaultValue; }<br \/>\n  return str;<br \/>\n}<br \/>\nFigure 2: CFG for code from Listing 3.<br \/>\nFigure 3: AST for code from Listing 3<\/p>\n<h4><span>Type-inference phase\u00a0<\/span><\/h4>\n<p><span>Nullsafe does type inference based on the code\u2019s CFG. The result of the inference is a mapping from expressions to nullness-extended types at different program points.<\/span><\/p>\n<p><em><span>state = expression x <\/span><span>program point \u2192 <\/span><span>nullness \u2013 extended type<\/span><\/em><\/p>\n<p><span>The inference engine traverses the CFG and <\/span><span>executes<\/span><span> every instruction according to the analysis\u2019 rules. For a program from <\/span><span>Listing 3<\/span><span> this would look like this:<\/span><\/p>\n<p><span>We start with a mapping at <\/span><span>&lt;entry&gt;<\/span><span> point:\u00a0<\/span><\/p>\n<p><span><span>{str <em>\u2192 <\/em><\/span><span> @Nullable String, defaultValue <em>\u2192 <\/em><\/span><span>String}<\/span><\/span><span>.<\/span><\/p>\n<p><span>When we execute the comparison <\/span><span>str<\/span><span> == <\/span><span>null<\/span><span>, the control flow splits and we produce two mappings:<\/span><\/p>\n<p><span>THEN: <\/span><span>{<span>str <em>\u2192 <\/em><\/span><\/span><span> @Nullable String, defaultValue <span><span><em>\u2192 <\/em><\/span><\/span><\/span><span><span> String<\/span>}<\/span><span>.<\/span><br \/>\n<span>ELSE: <\/span><span>{<\/span><span><span>str <em>\u2192 <\/em><\/span><span> String<\/span><span>, defaultValue <em>\u2192 <\/em><\/span><\/span><span><span> String<\/span>}<\/span><span>.<\/span><\/p>\n<p><span>When the control flow joins, the inference engine needs to produce a mapping that over-approximates the state in both branches. If we have <\/span><span>@Nullable String<\/span><span> in one branch and <\/span><span>String<\/span><span> in another, the over-approximated type would be <\/span><span>@Nullable String<\/span><span>.<\/span><\/p>\n<p>Figure 4: CFG with the analysis results<\/p>\n<p><span>The main benefit of using a CFG for inference is that it allows us to make the analysis flow-sensitive, which is crucial for an analysis like this to be useful in practice.<\/span><\/p>\n<p><span>The example above demonstrates a very common case where nullness of a value is refined according to the control flow. To accommodate real-world coding patterns, Nullsafe has support for more advanced features, ranging from contracts and complex invariants where we use SAT solving to interprocedural object initialization analysis. Discussion of these features, however, is outside the scope of this post.<\/span><\/p>\n<h4><span>Type-checking phase<\/span><\/h4>\n<p><span>Nullsafe does type checking based on the program\u2019s AST. By traversing the AST, we can compare the information specified in the source code with the results from the inference step.<\/span><\/p>\n<p><span>In our example from Listing 3, when we visit the <\/span><span>return str<\/span><span> node we fetch the inferred type of <\/span><span>str<\/span><span> expression, which happens to be <\/span><span>String<\/span><span>, and check whether this type is compatible with the return type of the method, which is declared as <\/span><span>String<\/span><span>.<\/span><\/p>\n<p>Figure 5: Checking types during AST traversal.<\/p>\n<p><span>When we see an AST node corresponding to an object dereference, we check that the inferred type of the receiver excludes <\/span><span>null<\/span><span>. Implicit unboxing is treated in a similar way. For method call nodes, we check that the inferred types of the arguments are compatible with method\u2019s declared types. And so on.<\/span><\/p>\n<p><span>Overall, the type-checking phase is much more straightforward than the type-inference phase. One nontrivial aspect here is error rendering, where we need to augment a type error with a context, such as a type trace, code origin, and potential quick fix.<\/span><\/p>\n<h4><span>Challenges in supporting generics<\/span><\/h4>\n<p><span>Examples of the nullness analysis given above covered only the so-called root nullness, or nullness of a value itself. Generics add a whole new dimension of expressivity to the language and, similarly, nullness analysis can be extended to support generic and parameterized classes to further improve the expressivity and precision of APIs.<\/span><\/p>\n<p><span>Supporting generics is obviously a good thing. But extra expressivity comes as a cost. In particular, type inference gets a lot more complicated.<\/span><\/p>\n<p><span>Consider a parameterized class <\/span><span>Map&lt;K, List&lt;Pair&lt;V1, V2&gt;&gt;&gt;<\/span><span>. In the case of <\/span>non-generic <span>nullness checker, there is only the root nullness to infer:<\/span><\/p>\n<p>\/\/ NON-GENERIC CASE<br \/>\n   \u2423 Map&lt;K, List&lt;Pair&lt;V1, V2&gt;&gt;<br \/>\n\/\/ ^<br \/>\n\/\/ &#8212; Only the root nullness needs to be inferred<\/p>\n<p><span>The <\/span>generic <span>case requires a lot more gaps to fill on top of an already complex flow-sensitive analysis:<\/span><\/p>\n<p>\/\/ GENERIC CASE<br \/>\n   \u2423 Map&lt;\u2423 K, \u2423 List&lt;\u2423 Pair&lt;\u2423 V1, \u2423 V2&gt;&gt;<br \/>\n\/\/ ^     ^    ^      ^      ^      ^<br \/>\n\/\/ &#8212;&#8211;|&#8212;-|&#8212;&#8212;|&#8212;&#8212;|&#8212;&#8212;|&#8212; All these need to be inferred<\/p>\n<p><span>This is not all. Generic types that the analysis infers must closely follow <\/span><span>the shape<\/span><span> of the types that Java itself inferred to avoid bogus errors. For example, consider the following snippet of code:<\/span><\/p>\n<p>interface Animal {}<br \/>\nclass Cat implements Animal {}<br \/>\nclass Dog implements Animal {}<\/p>\n<p>void targetType(@Nullable Cat catMaybe) {<br \/>\n  List&lt;@Nullable Animal&gt; animalsMaybe = List.of(catMaybe);<br \/>\n}<\/p>\n<p><span>List.&lt;T&gt;of(T\u2026)<\/span><span> is a generic method and in isolation the type of <\/span><span>List.of(catMaybe)<\/span><span> could be inferred as <\/span><span>List&lt;@Nullable Cat&gt;<\/span><span>. This would be problematic because generics in Java are invariant, which means that <\/span><span>List&lt;Animal&gt;<\/span><span> is not compatible with <\/span><span>List&lt;Cat&gt;<\/span><span> and the assignment would produce an error.<\/span><\/p>\n<p><span>The reason this code type checks is that the Java compiler knows the type of the target of the assignment and uses this information to tune how the type inference engine works in the context of the assignment (or a method argument for the matter). This feature is called <\/span><span>target typing<\/span><span>, and although it improves the ergonomics of working with generics, it doesn\u2019t play nicely with the kind of forward CFG-based analysis we described before, and it required extra care to handle.<\/span><\/p>\n<p><span>In addition to the above, the Java compiler itself has bugs (e.g., <\/span><a href=\"https:\/\/bugs.openjdk.org\/browse\/JDK-8225377\" target=\"_blank\" rel=\"noopener\"><span>this<\/span><\/a><span>) that require various workarounds in Nullsafe and in other static analysis tools that work with type annotations.<\/span><\/p>\n<p><span>Despite these challenges, we see <\/span>significant value in supporting generics<span>. In particular:<\/span><\/p>\n<p>Improved ergonomics<span>. Without support for generics, developers cannot define and use certain APIs in a null-aware way: from collections and functional interfaces to streams. They are forced to circumvent the nullness checker, which harms reliability and reinforces a bad habit. We have found many places in the codebase where lack of null-safe generics led to <\/span>brittle code and bugs<span>.<\/span><br \/>\nSafer Kotlin interoperability<span>. Meta is a heavy user of Kotlin, and a nullness analysis that supports generics closes the gap between the two languages and significantly <\/span>improves the safety of the interop<span> and the development experience in a heterogeneous codebase.<\/span><\/p>\n<h3><span>Dealing with legacy and third-party code<\/span><\/h3>\n<p><span>Conceptually, the static analysis performed by Nullsafe adds a new set of semantic rules to Java in an attempt to retrofit null-safety onto an otherwise null-unsafe language. The ideal scenario is that all code follows these rules, in which case diagnostics raised by the analyzer are relevant and actionable. The reality is that there\u2019s a lot of null-safe code that knows nothing about the new rules, and there\u2019s even more null-unsafe code. Running the analysis on such legacy code or even newer code that calls into legacy components would produce too much noise, which would add friction and undermine the value of the analyzer.<\/span><\/p>\n<p><span>To deal with this problem in Nullsafe, we separate code into three tiers:<\/span><\/p>\n<p>Tier 1: Nullsafe compliant code.<span> This includes first-party code marked as <\/span><span>@Nullsafe<\/span><span> and checked to have no errors. This also includes known good annotated third-party code or third-party code for which we have added nullness models.<\/span><br \/>\nTier 2: First-party code not compliant with Nullsafe.<span> This is internal code written without explicit nullness tracking in mind. This code is checked optimistically by Nullsafe.<\/span><br \/>\nTier 3: Unvetted third-party code.<span> This is third-party code that Nullsafe knows nothing about. When using such code, the uses are checked pessimistically and developers are urged to add proper nullness models.<\/span><\/p>\n<p><span>The important aspect of this tiered system is that when Nullsafe type-checks Tier <\/span><span>X<\/span><span> code that calls into Tier <\/span><span>Y<\/span><span> code, it uses Tier <\/span><span>Y<\/span><span>\u2019s rules. In particular:<\/span><\/p>\n<p><span>Calls from Tier 1 to Tier 2 are checked optimistically,<\/span><br \/>\n<span>Calls from Tier 1 to Tier 3 are checked pessimistically,<\/span><br \/>\n<span>Calls from Tier 2 to Tier 1 are checked according to Tier 1 component\u2019s nullness.<\/span><\/p>\n<p><span>Two things are worth noting here:<\/span><\/p>\n<p><span>According to point A, Tier 1 code can have unsafe dependencies or safe dependencies used unsafely. This unsoundness is the price we had to pay to streamline and gradualize the rollout and adoption of Nullsafe in the codebase. We tried other approaches, but extra friction rendered them extremely hard to scale. The good news is that as more Tier 2 code is migrated to Tier 1 code, this point becomes less of a concern.<\/span><br \/>\n<span>Pessimistic treatment of third-party code (point B) adds extra friction to the nullness checker adoption. But in our experience, the cost was not prohibitive, while the improvement in the safety of Tier 1 and Tier 3 code interoperability was real.<\/span><\/p>\n<p>Figure 6: Three tiers of null-safety rules.<\/p>\n<h3><span>Deployment, automation, and adoption<\/span><\/h3>\n<p><span>A nullness checker alone is not enough to make a real impact. The effect of the checker is proportional to the amount of code compliant with this checker. Thus a migration strategy, developer adoption, and protection from regressions become primary concerns.<\/span><\/p>\n<p><span>We found three main points to be essential to our initiative\u2019s success:<\/span><\/p>\n<p>Quick fixes<span> are incredibly helpful. The codebase is full of trivial null-safety violations. Teaching a static analysis to not only check for errors but also to come up with quick fixes can cover a lot of ground and give developers the space to work on meaningful fixes.<\/span><br \/>\nDeveloper adoption<span> is key. This means that the checker and related tooling should integrate well with the main development tools: build tools, IDEs, CLIs, and CI. But more important, there should be a working feedback loop between application and static analysis developers. <\/span><br \/>\nData and metrics<span> are important to keep the momentum. Knowing where you are, the progress you\u2019ve made, and the next best thing to fix really helps facilitate the migration.<\/span><\/p>\n<h2><span>Longer-term reliability impact<\/span><\/h2>\n<p><span>As one example, looking at 18 months of reliability data for the Instagram Android app:<\/span><\/p>\n<p><span>The portion of the app\u2019s code compliant with Nullsafe grew from 3 percent to 90 percent.<\/span><br \/>\n<span>There was a significant decrease in the relative volume of <\/span><span>NullPointerException<\/span><span> (NPE) errors across all release channels (see Figure 7). Particularly, in production, the volume of NPEs was reduced by 27 percent.<\/span><\/p>\n<p><span>This data is validated against other types of crashes and shows a real improvement in reliability and null-safety of the app.\u00a0<\/span><\/p>\n<p><span>At the same time, individual product teams also reported significant reduction in the volume of NPE crashes after addressing nullness errors reported by Nullsafe.\u00a0<\/span><\/p>\n<p><span>The drop in production NPEs varied from team to team, with <\/span>improvements ranging from 35 percent to 80 percent<span>.<\/span><\/p>\n<p><span>One particularly interesting aspect of the results is the <\/span>drastic drop in NPEs in the alpha-channel<span>. This directly reflects the improvement in the developer productivity that comes from using and relying on a nullness checker.<\/span><\/p>\n<p><span>Our north star goal, and an ideal scenario, would be to completely eliminate NPEs. However, real-world reliability is complex, and there are more factors playing a role:<\/span><\/p>\n<p><span>There is still null-unsafe code that is, in fact, responsible for a large percentage of top NPE crashes. But now we are in a position where targeted null-safety improvements can make a significant and lasting impact.<\/span><\/p>\n<p><span>The volume of crashes is not the best metric to measure reliability improvement because one bug that slips into production can become very hot and single-handedly skew the results. A better metric might be the number of new unique crashes per release, where we see <\/span><span>n<\/span><span>-fold improvement.<\/span><br \/>\n<span>Not all NPE crashes are caused by bugs in the app\u2019s code alone. A mismatch between the client and the server is another major source of production issues that need to be addressed via other means.<\/span><br \/>\n<span>The static analysis itself has limitations and unsound assumptions that let certain bugs slip into production.<\/span><\/p>\n<p><span>It is important to note that this is the <\/span>aggregate effect of hundreds of engineers using Nullsafe<span> to improve the safety of their code as well as the effect of<\/span> other reliability initiatives<span>, so we can\u2019t attribute the improvement solely to the use of Nullsafe. However, based on reports and our own observations over the course of the last few years, we\u2019re confident that Nullsafe played a significant role in driving down NPE-related crashes.<\/span><\/p>\n<p>Figure 7: Percent NPE crashes by release channel.<\/p>\n<h2><span>Beyond Meta<\/span><\/h2>\n<p><span>The problems outlined above are hardly specific to Meta. Unexpected <\/span><span>null<\/span><span>-dereferences have caused <\/span><a href=\"https:\/\/www.infoq.com\/presentations\/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare\/\" target=\"_blank\" rel=\"noopener\"><span>countless problems in different companies<\/span><\/a><span>. Languages like C# evolved into having <\/span><a href=\"https:\/\/docs.microsoft.com\/en-us\/dotnet\/csharp\/nullable-references\" target=\"_blank\" rel=\"noopener\"><span>explicit nullness<\/span><\/a><span> in their type system, while others, like Kotlin, had it from the very beginning.\u00a0<\/span><\/p>\n<p><span>When it comes to Java, there were multiple attempts to add nullness, starting with <\/span><a href=\"https:\/\/stackoverflow.com\/questions\/2289694\/what-is-the-status-of-jsr-305\" target=\"_blank\" rel=\"noopener\"><span>JSR-305<\/span><\/a><span>, but none was widely successful. Currently, there are many great static analysis tools for Java that can check nullness, including CheckerFramework, SpotBugs, ErrorProne, and NullAway, to name a few. In particular, Uber walked <\/span><a href=\"https:\/\/arxiv.org\/abs\/1907.02127\" target=\"_blank\" rel=\"noopener\"><span>the same path<\/span><\/a><span> by making their Android codebase null-safe using NullAway checker. But in the end, all the checkers perform nullness analysis in different and subtly incompatible ways. The lack of standard annotations with precise semantics has constrained the use of static analysis for Java throughout the industry.<\/span><\/p>\n<p><span>This problem is exactly what the <\/span><a href=\"https:\/\/jspecify.dev\/\" target=\"_blank\" rel=\"noopener\"><span>JSpecify workgroup<\/span><\/a><span> aims to address. The JSpecify started in 2019 and is a collaboration between individuals representing companies such as Google, JetBrains, Uber, Oracle, and others. Meta has also been part of JSpecify since late 2019.<\/span><\/p>\n<p><span>Although the <\/span><a href=\"https:\/\/jspecify.dev\/docs\/spec\" target=\"_blank\" rel=\"noopener\"><span>standard for nullness<\/span><\/a><span> is not yet finalized, there has been a lot of progress on the specification itself and on the tooling, with more exciting announcements following soon. Participation in JSpecify has also influenced how we at Meta think about nullness for Java and about our own codebase evolution.<\/span><\/p>\n<p>The post <a href=\"https:\/\/engineering.fb.com\/2022\/11\/22\/developer-tools\/meta-java-nullsafe\/\">Retrofitting null-safety onto Java at Meta<\/a> appeared first on <a href=\"https:\/\/engineering.fb.com\/\">Engineering at Meta<\/a>.<\/p>\n<p>Engineering at Meta<\/p>","protected":false},"excerpt":{"rendered":"<p>We developed a new static analysis tool called Nullsafe that is used at Meta to detect NullPointerException (NPE) errors in Java code. Interoperability with legacy code and gradual deployment model were key to Nullsafe\u2019s wide adoption and allowed us to recover some null-safety properties in the context of an otherwise null-unsafe language in a multimillion-line&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2022\/11\/22\/retrofitting-null-safety-onto-java-at-meta\/\">Continue reading <span class=\"screen-reader-text\">Retrofitting null-safety onto Java at Meta<\/span><\/a><\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[7],"tags":[],"class_list":["post-656","post","type-post","status-publish","format-standard","hentry","category-technology","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":334,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/building-data-pipelines-using-kotlin\/","url_meta":{"origin":656,"position":0},"title":"Building Data Pipelines Using Kotlin","date":"August 31, 2021","format":false,"excerpt":"Co-written by Alex\u00a0OscherovUp until recently, we, like many companies, built our data pipelines in any one of a handful of technologies using Java or Scala, including Apache Spark, Storm, and Kafka. But Java is a very verbose language, so writing these pipelines in Java involves a lot of boilerplate code.\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":271,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/faster-more-efficient-systems-for-finding-and-fixing-regressions\/","url_meta":{"origin":656,"position":1},"title":"Faster, more efficient systems for finding and fixing regressions","date":"August 31, 2021","format":false,"excerpt":"Every workday, Facebook engineers commit thousands of diffs (which is a change consisting of one or more files) into production. This code velocity allows us to rapidly ship new features, deliver bug fixes and optimizations, and run experiments. However, a natural downside to moving quickly in any industry is the\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":643,"url":"https:\/\/fde.cat\/index.php\/2022\/10\/24\/from-zero-to-10-million-lines-of-kotlin\/","url_meta":{"origin":656,"position":2},"title":"From zero to 10 million lines of Kotlin","date":"October 24, 2022","format":false,"excerpt":"We\u2019re sharing lessons learned from shifting our Android development from Java to Kotlin. Kotlin is a popular language for Android development and offers some key advantages over Java.\u00a0 As of today, our Android codebase contains over 10 million lines of Kotlin code. We\u2019re open sourcing various examples and utilities we\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":170,"url":"https:\/\/fde.cat\/index.php\/2020\/12\/14\/infer-powering-microsofts-infer-a-new-static-analyzer-for-c\/","url_meta":{"origin":656,"position":3},"title":"Infer powering Microsoft\u2019s Infer#, a new static analyzer for C#","date":"December 14, 2020","format":false,"excerpt":"What it is: Infer# brings the Infer static analysis platform to developers who use Microsoft\u2019s C# programming language. It can already detect null-pointer dereference and resource leak bugs, thanks to bi-abduction analysis. Detection of race conditions based on RacerD analysis is also in the works. Infer# has been used to\u2026","rel":"","context":"In &quot;External&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":548,"url":"https:\/\/fde.cat\/index.php\/2022\/03\/08\/an-open-source-compositional-deadlock-detector-for-android-java\/","url_meta":{"origin":656,"position":4},"title":"An open source compositional deadlock detector for Android Java","date":"March 8, 2022","format":false,"excerpt":"What the research is: We\u2019ve developed a new static analyzer that catches deadlocks in Java code for Android without ever running the code. What distinguishes our analyzer from past research is its ability to analyze revisions in codebases with hundreds of millions of lines of code. We have deployed our\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":728,"url":"https:\/\/fde.cat\/index.php\/2023\/06\/27\/meta-developer-tools-working-at-scale\/","url_meta":{"origin":656,"position":5},"title":"Meta developer tools: Working at scale","date":"June 27, 2023","format":false,"excerpt":"Every day, thousands of developers at Meta are working in repositories with millions of files. Those developers need tools that help them at every stage of the workflow while working at extreme scale. In this article we\u2019ll go through a few of the tools in the development process. And, as\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/656","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=656"}],"version-history":[{"count":0,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/656\/revisions"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=656"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=656"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=656"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}