In principle, this overhead can be mitigated by adding a layer of types to the codebase, and building tools that use type information to solve the above problems. For example, types can be used to identify bugs, to document interfaces of libraries, and so on.
In other words, we want Flow’s analysis to be precise in practice—it must model essential characteristics of the language accurately enough to understand the difference between idiomatic code and unintentional mistakes.
Precision also has other desirable consequences. When types are trustworthy, developers tend to rely on them to structure their code and reason about it, leading to cleaner and more efficient code with fewer dynamic checks. When type errors are trustworthy, developers can focus on what their code does rather than thinking about how to rewrite their code to satisfy (or work around) the type system.
Finally, precision enables useful developer tools to be built. In particular, the quality of results reported by Flow when the developer asks for the type of an expression, the definition reaching a reference, or the set of possible completions at a point through an IDE is correlated with the precision of Flow’s analysis.
In other words, we must engineer Flow’s analysis to be extremely fast—it must respond to code changes without noticeable delay, while still being precise enough in practice.
Like precision, speed also has other significant effects. When bugs are reported as the developer makes changes to code, they become part of the editing process—the developer doesn’t need to run the code to detect bugs, and tracing bugs back to the code becomes simpler. Similarly, when the IDE can show the type of an expression, the definition reaching a reference, etc. as the developer is coding, we have observed that productivity can improve dramatically.
The key to Flow’s speed is modularity: the ability to break the analysis into file-sized chunks that can be assembled later.
With modularity, we can aggressively parallelize our analysis. Furthermore, when files change, we can incrementally analyze only those files that depend on the changed files. Together, these choices have helped scale the analysis to millions of lines of code.
Under the hood, Flow relies on a high-throughput low-latency systems infrastructure that enables distribution of tasks among parallel workers, and communication of results in parallel via shared memory. Combined with an architecture where the analysis of a codebase is updated automatically in the background on file system changes, Flow delivers near-instantaneous feedback as the developer edits and rebases code, even in a large repository.