Saturday, December 9, 2017

Spare me the tedium of “simple” linting, please

In the development environments of singly-typed languages like JavaScript and Python, one popular build step for improving code quality is the “linting” or “style” tool. Tools like jshint and pyflakes point out misspellings and references to missing variables, idioms that frequently lead to bugs like == usage, calls with apparently the wrong number of arguments, and many other things.

Much of this is meant to mechanize the enforcement of a team’s style guidelines—at their best, sincere and effective tactics for avoiding common sources of bugs. Unfortunately, many of these guidelines can seem excessively pedantic, forcing the programmer to deal with cases that could not possibly happen.

Normally, it would make sense to tell the programmers to just suck it up and follow the rules. However, this tactic can lead to a couple bad outcomes.

  1. The lint check can lose support among developers, for being more trouble than it’s worth. If programmers feel that a lint is causing more code quality issues than it’s solving, they’ll sensibly support its removal.
  2. Lints without developer support tend to be disabled sooner or later, or simply mechanically suppressed at each point they would be triggered, via “ignore” markers and the like. At that point, the bug-catching benefits of the lint are completely eliminated; in the worst case, you have a universally ignored warning, and are even worse off than if the warning was simply disabled.
  3. If the errors spotted by the lint are serious enough, but the lint warns for too many exceptional cases, the developers might decide to move the style rule to manual code review, with an opportunity to argue for exceptional cases. This is labor-intensive, carries a much longer feedback cycle, and makes it easy to accept an argument for an exception to the style rule that is actually erroneous, as it is not machine-checked.

Many of these “too many false positives” warnings could be a lot better if they simply had more information to work with about what will happen at runtime. That way, they can avoid emitting the warning where, according to the extra information, a construct that appears dangerous will not be a problem in practice.

That is one thing that type systems are very good at. So lint users are on the right track; the solution to their woes of needless ritual is more static analysis, rather than less.

Let’s consider some common lints in JavaScript and see how the knowledge derived from types can improve them, reducing their false positive rate or simply making them more broadly useful.

Suspicious truthiness tests

(This example has the benefit of a recent implementation along the semantic lines I describe, in Flow 0.52. I take no credit for any of the specific lint suggestions I make in this article.)

In JavaScript, a common idiom for checking whether an “object” is not null or undefined is to use its own “truthiness”.

if (o)        // checking whether 'o' is defined
if (o.magic)  // checking whether 'o's 'magic' property is defined

This is concise and does the trick perfectly—if the value being tested isn’t possibly a number, string, or boolean. If it can be only an object, null, or undefined, then this is fine, because even the empty object {} is truthy, while null and undefined are both falsey. Unfortunately, in JavaScript, other classes of data have “falsey” values among them, such as 0 for number.

The “lint” solution to this problem is to always compare to null directly.

if (o != null)                     // not null or undefined
if (o !== null && o !== undefined) // required in more pedantic code styles

This might be encapsulated in a function, but still doesn’t approach the gentle idiom afforded by exploiting objects’ universal truthiness. But the object idiom is simply unsafe if the value could possibly be something like a number. So the question of whether you should allow this exception to the “always compare to null explicitly” lint boils down to “can I be sure that this can only be an object?” And if in review you decide “yes”, this is a decision that must constantly be revisited as code changes elsewhere change the potential types of the expression.

You want to mechanically rule out “possible bugs” that are not really possible in your use of the idiom, so that the linter will not warn about benign use of truthiness—it will save the warnings for code where it could actually be a problem. Ruling out impossible cases so that you only need cope with the possible is just the sort of programming job where the type system fits in perfectly.

A type system can say “this expression is definitely an object, null, or undefined”, and type-aware linting can use that information to allow use of the truthiness test. If that data comes from an argument, it can enforce that callers—wherever else in the program they might be—will not violate the premise of its allowance by passing in numbers or something else.

A type system can also say “this expression will definitely be an object, never null”, or vice versa, thus taking the lint even further—it can now tell the programmer that the if check is useless, because it will always be truthy or falsey, respectively. This is just the sort of premise that’s incredibly hard for humans to verify across the whole program, continuously, but is child’s play for a type-checker.

A type system such as Flow can even say “this expression could have been something dangerous for the truthy test, like number | object, but you ruled out the dangerous possibilities with earlier if tests, so it’s fine here”. Manual exemption can regress in cases like this with something as simple as reordering “if-else if-else” chains a little carelessly—keep in mind here the decades of failure that “be sufficiently careful” has had as a bug-avoidance tactic in programming—but type-aware linting will catch this right away, waking up to declare its previous exemption from the style rule null and void.

The more precise your types in your program, the more understanding of your use of this idiom—in only valid places—the type-aware linter will be. It will not make the reasoning mistakes that human review would make when allowing its use, so you can use it with more confidence, knowing that the lint will only call out cases where there’s a genuine concern of a non-null falsey value slipping in. And there is no need to argue in code review about which cases need != null and which don’t, nor to revisit those decisions as the program evolves; as circumstances change, the type checker will point out when the verbose check becomes unnecessary, or when the succinct check becomes unsafe.

References to undeclared variables

It’s very common to mistake a variable name that isn’t defined for one that is. The larger a program gets, the easier this mistake is to make.

This mistake comes in a few forms. It may be a simple misspelling. It may be a variable you thought was defined here, but is actually defined somewhere else. It may be a variable defined later, but not yet.

Whatever the meaning of the error, linters can catch the problem via a relatively simple method. As the file is scanned, the linter keeps track of what variables are currently in scope. When encountering a variable reference, it checks the variable name against its current working list of variables. If one is not in the list, the linter reports it.

This is better than nothing. Compared to what you can do with a type checker, though, it’s not very good at all.

Suppose that you have a few local functions defined at the top level of your module, foo, bar, and baz. A linter will point out an undeclared variable if you try to call fop, gar, or bax. So you don’t have to wait for the browser reload or test cycle; you can correct these errors right away.

Later on, your module is getting larger, so you decide to group some functions into objects to clean up the top-level namespace. You decide that foo, bar, and baz fit under the top-level object q.

const q = {
    foo(...) {...}

    bar(...) {...}

    baz(...) {...}
}

During your refactoring, references to these three functions are rewritten to q.foo, q.bar, and q.baz, respectively. This is a nice way to avoid ad hoc name prefixes as a grouping mechanism; you’re using a first-class language feature to do the grouping instead.

But let’s give a moment’s consideration to q.fop, q.gar, and q.bax. The linter will verify that the reference to q is sound; it’s declared at the module level as const q. However, the linter will not then verify that the “member” references are sound; that is a fact about the structure of q, not its mere existence.

When all you have is “simple” linting, this becomes a tax on modularity, so to speak. If a variable is defined locally—very locally—references to it are checked. If it is defined remotely, whether in another module or simply “grouped” in a submodule, references to it are not checked.

A type system cuts the modularity tax by tracking the names that are beyond the purview of the simple linter. In the case of q, type-checking tracks more than its existence; its statically-known structural features are reqorded as part of its type.

  • In this module, q is defined.
    • It is an object,
    • and is known to have properties:
      • foo,
        • a function which is…;
      • bar,
        • a function which is…, and
      • baz,
        • a function which is…

This continues to work at whatever depth of recursion you like. It works across modules, too: if I want to call baz from another module, I can import this module in that one, perhaps as mm, and then the reference mm.q.baz will be allowed, but mm.q.bax flagged as an error.

An undeclared-variable linter is all well and good, but if you want to take it to its logical conclusion, you need type checking.

hasOwnProperty guards

The best lints focus on elements of code hygiene we’re reluctant to faithfully practice while writing code, but come back to fill us with regret when we fail to do so. One example arises with using for to iterate over objects in JavaScript; the lint checks that you’re guarding the loop with hasOwnProperty.

The purpose of these guards is to handle non-plain objects; that is to say, objects “with class”. The guards are always suggested by the linter to avoid nasty surprises should you try iterating over the properties of an object “with class”.

The irony of this check is that the intent of such code is usually to work with plain objects only, that is, classy objects should not be present in the first place! The construct is still perfectly usable with classy objects; it’s just that more caution is called for when using it in that fashion.

As such, there are two basic scenarios of for iteration over objects.

  1. iteration over plain objects only, and
  2. iteration over potentially classy objects.

The focus of the hasOwnProperty guard ought to be #2, but this concern bleeds over into #1 cases, needlessly or not depending on how pessimistic you are about Object.prototype extension. But this question is moot for the linter, which can’t tell the difference between the two scenarios in the first place.

By contrast, a type checker can make this distinction. It can decide whether a variable is definitely a plain object, definitely not one, or might or might not be one. With that information, depending on your preferences, it could choose not to warn about a missing hasOwnProperty guard if it knows it’s looking at scenario #1. So if a hasOwnProperty is needless, it need not pollute your code for the sake of the scenarios where it will be needed.

Computer, please try to keep up

Humans are pretty good at coming up with complex contractual preconditions and postconditions for their programs, and making those programs’ correct operation depend on the satisfaction of those contracts. But humans are very bad at verifying those contracts, even as they make them more complex and depend more on them.

“Simple” linting tools are good at checking for the fulfillment of very simple rules to help ensure correctness, freeing the human mind to focus on the more complex cases. What makes them a chore to deal with—a reason to put solving linter warnings on the “technical debt backlog” instead of addressing them immediately—is “they don’t know what we know”; they can handle laborious low-level checks, but lack the sophisticated analysis humans use to decide the difference between a correct program and an incorrect one.

Happily, through static analysis, we can get much closer to human-level understanding’s inferences about a program, while preserving the tireless calculation of contractual fulfillment that makes linting such a helpful companion in the development cycle, doing the parts that humans are terrible at. When your linter is so “simple” that it becomes a hindrance, it’s time to put it down, and pick up a type system instead.

No comments:

Post a Comment