Add documentation for AbstractWalker and Performance best practices (p…

…alantir#2179)
felixfbecker · Feb 9, 2017 · bca26e4 · bca26e4
1 parent d68ba8e
commit bca26e4
Show file tree

Hide file tree

Showing 2 changed files with 263 additions and 0 deletions.
diff --git a/docs/develop/custom-rules/migration.md b/docs/develop/custom-rules/migration.md
@@ -0,0 +1,124 @@
+## Using `WalkContext<T>` and `Rule#applyWithFunction`
+If you have a rule with a pretty simple implementation, you don't need to declare a class which extends the `Walker` class. Instead, you can define a callback function that accepts following argument:
+
+- `ctx: WalkContext<T>`: An object containing rule information, an object `options: T` containing the parsed rule arguments, the `ts.sourceFile` object, and functions for adding failures
+
+Use this callback as an argument to `applyWithFunction`. You can also pass your parsed rule arguments as optional 3rd parameter.
+
+Let's look at `no-null-keyword` as an example:
+```ts
+import * as ts from "typescript";
+import * as Lint from "tslint";
+
+export class Rule extends Lint.Rules.AbstractRule {
+    public static FAILURE_STRING = "Use 'undefined' instead of 'null'";
+
+    public apply(sourceFile: ts.SourceFile): Lint.RuleFailure[] {
+        // Call `applyWithFunction` with your callback function, `walk`.
+        // This creates a `WalkContext<T>` and passes it in as an argument.
+        // An optional 3rd parameter allows you to pass in a parsed version of `this.ruleArguments`. If used, it is not recommended to
+        //     simply pass in `this.getOptions()`, but to parse it into a more useful object instead.
+        return this.applyWithFunction(sourceFile, walk);
+    }
+}
+
+// Here, the options object type is `void` because we don't pass any options in this example.
+function walk(ctx: Lint.WalkContext<void>) {
+    // Recursively walk the AST starting with root node, `ctx.sourceFile`.
+    // Call the function `cb` (defined below) for each child.
+    return ts.forEachChild(ctx.sourceFile, cb);
+
+    function cb(node: ts.Node): void {
+        // Stop recursing further into the AST by returning early. Here, we ignore type nodes.
+        if (node.kind >= ts.SyntaxKind.FirstTypeNode && node.kind <= ts.SyntaxKind.LastTypeNode) {
+            return;
+        }
+
+        // Add failures using the `WalkContext<T>` object. Here, we add a failure if we find the null keyword.
+        if (node.kind === ts.SyntaxKind.NullKeyword) {
+            return ctx.addFailureAtNode(node, Rule.FAILURE_STRING);
+        }
+
+        // Continue recursion into the AST by calling function `cb` for every child of the current node.
+        return ts.forEachChild(node, cb);
+    }
+}
+```
+
+## Using `AbstractWalker<T>`
+If your rule implementation is a bit more involved than the above example, you can also implement it as a class.
+Simply extend `AbstractWalker` and implement the `walk` method.
+
+```ts
+import * as ts from "typescript";
+import * as Lint from "tslint";
+
+export class Rule extends Lint.Rules.AbstractRule {
+    public static FAILURE_STRING = "'magic numbers' are not allowed";
+
+    public static ALLOWED_NODES = new Set<ts.SyntaxKind>([
+        ...
+    ]);
+
+    public apply(sourceFile: ts.SourceFile): Lint.RuleFailure[] {
+        // We convert the `ruleArguments` into a useful format before passing it to the constructor of AbstractWalker.
+        return this.applyWithWalker(new NoMagicNumbersWalker(sourceFile, this.ruleName, new Set(this.ruleArguments.map(String))));
+    }
+}
+
+// The type parameter of AbstractWalker corresponds to the third constructor parameter.
+class NoMagicNumbersWalker extends Lint.AbstractWalker<Set<string>> {
+    public walk(sourceFile: ts.SourceFile) {
+        const cb = (node: ts.Node): void => {
+            // Finds specific node types and do checking.
+            if (node.kind === ts.SyntaxKind.NumericLiteral) {
+                this.checkNumericLiteral(node, (node as ts.NumericLiteral).text);
+            } else if (node.kind === ts.SyntaxKind.PrefixUnaryExpression &&
+                       (node as ts.PrefixUnaryExpression).operator === ts.SyntaxKind.MinusToken) {
+                this.checkNumericLiteral(node, "-" + ((node as ts.PrefixUnaryExpression).operand as ts.NumericLiteral).text);
+            } else {
+                // Continue rescursion: call function `cb` for all children of the current node.
+                return ts.forEachChild(node, cb);
+            }
+        };
+
+        // Start recursion for all children of `sourceFile`.
+        return ts.forEachChild(sourceFile, cb);
+    }
+
+    private checkNumericLiteral(node: ts.Node, num: string) {
+        // `this.options` is the third constructor parameter from above (the Set we created in `Rule.apply`)
+        if (!Rule.ALLOWED_NODES.has(node.parent!.kind) && !this.options.has(num)) {
+            // Add failures to the Walker.
+            this.addFailureAtNode(node, Rule.FAILURE_STRING);
+        }
+    }
+}
+```
+
+## Migrating from `RuleWalker` to `AbstractWalker`
+The main difference between `RuleWalker` and `AbstractWalker` is that you need to implement the AST recursion yourself. But why would you want to do that?
+__Performance!__ `RuleWalker` wants to be "one walker to rule them all" (pun intended). It's easy to use but that convenience 
+makes it slow by default. When implementing the walking yourself, you only need to do as much work as needed.
+
+Besides that you *should* convert the `ruleArguments` to a useful format before passing it to `AbstractWalker` as seen above.
+
+This table describes the equivalent methods between the two classes:
+
+`RuleWalker` | `AbstractWalker`
+------------ | --------------
+`this.createFailure()` and `this.addFailure()` | `this.addFailureAt()`
+`this.addFailureFromStartToEnd()` | `this.addFailure()`
+`this.createReplacement()` | `new Lint.Replacement()`
+`this.deleteText()` | `Lint.Replacement.deleteText()`
+`this.deleteFromTo()` | `Lint.Replacement.deleteFromTo()`
+`this.appendText()` | `Lint.Replacement.appendText()`
+`this.hasOption()` and `this.getOptions()` | use `this.options` directly
+`this.getLineAndCharacterOfPosition()` | `ts.getLineAndCharacterOfPosition(this.sourceFile, ...)`
+`this.getLimit()` | `this.sourceFile.end` 
+`this.getSourceFile()` | is available to be compatible, but prefer `this.sourceFile` 
+`this.getFailures()` | is available to be compatible, but prefer `this.failures`
+`this.skip()` | just don't use it, it's a noop
+`this.getRuleName()` | `this.ruleName`
+
+
diff --git a/docs/develop/custom-rules/performance.md b/docs/develop/custom-rules/performance.md
@@ -0,0 +1,139 @@
+## Performance tips
+
+### Don't call the LanguageService repeatedly
+The LanguageService is designed to serve editors. By design it does as little work to serve requests as possible.
+For most requests no cache is used.
+
+Let's say you need all usages of a variable. The LanguageService needs to check the whole AST subtree in which the variable is in scope.
+Doing that once is barely noticable. But doing it over and over again, will result in pretty bad performance (looking at you `no-unused-variable`).
+
+### Use the TypeChecker only when needed
+The TypeChecker is a really mighty tool, but that comes with a cost. To create a TypeChecker the Program first has to locate, read, parse and bind all SourceFiles referenced.
+To avoid that cost, try to avoid the TypeChecker where possible.
+
+If you are interested in the JSDoc of a function for example, you *could* ask the TypeChecker.
+But there's another way: call `.getChildren()` on the FunctionDeclaration and search for nodes of kind `ts.SyntaxKind.JSDocComment`.
+Those nodes will precede other nodes in the array.
+
+### Avoid walking the AST if possible
+Some rules work directly on the content of the source file.
+
+`max-file-line-count` and `linebreak-style` don't need to walk the AST at all.
+
+Other rules define exceptions: `no-consecutive-blank-lines` ignores template strings.
+To optimize for the best case, this rule can first look for failures in the source. 
+If and only if there are any failures, walk the AST to find the location of all template strings to filter the failures.
+
+### Implement your own walking algorithm
+Convenience comes with a price. When using `SyntaxWalker` or any subclass thereof like `RuleWalker` you pay the price for the
+big switch statement in `visitNode` which then calls the appropriate `visitXXX` method for **every** node in the AST, even if you don't use them.
+
+Use `AbstractWalker` instead and implement the `walk` method to fit the needs of your rule.
+It's as simple as this:
+```ts
+class MyWalker extends Lint.AbstractWalker<MyOptionsType> {
+    public walk(sourceFile: ts.SourceFile) {
+        const cb = (node: ts.Node): void => {
+            if (someCondition) {
+                // do stuff
+            }
+            // Wondering why return is used below? Refer to "Make use of tail calls"
+            return ts.forEachChild(node, cb); // recurse deeper
+        };
+        return ts.forEachChild(sourceFile, cb); // start recursion with children of sourceFile
+    }
+```
+
+### Don't walk the whole AST if possible
+__The Spec is your friend:__
+The language spec defines where each statement can occur. If you are interested in `import` statements for example, you only need to search 
+in `sourceFile.statements` and nested `NamespaceDeclaration`s.
+
+__Don't visit AST branches you're not interested in:__
+For example `no-null-keyword` creates no failure if the null keyword is part of another type.
+There are two ways to achieve this:
+
+* Recurse into the AST until you find a token of kind NullKeyword and then walk up its parent chain to find out if it is part of a type node
+* Stop recursing deeper into that branch as soon as you hit a type node (preferred)
+
+### Avoid frequently creating one-time closures in the hot path
+```ts
+class SomeClass {
+    // this is a simplified version of what SyntaxWalker does under the hood
+    doStuff(node: ts.Node) {
+        // do stuff ...
+
+        ts.forEachChild(node, (n) => this.doStuff(n));
+                           // ~~~~~~~~~~~~~~~~~~~~~~ [a new closure is created for EVERY node in the AST and remains on the call stack
+                           //                         until processing of all children is done]
+    }
+}
+```
+Instead use the same closure for every call like the example in [Implement your own walking algorithm](#Implement_your_own_walking_algorithm).
+
+### Create small specialized functions / methods
+Instead of stuffing the whole logic in a single closure, consider splitting it up into smaller functions or methods.
+Each function should handle similar kinds of nodes. Don't worry too much about the function call, since V8 eventually inlines the function
+if possible.
+
+The AST nodes have different properties, therefore they have a different hidden class in V8. A function can only be optimized for a certain 
+amount of different hidden classes. Above that threshold the function will be deoptimized and is never optimized again.
+
+### Pass the optional `sourceFile` parameter
+There are serveral methods that have an optional parameter `sourceFile`. Don't omit this parameter if you care for performance.
+If ommitted, typescript needs to walk up the node's parent chain until it reaches the SourceFile. This *can* be quite costly when done 
+frequently on deeply nested nodes.
+
+Some examples:
+
+* `node.getStart()`
+* `node.getWidth()`
+* `node.getText()`
+* `node.getChildren()`
+* `node.getFirstToken()`
+* `node.getLeadingTriviaWidth()`
+
+### Avoid excessive calls to `node.getStart()`, `node.getWidth()` and `node.getText()`
+`node.getStart()` scans the source to skip all the leading trivia. Although barely noticeable, this operation is not for free.
+If you need the start position of a node more than once per function, consider caching it.
+
+`node.getWidth()` is most of the time used together with `node.getStart()` to get the node's span. Internally it uses `node.getStart() - node.getEnd()` which effectively doubles the calls to `node.getStart()`. Consider using `node.getEnd()` instead and calculate the width yourself if necessary.
+
+`node.getText()` calculates the start of the node and returns a substring until the end of the token.
+Most of the time this not needed, because this substring is already contained in the node.
+```ts
+declare node: ts.Identifier;
+node.getText() === node.text; // prefer node.text where available
+```
+
+__Bonus points:__ If you know the width of the node (either from the `text` property or because it is a keyword of known width), 
+you can use `node.getEnd() - width` to calculate the node's start.
+`node.getEnd()` is effectively for free as it only returns the `end` property. This way you avoid the cost of skipping leading trivia.
+
+### Make use of tail calls
+Tail calls are function or method calls at the end of the control flow of a function. It's only a tail call if the return value of that call
+is directly returned unchanged. Browsers can optimize this pattern for performance.
+Further optimization is specced in ES2015 as "Proper Tail Calls".
+With proper tail calls the browser reuses the stack frame of the current function. When done right this allows for infinite recursion. 
+```ts
+function foo() {
+    if (condition)
+        return bar(); // tail call
+    if (someOtherCondition)
+        return foo() + 1; // no tail call, return value is modified
+    return baz(); // tail call
+}
+function bas() {
+    if (cond)
+        return someGlobalVariable = bar(); // no tail call, return value is stored in value before it is returned
+    foo(); // no tail call because there is no return
+}
+```
+
+### Typeguards
+Typeguard functions are very small by default. These functions will be inlined into the containing function.
+After inlining you no longer pay the cost of the function call.
+
+But beware of the inlining limit. If a function is big enough or already has many inlined functions, V8 will stop inlining other functions.
+
+Try to use a discriminated union if possible. A typeguard makes sense if you can save up multiple type assertions.