Skip to content

Commit

Permalink
Add documentation for AbstractWalker and Performance best practices (p…
Browse files Browse the repository at this point in the history
  • Loading branch information
ajafff authored and nchen63 committed Feb 9, 2017
1 parent d68ba8e commit bca26e4
Show file tree
Hide file tree
Showing 2 changed files with 263 additions and 0 deletions.
124 changes: 124 additions & 0 deletions docs/develop/custom-rules/migration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
## Using `WalkContext<T>` and `Rule#applyWithFunction`
If you have a rule with a pretty simple implementation, you don't need to declare a class which extends the `Walker` class. Instead, you can define a callback function that accepts following argument:

- `ctx: WalkContext<T>`: An object containing rule information, an object `options: T` containing the parsed rule arguments, the `ts.sourceFile` object, and functions for adding failures

Use this callback as an argument to `applyWithFunction`. You can also pass your parsed rule arguments as optional 3rd parameter.

Let's look at `no-null-keyword` as an example:
```ts
import * as ts from "typescript";
import * as Lint from "tslint";

export class Rule extends Lint.Rules.AbstractRule {
public static FAILURE_STRING = "Use 'undefined' instead of 'null'";

public apply(sourceFile: ts.SourceFile): Lint.RuleFailure[] {
// Call `applyWithFunction` with your callback function, `walk`.
// This creates a `WalkContext<T>` and passes it in as an argument.
// An optional 3rd parameter allows you to pass in a parsed version of `this.ruleArguments`. If used, it is not recommended to
// simply pass in `this.getOptions()`, but to parse it into a more useful object instead.
return this.applyWithFunction(sourceFile, walk);
}
}

// Here, the options object type is `void` because we don't pass any options in this example.
function walk(ctx: Lint.WalkContext<void>) {
// Recursively walk the AST starting with root node, `ctx.sourceFile`.
// Call the function `cb` (defined below) for each child.
return ts.forEachChild(ctx.sourceFile, cb);

function cb(node: ts.Node): void {
// Stop recursing further into the AST by returning early. Here, we ignore type nodes.
if (node.kind >= ts.SyntaxKind.FirstTypeNode && node.kind <= ts.SyntaxKind.LastTypeNode) {
return;
}

// Add failures using the `WalkContext<T>` object. Here, we add a failure if we find the null keyword.
if (node.kind === ts.SyntaxKind.NullKeyword) {
return ctx.addFailureAtNode(node, Rule.FAILURE_STRING);
}

// Continue recursion into the AST by calling function `cb` for every child of the current node.
return ts.forEachChild(node, cb);
}
}
```

## Using `AbstractWalker<T>`
If your rule implementation is a bit more involved than the above example, you can also implement it as a class.
Simply extend `AbstractWalker` and implement the `walk` method.

```ts
import * as ts from "typescript";
import * as Lint from "tslint";

export class Rule extends Lint.Rules.AbstractRule {
public static FAILURE_STRING = "'magic numbers' are not allowed";

public static ALLOWED_NODES = new Set<ts.SyntaxKind>([
...
]);

public apply(sourceFile: ts.SourceFile): Lint.RuleFailure[] {
// We convert the `ruleArguments` into a useful format before passing it to the constructor of AbstractWalker.
return this.applyWithWalker(new NoMagicNumbersWalker(sourceFile, this.ruleName, new Set(this.ruleArguments.map(String))));
}
}

// The type parameter of AbstractWalker corresponds to the third constructor parameter.
class NoMagicNumbersWalker extends Lint.AbstractWalker<Set<string>> {
public walk(sourceFile: ts.SourceFile) {
const cb = (node: ts.Node): void => {
// Finds specific node types and do checking.
if (node.kind === ts.SyntaxKind.NumericLiteral) {
this.checkNumericLiteral(node, (node as ts.NumericLiteral).text);
} else if (node.kind === ts.SyntaxKind.PrefixUnaryExpression &&
(node as ts.PrefixUnaryExpression).operator === ts.SyntaxKind.MinusToken) {
this.checkNumericLiteral(node, "-" + ((node as ts.PrefixUnaryExpression).operand as ts.NumericLiteral).text);
} else {
// Continue rescursion: call function `cb` for all children of the current node.
return ts.forEachChild(node, cb);
}
};

// Start recursion for all children of `sourceFile`.
return ts.forEachChild(sourceFile, cb);
}

private checkNumericLiteral(node: ts.Node, num: string) {
// `this.options` is the third constructor parameter from above (the Set we created in `Rule.apply`)
if (!Rule.ALLOWED_NODES.has(node.parent!.kind) && !this.options.has(num)) {
// Add failures to the Walker.
this.addFailureAtNode(node, Rule.FAILURE_STRING);
}
}
}
```

## Migrating from `RuleWalker` to `AbstractWalker`
The main difference between `RuleWalker` and `AbstractWalker` is that you need to implement the AST recursion yourself. But why would you want to do that?
__Performance!__ `RuleWalker` wants to be "one walker to rule them all" (pun intended). It's easy to use but that convenience
makes it slow by default. When implementing the walking yourself, you only need to do as much work as needed.

Besides that you *should* convert the `ruleArguments` to a useful format before passing it to `AbstractWalker` as seen above.

This table describes the equivalent methods between the two classes:

`RuleWalker` | `AbstractWalker`
------------ | --------------
`this.createFailure()` and `this.addFailure()` | `this.addFailureAt()`
`this.addFailureFromStartToEnd()` | `this.addFailure()`
`this.createReplacement()` | `new Lint.Replacement()`
`this.deleteText()` | `Lint.Replacement.deleteText()`
`this.deleteFromTo()` | `Lint.Replacement.deleteFromTo()`
`this.appendText()` | `Lint.Replacement.appendText()`
`this.hasOption()` and `this.getOptions()` | use `this.options` directly
`this.getLineAndCharacterOfPosition()` | `ts.getLineAndCharacterOfPosition(this.sourceFile, ...)`
`this.getLimit()` | `this.sourceFile.end`
`this.getSourceFile()` | is available to be compatible, but prefer `this.sourceFile`
`this.getFailures()` | is available to be compatible, but prefer `this.failures`
`this.skip()` | just don't use it, it's a noop
`this.getRuleName()` | `this.ruleName`


139 changes: 139 additions & 0 deletions docs/develop/custom-rules/performance.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
## Performance tips

### Don't call the LanguageService repeatedly
The LanguageService is designed to serve editors. By design it does as little work to serve requests as possible.
For most requests no cache is used.

Let's say you need all usages of a variable. The LanguageService needs to check the whole AST subtree in which the variable is in scope.
Doing that once is barely noticable. But doing it over and over again, will result in pretty bad performance (looking at you `no-unused-variable`).

### Use the TypeChecker only when needed
The TypeChecker is a really mighty tool, but that comes with a cost. To create a TypeChecker the Program first has to locate, read, parse and bind all SourceFiles referenced.
To avoid that cost, try to avoid the TypeChecker where possible.

If you are interested in the JSDoc of a function for example, you *could* ask the TypeChecker.
But there's another way: call `.getChildren()` on the FunctionDeclaration and search for nodes of kind `ts.SyntaxKind.JSDocComment`.
Those nodes will precede other nodes in the array.

### Avoid walking the AST if possible
Some rules work directly on the content of the source file.

`max-file-line-count` and `linebreak-style` don't need to walk the AST at all.

Other rules define exceptions: `no-consecutive-blank-lines` ignores template strings.
To optimize for the best case, this rule can first look for failures in the source.
If and only if there are any failures, walk the AST to find the location of all template strings to filter the failures.

### Implement your own walking algorithm
Convenience comes with a price. When using `SyntaxWalker` or any subclass thereof like `RuleWalker` you pay the price for the
big switch statement in `visitNode` which then calls the appropriate `visitXXX` method for **every** node in the AST, even if you don't use them.

Use `AbstractWalker` instead and implement the `walk` method to fit the needs of your rule.
It's as simple as this:
```ts
class MyWalker extends Lint.AbstractWalker<MyOptionsType> {
public walk(sourceFile: ts.SourceFile) {
const cb = (node: ts.Node): void => {
if (someCondition) {
// do stuff
}
// Wondering why return is used below? Refer to "Make use of tail calls"
return ts.forEachChild(node, cb); // recurse deeper
};
return ts.forEachChild(sourceFile, cb); // start recursion with children of sourceFile
}
```
### Don't walk the whole AST if possible
__The Spec is your friend:__
The language spec defines where each statement can occur. If you are interested in `import` statements for example, you only need to search
in `sourceFile.statements` and nested `NamespaceDeclaration`s.
__Don't visit AST branches you're not interested in:__
For example `no-null-keyword` creates no failure if the null keyword is part of another type.
There are two ways to achieve this:
* Recurse into the AST until you find a token of kind NullKeyword and then walk up its parent chain to find out if it is part of a type node
* Stop recursing deeper into that branch as soon as you hit a type node (preferred)
### Avoid frequently creating one-time closures in the hot path
```ts
class SomeClass {
// this is a simplified version of what SyntaxWalker does under the hood
doStuff(node: ts.Node) {
// do stuff ...

ts.forEachChild(node, (n) => this.doStuff(n));
// ~~~~~~~~~~~~~~~~~~~~~~ [a new closure is created for EVERY node in the AST and remains on the call stack
// until processing of all children is done]
}
}
```
Instead use the same closure for every call like the example in [Implement your own walking algorithm](#Implement_your_own_walking_algorithm).
### Create small specialized functions / methods
Instead of stuffing the whole logic in a single closure, consider splitting it up into smaller functions or methods.
Each function should handle similar kinds of nodes. Don't worry too much about the function call, since V8 eventually inlines the function
if possible.
The AST nodes have different properties, therefore they have a different hidden class in V8. A function can only be optimized for a certain
amount of different hidden classes. Above that threshold the function will be deoptimized and is never optimized again.
### Pass the optional `sourceFile` parameter
There are serveral methods that have an optional parameter `sourceFile`. Don't omit this parameter if you care for performance.
If ommitted, typescript needs to walk up the node's parent chain until it reaches the SourceFile. This *can* be quite costly when done
frequently on deeply nested nodes.
Some examples:
* `node.getStart()`
* `node.getWidth()`
* `node.getText()`
* `node.getChildren()`
* `node.getFirstToken()`
* `node.getLeadingTriviaWidth()`
### Avoid excessive calls to `node.getStart()`, `node.getWidth()` and `node.getText()`
`node.getStart()` scans the source to skip all the leading trivia. Although barely noticeable, this operation is not for free.
If you need the start position of a node more than once per function, consider caching it.
`node.getWidth()` is most of the time used together with `node.getStart()` to get the node's span. Internally it uses `node.getStart() - node.getEnd()` which effectively doubles the calls to `node.getStart()`. Consider using `node.getEnd()` instead and calculate the width yourself if necessary.
`node.getText()` calculates the start of the node and returns a substring until the end of the token.
Most of the time this not needed, because this substring is already contained in the node.
```ts
declare node: ts.Identifier;
node.getText() === node.text; // prefer node.text where available
```
__Bonus points:__ If you know the width of the node (either from the `text` property or because it is a keyword of known width),
you can use `node.getEnd() - width` to calculate the node's start.
`node.getEnd()` is effectively for free as it only returns the `end` property. This way you avoid the cost of skipping leading trivia.
### Make use of tail calls
Tail calls are function or method calls at the end of the control flow of a function. It's only a tail call if the return value of that call
is directly returned unchanged. Browsers can optimize this pattern for performance.
Further optimization is specced in ES2015 as "Proper Tail Calls".
With proper tail calls the browser reuses the stack frame of the current function. When done right this allows for infinite recursion.
```ts
function foo() {
if (condition)
return bar(); // tail call
if (someOtherCondition)
return foo() + 1; // no tail call, return value is modified
return baz(); // tail call
}
function bas() {
if (cond)
return someGlobalVariable = bar(); // no tail call, return value is stored in value before it is returned
foo(); // no tail call because there is no return
}
```
### Typeguards
Typeguard functions are very small by default. These functions will be inlined into the containing function.
After inlining you no longer pay the cost of the function call.
But beware of the inlining limit. If a function is big enough or already has many inlined functions, V8 will stop inlining other functions.
Try to use a discriminated union if possible. A typeguard makes sense if you can save up multiple type assertions.

0 comments on commit bca26e4

Please sign in to comment.