Skip to content

Commit

Permalink
Merge pull request cockroachdb#6135 from solongordon/improve-storing-…
Browse files Browse the repository at this point in the history
…description

Improve STORING description
  • Loading branch information
solongordon authored Dec 12, 2019
2 parents 253b54a + a1d44a9 commit e1626b3
Show file tree
Hide file tree
Showing 7 changed files with 267 additions and 70 deletions.
51 changes: 41 additions & 10 deletions v1.0/indexes.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,10 +50,10 @@ To maximize your indexes' performance, we recommend following a few [best practi

## Best Practices

We recommend creating indexes for all of your common queries. To design the most useful indexes, look at each query's `WHERE` and `FROM` clauses, and create indexes that:
We recommend creating indexes for all of your common queries. To design the most useful indexes, look at each query's `WHERE` and `SELECT` clauses, and create indexes that:

- [Index all columns](#indexing-columns) in the `WHERE` clause.
- [Store columns](#storing-columns) that are _only_ in the `FROM` clause.
- [Store columns](#storing-columns) that are _only_ in the `SELECT` clause.

### Indexing Columns

Expand All @@ -66,24 +66,55 @@ When designing indexes, it's important to consider which columns you index and t

### Storing Columns

Storing a column optimizes the performance of queries that retrieve its values (i.e., in the `FROM` clause) but don’t filter them. This is because indexing values is only useful when they're filtered, but it's still faster for SQL to retrieve values in the index it's already scanning rather than reaching back to the table itself.

However, for SQL to use stored columns, queries must filter another column in the same index.
The `STORING` clause specifies columns which are not part of the index key but should be stored in the index. This optimizes queries which retrieve those columns without filtering on them, because it prevents the need to read the primary index.

### Example

If you wanted to optimize the performance of the following queries:
Say we have a table with three columns, two of which are indexed:

{% include copy-clipboard.html %}
~~~ sql
> SELECT col1 FROM tbl WHERE col1 = 10;
> CREATE TABLE tbl (col1 INT, col2 INT, col3 INT, INDEX (col1, col2));
~~~

> SELECT col1, col2, col3 FROM tbl WHERE col1 = 10 AND col2 > 1;
If we filter on the indexed columns but retrieve the unindexed column, this requires reading `col3` from the primary index via an "index join."

{% include copy-clipboard.html %}
~~~ sql
> EXPLAIN SELECT col3 FROM tbl WHERE col1 = 10 AND col2 > 1;
~~~

~~~
tree | field | description
+-----------------+-------------+-----------------------+
render | |
└── index-join | |
│ | table | tbl@primary
│ | key columns | rowid
└── scan | |
| table | tbl@tbl_col1_col2_idx
| spans | /10/2-/11
~~~

You could create a single index of `col1` and `col2` that stores `col3`:
However, if we store `col3` in the index, the index join is no longer necessary. This means our query only needs to read from the secondary index, so it will be more efficient.

{% include copy-clipboard.html %}
~~~ sql
> CREATE INDEX ON tbl (col1, col2) STORING (col3);
> CREATE TABLE tbl (col1 INT, col2 INT, col3 INT, INDEX (col1, col2) STORING (col3));
~~~

{% include copy-clipboard.html %}
~~~ sql
> EXPLAIN SELECT col3 FROM tbl WHERE col1 = 10 AND col2 > 1;
~~~

~~~
tree | field | description
+-----------+-------------+-------------------+
render | |
└── scan | |
| table | tbl@tbl_col1_col2_idx
| spans | /10/2-/11
~~~

## See Also
Expand Down
51 changes: 41 additions & 10 deletions v1.1/indexes.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,10 +50,10 @@ To maximize your indexes' performance, we recommend following a few [best practi

## Best Practices

We recommend creating indexes for all of your common queries. To design the most useful indexes, look at each query's `WHERE` and `FROM` clauses, and create indexes that:
We recommend creating indexes for all of your common queries. To design the most useful indexes, look at each query's `WHERE` and `SELECT` clauses, and create indexes that:

- [Index all columns](#indexing-columns) in the `WHERE` clause.
- [Store columns](#storing-columns) that are _only_ in the `FROM` clause.
- [Store columns](#storing-columns) that are _only_ in the `SELECT` clause.

### Indexing Columns

Expand All @@ -66,24 +66,55 @@ When designing indexes, it's important to consider which columns you index and t

### Storing Columns

Storing a column optimizes the performance of queries that retrieve its values (i.e., in the `FROM` clause) but don’t filter them. This is because indexing values is only useful when they're filtered, but it's still faster for SQL to retrieve values in the index it's already scanning rather than reaching back to the table itself.

However, for SQL to use stored columns, queries must filter another column in the same index.
The `STORING` clause specifies columns which are not part of the index key but should be stored in the index. This optimizes queries which retrieve those columns without filtering on them, because it prevents the need to read the primary index.

### Example

If you wanted to optimize the performance of the following queries:
Say we have a table with three columns, two of which are indexed:

{% include copy-clipboard.html %}
~~~ sql
> SELECT col1 FROM tbl WHERE col1 = 10;
> CREATE TABLE tbl (col1 INT, col2 INT, col3 INT, INDEX (col1, col2));
~~~

> SELECT col1, col2, col3 FROM tbl WHERE col1 = 10 AND col2 > 1;
If we filter on the indexed columns but retrieve the unindexed column, this requires reading `col3` from the primary index via an "index join."

{% include copy-clipboard.html %}
~~~ sql
> EXPLAIN SELECT col3 FROM tbl WHERE col1 = 10 AND col2 > 1;
~~~

~~~
tree | field | description
+-----------------+-------------+-----------------------+
render | |
└── index-join | |
│ | table | tbl@primary
│ | key columns | rowid
└── scan | |
| table | tbl@tbl_col1_col2_idx
| spans | /10/2-/11
~~~

You could create a single index of `col1` and `col2` that stores `col3`:
However, if we store `col3` in the index, the index join is no longer necessary. This means our query only needs to read from the secondary index, so it will be more efficient.

{% include copy-clipboard.html %}
~~~ sql
> CREATE INDEX ON tbl (col1, col2) STORING (col3);
> CREATE TABLE tbl (col1 INT, col2 INT, col3 INT, INDEX (col1, col2) STORING (col3));
~~~

{% include copy-clipboard.html %}
~~~ sql
> EXPLAIN SELECT col3 FROM tbl WHERE col1 = 10 AND col2 > 1;
~~~

~~~
tree | field | description
+-----------+-------------+-------------------+
render | |
└── scan | |
| table | tbl@tbl_col1_col2_idx
| spans | /10/2-/11
~~~

## See Also
Expand Down
46 changes: 36 additions & 10 deletions v19.1/indexes.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,10 +49,10 @@ To maximize your indexes' performance, we recommend following a few [best practi

## Best practices

We recommend creating indexes for all of your common queries. To design the most useful indexes, look at each query's `WHERE` and `FROM` clauses, and create indexes that:
We recommend creating indexes for all of your common queries. To design the most useful indexes, look at each query's `WHERE` and `SELECT` clauses, and create indexes that:

- [Index all columns](#indexing-columns) in the `WHERE` clause.
- [Store columns](#storing-columns) that are _only_ in the `FROM` clause.
- [Store columns](#storing-columns) that are _only_ in the `SELECT` clause.

{{site.data.alerts.callout_success}}
For more information about how to tune CockroachDB's performance, see [SQL Performance Best Practices](performance-best-practices-overview.html) and the [Performance Tuning](performance-tuning.html) tutorial.
Expand All @@ -69,29 +69,55 @@ When designing indexes, it's important to consider which columns you index and t

### Storing columns

Storing a column optimizes the performance of queries that retrieve its values (i.e., in the `FROM` clause) but do not filter them. This is because indexing values is only useful when they're filtered, but it's still faster for SQL to retrieve values in the index it's already scanning rather than reaching back to the table itself.

However, for SQL to use stored columns, queries must filter another column in the same index.
The `STORING` clause specifies columns which are not part of the index key but should be stored in the index. This optimizes queries which retrieve those columns without filtering on them, because it prevents the need to read the primary index.

### Example

If you wanted to optimize the performance of the following queries:
Say we have a table with three columns, two of which are indexed:

{% include copy-clipboard.html %}
~~~ sql
> SELECT col1 FROM tbl WHERE col1 = 10;
> CREATE TABLE tbl (col1 INT, col2 INT, col3 INT, INDEX (col1, col2));
~~~

If we filter on the indexed columns but retrieve the unindexed column, this requires reading `col3` from the primary index via an "index join."

{% include copy-clipboard.html %}
~~~ sql
> SELECT col1, col2, col3 FROM tbl WHERE col1 = 10 AND col2 > 1;
> EXPLAIN SELECT col3 FROM tbl WHERE col1 = 10 AND col2 > 1;
~~~

~~~
tree | field | description
+-----------------+-------------+-----------------------+
render | |
└── index-join | |
│ | table | tbl@primary
│ | key columns | rowid
└── scan | |
| table | tbl@tbl_col1_col2_idx
| spans | /10/2-/11
~~~

You could create a single index of `col1` and `col2` that stores `col3`:
However, if we store `col3` in the index, the index join is no longer necessary. This means our query only needs to read from the secondary index, so it will be more efficient.

{% include copy-clipboard.html %}
~~~ sql
> CREATE TABLE tbl (col1 INT, col2 INT, col3 INT, INDEX (col1, col2) STORING (col3));
~~~

{% include copy-clipboard.html %}
~~~ sql
> CREATE INDEX ON tbl (col1, col2) STORING (col3);
> EXPLAIN SELECT col3 FROM tbl WHERE col1 = 10 AND col2 > 1;
~~~

~~~
tree | field | description
+-----------+-------------+-------------------+
render | |
└── scan | |
| table | tbl@tbl_col1_col2_idx
| spans | /10/2-/11
~~~

## See also
Expand Down
46 changes: 36 additions & 10 deletions v19.2/indexes.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,10 +49,10 @@ To maximize your indexes' performance, we recommend following a few [best practi

## Best practices

We recommend creating indexes for all of your common queries. To design the most useful indexes, look at each query's `WHERE` and `FROM` clauses, and create indexes that:
We recommend creating indexes for all of your common queries. To design the most useful indexes, look at each query's `WHERE` and `SELECT` clauses, and create indexes that:

- [Index all columns](#indexing-columns) in the `WHERE` clause.
- [Store columns](#storing-columns) that are _only_ in the `FROM` clause.
- [Store columns](#storing-columns) that are _only_ in the `SELECT` clause.

{{site.data.alerts.callout_success}}
For more information about how to tune CockroachDB's performance, see [SQL Performance Best Practices](performance-best-practices-overview.html) and the [Performance Tuning](performance-tuning.html) tutorial.
Expand All @@ -69,29 +69,55 @@ When designing indexes, it's important to consider which columns you index and t

### Storing columns

Storing a column optimizes the performance of queries that retrieve its values (i.e., in the `FROM` clause) but do not filter them. This is because indexing values is only useful when they're filtered, but it's still faster for SQL to retrieve values in the index it's already scanning rather than reaching back to the table itself.

However, for SQL to use stored columns, queries must filter another column in the same index.
The `STORING` clause specifies columns which are not part of the index key but should be stored in the index. This optimizes queries which retrieve those columns without filtering on them, because it prevents the need to read the primary index.

### Example

If you wanted to optimize the performance of the following queries:
Say we have a table with three columns, two of which are indexed:

{% include copy-clipboard.html %}
~~~ sql
> SELECT col1 FROM tbl WHERE col1 = 10;
> CREATE TABLE tbl (col1 INT, col2 INT, col3 INT, INDEX (col1, col2));
~~~

If we filter on the indexed columns but retrieve the unindexed column, this requires reading `col3` from the primary index via an "index join."

{% include copy-clipboard.html %}
~~~ sql
> SELECT col1, col2, col3 FROM tbl WHERE col1 = 10 AND col2 > 1;
> EXPLAIN SELECT col3 FROM tbl WHERE col1 = 10 AND col2 > 1;
~~~

~~~
tree | field | description
+-----------------+-------------+-----------------------+
render | |
└── index-join | |
│ | table | tbl@primary
│ | key columns | rowid
└── scan | |
| table | tbl@tbl_col1_col2_idx
| spans | /10/2-/11
~~~

You could create a single index of `col1` and `col2` that stores `col3`:
However, if we store `col3` in the index, the index join is no longer necessary. This means our query only needs to read from the secondary index, so it will be more efficient.

{% include copy-clipboard.html %}
~~~ sql
> CREATE TABLE tbl (col1 INT, col2 INT, col3 INT, INDEX (col1, col2) STORING (col3));
~~~

{% include copy-clipboard.html %}
~~~ sql
> CREATE INDEX ON tbl (col1, col2) STORING (col3);
> EXPLAIN SELECT col3 FROM tbl WHERE col1 = 10 AND col2 > 1;
~~~

~~~
tree | field | description
+-----------+-------------+-------------------+
render | |
└── scan | |
| table | tbl@tbl_col1_col2_idx
| spans | /10/2-/11
~~~

## See also
Expand Down
51 changes: 41 additions & 10 deletions v2.0/indexes.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,10 +50,10 @@ To maximize your indexes' performance, we recommend following a few [best practi

## Best Practices

We recommend creating indexes for all of your common queries. To design the most useful indexes, look at each query's `WHERE` and `FROM` clauses, and create indexes that:
We recommend creating indexes for all of your common queries. To design the most useful indexes, look at each query's `WHERE` and `SELECT` clauses, and create indexes that:

- [Index all columns](#indexing-columns) in the `WHERE` clause.
- [Store columns](#storing-columns) that are _only_ in the `FROM` clause.
- [Store columns](#storing-columns) that are _only_ in the `SELECT` clause.

### Indexing Columns

Expand All @@ -66,24 +66,55 @@ When designing indexes, it's important to consider which columns you index and t

### Storing Columns

Storing a column optimizes the performance of queries that retrieve its values (i.e., in the `FROM` clause) but don’t filter them. This is because indexing values is only useful when they're filtered, but it's still faster for SQL to retrieve values in the index it's already scanning rather than reaching back to the table itself.

However, for SQL to use stored columns, queries must filter another column in the same index.
The `STORING` clause specifies columns which are not part of the index key but should be stored in the index. This optimizes queries which retrieve those columns without filtering on them, because it prevents the need to read the primary index.

### Example

If you wanted to optimize the performance of the following queries:
Say we have a table with three columns, two of which are indexed:

{% include copy-clipboard.html %}
~~~ sql
> SELECT col1 FROM tbl WHERE col1 = 10;
> CREATE TABLE tbl (col1 INT, col2 INT, col3 INT, INDEX (col1, col2));
~~~

> SELECT col1, col2, col3 FROM tbl WHERE col1 = 10 AND col2 > 1;
If we filter on the indexed columns but retrieve the unindexed column, this requires reading `col3` from the primary index via an "index join."

{% include copy-clipboard.html %}
~~~ sql
> EXPLAIN SELECT col3 FROM tbl WHERE col1 = 10 AND col2 > 1;
~~~

~~~
tree | field | description
+-----------------+-------------+-----------------------+
render | |
└── index-join | |
│ | table | tbl@primary
│ | key columns | rowid
└── scan | |
| table | tbl@tbl_col1_col2_idx
| spans | /10/2-/11
~~~

You could create a single index of `col1` and `col2` that stores `col3`:
However, if we store `col3` in the index, the index join is no longer necessary. This means our query only needs to read from the secondary index, so it will be more efficient.

{% include copy-clipboard.html %}
~~~ sql
> CREATE INDEX ON tbl (col1, col2) STORING (col3);
> CREATE TABLE tbl (col1 INT, col2 INT, col3 INT, INDEX (col1, col2) STORING (col3));
~~~

{% include copy-clipboard.html %}
~~~ sql
> EXPLAIN SELECT col3 FROM tbl WHERE col1 = 10 AND col2 > 1;
~~~

~~~
tree | field | description
+-----------+-------------+-------------------+
render | |
└── scan | |
| table | tbl@tbl_col1_col2_idx
| spans | /10/2-/11
~~~

## See Also
Expand Down
Loading

0 comments on commit e1626b3

Please sign in to comment.