Skip to content

Commit

Permalink
[SQL] Documentation: Explain cacheTable command
Browse files Browse the repository at this point in the history
add the `cacheTable` specification

Author: CrazyJvm <[email protected]>

Closes apache#1681 from CrazyJvm/sql-programming-guide-cache and squashes the following commits:

0a231e0 [CrazyJvm] grammar fixes
a04020e [CrazyJvm] modify title to Cached tables
18b6594 [CrazyJvm] fix format
2cbbf58 [CrazyJvm] add cacheTable guide
  • Loading branch information
CrazyJvm authored and pwendell committed Aug 1, 2014
1 parent c0b47ba commit c82fe47
Showing 1 changed file with 10 additions and 0 deletions.
10 changes: 10 additions & 0 deletions docs/sql-programming-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -769,3 +769,13 @@ To start the Spark SQL CLI, run the following in the Spark directory:
Configuration of Hive is done by placing your `hive-site.xml` file in `conf/`.
You may run `./bin/spark-sql --help` for a complete list of all available
options.

# Cached tables

Spark SQL can cache tables using an in-memory columnar format by calling `cacheTable("tableName")`.
Then Spark SQL will scan only required columns and will automatically tune compression to minimize
memory usage and GC pressure. You can call `uncacheTable("tableName")` to remove the table from memory.

Note that if you just call `cache` rather than `cacheTable`, tables will _not_ be cached in
in-memory columnar format. So we strongly recommend using `cacheTable` whenever you want to
cache tables.

0 comments on commit c82fe47

Please sign in to comment.