Skip to content

Commit

Permalink
[SPARK-27811][CORE][DOCS] Improve docs about spark.driver.memoryOverh…
Browse files Browse the repository at this point in the history
…ead and spark.executor.memoryOverhead.

## What changes were proposed in this pull request?

I found the docs of `spark.driver.memoryOverhead` and `spark.executor.memoryOverhead` exists a little ambiguity.
For example, the origin docs of `spark.driver.memoryOverhead` start with `The amount of off-heap memory to be allocated per driver in cluster mode`.
But `MemoryManager` also managed a memory area named off-heap used to allocate memory in tungsten mode.
So I think the description of `spark.driver.memoryOverhead` always make confused.

`spark.executor.memoryOverhead` has the same confused with `spark.driver.memoryOverhead`.

## How was this patch tested?

Exists UT.

Closes apache#24671 from beliefer/improve-docs-of-overhead.

Authored-by: gengjiaan <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
  • Loading branch information
beliefer authored and srowen committed Jun 1, 2019
1 parent aec0869 commit 8feb80a
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 7 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ package object config {
.createWithDefaultString("1g")

private[spark] val DRIVER_MEMORY_OVERHEAD = ConfigBuilder("spark.driver.memoryOverhead")
.doc("The amount of off-heap memory to be allocated per driver in cluster mode, " +
.doc("The amount of non-heap memory to be allocated per driver in cluster mode, " +
"in MiB unless otherwise specified.")
.bytesConf(ByteUnit.MiB)
.createOptional
Expand Down Expand Up @@ -196,7 +196,7 @@ package object config {
.createWithDefaultString("1g")

private[spark] val EXECUTOR_MEMORY_OVERHEAD = ConfigBuilder("spark.executor.memoryOverhead")
.doc("The amount of off-heap memory to be allocated per executor in cluster mode, " +
.doc("The amount of non-heap memory to be allocated per executor in cluster mode, " +
"in MiB unless otherwise specified.")
.bytesConf(ByteUnit.MiB)
.createOptional
Expand Down
26 changes: 21 additions & 5 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,10 +181,16 @@ of the most common options to set are:
<td><code>spark.driver.memoryOverhead</code></td>
<td>driverMemory * 0.10, with minimum of 384 </td>
<td>
The amount of off-heap memory to be allocated per driver in cluster mode, in MiB unless
otherwise specified. This is memory that accounts for things like VM overheads, interned strings,
Amount of non-heap memory to be allocated per driver process in cluster mode, in MiB unless
otherwise specified. This is memory that accounts for things like VM overheads, interned strings,
other native overheads, etc. This tends to grow with the container size (typically 6-10%).
This option is currently supported on YARN, Mesos and Kubernetes.
<em>Note:</em> Non-heap memory includes off-heap memory
(when <code>spark.memory.offHeap.enabled=true</code>) and memory used by other driver processes
(e.g. python process that goes with a PySpark driver) and memory used by other non-driver
processes running in the same container. The maximum memory size of container to running
driver is determined by the sum of <code>spark.driver.memoryOverhead</code>
and <code>spark.driver.memory</code>.
</td>
</tr>
<tr>
Expand Down Expand Up @@ -244,10 +250,17 @@ of the most common options to set are:
<td><code>spark.executor.memoryOverhead</code></td>
<td>executorMemory * 0.10, with minimum of 384 </td>
<td>
The amount of off-heap memory to be allocated per executor, in MiB unless otherwise specified.
This is memory that accounts for things like VM overheads, interned strings, other native
overheads, etc. This tends to grow with the executor size (typically 6-10%).
Amount of non-heap memory to be allocated per executor process in cluster mode, in MiB unless
otherwise specified. This is memory that accounts for things like VM overheads, interned strings,
other native overheads, etc. This tends to grow with the executor size (typically 6-10%).
This option is currently supported on YARN and Kubernetes.
<br/>
<em>Note:</em> Non-heap memory includes off-heap memory
(when <code>spark.memory.offHeap.enabled=true</code>) and memory used by other executor processes
(e.g. python process that goes with a PySpark executor) and memory used by other non-executor
processes running in the same container. The maximum memory size of container to running executor
is determined by the sum of <code>spark.executor.memoryOverhead</code> and
<code>spark.executor.memory</code>.
</td>
</tr>
<tr>
Expand Down Expand Up @@ -1283,6 +1296,9 @@ Apart from these, the following properties are also available, and may be useful
<td>
If true, Spark will attempt to use off-heap memory for certain operations. If off-heap memory
use is enabled, then <code>spark.memory.offHeap.size</code> must be positive.
<em>Note:</em> If off-heap memory is enabled, may need to raise the non-heap memory size
(e.g. increase <code>spark.driver.memoryOverhead</code> or
<code>spark.executor.memoryOverhead</code>).
</td>
</tr>
<tr>
Expand Down

0 comments on commit 8feb80a

Please sign in to comment.