Skip to content

Commit 7c61c2a

Browse files
uncleGensrowen
authored andcommitted
[DOCS] Fix typo in docs
## What changes were proposed in this pull request? Fix typo in docs ## How was this patch tested? Author: uncleGen <[email protected]> Closes apache#16658 from uncleGen/typo-issue.
1 parent f27e024 commit 7c61c2a

5 files changed

+7
-7
lines changed

docs/configuration.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -435,7 +435,7 @@ Apart from these, the following properties are also available, and may be useful
435435
<td><code>spark.jars.packages</code></td>
436436
<td></td>
437437
<td>
438-
Comma-separated list of maven coordinates of jars to include on the driver and executor
438+
Comma-separated list of Maven coordinates of jars to include on the driver and executor
439439
classpaths. The coordinates should be groupId:artifactId:version. If <code>spark.jars.ivySettings</code>
440440
is given artifacts will be resolved according to the configuration in the file, otherwise artifacts
441441
will be searched for in the local maven repo, then maven central and finally any additional remote

docs/index.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ It also supports a rich set of higher-level tools including [Spark SQL](sql-prog
1515
Get Spark from the [downloads page](http://spark.apache.org/downloads.html) of the project website. This documentation is for Spark version {{site.SPARK_VERSION}}. Spark uses Hadoop's client libraries for HDFS and YARN. Downloads are pre-packaged for a handful of popular Hadoop versions.
1616
Users can also download a "Hadoop free" binary and run Spark with any Hadoop version
1717
[by augmenting Spark's classpath](hadoop-provided.html).
18-
Scala and Java users can include Spark in their projects using its maven cooridnates and in the future Python users can also install Spark from PyPI.
18+
Scala and Java users can include Spark in their projects using its Maven coordinates and in the future Python users can also install Spark from PyPI.
1919

2020

2121
If you'd like to build Spark from

docs/programming-guide.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -185,7 +185,7 @@ In the Spark shell, a special interpreter-aware SparkContext is already created
185185
variable called `sc`. Making your own SparkContext will not work. You can set which master the
186186
context connects to using the `--master` argument, and you can add JARs to the classpath
187187
by passing a comma-separated list to the `--jars` argument. You can also add dependencies
188-
(e.g. Spark Packages) to your shell session by supplying a comma-separated list of maven coordinates
188+
(e.g. Spark Packages) to your shell session by supplying a comma-separated list of Maven coordinates
189189
to the `--packages` argument. Any additional repositories where dependencies might exist (e.g. Sonatype)
190190
can be passed to the `--repositories` argument. For example, to run `bin/spark-shell` on exactly
191191
four cores, use:
@@ -200,7 +200,7 @@ Or, to also add `code.jar` to its classpath, use:
200200
$ ./bin/spark-shell --master local[4] --jars code.jar
201201
{% endhighlight %}
202202

203-
To include a dependency using maven coordinates:
203+
To include a dependency using Maven coordinates:
204204

205205
{% highlight bash %}
206206
$ ./bin/spark-shell --master local[4] --packages "org.example:example:0.1"
@@ -217,7 +217,7 @@ In the PySpark shell, a special interpreter-aware SparkContext is already create
217217
variable called `sc`. Making your own SparkContext will not work. You can set which master the
218218
context connects to using the `--master` argument, and you can add Python .zip, .egg or .py files
219219
to the runtime path by passing a comma-separated list to `--py-files`. You can also add dependencies
220-
(e.g. Spark Packages) to your shell session by supplying a comma-separated list of maven coordinates
220+
(e.g. Spark Packages) to your shell session by supplying a comma-separated list of Maven coordinates
221221
to the `--packages` argument. Any additional repositories where dependencies might exist (e.g. Sonatype)
222222
can be passed to the `--repositories` argument. Any Python dependencies a Spark package has (listed in
223223
the requirements.txt of that package) must be manually installed using `pip` when necessary.

docs/streaming-kafka-0-10-integration.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -183,7 +183,7 @@ stream.foreachRDD(new VoidFunction<JavaRDD<ConsumerRecord<String, String>>>() {
183183
Note that the typecast to `HasOffsetRanges` will only succeed if it is done in the first method called on the result of `createDirectStream`, not later down a chain of methods. Be aware that the one-to-one mapping between RDD partition and Kafka partition does not remain after any methods that shuffle or repartition, e.g. reduceByKey() or window().
184184

185185
### Storing Offsets
186-
Kafka delivery semantics in the case of failure depend on how and when offsets are stored. Spark output operations are [at-least-once](streaming-programming-guide.html#semantics-of-output-operations). So if you want the equivalent of exactly-once semantics, you must either store offsets after an idempotent output, or store offsets in an atomic transaction alongside output. With this integration, you have 3 options, in order of increasing reliablity (and code complexity), for how to store offsets.
186+
Kafka delivery semantics in the case of failure depend on how and when offsets are stored. Spark output operations are [at-least-once](streaming-programming-guide.html#semantics-of-output-operations). So if you want the equivalent of exactly-once semantics, you must either store offsets after an idempotent output, or store offsets in an atomic transaction alongside output. With this integration, you have 3 options, in order of increasing reliability (and code complexity), for how to store offsets.
187187

188188
#### Checkpoints
189189
If you enable Spark [checkpointing](streaming-programming-guide.html#checkpointing), offsets will be stored in the checkpoint. This is easy to enable, but there are drawbacks. Your output operation must be idempotent, since you will get repeated outputs; transactions are not an option. Furthermore, you cannot recover from a checkpoint if your application code has changed. For planned upgrades, you can mitigate this by running the new code at the same time as the old code (since outputs need to be idempotent anyway, they should not clash). But for unplanned failures that require code changes, you will lose data unless you have another way to identify known good starting offsets.

docs/submitting-applications.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -189,7 +189,7 @@ This can use up a significant amount of space over time and will need to be clea
189189
is handled automatically, and with Spark standalone, automatic cleanup can be configured with the
190190
`spark.worker.cleanup.appDataTtl` property.
191191

192-
Users may also include any other dependencies by supplying a comma-delimited list of maven coordinates
192+
Users may also include any other dependencies by supplying a comma-delimited list of Maven coordinates
193193
with `--packages`. All transitive dependencies will be handled when using this command. Additional
194194
repositories (or resolvers in SBT) can be added in a comma-delimited fashion with the flag `--repositories`.
195195
(Note that credentials for password-protected repositories can be supplied in some cases in the repository URI,

0 commit comments

Comments
 (0)