[hotfix] [docs] Fix typos

This closes apache#5289.
imran273 · Jan 15, 2018 · 0755324 · 0755324
1 parent 5623ac6
commit 0755324
Show file tree

Hide file tree

Showing 51 changed files with 85 additions and 85 deletions.
diff --git a/docs/dev/api_concepts.md b/docs/dev/api_concepts.md
@@ -460,7 +460,7 @@ The following example shows a key selector function that simply returns the fiel
 // some ordinary POJO
 public class WC {public String word; public int count;}
 DataStream<WC> words = // [...]
-KeyedStream<WC> kyed = words
+KeyedStream<WC> keyed = words
   .keyBy(new KeySelector<WC, String>() {
      public String getKey(WC wc) { return wc.word; }
    });

diff --git a/docs/dev/batch/hadoop_compatibility.md b/docs/dev/batch/hadoop_compatibility.md
@@ -42,7 +42,7 @@ This document shows how to use existing Hadoop MapReduce code with Flink. Please
 
 ### Project Configuration
 
-Support for Haddop input/output formats is part of the `flink-java` and
+Support for Hadoop input/output formats is part of the `flink-java` and
 `flink-scala` Maven modules that are always required when writing Flink jobs.
 The code is located in `org.apache.flink.api.java.hadoop` and
 `org.apache.flink.api.scala.hadoop` in an additional sub-package for the

diff --git a/docs/dev/batch/index.md b/docs/dev/batch/index.md
@@ -293,7 +293,7 @@ result = input1.join(input2)
         pick the best strategy according to those estimates.
 {% highlight java %}
 // This executes a join by broadcasting the first data set
-// using a hash table for the broadcasted data
+// using a hash table for the broadcast data
 result = input1.join(input2, JoinHint.BROADCAST_HASH_FIRST)
                .where(0).equalTo(1);
 {% endhighlight %}
@@ -613,7 +613,7 @@ val result = input1.join(input2).where(0).equalTo(1)
         pick the best strategy according to those estimates.
 {% highlight scala %}
 // This executes a join by broadcasting the first data set
-// using a hash table for the broadcasted data
+// using a hash table for the broadcast data
 val result = input1.join(input2, JoinHint.BROADCAST_HASH_FIRST)
                    .where(0).equalTo(1)
 {% endhighlight %}
@@ -658,7 +658,7 @@ val data1: DataSet[Int] = // [...]
 val data2: DataSet[String] = // [...]
 val result: DataSet[(Int, String)] = data1.cross(data2)
 {% endhighlight %}
-        <p>Note: Cross is potentially a <b>very</b> compute-intensive operation which can challenge even large compute clusters! It is adviced to hint the system with the DataSet sizes by using <i>crossWithTiny()</i> and <i>crossWithHuge()</i>.</p>
+        <p>Note: Cross is potentially a <b>very</b> compute-intensive operation which can challenge even large compute clusters! It is advised to hint the system with the DataSet sizes by using <i>crossWithTiny()</i> and <i>crossWithHuge()</i>.</p>
       </td>
     </tr>
     <tr>
@@ -994,7 +994,7 @@ Collection-based:
 - `fromParallelCollection(SplittableIterator)` - Creates a data set from an iterator, in
   parallel. The class specifies the data type of the elements returned by the iterator.
 
-- `generateSequence(from, to)` - Generates the squence of numbers in the given interval, in
+- `generateSequence(from, to)` - Generates the sequence of numbers in the given interval, in
   parallel.
 
 Generic:
@@ -1146,7 +1146,7 @@ using an
 Flink comes with a variety of built-in output formats that are encapsulated behind operations on the
 DataSet:
 
-- `writeAsText()` / `TextOuputFormat` - Writes elements line-wise as Strings. The Strings are
+- `writeAsText()` / `TextOutputFormat` - Writes elements line-wise as Strings. The Strings are
   obtained by calling the *toString()* method of each element.
 - `writeAsFormattedText()` / `TextOutputFormat` - Write elements line-wise as Strings. The Strings
   are obtained by calling a user-defined *format()* method for each element.
@@ -1972,15 +1972,15 @@ Collection.
 <div class="codetabs" markdown="1">
 <div data-lang="java" markdown="1">
 {% highlight java %}
-// 1. The DataSet to be broadcasted
+// 1. The DataSet to be broadcast
 DataSet<Integer> toBroadcast = env.fromElements(1, 2, 3);
 
 DataSet<String> data = env.fromElements("a", "b");
 
 data.map(new RichMapFunction<String, String>() {
     @Override
     public void open(Configuration parameters) throws Exception {
-      // 3. Access the broadcasted DataSet as a Collection
+      // 3. Access the broadcast DataSet as a Collection
       Collection<Integer> broadcastSet = getRuntimeContext().getBroadcastVariable("broadcastSetName");
     }
 
@@ -1993,13 +1993,13 @@ data.map(new RichMapFunction<String, String>() {
 {% endhighlight %}
 
 Make sure that the names (`broadcastSetName` in the previous example) match when registering and
-accessing broadcasted data sets. For a complete example program, have a look at
+accessing broadcast data sets. For a complete example program, have a look at
 {% gh_link /flink-examples/flink-examples-batch/src/main/java/org/apache/flink/examples/java/clustering/KMeans.java "K-Means Algorithm" %}.
 </div>
 <div data-lang="scala" markdown="1">
 
 {% highlight scala %}
-// 1. The DataSet to be broadcasted
+// 1. The DataSet to be broadcast
 val toBroadcast = env.fromElements(1, 2, 3)
 
 val data = env.fromElements("a", "b")
@@ -2008,7 +2008,7 @@ data.map(new RichMapFunction[String, String]() {
     var broadcastSet: Traversable[String] = null
 
     override def open(config: Configuration): Unit = {
-      // 3. Access the broadcasted DataSet as a Collection
+      // 3. Access the broadcast DataSet as a Collection
       broadcastSet = getRuntimeContext().getBroadcastVariable[String]("broadcastSetName").asScala
     }
 
@@ -2019,7 +2019,7 @@ data.map(new RichMapFunction[String, String]() {
 {% endhighlight %}
 
 Make sure that the names (`broadcastSetName` in the previous example) match when registering and
-accessing broadcasted data sets. For a complete example program, have a look at
+accessing broadcast data sets. For a complete example program, have a look at
 {% gh_link /flink-examples/flink-examples-batch/src/main/scala/org/apache/flink/examples/scala/clustering/KMeans.scala "KMeans Algorithm" %}.
 </div>
 </div>

diff --git a/docs/dev/batch/iterations.md b/docs/dev/batch/iterations.md
@@ -119,13 +119,13 @@ setFinalState(state);
 
 ### Example: Incrementing Numbers
 
-In the following example, we **iteratively incremenet a set numbers**:
+In the following example, we **iteratively increment a set numbers**:
 
 <p class="text-center">
     <img alt="Iterate Operator Example" width="60%" src="{{site.baseurl}}/fig/iterations_iterate_operator_example.png" />
 </p>
 
-  1. **Iteration Input**: The inital input is read from a data source and consists of five single-field records (integers `1` to `5`).
+  1. **Iteration Input**: The initial input is read from a data source and consists of five single-field records (integers `1` to `5`).
   2. **Step function**: The step function is a single `map` operator, which increments the integer field from `i` to `i+1`. It will be applied to every record of the input.
   3. **Next Partial Solution**: The output of the step function will be the output of the map operator, i.e. records with incremented integers.
   4. **Iteration Result**: After ten iterations, the initial numbers will have been incremented ten times, resulting in integers `11` to `15`.

diff --git a/docs/dev/batch/python.md b/docs/dev/batch/python.md
@@ -560,7 +560,7 @@ class MapperBcv(MapFunction):
         factor = self.context.get_broadcast_variable("bcv")[0][0]
         return value * factor
 
-# 1. The DataSet to be broadcasted
+# 1. The DataSet to be broadcast
 toBroadcast = env.from_elements(1, 2, 3)
 data = env.from_elements("a", "b")
 
@@ -569,7 +569,7 @@ data.map(MapperBcv()).with_broadcast_set("bcv", toBroadcast)
 {% endhighlight %}
 
 Make sure that the names (`bcv` in the previous example) match when registering and
-accessing broadcasted data sets.
+accessing broadcast data sets.
 
 **Note**: As the content of broadcast variables is kept in-memory on each node, it should not become
 too large. For simpler things like scalar values you can simply parameterize the rich function.

diff --git a/docs/dev/connectors/cassandra.md b/docs/dev/connectors/cassandra.md
@@ -30,7 +30,7 @@ under the License.
 This connector provides sinks that writes data into a [Apache Cassandra](https://cassandra.apache.org/) database.
 
 <!--
-  TODO: Perhaps worth mentioning current DataStax Java Driver version to match Cassandra versoin on user side.
+  TODO: Perhaps worth mentioning current DataStax Java Driver version to match Cassandra version on user side.
 -->
 
 To use this connector, add the following dependency to your project:

diff --git a/docs/dev/connectors/kafka.md b/docs/dev/connectors/kafka.md
@@ -633,7 +633,7 @@ the consumers until `transaction1` is committed or aborted. This has two implica
 
  * First of all, during normal working of Flink applications, user can expect a delay in visibility
  of the records produced into Kafka topics, equal to average time between completed checkpoints.
- * Secondly in case of Flink application failure, topics into which this application was writting, 
+ * Secondly in case of Flink application failure, topics into which this application was writing,
  will be blocked for the readers until the application restarts or the configured transaction 
  timeout time will pass. This remark only applies for the cases when there are multiple
  agents/applications writing to the same Kafka topic.

diff --git a/docs/dev/connectors/kinesis.md b/docs/dev/connectors/kinesis.md
@@ -331,7 +331,7 @@ Otherwise, the returned stream name is used.
 
 ### Threading Model
 
-Since Flink 1.4.0, `FlinkKinesisProducer` switches its default underlying KPL from a one-thread-per-request mode to a thread-pool mode. KPL in thread-pool mode uses a queue and thread pool to execute requests to Kinesis. This limits the number of threads that KPL's native process may create, and therefore greatly lowers CPU utilizations and improves efficiency. **Thus, We highly recommend Flink users use thread-pool model.** The default thread pool size is `10`. Users can set the pool size in `java.util.Properties` instance with key `ThreadPoolSize`, as shown in the above example.
+Since Flink 1.4.0, `FlinkKinesisProducer` switches its default underlying KPL from a one-thread-per-request mode to a thread-pool mode. KPL in thread-pool mode uses a queue and thread pool to execute requests to Kinesis. This limits the number of threads that KPL's native process may create, and therefore greatly lowers CPU utilization and improves efficiency. **Thus, We highly recommend Flink users use thread-pool model.** The default thread pool size is `10`. Users can set the pool size in `java.util.Properties` instance with key `ThreadPoolSize`, as shown in the above example.
 
 Users can still switch back to one-thread-per-request mode by setting a key-value pair of `ThreadingModel` and `PER_REQUEST` in `java.util.Properties`, as shown in the code commented out in above example.
 

diff --git a/docs/dev/connectors/rabbitmq.md b/docs/dev/connectors/rabbitmq.md
@@ -66,7 +66,7 @@ RabbitMQ source, the following is required -
  - *Use correlation ids*: Correlation ids are a RabbitMQ application feature.
  You have to set it in the message properties when injecting messages into RabbitMQ.
  The correlation id is used by the source to deduplicate any messages that
- have been reproccessed when restoring from a checkpoint.
+ have been reprocessed when restoring from a checkpoint.
  - *Non-parallel source*: The source must be non-parallel (parallelism set
  to 1) in order to achieve exactly-once. This limitation is mainly due to
  RabbitMQ's approach to dispatching messages from a single queue to multiple

diff --git a/docs/dev/datastream_api.md b/docs/dev/datastream_api.md
@@ -490,7 +490,7 @@ env.generateSequence(1,10).map(new MyMapper()).setBufferTimeout(timeoutMillis);
 LocalStreamEnvironment env = StreamExecutionEnvironment.createLocalEnvironment
 env.setBufferTimeout(timeoutMillis)
 
-env.genereateSequence(1,10).map(myMap).setBufferTimeout(timeoutMillis)
+env.generateSequence(1,10).map(myMap).setBufferTimeout(timeoutMillis)
 {% endhighlight %}
 </div>
 </div>

diff --git a/docs/dev/java8.md b/docs/dev/java8.md
@@ -169,7 +169,7 @@ Create/Import your Eclipse project.
 
 If you are using Maven, you also need to change the Java version in your `pom.xml` for the `maven-compiler-plugin`. Otherwise right click the `JRE System Library` section of your project and open the `Properties` window in order to switch to a Java 8 JRE (or above) that supports Lambda Expressions.
 
-The Eclipse JDT compiler needs a special compiler flag in order to store type information in `.class` files. Open the JDT configuration file at `{project directoy}/.settings/org.eclipse.jdt.core.prefs` with your favorite text editor and add the following line:
+The Eclipse JDT compiler needs a special compiler flag in order to store type information in `.class` files. Open the JDT configuration file at `{project directory}/.settings/org.eclipse.jdt.core.prefs` with your favorite text editor and add the following line:
 
 ~~~
 org.eclipse.jdt.core.compiler.codegen.lambda.genericSignature=generate

diff --git a/docs/dev/libs/cep.md b/docs/dev/libs/cep.md
@@ -150,7 +150,7 @@ it to a looping one by using [Quantifiers](#quantifiers). Each pattern can have
 
 #### Quantifiers
 
-In FlinkCEP, you can specifiy looping patterns using these methods: `pattern.oneOrMore()`, for patterns that expect one or more occurrences of a given event (e.g. the `b+` mentioned before); and `pattern.times(#ofTimes)`, for patterns that
+In FlinkCEP, you can specify looping patterns using these methods: `pattern.oneOrMore()`, for patterns that expect one or more occurrences of a given event (e.g. the `b+` mentioned before); and `pattern.times(#ofTimes)`, for patterns that
 expect a specific number of occurrences of a given type of event, e.g. 4 `a`'s; and `pattern.times(#fromTimes, #toTimes)`, for patterns that expect a specific minimum number of occurrences and a maximum number of occurrences of a given type of event, e.g. 2-4 `a`s.
 
 You can make looping patterns greedy using the `pattern.greedy()` method, but you cannot yet make group patterns greedy. You can make all patterns, looping or not, optional using the `pattern.optional()` method.
@@ -1089,7 +1089,7 @@ Pattern<Event, ?> notNext = start.notNext("not");
                         if other events occur between the matching (negative) event and the previous matching event
                         (relaxed contiguity):</p>
 {% highlight java %}
-Pattern<Event, ?> notFollowedBy = start.notFllowedBy("not");
+Pattern<Event, ?> notFollowedBy = start.notFollowedBy("not");
 {% endhighlight %}
                     </td>
                 </tr>
@@ -1211,7 +1211,7 @@ val notNext = start.notNext("not")
                                         if other events occur between the matching (negative) event and the previous matching event
                                         (relaxed contiguity):</p>
 {% highlight scala %}
-val notFollowedBy = start.notFllowedBy("not")
+val notFollowedBy = start.notFollowedBy("not")
 {% endhighlight %}
                                     </td>
                                 </tr>
@@ -1448,7 +1448,7 @@ To treat partial patterns, the `select` and `flatSelect` API calls offer an over
 parameters
 
  * `PatternTimeoutFunction`/`PatternFlatTimeoutFunction`
- * [OutputTag]({{ site.baseurl }}/dev/stream/side_output.html) for the side output in which the timeouted matches will be returned
+ * [OutputTag]({{ site.baseurl }}/dev/stream/side_output.html) for the side output in which the timed out matches will be returned
  * and the known `PatternSelectFunction`/`PatternFlatSelectFunction`.
 
 <div class="codetabs" markdown="1">

diff --git a/docs/dev/libs/gelly/graph_generators.md b/docs/dev/libs/gelly/graph_generators.md
@@ -555,10 +555,10 @@ val graph = new RMatGraph(env.getJavaEnv, rnd, vertexCount, edgeCount).generate(
 </div>
 </div>
 
-The default RMat contants can be overridden as shown in the following example.
-The contants define the interdependence of bits from each generated edge's source
+The default RMat constants can be overridden as shown in the following example.
+The constants define the interdependence of bits from each generated edge's source
 and target labels. The RMat noise can be enabled and progressively perturbs the
-contants while generating each edge.
+constants while generating each edge.
 
 The RMat generator can be configured to produce a simple graph by removing self-loops
 and duplicate edges. Symmetrization is performed either by a "clip-and-flip" throwing away

diff --git a/docs/dev/libs/ml/cross_validation.md b/docs/dev/libs/ml/cross_validation.md
@@ -54,7 +54,7 @@ Traditionally, training and testing would be done to train an algorithms as norm
 
 In a train-test-holdout strategy we sacrifice the sample size of the initial fitting algorithm for increased confidence that our model is not over-fit.
 
-When using `trainTestHoldout` splitter, the *fraction* `Double` is replaced by a *fraction* array of length three. The first element coresponds to the portion to be used for training, second for testing, and third for holdout.  The weights of this array are *relative*, e.g. an array `Array(3.0, 2.0, 1.0)` would results in approximately 50% of the observations being in the training set, 33% of the observations in the testing set, and 17% of the observations in holdout set.
+When using `trainTestHoldout` splitter, the *fraction* `Double` is replaced by a *fraction* array of length three. The first element corresponds to the portion to be used for training, second for testing, and third for holdout.  The weights of this array are *relative*, e.g. an array `Array(3.0, 2.0, 1.0)` would results in approximately 50% of the observations being in the training set, 33% of the observations in the testing set, and 17% of the observations in holdout set.
 
 ### K-Fold Splits
 

diff --git a/docs/dev/libs/storm_compatibility.md b/docs/dev/libs/storm_compatibility.md
@@ -54,10 +54,10 @@ Add the following dependency to your `pom.xml` if you want to execute Storm code
 **Please note**: Do not add `storm-core` as a dependency. It is already included via `flink-storm`.
 
 **Please note**: `flink-storm` is not part of the provided binary Flink distribution.
-Thus, you need to include `flink-storm` classes (and their dependencies) in your program jar (also called ueber-jar or fat-jar) that is submitted to Flink's JobManager.
+Thus, you need to include `flink-storm` classes (and their dependencies) in your program jar (also called uber-jar or fat-jar) that is submitted to Flink's JobManager.
 See *WordCount Storm* within `flink-storm-examples/pom.xml` for an example how to package a jar correctly.
 
-If you want to avoid large ueber-jars, you can manually copy `storm-core-0.9.4.jar`, `json-simple-1.1.jar` and `flink-storm-{{site.version}}.jar` into Flink's `lib/` folder of each cluster node (*before* the cluster is started).
+If you want to avoid large uber-jars, you can manually copy `storm-core-0.9.4.jar`, `json-simple-1.1.jar` and `flink-storm-{{site.version}}.jar` into Flink's `lib/` folder of each cluster node (*before* the cluster is started).
 For this case, it is sufficient to include only your own Spout and Bolt classes (and their internal dependencies) into the program jar.
 
 # Execute Storm Topologies

diff --git a/docs/dev/linking_with_flink.md b/docs/dev/linking_with_flink.md
@@ -109,7 +109,7 @@ import org.apache.flink.api.scala.createTypeInformation
 {% endhighlight %}
 
 The reason is that Flink analyzes the types that are used in a program and generates serializers
-and comparaters for them. By having either of those imports you enable an implicit conversion
+and comparators for them. By having either of those imports you enable an implicit conversion
 that creates the type information for Flink operations.
 
 If you would rather use SBT, see [here]({{ site.baseurl }}/quickstart/scala_api_quickstart.html#sbt).

diff --git a/docs/dev/migration.md b/docs/dev/migration.md
@@ -165,7 +165,7 @@ public class BufferingSink implements SinkFunction<Tuple2<String, Integer>>,
 {% endhighlight %}
 
 
-The `CountMapper` is a `RichFlatMapFuction` which assumes a grouped-by-key input stream of the form
+The `CountMapper` is a `RichFlatMapFunction` which assumes a grouped-by-key input stream of the form
 `(word, 1)`. The function keeps a counter for each incoming key (`ValueState<Integer> counter`) and if
 the number of occurrences of a certain word surpasses the user-provided threshold, a tuple is emitted
 containing the word itself and the number of occurrences.

diff --git a/docs/dev/packaging.md b/docs/dev/packaging.md
@@ -48,7 +48,7 @@ automatically when exporting JAR files.
 
 ### Packaging Programs through Plans
 
-Additionally, we support packaging programs as *Plans*. Instead of defining a progam in the main
+Additionally, we support packaging programs as *Plans*. Instead of defining a program in the main
 method and calling
 `execute()` on the environment, plan packaging returns the *Program Plan*, which is a description of
 the program's data flow. To do that, the program must implement the

diff --git a/docs/dev/scala_api_extensions.md b/docs/dev/scala_api_extensions.md
@@ -61,7 +61,7 @@ data.map {
 {% endhighlight %}
 
 This extension introduces new methods in both the DataSet and DataStream Scala API
-that have a one-to-one correspondance in the extended API. These delegating methods
+that have a one-to-one correspondence in the extended API. These delegating methods
 do support anonymous pattern matching functions.
 
 #### DataSet API