Skip to content

Commit

Permalink
[hotfix] [docs] Fix typos
Browse files Browse the repository at this point in the history
This closes apache#5289.
  • Loading branch information
greghogan authored and zentol committed Jan 15, 2018
1 parent 5623ac6 commit 0755324
Show file tree
Hide file tree
Showing 51 changed files with 85 additions and 85 deletions.
2 changes: 1 addition & 1 deletion docs/dev/api_concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -460,7 +460,7 @@ The following example shows a key selector function that simply returns the fiel
// some ordinary POJO
public class WC {public String word; public int count;}
DataStream<WC> words = // [...]
KeyedStream<WC> kyed = words
KeyedStream<WC> keyed = words
.keyBy(new KeySelector<WC, String>() {
public String getKey(WC wc) { return wc.word; }
});
Expand Down
2 changes: 1 addition & 1 deletion docs/dev/batch/hadoop_compatibility.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ This document shows how to use existing Hadoop MapReduce code with Flink. Please

### Project Configuration

Support for Haddop input/output formats is part of the `flink-java` and
Support for Hadoop input/output formats is part of the `flink-java` and
`flink-scala` Maven modules that are always required when writing Flink jobs.
The code is located in `org.apache.flink.api.java.hadoop` and
`org.apache.flink.api.scala.hadoop` in an additional sub-package for the
Expand Down
22 changes: 11 additions & 11 deletions docs/dev/batch/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -293,7 +293,7 @@ result = input1.join(input2)
pick the best strategy according to those estimates.
{% highlight java %}
// This executes a join by broadcasting the first data set
// using a hash table for the broadcasted data
// using a hash table for the broadcast data
result = input1.join(input2, JoinHint.BROADCAST_HASH_FIRST)
.where(0).equalTo(1);
{% endhighlight %}
Expand Down Expand Up @@ -613,7 +613,7 @@ val result = input1.join(input2).where(0).equalTo(1)
pick the best strategy according to those estimates.
{% highlight scala %}
// This executes a join by broadcasting the first data set
// using a hash table for the broadcasted data
// using a hash table for the broadcast data
val result = input1.join(input2, JoinHint.BROADCAST_HASH_FIRST)
.where(0).equalTo(1)
{% endhighlight %}
Expand Down Expand Up @@ -658,7 +658,7 @@ val data1: DataSet[Int] = // [...]
val data2: DataSet[String] = // [...]
val result: DataSet[(Int, String)] = data1.cross(data2)
{% endhighlight %}
<p>Note: Cross is potentially a <b>very</b> compute-intensive operation which can challenge even large compute clusters! It is adviced to hint the system with the DataSet sizes by using <i>crossWithTiny()</i> and <i>crossWithHuge()</i>.</p>
<p>Note: Cross is potentially a <b>very</b> compute-intensive operation which can challenge even large compute clusters! It is advised to hint the system with the DataSet sizes by using <i>crossWithTiny()</i> and <i>crossWithHuge()</i>.</p>
</td>
</tr>
<tr>
Expand Down Expand Up @@ -994,7 +994,7 @@ Collection-based:
- `fromParallelCollection(SplittableIterator)` - Creates a data set from an iterator, in
parallel. The class specifies the data type of the elements returned by the iterator.

- `generateSequence(from, to)` - Generates the squence of numbers in the given interval, in
- `generateSequence(from, to)` - Generates the sequence of numbers in the given interval, in
parallel.

Generic:
Expand Down Expand Up @@ -1146,7 +1146,7 @@ using an
Flink comes with a variety of built-in output formats that are encapsulated behind operations on the
DataSet:

- `writeAsText()` / `TextOuputFormat` - Writes elements line-wise as Strings. The Strings are
- `writeAsText()` / `TextOutputFormat` - Writes elements line-wise as Strings. The Strings are
obtained by calling the *toString()* method of each element.
- `writeAsFormattedText()` / `TextOutputFormat` - Write elements line-wise as Strings. The Strings
are obtained by calling a user-defined *format()* method for each element.
Expand Down Expand Up @@ -1972,15 +1972,15 @@ Collection.
<div class="codetabs" markdown="1">
<div data-lang="java" markdown="1">
{% highlight java %}
// 1. The DataSet to be broadcasted
// 1. The DataSet to be broadcast
DataSet<Integer> toBroadcast = env.fromElements(1, 2, 3);

DataSet<String> data = env.fromElements("a", "b");

data.map(new RichMapFunction<String, String>() {
@Override
public void open(Configuration parameters) throws Exception {
// 3. Access the broadcasted DataSet as a Collection
// 3. Access the broadcast DataSet as a Collection
Collection<Integer> broadcastSet = getRuntimeContext().getBroadcastVariable("broadcastSetName");
}

Expand All @@ -1993,13 +1993,13 @@ data.map(new RichMapFunction<String, String>() {
{% endhighlight %}

Make sure that the names (`broadcastSetName` in the previous example) match when registering and
accessing broadcasted data sets. For a complete example program, have a look at
accessing broadcast data sets. For a complete example program, have a look at
{% gh_link /flink-examples/flink-examples-batch/src/main/java/org/apache/flink/examples/java/clustering/KMeans.java "K-Means Algorithm" %}.
</div>
<div data-lang="scala" markdown="1">

{% highlight scala %}
// 1. The DataSet to be broadcasted
// 1. The DataSet to be broadcast
val toBroadcast = env.fromElements(1, 2, 3)

val data = env.fromElements("a", "b")
Expand All @@ -2008,7 +2008,7 @@ data.map(new RichMapFunction[String, String]() {
var broadcastSet: Traversable[String] = null

override def open(config: Configuration): Unit = {
// 3. Access the broadcasted DataSet as a Collection
// 3. Access the broadcast DataSet as a Collection
broadcastSet = getRuntimeContext().getBroadcastVariable[String]("broadcastSetName").asScala
}

Expand All @@ -2019,7 +2019,7 @@ data.map(new RichMapFunction[String, String]() {
{% endhighlight %}

Make sure that the names (`broadcastSetName` in the previous example) match when registering and
accessing broadcasted data sets. For a complete example program, have a look at
accessing broadcast data sets. For a complete example program, have a look at
{% gh_link /flink-examples/flink-examples-batch/src/main/scala/org/apache/flink/examples/scala/clustering/KMeans.scala "KMeans Algorithm" %}.
</div>
</div>
Expand Down
4 changes: 2 additions & 2 deletions docs/dev/batch/iterations.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,13 +119,13 @@ setFinalState(state);

### Example: Incrementing Numbers

In the following example, we **iteratively incremenet a set numbers**:
In the following example, we **iteratively increment a set numbers**:

<p class="text-center">
<img alt="Iterate Operator Example" width="60%" src="{{site.baseurl}}/fig/iterations_iterate_operator_example.png" />
</p>

1. **Iteration Input**: The inital input is read from a data source and consists of five single-field records (integers `1` to `5`).
1. **Iteration Input**: The initial input is read from a data source and consists of five single-field records (integers `1` to `5`).
2. **Step function**: The step function is a single `map` operator, which increments the integer field from `i` to `i+1`. It will be applied to every record of the input.
3. **Next Partial Solution**: The output of the step function will be the output of the map operator, i.e. records with incremented integers.
4. **Iteration Result**: After ten iterations, the initial numbers will have been incremented ten times, resulting in integers `11` to `15`.
Expand Down
4 changes: 2 additions & 2 deletions docs/dev/batch/python.md
Original file line number Diff line number Diff line change
Expand Up @@ -560,7 +560,7 @@ class MapperBcv(MapFunction):
factor = self.context.get_broadcast_variable("bcv")[0][0]
return value * factor

# 1. The DataSet to be broadcasted
# 1. The DataSet to be broadcast
toBroadcast = env.from_elements(1, 2, 3)
data = env.from_elements("a", "b")

Expand All @@ -569,7 +569,7 @@ data.map(MapperBcv()).with_broadcast_set("bcv", toBroadcast)
{% endhighlight %}

Make sure that the names (`bcv` in the previous example) match when registering and
accessing broadcasted data sets.
accessing broadcast data sets.

**Note**: As the content of broadcast variables is kept in-memory on each node, it should not become
too large. For simpler things like scalar values you can simply parameterize the rich function.
Expand Down
2 changes: 1 addition & 1 deletion docs/dev/connectors/cassandra.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ under the License.
This connector provides sinks that writes data into a [Apache Cassandra](https://cassandra.apache.org/) database.

<!--
TODO: Perhaps worth mentioning current DataStax Java Driver version to match Cassandra versoin on user side.
TODO: Perhaps worth mentioning current DataStax Java Driver version to match Cassandra version on user side.
-->

To use this connector, add the following dependency to your project:
Expand Down
2 changes: 1 addition & 1 deletion docs/dev/connectors/kafka.md
Original file line number Diff line number Diff line change
Expand Up @@ -633,7 +633,7 @@ the consumers until `transaction1` is committed or aborted. This has two implica

* First of all, during normal working of Flink applications, user can expect a delay in visibility
of the records produced into Kafka topics, equal to average time between completed checkpoints.
* Secondly in case of Flink application failure, topics into which this application was writting,
* Secondly in case of Flink application failure, topics into which this application was writing,
will be blocked for the readers until the application restarts or the configured transaction
timeout time will pass. This remark only applies for the cases when there are multiple
agents/applications writing to the same Kafka topic.
Expand Down
2 changes: 1 addition & 1 deletion docs/dev/connectors/kinesis.md
Original file line number Diff line number Diff line change
Expand Up @@ -331,7 +331,7 @@ Otherwise, the returned stream name is used.

### Threading Model

Since Flink 1.4.0, `FlinkKinesisProducer` switches its default underlying KPL from a one-thread-per-request mode to a thread-pool mode. KPL in thread-pool mode uses a queue and thread pool to execute requests to Kinesis. This limits the number of threads that KPL's native process may create, and therefore greatly lowers CPU utilizations and improves efficiency. **Thus, We highly recommend Flink users use thread-pool model.** The default thread pool size is `10`. Users can set the pool size in `java.util.Properties` instance with key `ThreadPoolSize`, as shown in the above example.
Since Flink 1.4.0, `FlinkKinesisProducer` switches its default underlying KPL from a one-thread-per-request mode to a thread-pool mode. KPL in thread-pool mode uses a queue and thread pool to execute requests to Kinesis. This limits the number of threads that KPL's native process may create, and therefore greatly lowers CPU utilization and improves efficiency. **Thus, We highly recommend Flink users use thread-pool model.** The default thread pool size is `10`. Users can set the pool size in `java.util.Properties` instance with key `ThreadPoolSize`, as shown in the above example.

Users can still switch back to one-thread-per-request mode by setting a key-value pair of `ThreadingModel` and `PER_REQUEST` in `java.util.Properties`, as shown in the code commented out in above example.

Expand Down
2 changes: 1 addition & 1 deletion docs/dev/connectors/rabbitmq.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ RabbitMQ source, the following is required -
- *Use correlation ids*: Correlation ids are a RabbitMQ application feature.
You have to set it in the message properties when injecting messages into RabbitMQ.
The correlation id is used by the source to deduplicate any messages that
have been reproccessed when restoring from a checkpoint.
have been reprocessed when restoring from a checkpoint.
- *Non-parallel source*: The source must be non-parallel (parallelism set
to 1) in order to achieve exactly-once. This limitation is mainly due to
RabbitMQ's approach to dispatching messages from a single queue to multiple
Expand Down
2 changes: 1 addition & 1 deletion docs/dev/datastream_api.md
Original file line number Diff line number Diff line change
Expand Up @@ -490,7 +490,7 @@ env.generateSequence(1,10).map(new MyMapper()).setBufferTimeout(timeoutMillis);
LocalStreamEnvironment env = StreamExecutionEnvironment.createLocalEnvironment
env.setBufferTimeout(timeoutMillis)

env.genereateSequence(1,10).map(myMap).setBufferTimeout(timeoutMillis)
env.generateSequence(1,10).map(myMap).setBufferTimeout(timeoutMillis)
{% endhighlight %}
</div>
</div>
Expand Down
2 changes: 1 addition & 1 deletion docs/dev/java8.md
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ Create/Import your Eclipse project.

If you are using Maven, you also need to change the Java version in your `pom.xml` for the `maven-compiler-plugin`. Otherwise right click the `JRE System Library` section of your project and open the `Properties` window in order to switch to a Java 8 JRE (or above) that supports Lambda Expressions.

The Eclipse JDT compiler needs a special compiler flag in order to store type information in `.class` files. Open the JDT configuration file at `{project directoy}/.settings/org.eclipse.jdt.core.prefs` with your favorite text editor and add the following line:
The Eclipse JDT compiler needs a special compiler flag in order to store type information in `.class` files. Open the JDT configuration file at `{project directory}/.settings/org.eclipse.jdt.core.prefs` with your favorite text editor and add the following line:

~~~
org.eclipse.jdt.core.compiler.codegen.lambda.genericSignature=generate
Expand Down
8 changes: 4 additions & 4 deletions docs/dev/libs/cep.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,7 @@ it to a looping one by using [Quantifiers](#quantifiers). Each pattern can have

#### Quantifiers

In FlinkCEP, you can specifiy looping patterns using these methods: `pattern.oneOrMore()`, for patterns that expect one or more occurrences of a given event (e.g. the `b+` mentioned before); and `pattern.times(#ofTimes)`, for patterns that
In FlinkCEP, you can specify looping patterns using these methods: `pattern.oneOrMore()`, for patterns that expect one or more occurrences of a given event (e.g. the `b+` mentioned before); and `pattern.times(#ofTimes)`, for patterns that
expect a specific number of occurrences of a given type of event, e.g. 4 `a`'s; and `pattern.times(#fromTimes, #toTimes)`, for patterns that expect a specific minimum number of occurrences and a maximum number of occurrences of a given type of event, e.g. 2-4 `a`s.

You can make looping patterns greedy using the `pattern.greedy()` method, but you cannot yet make group patterns greedy. You can make all patterns, looping or not, optional using the `pattern.optional()` method.
Expand Down Expand Up @@ -1089,7 +1089,7 @@ Pattern<Event, ?> notNext = start.notNext("not");
if other events occur between the matching (negative) event and the previous matching event
(relaxed contiguity):</p>
{% highlight java %}
Pattern<Event, ?> notFollowedBy = start.notFllowedBy("not");
Pattern<Event, ?> notFollowedBy = start.notFollowedBy("not");
{% endhighlight %}
</td>
</tr>
Expand Down Expand Up @@ -1211,7 +1211,7 @@ val notNext = start.notNext("not")
if other events occur between the matching (negative) event and the previous matching event
(relaxed contiguity):</p>
{% highlight scala %}
val notFollowedBy = start.notFllowedBy("not")
val notFollowedBy = start.notFollowedBy("not")
{% endhighlight %}
</td>
</tr>
Expand Down Expand Up @@ -1448,7 +1448,7 @@ To treat partial patterns, the `select` and `flatSelect` API calls offer an over
parameters

* `PatternTimeoutFunction`/`PatternFlatTimeoutFunction`
* [OutputTag]({{ site.baseurl }}/dev/stream/side_output.html) for the side output in which the timeouted matches will be returned
* [OutputTag]({{ site.baseurl }}/dev/stream/side_output.html) for the side output in which the timed out matches will be returned
* and the known `PatternSelectFunction`/`PatternFlatSelectFunction`.

<div class="codetabs" markdown="1">
Expand Down
6 changes: 3 additions & 3 deletions docs/dev/libs/gelly/graph_generators.md
Original file line number Diff line number Diff line change
Expand Up @@ -555,10 +555,10 @@ val graph = new RMatGraph(env.getJavaEnv, rnd, vertexCount, edgeCount).generate(
</div>
</div>

The default RMat contants can be overridden as shown in the following example.
The contants define the interdependence of bits from each generated edge's source
The default RMat constants can be overridden as shown in the following example.
The constants define the interdependence of bits from each generated edge's source
and target labels. The RMat noise can be enabled and progressively perturbs the
contants while generating each edge.
constants while generating each edge.

The RMat generator can be configured to produce a simple graph by removing self-loops
and duplicate edges. Symmetrization is performed either by a "clip-and-flip" throwing away
Expand Down
2 changes: 1 addition & 1 deletion docs/dev/libs/ml/cross_validation.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ Traditionally, training and testing would be done to train an algorithms as norm

In a train-test-holdout strategy we sacrifice the sample size of the initial fitting algorithm for increased confidence that our model is not over-fit.

When using `trainTestHoldout` splitter, the *fraction* `Double` is replaced by a *fraction* array of length three. The first element coresponds to the portion to be used for training, second for testing, and third for holdout. The weights of this array are *relative*, e.g. an array `Array(3.0, 2.0, 1.0)` would results in approximately 50% of the observations being in the training set, 33% of the observations in the testing set, and 17% of the observations in holdout set.
When using `trainTestHoldout` splitter, the *fraction* `Double` is replaced by a *fraction* array of length three. The first element corresponds to the portion to be used for training, second for testing, and third for holdout. The weights of this array are *relative*, e.g. an array `Array(3.0, 2.0, 1.0)` would results in approximately 50% of the observations being in the training set, 33% of the observations in the testing set, and 17% of the observations in holdout set.

### K-Fold Splits

Expand Down
4 changes: 2 additions & 2 deletions docs/dev/libs/storm_compatibility.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,10 @@ Add the following dependency to your `pom.xml` if you want to execute Storm code
**Please note**: Do not add `storm-core` as a dependency. It is already included via `flink-storm`.

**Please note**: `flink-storm` is not part of the provided binary Flink distribution.
Thus, you need to include `flink-storm` classes (and their dependencies) in your program jar (also called ueber-jar or fat-jar) that is submitted to Flink's JobManager.
Thus, you need to include `flink-storm` classes (and their dependencies) in your program jar (also called uber-jar or fat-jar) that is submitted to Flink's JobManager.
See *WordCount Storm* within `flink-storm-examples/pom.xml` for an example how to package a jar correctly.

If you want to avoid large ueber-jars, you can manually copy `storm-core-0.9.4.jar`, `json-simple-1.1.jar` and `flink-storm-{{site.version}}.jar` into Flink's `lib/` folder of each cluster node (*before* the cluster is started).
If you want to avoid large uber-jars, you can manually copy `storm-core-0.9.4.jar`, `json-simple-1.1.jar` and `flink-storm-{{site.version}}.jar` into Flink's `lib/` folder of each cluster node (*before* the cluster is started).
For this case, it is sufficient to include only your own Spout and Bolt classes (and their internal dependencies) into the program jar.

# Execute Storm Topologies
Expand Down
2 changes: 1 addition & 1 deletion docs/dev/linking_with_flink.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ import org.apache.flink.api.scala.createTypeInformation
{% endhighlight %}

The reason is that Flink analyzes the types that are used in a program and generates serializers
and comparaters for them. By having either of those imports you enable an implicit conversion
and comparators for them. By having either of those imports you enable an implicit conversion
that creates the type information for Flink operations.

If you would rather use SBT, see [here]({{ site.baseurl }}/quickstart/scala_api_quickstart.html#sbt).
Expand Down
2 changes: 1 addition & 1 deletion docs/dev/migration.md
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,7 @@ public class BufferingSink implements SinkFunction<Tuple2<String, Integer>>,
{% endhighlight %}


The `CountMapper` is a `RichFlatMapFuction` which assumes a grouped-by-key input stream of the form
The `CountMapper` is a `RichFlatMapFunction` which assumes a grouped-by-key input stream of the form
`(word, 1)`. The function keeps a counter for each incoming key (`ValueState<Integer> counter`) and if
the number of occurrences of a certain word surpasses the user-provided threshold, a tuple is emitted
containing the word itself and the number of occurrences.
Expand Down
2 changes: 1 addition & 1 deletion docs/dev/packaging.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ automatically when exporting JAR files.

### Packaging Programs through Plans

Additionally, we support packaging programs as *Plans*. Instead of defining a progam in the main
Additionally, we support packaging programs as *Plans*. Instead of defining a program in the main
method and calling
`execute()` on the environment, plan packaging returns the *Program Plan*, which is a description of
the program's data flow. To do that, the program must implement the
Expand Down
2 changes: 1 addition & 1 deletion docs/dev/scala_api_extensions.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ data.map {
{% endhighlight %}

This extension introduces new methods in both the DataSet and DataStream Scala API
that have a one-to-one correspondance in the extended API. These delegating methods
that have a one-to-one correspondence in the extended API. These delegating methods
do support anonymous pattern matching functions.

#### DataSet API
Expand Down
Loading

0 comments on commit 0755324

Please sign in to comment.