diff --git a/docs/dev/connectors/kafka.md b/docs/dev/connectors/kafka.md index 5d3e66d66e850..ad4cc2fe5894c 100644 --- a/docs/dev/connectors/kafka.md +++ b/docs/dev/connectors/kafka.md @@ -537,7 +537,7 @@ chosen by passing appropriate `semantic` parameter to the `FlinkKafkaProducer011 * `Semantic.NONE`: Flink will not guarantee anything. Produced records can be lost or they can be duplicated. * `Semantic.AT_LEAST_ONCE` (default setting): similar to `setFlushOnCheckpoint(true)` in - `FlinkKafkaProducer010`. his guarantees that no records will be lost (although they can be duplicated). + `FlinkKafkaProducer010`. This guarantees that no records will be lost (although they can be duplicated). * `Semantic.EXACTLY_ONCE`: uses Kafka transactions to provide exactly-once semantic.
@@ -579,7 +579,7 @@ un-finished transaction. In other words after following sequence of events: 3. User committed `transaction2` Even if records from `transaction2` are already committed, they will not be visible to -the consumers until `transaction1` is committed or aborted. This hastwo implications: +the consumers until `transaction1` is committed or aborted. This has two implications: * First of all, during normal working of Flink applications, user can expect a delay in visibility of the records produced into Kafka topics, equal to average time between completed checkpoints. diff --git a/docs/dev/stream/operators/windows.md b/docs/dev/stream/operators/windows.md index 3c0cd8509d3fa..e161854bdcd9d 100644 --- a/docs/dev/stream/operators/windows.md +++ b/docs/dev/stream/operators/windows.md @@ -29,7 +29,7 @@ programmer can benefit to the maximum from its offered functionality. The general structure of a windowed Flink program is presented below. The first snippet refers to *keyed* streams, while the second to *non-keyed* ones. As one can see, the only difference is the `keyBy(...)` call for the keyed streams -and the `window(...)` which becomes `windowAll(...)` for non-keyed streams. These is also going to serve as a roadmap +and the `window(...)` which becomes `windowAll(...)` for non-keyed streams. This is also going to serve as a roadmap for the rest of the page. **Keyed Windows** @@ -1383,7 +1383,7 @@ and then calculating the top-k elements within the same window in the second ope Windows can be defined over long periods of time (such as days, weeks, or months) and therefore accumulate very large state. There are a couple of rules to keep in mind when estimating the storage requirements of your windowing computation: -1. Flink creates one copy of each element per window to which it belongs. Given this, tumbling windows keep one copy of each element (an element belongs to exactly window unless it is dropped late). In contrast, sliding windows create several of each element, as explained in the [Window Assigners](#window-assigners) section. Hence, a sliding window of size 1 day and slide 1 second might not be a good idea. +1. Flink creates one copy of each element per window to which it belongs. Given this, tumbling windows keep one copy of each element (an element belongs to exactly one window unless it is dropped late). In contrast, sliding windows create several of each element, as explained in the [Window Assigners](#window-assigners) section. Hence, a sliding window of size 1 day and slide 1 second might not be a good idea. 2. `ReduceFunction`, `AggregateFunction`, and `FoldFunction` can significantly reduce the storage requirements, as they eagerly aggregate elements and store only one value per window. In contrast, just using a `ProcessWindowFunction` requires accumulating all elements. diff --git a/docs/ops/production_ready.md b/docs/ops/production_ready.md index 303e7a71bb6cc..0d11b8a1866b1 100644 --- a/docs/ops/production_ready.md +++ b/docs/ops/production_ready.md @@ -32,7 +32,7 @@ important and need **careful considerations** if you plan to bring your Flink jo Flink provides out-of-the-box defaults to make usage and adoption of Flink easier. For many users and scenarios, those defaults are good starting points for development and completely sufficient for "one-shot" jobs. -However, once you are planning to bring a Flink appplication to production the requirements typically increase. For example, +However, once you are planning to bring a Flink application to production the requirements typically increase. For example, you want your job to be (re-)scalable and to have a good upgrade story for your job and new Flink versions. In the following, we present a collection of configuration options that you should check before your job goes into production. diff --git a/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java b/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java index 46c821edfa2e8..cc45ddc580ffb 100644 --- a/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java +++ b/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java @@ -747,7 +747,7 @@ public final DataStreamSource fromElements(Class type, OUT... da * elements, it may be necessary to manually supply the type information via * {@link #fromCollection(java.util.Collection, org.apache.flink.api.common.typeinfo.TypeInformation)}. * - *

Note that this operation will result in a non-parallel data stream source, i.e. a data stream source with a + *

Note that this operation will result in a non-parallel data stream source, i.e. a data stream source with * parallelism one. * * @param data @@ -784,7 +784,7 @@ public DataStreamSource fromCollection(Collection data) { * Creates a data stream from the given non-empty collection. * *

Note that this operation will result in a non-parallel data stream source, - * i.e., a data stream source with a parallelism one. + * i.e., a data stream source with parallelism one. * * @param data * The collection of elements to create the data stream from @@ -843,7 +843,7 @@ public DataStreamSource fromCollection(Iterator data, Class * {@link #fromCollection(java.util.Iterator, Class)} does not supply all type information. * *

Note that this operation will result in a non-parallel data stream source, i.e., - * a data stream source with a parallelism one. + * a data stream source with parallelism one. * * @param data * The iterator of elements to create the data stream from