Skip to content

Commit

Permalink
MINOR: fix typos for doc (apache#13883)
Browse files Browse the repository at this point in the history
Reviewers: Divij Vaidya <[email protected]>
  • Loading branch information
berg223 authored Jun 21, 2023
1 parent dd25753 commit 49c1697
Show file tree
Hide file tree
Showing 6 changed files with 19 additions and 19 deletions.
2 changes: 1 addition & 1 deletion docs/design.html
Original file line number Diff line number Diff line change
Expand Up @@ -339,7 +339,7 @@ <h3 class="anchor-heading"><a id="replication" class="anchor-link"></a><a href="
and mark the broker offline.
<p>
We refer to nodes satisfying these two conditions as being "in sync" to avoid the vagueness of "alive" or "failed". The leader keeps track of the set of "in sync" replicas,
which is known as the ISR. If either of these conditions fail to be satisified, then the broker will be removed from the ISR. For example,
which is known as the ISR. If either of these conditions fail to be satisfied, then the broker will be removed from the ISR. For example,
if a follower dies, then the controller will notice the failure through the loss of its session, and will remove the broker from the ISR.
On the other hand, if the follower lags too far behind the leader but still has an active session, then the leader can also remove it from the ISR.
The determination of lagging replicas is controlled through the <code>replica.lag.time.max.ms</code> configuration.
Expand Down
14 changes: 7 additions & 7 deletions docs/ops.html
Original file line number Diff line number Diff line change
Expand Up @@ -892,7 +892,7 @@ <h5 class="anchor-heading"><a id="georeplication-config-conflicts" class="anchor
</p>

<p>
It is therefore important to keep the MirrorMaker configration consistent across replication flows to the same target cluster. This can be achieved, for example, through automation tooling or by using a single, shared MirrorMaker configuration file for your entire organization.
It is therefore important to keep the MirrorMaker configuration consistent across replication flows to the same target cluster. This can be achieved, for example, through automation tooling or by using a single, shared MirrorMaker configuration file for your entire organization.
</p>

<h5 class="anchor-heading"><a id="georeplication-best-practice" class="anchor-link"></a><a href="#georeplication-best-practice">Best Practice: Consume from Remote, Produce to Local</a></h5>
Expand Down Expand Up @@ -1415,7 +1415,7 @@ <h5 class="anchor-heading"><a id="ext4" class="anchor-link"></a><a href="#ext4">
<h4 class="anchor-heading"><a id="replace_disk" class="anchor-link"></a><a href="#replace_disk">Replace KRaft Controller Disk</a></h4>
<p>When Kafka is configured to use KRaft, the controllers store the cluster metadata in the directory specified in <code>metadata.log.dir</code> -- or the first log directory, if <code>metadata.log.dir</code> is not configured. See the documentation for <code>metadata.log.dir</code> for details.</p>

<p>If the data in the cluster metdata directory is lost either because of hardware failure or the hardware needs to be replaced, care should be taken when provisioning the new controller node. The new controller node should not be formatted and started until the majority of the controllers have all of the committed data. To determine if the majority of the controllers have the committed data, run the <code>kafka-metadata-quorum.sh</code> tool to describe the replication status:</p>
<p>If the data in the cluster metadata directory is lost either because of hardware failure or the hardware needs to be replaced, care should be taken when provisioning the new controller node. The new controller node should not be formatted and started until the majority of the controllers have all of the committed data. To determine if the majority of the controllers have the committed data, run the <code>kafka-metadata-quorum.sh</code> tool to describe the replication status:</p>

<pre class="line-numbers"><code class="language-bash"> &gt; bin/kafka-metadata-quorum.sh --bootstrap-server broker_host:port describe --replication
NodeId LogEndOffset Lag LastFetchTimestamp LastCaughtUpTimestamp Status
Expand All @@ -1427,7 +1427,7 @@ <h4 class="anchor-heading"><a id="replace_disk" class="anchor-link"></a><a href=

<pre class="line-numbers"><code class="language-bash"> &gt; bin/kafka-storage.sh format --cluster-id uuid --config server_properties</code></pre>

<p>It is possible for the <code>bin/kafka-storage.sh format</code> command above to fail with a message like <code>Log directory ... is already formatted</code>. This can happend when combined mode is used and only the metadata log directory was lost but not the others. In that case and only in that case, can you run the <code>kafka-storage.sh format</code> command with the <code>--ignore-formatted</code> option.</p>
<p>It is possible for the <code>bin/kafka-storage.sh format</code> command above to fail with a message like <code>Log directory ... is already formatted</code>. This can happen when combined mode is used and only the metadata log directory was lost but not the others. In that case and only in that case, can you run the <code>kafka-storage.sh format</code> command with the <code>--ignore-formatted</code> option.</p>

<p>Start the KRaft controller after formatting the log directories.</p>

Expand Down Expand Up @@ -1779,7 +1779,7 @@ <h4 class="anchor-heading"><a id="remote_jmx" class="anchor-link"></a><a href="#
<tr>
<td>ZooKeeper client request latency</td>
<td>kafka.server:type=ZooKeeperClientMetrics,name=ZooKeeperRequestLatencyMs</td>
<td>Latency in millseconds for ZooKeeper requests from broker.</td>
<td>Latency in milliseconds for ZooKeeper requests from broker.</td>
</tr>
<tr>
<td>ZooKeeper connection status</td>
Expand Down Expand Up @@ -2447,7 +2447,7 @@ <h4 class="anchor-heading"><a id="consumer_monitoring" class="anchor-link"></a><
<td>kafka.consumer:type=consumer-metrics,client-id=([-.\w]+)</td>
</tr>
<tr>
<td>commited-time-ns-total</td>
<td>committed-time-ns-total</td>
<td>The total time the Consumer spent in committed in nanoseconds.</td>
<td>kafka.consumer:type=consumer-metrics,client-id=([-.\w]+)</td>
</tr>
Expand Down Expand Up @@ -3461,7 +3461,7 @@ <h5 class="anchor-heading"><a id="kafka_streams_cache_monitoring" class="anchor-
</tr>
<tr>
<td>hit-ratio-min</td>
<td>The mininum cache hit ratio.</td>
<td>The minimum cache hit ratio.</td>
<td>kafka.streams:type=stream-record-cache-metrics,thread-id=([-.\w]+),task-id=([-.\w]+),record-cache-id=([-.\w]+)</td>
</tr>
<tr>
Expand Down Expand Up @@ -3562,7 +3562,7 @@ <h5 class="anchor-heading"><a id="kraft_dump_log" class="anchor-link"></a><a hre

<pre class="line-numbers"><code class="language-bash"> &gt; bin/kafka-dump-log.sh --cluster-metadata-decoder --files metadata_log_dir/__cluster_metadata-0/00000000000000000000.log</code></pre>

<p>This command decodes and prints the recrods in the a cluster metadata snapshot:</p>
<p>This command decodes and prints the records in the a cluster metadata snapshot:</p>

<pre class="line-numbers"><code class="language-bash"> &gt; bin/kafka-dump-log.sh --cluster-metadata-decoder --files metadata_log_dir/__cluster_metadata-0/00000000000000000100-0000000001.checkpoint</code></pre>

Expand Down
6 changes: 3 additions & 3 deletions docs/security.html
Original file line number Diff line number Diff line change
Expand Up @@ -262,7 +262,7 @@ <h5>Host Name Verification</h5>
[ CA_default ]

base_dir = .
certificate = $base_dir/cacert.pem # The CA certifcate
certificate = $base_dir/cacert.pem # The CA certificate
private_key = $base_dir/cakey.pem # The CA private key
new_certs_dir = $base_dir # Location for new certs after signing
database = $base_dir/index.txt # Database index file
Expand Down Expand Up @@ -437,7 +437,7 @@ <h5>SSL key and certificates in PEM format</h5>
<li><b>Failure to copy extension fields</b><br>
CA operators are often hesitant to copy and requested extension fields from CSRs and prefer to specify these themselves as this makes it
harder for a malicious party to obtain certificates with potentially misleading or fraudulent values.
It is adviseable to double check signed certificates, whether these contain all requested SAN fields to enable proper hostname verification.
It is advisable to double check signed certificates, whether these contain all requested SAN fields to enable proper hostname verification.
The following command can be used to print certificate details to the console, which should be compared with what was originally requested:
<pre class="line-numbers"><code class="language-bash">&gt; openssl x509 -in certificate.crt -text -noout</code></pre>
</li>
Expand Down Expand Up @@ -1269,7 +1269,7 @@ <h3 class="anchor-heading"><a id="security_sasl" class="anchor-link"></a><a href
</ol>

<h3 class="anchor-heading"><a id="security_authz" class="anchor-link"></a><a href="#security_authz">7.5 Authorization and ACLs</a></h3>
Kafka ships with a pluggable authorization framework, which is configured with the <tt>authorizer.class.name</tt> property in the server confgiuration.
Kafka ships with a pluggable authorization framework, which is configured with the <tt>authorizer.class.name</tt> property in the server configuration.
Configured implementations must extend <code>org.apache.kafka.server.authorizer.Authorizer</code>.
Kafka provides default implementations which store ACLs in the cluster metadata (either Zookeeper or the KRaft metadata log).

Expand Down
6 changes: 3 additions & 3 deletions docs/streams/developer-guide/dsl-api.html
Original file line number Diff line number Diff line change
Expand Up @@ -3324,8 +3324,8 @@ <h5><a class="toc-backref" href="#id34">KTable-KTable Foreign-Key
<div class="figure align-center" id="id35">
<img class="centered" src="/{{version}}/images/streams-sliding-windows.png">
<p class="caption"><span class="caption-text">This diagram shows windowing a stream of data records with sliding windows. The overlap of
the sliding window snapshots varies depending on the record times. In this diagram, the time numbers represent miliseconds. For example,
t=5 means &#8220;at the five milisecond mark&#8220;.</span></p>
the sliding window snapshots varies depending on the record times. In this diagram, the time numbers represent milliseconds. For example,
t=5 means &#8220;at the five millisecond mark&#8220;.</span></p>
</div>
<p>Sliding windows are aligned to the data record timestamps, not to the epoch. In contrast to hopping and tumbling windows,
the lower and upper window time interval bounds of sliding windows are both inclusive.</p>
Expand Down Expand Up @@ -3385,7 +3385,7 @@ <h5><a class="toc-backref" href="#id34">KTable-KTable Foreign-Key
Common examples of this are sending alerts or delivering results to a system that doesn't support updates.
</p>
<p>Suppose that you have an hourly windowed count of events per user.
If you want to send an alert when a user has <em>less than</em> three events in an hour, you have a real challange.
If you want to send an alert when a user has <em>less than</em> three events in an hour, you have a real challenge.
All users would match this condition at first, until they accrue enough events, so you cannot simply
send an alert when someone matches the condition; you have to wait until you know you won't see any more events for a particular window
and <em>then</em> send the alert.
Expand Down
2 changes: 1 addition & 1 deletion docs/streams/developer-guide/dsl-topology-naming.html
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ <h2>Readability Issues</h2>

<p>
By saying there is a readability trade-off, we are referring to viewing a description of the topology.
When you render the string description of your topology via the <code>Topology#desribe()</code>
When you render the string description of your topology via the <code>Topology#describe()</code>
method, you can see what the processor is, but you don't have any context for its business purpose.
For example, consider the following simple topology:

Expand Down
8 changes: 4 additions & 4 deletions docs/streams/upgrade-guide.html
Original file line number Diff line number Diff line change
Expand Up @@ -158,8 +158,8 @@ <h3><a id="streams_api_changes_350" href="#streams_api_changes_350">Streams API

<p>
<a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-904%3A+Kafka+Streams+-+Guarantee+subtractor+is+called+before+adder+if+key+has+not+changed">KIP-904</a>
improves the implemenation of KTable aggregations. In general, an input KTable update triggers a result refinent for two rows;
however, prior to KIP-904, if both refinements happend to the same result row, two independent updates to the same row are applied, resulting in spurious itermediate results.
improves the implementation of KTable aggregations. In general, an input KTable update triggers a result refinent for two rows;
however, prior to KIP-904, if both refinements happen to the same result row, two independent updates to the same row are applied, resulting in spurious itermediate results.
KIP-904 allows us to detect this case, and to only apply a single update avoiding spurious intermediate results.
</p>

Expand Down Expand Up @@ -284,7 +284,7 @@ <h3><a id="streams_api_changes_300" href="#streams_api_changes_300">Streams API
<p>
The public <code>topicGroupId</code> and <code>partition</code> fields on TaskId have been deprecated and replaced with getters. Please migrate to using the new <code>TaskId.subtopology()</code>
(which replaces <code>topicGroupId</code>) and <code>TaskId.partition()</code> APIs instead. Also, the <code>TaskId#readFrom</code> and <code>TaskId#writeTo</code> methods have been deprecated
and will be removed, as they were never intended for public use. We have also deprecated the <code>org.apache.kafka.streams.processsor.TaskMetadata</code> class and introduced a new interface
and will be removed, as they were never intended for public use. We have also deprecated the <code>org.apache.kafka.streams.processor.TaskMetadata</code> class and introduced a new interface
<code>org.apache.kafka.streams.TaskMetadata</code> to be used instead. This change was introduced to better reflect the fact that <code>TaskMetadata</code> was not meant to be instantiated outside
of Kafka codebase.
Please note that the new <code>TaskMetadata</code> offers APIs that better represent the task id as an actual <code>TaskId</code> object instead of a String. Please migrate to the new
Expand Down Expand Up @@ -733,7 +733,7 @@ <h3 class="anchor-heading"><a id="streams_api_changes_210" class="anchor-link"><
Also, window sizes and retention times are now specified as <code>Duration</code> type in <code>Stores</code> class.
The <code>Window</code> class has new methods <code>#startTime()</code> and <code>#endTime()</code> that return window start/end timestamp as <code>Instant</code>.
For interactive queries, there are new <code>#fetch(...)</code> overloads taking <code>Instant</code> arguments.
Additionally, punctuations are now registerd via <code>ProcessorContext#schedule(Duration interval, ...)</code>.
Additionally, punctuations are now registered via <code>ProcessorContext#schedule(Duration interval, ...)</code>.
For more details, see <a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-358%3A+Migrate+Streams+API+to+Duration+instead+of+long+ms+times">KIP-358</a>.
</p>

Expand Down

0 comments on commit 49c1697

Please sign in to comment.