Skip to content

Commit

Permalink
[STREAMING][KAFKA][DOC] clarify kafka settings needed for larger batches
Browse files Browse the repository at this point in the history
## What changes were proposed in this pull request?

Minor doc change to mention kafka configuration for larger spark batches.

## How was this patch tested?

Doc change only, confirmed via jekyll.

The configuration issue was discussed / confirmed with users on the mailing list.

Author: cody koeninger <[email protected]>

Closes apache#15570 from koeninger/kafka-doc-heartbeat.
  • Loading branch information
koeninger authored and zsxwing committed Oct 21, 2016
1 parent 268ccb9 commit c9720b2
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions docs/streaming-kafka-0-10-integration.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ Each item in the stream is a [ConsumerRecord](http://kafka.apache.org/0100/javad
</div>

For possible kafkaParams, see [Kafka consumer config docs](http://kafka.apache.org/documentation.html#newconsumerconfigs).
If your Spark batch duration is larger than the default Kafka heartbeat session timeout (30 seconds), increase heartbeat.interval.ms and session.timeout.ms appropriately. For batches larger than 5 minutes, this will require changing group.max.session.timeout.ms on the broker.
Note that the example sets enable.auto.commit to false, for discussion see [Storing Offsets](streaming-kafka-0-10-integration.html#storing-offsets) below.

### LocationStrategies
Expand Down

0 comments on commit c9720b2

Please sign in to comment.