Skip to content

Commit

Permalink
[FLINK-4314] [tests] Fix test instability in JobManagerHAJobGraphReco…
Browse files Browse the repository at this point in the history
…veryITCase

The test was relying on the JobManager shutting down before the
TaskManager, which is not necessarily the case. If the TaskManager
shuts down before the JobManager, the JobGraph could reach the final
state FAILED, in which case all HA state is removed.

To circumvent this, we add a restart strategy.
  • Loading branch information
uce committed Aug 5, 2016
1 parent 6590a4c commit 0870007
Showing 1 changed file with 9 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@
import akka.testkit.TestActorRef;
import org.apache.commons.io.FileUtils;
import org.apache.commons.io.filefilter.TrueFileFilter;
import org.apache.flink.api.common.ExecutionConfig;
import org.apache.flink.api.common.restartstrategy.RestartStrategies;
import org.apache.flink.configuration.ConfigConstants;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.runtime.akka.AkkaUtils;
Expand Down Expand Up @@ -140,6 +142,13 @@ public void testJobPersistencyWhenJobManagerShutdown() throws Exception {

JobGraph jobGraph = createBlockingJobGraph();

// Set restart strategy to guard against shut down races.
// If the TM fails before the JM, it might happen that the
// Job is failed, leading to state removal.
ExecutionConfig ec = new ExecutionConfig();
ec.setRestartStrategy(RestartStrategies.fixedDelayRestart(Integer.MAX_VALUE, 100));
jobGraph.setExecutionConfig(ec);

ActorGateway jobManager = flink.getLeaderGateway(deadline.timeLeft());

// Submit the job
Expand Down

0 comments on commit 0870007

Please sign in to comment.