Skip to content

Commit

Permalink
Traverse dependency graphs in BFS order
Browse files Browse the repository at this point in the history
Let’s assume that Flink is setup with YARN and HDFS as dependencies. I.e., the dependency graph looks like this:

flink-1.7.2 -> yarn-3.1.1
flink-1.7.2 -> hdfs-3.1.1
hdfs-3.1.1 -> ()
yarn-3.1.1 -> ()

Do determine the order in which dependencies are setup, the graph is reversed.

hdfs-3.1.1 -> flink-1.7.2
yarn-3.1.1 -> flink-1.7.2
flink-1.7.2 -> ()

The graph is then traversed by starting with the nodes with in-degree > 0 and adding their dependencies to the list of nodes to visit. If DFS order is used, the following activation order is possible:

hdfs-3.1.1, flink-1.7.2, yarn-3.1.1

That is because the traversal starts with hdfs-3.1.1 and then follows the edge to flink-1.7.2 before continuing with yarn-3.1.1.

However, if a Flink YARN session is used, then Flink needs to connect to YARN at startup. Therefore, all dependencies of Flink have to be activated before it. This is achieved by traversing the graph in BFS order.
  • Loading branch information
he-sk committed Sep 28, 2020
1 parent 9871cf4 commit 02983c5
Showing 1 changed file with 2 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ class DependencyGraph[T: ClassTag] {
throw new Exception("Cannot reverse empty Graph!")
}

/** Collects descendants in a depth-first manner starting from the given set.
/** Collects descendants in a breadth-first manner starting from the given set.
*
* @param toVisit A set of nodes that are yet to be visited.
* @param visited A list of already visited nodes.
Expand All @@ -191,7 +191,7 @@ class DependencyGraph[T: ClassTag] {
case x: Any => !visited.contains(x)
}

collect(children ++ toVisit.tail, next :: visited, excluded)
collect(toVisit.tail ++ children, next :: visited, excluded)
}
}

Expand Down

0 comments on commit 02983c5

Please sign in to comment.