The watershed graph is an undirected graph containing one node for each watershed, and edges between watersheds that are adjacent. The edges are labeled with the lowest elevation that occurs along the boundary between the two watersheds corresponding to that edge.
In order to identify watersheds we must route flow for each cell in the (unfilled) terrain. Flow routing on the unfilled terrain can be accomplished for most cells by simply inspecting the neighboring cells. In order to assign directions in plateaus, it is necessary to examine the whole flat region. Plateaus have one or more cells on their boundary which have downslope neighbors. On each plateau, directions are assigned by doing a breadth-first search (BFS) starting from all the spill points of the plateau. When a node is discovered for the first time, it sets its flow direction towards the node which discovered it. This partitions the total flow between the multiple spill points.
Once we have determined flow directions for slopes and plateaus we can identify the sinks, which are the only cells with flow directions undefined. We give each sink a watershed label, which we now need to assign to each cell in the corresponding watershed. We sweep the terrain bottom-up with a horizontal plane, propagating watershed labels from each cell to its neighbors which flow into it. The sweep plane touches the cells in the grid in reverse topological order of the flow directions; when a cell u is processed, all the cells that it flows into have already been touched by the sweep plane and hence have set the watershed label of u to their value.
During the sweep, we propagate the watershed labels up, from each cell up to all neighbors that flow into it. The straightforward solution would be to keep a watershed label grid W and to access individual cells Wijas needed along the way. However, this may result in O(N) scattered accesses to W because cells are processed in topological order and are not necessarily well clustered spatially in this order. Instead we maintain an I/O efficient priority queue which contains the watershed label (``sent forward'' to the cell from one of its downslope neighbors). When we process a grid cell during the sweep, its watershed label is propagated to its upslope neighbors by inserting an element for each neighbor into the priority queue with key equal to the priority of the neighbor. The watershed label can then be found by performing extract_min operations on the priority queue. Because a cell can have multiple outgoing edges, we may need to perform several extract_min operations. If a cell receives multiple watershed labels, it must be on a boundary between watersheds. In this case we add an edge to the watershed graph labeled to indicate that the two watersheds merge at the elevation of this cell. We later eliminate all but the lowest edge between two watersheds. The algorithm for identifying watersheds and building the watershed graph is outlined below:
Step 1 uses Scan(N) I/Os. Steps 2, 3, 4 and 6 use Sort(N) I/Os. Step 5 can also be done in Sort(N) by using a standard I/O-efficient priority queue. Thus, partitioning the terrain into watersheds and constructing the watershed graph can be done in Sort(N) I/Os.