# TerraFlow Algorithm Outline

FILL | FLOW

Our approach to optimizing performance develops algorithms that explicitly manage data placement and movement, which we refer to as external memory algorithms or I/O-efficient algorithms. In order to be effective, I/O-efficient algorithms must be based on reasonably accurate, yet simple, model of the memory system's characteristics, the I/O-model.

TerraFlow takes as input an elevation grid and outputs the flow direction grid and the flow accumulation grid (it has options to output intermediate grids like the watershed grid, the plateaus grid a.s.o). TerraFlow has two components: FILL and FLOW.

• The FILL program performes the flow routing and computes the flow direction grid.
• The FLOW program computes the flow accumulation grid using the elevation and the flow direction grids.

FILL and FLOW are outlines below. For additional details see

## FILL

Flow routing is hard because of flat areas. Flat areas are of two types: plateaus and sinks.

• A plateau is a flat area which has at least a spill-point.
Plateaus pose a problem for flow routing because there is no obvious direction in which to route flow. The intuition is that the flow direction of the plateau cells should be such that, globally, the flow of the plateau is directed towards its spill-points.
• A sink is a flat area with no spill-points.
Sinks pose a problem for flow routing because they do not let water flow out; this violates a useful assumption of flow routing, which assumes that all water must eventually flow off the grid (hence there cannot be cells in the terrain which act as holes in the ground). Although sinks may represent actual geographic features, they are often considered to be artifacts of the input data generation and are removed to allow the complete definition of flow routes across the terrain. Flooding (filling) the terrain eliminates sinks by raising parts of the terrain so that all flow can be routed off the edge.
Partitioning the terrain into watersheds is a key element of our approach to flooding. A watershed consists of a set of cells around a sink that route some flow to the sink. In the case of cells that may route flow to multiple sinks, we make an arbitrary choice.

The three main phases of FILL are:

You may want to check out how we deal with nodata.

Each of the three phases of FILL can be solved using Sort(N) I/Os.

## FLOW

Flow accumulation of a cell represents the total amount of flow draining through a cell of the terrain. We assume that every grid cell initially has one unit of flow (water) and that each grid cell distributes the flow, initial as well as incoming, to the neighbors pointed to by its flow directions. The standard (in-memory) algorithm for flow accumulation uses O(N) I/Os.

..in progress..in the meanwhile see

<laura@cs.duke.edu>