Hierarchical Watershed Decomposition

Hydrologists use DEMs to study river networks of terrains in order to manage drinking water supplies, track pollutants, create flood maps, and more. Often it is not necessary to study the entire terrain or river network at once; frequently interest is only on regions that are downstream of a particular river, or the upstream areas that contribute flow to a particular river. By decomposing the terrain into a set of disjoint hydrologic units (or watersheds) — regions where all water within the region flows towards a single, common outlet — areas of interest can quickly be identified without having to examine the entire terrain. Since the desired size of a unit can vary from application to application, it is advantageous to have a hierarchical decomposition of the terrain into nested units of arbitrary small size. Furthermore, it is useful if the (smallest) units are assigned a unique label that also encodes topological properties such as upstream and downstream neighbors; thus making it possible to automatically identify hydrological units of interest based on the label alone.


Some of the main disadvantages of most of the current methods for determining hydrological units on grid DEMs are that they are either very ad-hoc and manual, or automatic but simple and without naturally defining a hierarchical decomposition that encode topological properties. For example, the Hydrologic Unit System developed by the Water Resources Division of the United States Geological Survey (USGS) defines a hierarchical decomposition of the terrain in the United States. At the top level, the US is divided into 21 regions which are further divided into 222 sub-regions; sub-regions are further divided into basins, sub-basins, watersheds and sub-watersheds, offering a total of six levels of decomposition. The USGS assigns a hydrologic unit code (HUC) to each hydrologic unit. A HUC is a two to twelve digit code where each pair of successive digits indicate the region, sub-region, basin, sub-basin, watershed, and sub-watershed, respectively. Refer to Figure 1. Two of the main disadvantages of the HUC codes are that they are obtained manually (i.e. there is no automatic way to compute the codes for a given DEM), and that they are for the most part assigned arbitrarily (i.e. they do not encode topological information).

Figure 1:

Upper Neuse River (HUC 03030301) Neuse River (HUC 030202)
Neuse Pamlico (HUC 0302) South Atlantic (HUC 03)

One automatic method, the Pfafstetter labeling method by Verdin and Verdin, can be used to hierarchically divide a terrain into arbitrarily small hydrological units, each with a unique label, such that these so-called Pfafstetter labels encode topological properties. Given a (main) river R in a terrain and its basin, that is, the contiguous area that drains into R, the method divides the basin into nine disjoint hydrological units: four basins and five interbasins. The four basins are the four largest basins of the rivers flowing into R; they are labeled 2, 4, 6, and 8 in the order they appear upstream along R. The interbasins are then defined by "breaking" R at the places where the four rivers with largest basins flow into R; they are labeled 1, 3, 5, 7, and 9, in the order they appear upstream along R. To obtain a hierarchy of hydrological units, the method is then applied recursively on each basin and interbasin, and labels of the subdivided regions are appended to the existing label of the original region. Thus, subdivisions further down the hierarchy have longer labels. Refer to Figure 2.

Figure 2:  
A division of a river basin into nine basin/interbasins and recursive subdivision of interbasin 9

Our Approach

We developed an efficient algorithm for computing Pfafstetter labels on grid DEMs. Our algorithm is efficient both when the DEM fits in main memory and when it does not. With the recent progress in remote sensing technology, such as LIDAR, massive DEMs that do not fit in memory are increasingly becoming available. Our algorithm process such a massive DEM I/O-efficiently, using no more than the number of I/Os needed to sort the output labels. The algorithm is based on I/O-efficient tree processing algorithms, as well as an I/O-efficient version of a so-called Cartesian tree.

Our Results

Figure 3: Comparison of Pfafstetter label regions to USGS HUCs in the French Broad–Holston river basin (HUC 060101). Common boundaries are generally in good agreement.

We implemented our I/O-efficient algorithm and performed experiments with real life grid DEMs (preprocessed using our TerraFlow flow modeling software) that showed that it is capable of handling truly massive DEMs. For example, we were able to process a 20 feet resolution DEM of the Neuse river basin in North Carolina, containing approximately 400 million cells, in around 3 hours. Interestingly, our experiments also showed that the Pfafstetter regions of grid DEMs obtained from high resolution LIDAR data corresponds nicely to the USGS HUC regions. Refer to Figure 3.


L. Arge, A. Danner, H. Haverkort and N. Zeh. I/O-efficient Hierarchical Watershed Decomposition of Grid Terrain Models. In Proc. 12th International Symposium on Spatial Data Handling, 2006. pdf


P. Seaber and F. Kapinos and G. Knapp. Hydrologic Unit Maps. USGS, Water Supply Paper 2294, 1987.

K. L. Verdin and J. P. Verdin. A topological system for delineation and codification of the Earth's river basins. Journal of Hydrology, 218, 1-12, 1999.



Elevation Derivatives for National Applications (EDNA)

Watershed Topology - The Pfafstetter System

Topologically Coding Global Drainage Basins

STREAM Projects: Grid Construction | TIN Construction | Noise Removal | Flow | Topographic Change | Watershed