Noise Removal on DEMs using Persistence

Almost all existing flow models proposed in the literature assume that once water flows into a minima, it never flows out. This is of course only true if the minima corresponds to the ocean or some big lake. However, on a grid or TIN constructed from the elevation data of a terrain, there are numerous minima (often called sinks), due to either measurement noises or real small pits on the terrain. As a result, various sink-removal techniques have been proposed to remove all the spurious sinks before computing the actual flows.

Flooding

Figure A:  (a) The terrain. (b) Flood the terrain until a steady-state is reached. (c) Partially flood the terrain.

Currently, the most popular sink-removal method is flooding, which has been used in many commercial and open-source GIS systems such as ArcInfo and GRASS. The idea of flooding is to simulate uniformly pouring water on the terrain until all sinks are filled and a steady-state is reached. Refer to Figure A (a) and (b). In the end, the only remaining sink is the "outside," which corresponds to the ocean.

However, one serious problem with this approach is that it floods all sinks, regardless of its importance. As such, many important geographical features, such as big lakes or river valleys that do not end up in the ocean will vanish after this flooding procedure. To preserve these features, users have to manually mark the important sinks as "real sinks" to distinguish them from the spurious ones. This is labor-consuming, and furthermore it undermines the value of flow computation, one of whose purposes is to determine where the big lakes and rivers are. We adopt the notion of topological persistence, introduced by Edelsbrunner et al., to measure the importance of the sinks on the terrain.

Topological persistence

Imagine that we sweep the terrain using a horizontal plane bottom-up. During the process we maintain the connected components of the portion of the terrain below the sweep plane, and for each component we maintain its minimum-height vertex as its representative. Every time we pass a minimum, a new component is created. When we pass a regular or a maximum vertex, the number of components and all the representatives do not change. When we pass a saddle, one or more components may be merged, and all their representatives but the lowest one gets destroyed, and the lowest representative continues to be the representative of the merged component. The idea behind topological persistence is to assign an importance value to a minimum v equal to the time span that it remains as a representative, i.e., the height difference between v and the saddle that merges the component represented by v into another, "older" component represented by a lower minimum.

We use persistence to measure the importance of sinks on a terrain. For a user-specified threshold τ, we declare all sinks with persistence greater than τ to be the real sinks, while the rest should be removed. The user can change the threshold to control the smallest feature size to be preserved. Thus we get an automated way for flooding that preserves features on the terrain based on their persistence.

Relation to flooding

Besides its many nice properties that root from Morse theory, the topological persistence also has a natural interpretation in the flooding model, which makes it very appropriate for the purpose of sink removal. Flooding computes an raise elevation for each sink, and then raise the sink to that height so that water can flows out to a real sink. It is then attempting to relate the importance of a minimum with the difference between the raise elevation and the original height of the minimum, since that is the amount to be modified. However there is a serious problem. Consider the case shown in Figure A (a). On the high level there are three major sinks, and suppose all of them are real sinks, but the rightmost one corresponds to a big lake with a lot of bumps and pits at the bottom. If we were to use the above definition of importance, then all those tiny pits would be more important than the other two major sinks, which does not make sense. On the other hand, persistence resolves the problem smoothly: The persistence value of a sink is actually how much higher the sink needs to be raised in order to for it to flow into another lower sink, which may not necessarily be the "outside," as opposed to the above naive definition. In the case of the right sink in Figure A, all the tiny pits except the lowest one will get a very small persistence value, while the lowest one gets a large value, which "represents" this big lake. By choosing an appropriate threshold τ, the flooding result will be the one in Figure A (c), as desired.

Implementation and results

 Figure B: Original terrain around a quarry north of Durham, NC. | enlarge Figure C: Terrain flooded with τ = ∞. | enlarge Figure D: Terrain flooded with persistence threshod τ=30. | enlarge

We have developed an I/O-efficient algorithm for computing the topological persistence on a massive terrain, which can be represented by either a grid or a TIN. The persistence algorithm also leads to a much simpler flooding algorithm than the one implemented in the earlier version of TerraFlow. We implemented the new flooding algorithm and performed some experiments with real terrain data.

We applied the flooding algorithm to the Neuse River Basin with a persistence threshold of τ=30. We also ran the algorithm with τ=∞, i.e., flooding all sinks, which gives the same results as the previous flooding algorithm. A portion of the original terrain, and the flooded terrains are shown in Figure B, Figure C, and Figure D. With a threshold of τ=30, around 99.5% of the sinks have been removed, while the major features have been preserved. On the other hand, the previous flooding procedure has eradicated some major features, and generated a few large flat areas, which are undesirable.

Publication

P. K. Agarwal, L. Arge, and K. Yi. I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis. In 22nd ACM Sympos. Comput. Geom. (SoCG'06). pdf

References

P.-T. Bremer, V. Pascucci, H. Edelsbrunner and B. Hamann. A topological hierarchy for functions on triangulated surfaces. IEEE Trans. Vis. Comput. Graphics 10 (2004), 385-396.

H. Edelsbrunner, J. Harer and A. Zomorodian. Hierarchical Morse-Smale complexes for piecewise linear 2-manifolds. Discrete Comput. Geom. 30 (2003), 87-107.