TerraFlow Speedup over other GIS software

ArcInfo | GRASS | TARDEM

We compare TerraFlow with:

We would also like to experiment with TOPAZ and TAPESG at some point. Together these constitute all the software we know of that implement flow modeling functions.

TerraFlow and ArcInfo

ArcInfo provides the grid functions flowdirection and flowaccumulation. Their behavior is very similar to TerraFlow: The ArcInfo flowdirection function takes as input an elevation grid and computes an SFD direction grid and a filled elevation grid. The ArcInfo flowaccumulation function takes as input the flow direction grid and computes a D8 flow accumulation grid.

Platform: We installed ArcInfo 7.1.2 on the same type of machine as the one used to test TerraFlow, a Digital Personal Workstation with 500 MHz Alpha processors and 766MB of main memory running FreeBSD 4.0. For each experiment we rebooted the machine with the desired amount of main memory.

Comparison of the total running time of Terraflow and ArcInfo's flowdirection and flowaccumulation commands at different main memory sizes. Data size is in million elements and running time is in hours.

128MB main memory

512MB main memory

Arcinfo running times

512MB RAM runs:

Dataset ArcInfo fill ArcInfo flow Total
KAWEAH 1.28 min 0.52 min 1:40 min
PUERTO RICO 2.6 min 0.8 min 3:24 min
SIERRA NEVADA 12.6 min 4 min 16:24 min
HAWAII 9.33 min 2.97 min 12:18 min
CUMBERLANDS 1.87 hr 1.1 hr 2:58 hr
LOWER NEW ENGLAND 1.78 hr 0.5 hr 2:16 hr
CAPDEM 5:25 hr 2:54 hr 8:20 hr
USADEM6 8.5 hr 69 hr 77:30 hr
USADEM2 13.5 hr 18.92 hr 32:25 hr

128MB RAM runs:

Dataset ArcInfo fill ArcInfo flow Total
KAWEAH 1.23 min 0.48 min 1:42 min
PUERTO RICO 2.4 min 0.73 min 3:08 min
SIERRA NEVADA 13.63 min 5.45 min 19:05 min
HAWAII 9.22 min 2.9 min 12:07 min
CUMBERLANDS 1.96 hr 1.22 hr 3:10 hr
LOWER NEW ENGLAND 1.85 hr 0.6 hr 2:27 hr
USADEM6 8.63 hr 70 hr 78:38 hr
USADEM2 15.8 20.1 hr 35:54 hr

64MB RAM runs:

Dataset ArcInfo fill ArcInfo flow Total
KAWEAH 1.23 min 0.5 min 1:44 min
PUERTO RICO 2.62 min 0.78 min 3:24 min
SIERRA NEVADA 14 min 8.05 min 22:03 min
HAWAII 9.98 min 3.52 min 13:30 min
CUMBERLANDS 2.05 hr 1.4 hr 2:27 hr
LOWER NEW ENGLAND 1.92 hr 0.65 hr 2:34 hr
USADEM6 8.72 hr 70.25 hr 78:58 hr
USADEM2 16.95 hr 20.38 hr 37:20 hr



Terraflow and GRASS

GRASS is open source GIS released under the GNU General Public License. It provides the function r.watershed which uses a least-cost search algorithm to determine the flow accumulation. It takes as input an elevation grid and outputs a flow accumulation grid. It does not require a separate direction computation, and is thus the equivalent of both FILL and FLOW. Prior to running the command we mask out the nodata values with the r.mask command, in order to reduce the memory necessary to run the program and processing time. The r.watershed function has two versions, ram and seg; ram uses virtual memory managed by the operating system to store all the data structures and is faster than seg; seg uses the GRASS segment library which manages data in disk files. In our experiments we first run the ram version; if it runs out of memory we run the seg version.

Platform: We installed GRASS on an Intel 500MHz PIII with 1GB of main memory running FreeBSD 4.0 and having a local striped disk array consisting of 36GB 10000 RPM IBM drives. The platform for GRASS and TARDEM, and TerraFlow and ArcInfo are not the same, but comparable, GRASS and TARDEM being at an advantage.

GRASS takes 12 minutes on Kaweah dataset and 5 days on Puerto Rico. We let GRASS run for 17 days on the Hawaii dataset, time in which it completed 65% of the task. The estimated run time on Hawaii is thus 24 days, which is 960 times bigger than the running time of FILL and FLOW together (38 minutes at 512MB main memory).

Total running time of GRASS at 1GB main memory. Data size is in million elements and running time is in hours.

GRASS running times

Dataset r.watershed Run output Notes
KAWEAH 12:09 min (in memory)   
PUERTO RICO 5 days    
SIERRA NEVADA      
HAWAII (out of memory)
Fri Nov 10 started
Sun Nov 12 done 6%
Mon Nov 22 done 39%
Mon Nov 27 done 65%
killed
done 65% in 17 days
estimated time: 24 days!
run Note: CPU 99%, no paging.
CUMBERLANDS    
LOWER NEW ENGLAND   
USADEM6    
USADEM2    


Terraflow and TARDEM

TARDEM is a suite of programs for the analysis of digital elevation data developed at Utah State University. It provides the functions flood, d8 and aread8. The flood function takes as input an elevation grid, fills it and outputs a filled elevation grid. The d8 function takes as input a filled elevation grid and outputs SFD flow directions. The aread8 function takes as input the flow direction grid produced by the previous function and computes the flow accumulation grid.

Platform For the TARDEM experiments we used a machine identical with the one running GRASS, with 1GB of main memory running Windows2000. The platforms for GRASS and TARDEM, and TerraFlow and ArcInfo are not the same, but comparable, GRASS and TARDEM being at an advantage..

TARDEM is competitive for small datasets, but starts breaking down as dataset size increases. It takes less than 3 minute on Kaweah dataset, less than 5 minutes on Puerto Rico, 6:20 hours on Sierra Nevada and 39.5 hours on Cumberlands. We tested it on a dataset twice as large as Cumberlands: the flood command completed in 20 days and the d8 command ran for 21 days before we killed it. During this time it was ``thrashing'' with a CPU utilization under 5% and a 3GB swap file.

Total running time of TARDEM at 1GB main memory. Data size is in million elements and running time is in hours.

Dataset Flood d8 & aread8 Run output
KAWEAH < 1 min < 2 min run
PUERTO RICO < 3 min < 2 min run
SIERRA NEVADA 6.15 hr 8 min run
HAWAII
CUMBERLANDS 37.5 hr 1.95 hr run
LOWER NEW ENGLAND
CAPDEM
started Fri 10/27 4:05p,
Tue Oct 31 done 19%,
Sun Nov 5th done 27%,
Mon Nov 13 done 53%
done Fri Nov 17th, 9:34a
TOTAL: 20:17 days
running!
started Fri Nov 17th, 9:34a
running at 5% CPU, 3GB memory in use
Thu Nov 30th, running (5% CPU)
Mon Dec 4th, running (5% CPU)
Tue Dec 12th, running (5% CPU) ( 26 days )
USADEM6 flood error..(started Thu 10/26 2:49p, died 3:09p run
USADEM2 flood error..(started Fri 10/27 12:21p, died 1.20p)


<laura@cs.duke.edu>
Last modified: Thu Mar 1 00:32:24 2001 by laura.