Full text (gzip-compressed postscript)
In this paper we introduce input/output (I/O) overhead
as
a complexity measure for VLSI implementations of two-dimensional
lattice computations of the type arising in the simulation of
physical systems. We show by pebbling arguments that
when there are n2 processing elements
available. If the results are required to be observed at every
generation, and no on-chip storage is allowed, we show the lower bound
is the constant 2. We then examine four VLSI architectures and show
that one of them, the multi-generation sweep architecture, also has
I/O overhead proportional to n-1. We compare the constants of
proportionality between the lower bound and the architecture.
Finally, we prove a closed-form for the discrete minimization equation
giving the optimal number of generations to compute for the
multi-generation sweep architecture.