[AE93] Lars-Erik Andersson and Tommy Elfving. Two constrained procrustes problems. LiTH-MAT-R-1993-39, Link\"oping University, Department of Mathematics, December 1993.

[AG92] R. C. Agarwal and F. G. Gustavson. Algorithm and architecture aspects of producing ESSL BLAS on POWER2. In PowerPC and POWER2: Technical Aspects of the New IBM RISC System/6000, pages 167--176. IBM Corporation, 1992.

[AGZ94] R. C. Agarwal, F. G. Gustavson, and M. Zubair. A high performance parallel algorithm for 1-d fft. In Supercomputing'94, pages 34--40. IEEE Computer Society and ACM, IEEE Computer Society Press, November 1994.

[AS85] Harold Abelson and Gerald Jay Sussman. Structure and Interpretation of Computer Programs. The MIT Press, 1985.

[AS93] D.G. Antzoulatos and A.A. Sawchuk. Hypermatrix algebra: applications in parallel image processing. CVGIP: Image Understanding, 57(1):42--62, January 1993.

[Bar87] R. Barakat. Optical matrix-matrix multiplier based on kronecker product decomposition. Applied Optics, 26(2):191--192, January 1987.

[BBDS94] D. Bailey, E. Barszcz, L. Dagum, and H. Simon. Nas parallel benchmark results 3-94. Technical Report RNR-94-006, NASA, Ames Research Center, March 1994.

[BCFH92] James M. Boyle, Maurice Clint, Stephen Fitzpatrick, and Terence J. Harmer. The construction of numerical mathematical software for the amt dap by program transformation. In L. Bouge, M. Cosnar, Y. Robert, and D. Trystram, editors,
CONPAR'92 VAPP V
, Lecture Notes in Computer Science 634, pages 761--767. Springer-Verlag, September 1992.

[BCR91] G. Beylkin, R. Coifman, and V. Rokhlin. Fast wavelet transforms and numerical algorithms i. Communications on Pure and Applied Mathematics, XLIV:141--183, 1991.

[Ber80] M. W. Berry. Multiprocessor sparse SVD algorithms and applications. PhD thesis, The University of Illinois at Urbana-Champaign, 1980.

[Ber85] Robert H. Berman. Fourier transform algorithms for spectral analysis derived with macsyma. In Richard Pavelle, editor, Applications of computer algebra, pages 210--241. Kluwer Academic Publishers, Boston, 1985.

[BH90] William L. Briggs and Van Emden Henson. The fft as a multigrid algorithm. SIAM Review, 32(2):252--261, June 1990.

[BHSO87] William L. Briggs, Leslie B. Hart, Roland A. Sweet, and Abbie O'Gallagher. Multiprocessor fft methods. SIAM Journal of Scientific and Statistical Computing, 8(1):s27--s42, January 1987.

[BP85] C. S. Burrus and T. W. Parks. DFT/FFT and Convolution Algorithms. John Wiley and Sons, 1985.

[Bra91] Bert L. Bradford. Fast Fourier Transforms for Direct Solution of Poisson's Equation. PhD thesis, Department of Mathematics, University of Colorado, 1991.

[Bro92] Lisa G. Brown. A survey of image registration techniques. ACM Computing Surveys, 24(4):325--376, December 1992.

[BS91] David H. Bailey and Paul N. Swarztrauber. The fractional fourier transform and applications. SIAM Review, 33(3):389--404, September 1991.

[BV94] Dave M. Bond and Stephen A. Vavasis. Fast wavelet transforms for matrices arising from boundary element methods. Technical Report CTC94TR174, ACRI, Cornell University, 1994.

[CF92] B. Cernuschi-Frias. A generalization of the bookstein constraint to algebraic surfaces. In 1992 IEEE International Conference on Systems, Man and Cybernetics, volume 1, pages 599--604. IEEE, 1992.

[CGM85] P. Concus, G. H. Golub, and G. Meurant. Block preconditioning for the conjugate gradient method. SIAM J Sci Stat Comput, 6(1):220--252, January 1985.

[Cha88] Raymond H. Chan. An optimal circulant preconditioner for toeplitz systems. SIAM Journal of Scientific and Statistical Computing, 9(?):766--771, 1988.

[CHR94] Nikos Chrisochoides, Elias Houstis, and John Rice. Mapping algorithms and software environments for data parallel pde iterative solvers. Journal of Parallel and Distributed Computing, 21:75--95, 1994.

[Chu92] Charles K. Chui. An Introduction to Wavelets, volume 1 of Wavelet Analysis and Its Applications. Academic Press, Inc., 1992.

[CJ92] Raymond H. Chan and Xiao-Qing Jin. A family of block preconditioners for block systems. SIAM Journal of Scientific and Statistical Computing, 13(5):1218--1235, September 1992.

[Coo87] James W. Cooley. How the fft gained acceptance. In ACM Conference on the History of Numeric and Scientific Computing, May 1987.

[CS88] Hung-Yuan Chung and York-Yih Sun. Analysis and parameter estimation of nonlinear systems with hammerstein model using taylor series approach. IEEE Transactions on Circuits and Systems, 35(12):1539--1541, December 1988.

[CT65] James W. Cooley and John W. Tukey. An algorithm for the machine calculation of complex fourier series. Mathematics of Computation, 19:297--301, April 1965.

[Dau92] Ingrid Daubechies. Ten Lectures on Wavelets, volume 61 of Regional Conference Series in Applied Mathematics. SIAM, 1992.

[DES82] Ron S. Dembo, Stanley C. Eisenstat, and Trond Steinhaug. Inexact newton methods. SIAM J. Numer. Anal., 19(2):400--408, April 1982.

[DGK{\etalchar{+}}94] D.L. Dai, S.K.S. Gupta, S.D. Kaushik, Lu J.H., R.V. Singh, C.-H. Huang, P. Sadayappan, and R.W. Johnson. Extent: A portable programming environment for designing and implementing high-performance block recursive algorithms. In Supercomputing'94, pages 49--58. IEEE Computer Society and ACM, IEEE Computer Society Press, 1994.

[DLS94] W. De Launey and J. Seberry. The strong kronecker product. Journal of Combinatorial Theory, Series A, 66(2):192--213, May 1994.

[DR93] A. Dutt and V. Rokhlin. Fast fourier transforms for nonequispaced data. SIAM Journal of Scientific and Statistical Computing, 14(6):1368--1393, November 1993.

[DRGGP95] L. De Rose, K. Gallivan, E. Gallopoulos, and D Padua. A matlab compiler and restructurer for the development of scientific libraries and applications. Technical Report CSRD 1430, CSRD, University of Illinois at Urbana-Champaign, May 1995.

[Els89] Ann C. Elster. Fast bit-reversal algorithms. In ICASSP'89, pages 1099--1102, 1989.

[EVLPP94] Brent L. Ellerbroeck, Charles Van Loan, Nikos P. Pitsianis, and Robbert J. Plemmons. Optimizing closed-loop adaptive-optics performance with use of multiple control bandwidths. Journal of the Optical Society of America A, 11(11):2871--2886, November 1994.

[FCK95] S. Fitzpatrick, M. Clint, and P. Kilpatrick. The automated derivation of sparse implementations of numerical algorithms through program transformation. Technical Report 1995, Department of Computer Science, The Queen's University of Belfast, April 1995.

[FF92] Donald W. Fausett and Charles T. Fulton. Large least squares problems involving kronecker products. SIAM Journal of Matrix Analysis, ?(?):?, 1992.

[FHB92] Stephen Fitzpatrick, Terence J. Harmer, and James M. Boyle. Deriving efficient parallel implementations of program transformation. In L. Bouge, M. Cosnar, Y. Robert, and D. Trystram, editors,
CONPAR'92 VAPP V
, Lecture Notes in Computer Science 634, pages 761--767. Springer-Verlag, September 1992.

[Fle90] Roger Flecher. Practical Methods of Optimization. John Wiley \& Sons, 1990.

[GCT91] J. Granata, M. Conner, and R. Tolimieri. A tensor product factorization of the linear convolution matrix. IEEE Transactions on Circuits and Systems, 38(11):1364--1366, November 1991.

[GCT92a] J. Granata, M. Conner, and R. Tolimieri. Recursive fast algorithms and the role of the tensor product. IEEE Transactions on SP, 40(12):2921--2930, December 1992.

[GCT92b] J. Granata, M. Conner, and R. Tolimieri. The tensor product: A mathematical programming language for fft's and other fast dsp operations. IEEE SP Magazine, pages 40--48, December 1992.

[GJ79] Michel R. Garey and David S. Johnson. Computers and Intractability, A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979.

[GKH{\etalchar{+}}92] S. K. S. Gupta, S. D. Kaushik, C. H. Huang, J. R. Johnson, R. W. Johnson, and P. Sadayappan. A methodology for generating data distributions from tensor product formulas. Technical report, The Ohio State University, 1992.

[GKH{\etalchar{+}}94] S. Gupta, S. Kaushik, C-H Huang, J. Johnson, R. Johnson, and P. Sadayappan. A methodology for generating data distributions from tensor product formulas. Technical report, The Ohio State University, 1994.

[GKHS93] S. Gupta, S. Kaushik, C.-H. Huang, and P. Sadayappan. On compiling array expressions for efficient execution for distributed-memory machines. Technical report, The Ohio State University, 1993.

[GKS92] A. Gupta, V. Kumar, and A. Sameh. Performance and scalability of conjugate gradient methods on parallel computers. preprint, 1992.

[GLO81] Gene H. Golub, Franklin Luk, and Mike Overton. A block lanczos method for computing the singular values and corresponding singular vectors of a matrix. ACM Transactions on Mathematical Software, 7:149--169, 1981.

[GR89] A. Greenbaum and G. H. Rodrigue. Optimal preconditioners of a given sparsity pattern. BIT, 29:610--634, 1989.

[GS91] Leslie Greengard and John Strain. The fast gauss transform. SIAM Journal of Scientific and Statistical Computing, 12(1):79--94, January 1991.

[GVL89] Gene H. Golub and Charles F. Van Loan. Matrix Computations. The Johns Hopkins University Press, 1989.

[GW87] Rafael C. Gonzalez and Paul Wintz. Digital image processing. Addison-Wesley, second edition, 1987.

[HC89] David C. Hyland and Emmanuel G. Collins. Block kronecker products and block norm matrices in large-scale analysis. SIAM Journal of Matrix Analysis, 10(1):18--29, January 1989.

[Heg95] Markus Hegland. An implementation of multiple and multivariate fourier transforms on vector processors. SIAM Journal on Scientific Computing, 16(2):271--288, March 1995.

[Hig88] Nicholas J. Higham. The symmetric procrustes problem. BIT, 28:133--143, 1988.

[HJB85] Michael T. Heideman, Don H. Johnson, and C. Sidney Burrus. Gauss and the history of the fast fourier transform. Archive for History of Exact Sciences, 34(3):265--277, 1985.

[HJJ90] C-H Huang, J. Johnson, and R. Johnson. A tensor product formulation of strassen's matrix multiplication algorithm. Applied Mathematics Letters, 3(3):67--71, 1990.

[HJJ92] C-H Huang, J. Johnson, and R. Johnson. Generating parallel programs from tensor product formulas: A case study of strassen's matrix multiplication algorithm. Technical report, The Ohio State University, 1992.

[HL87] Chua-Huang Huang and Christian Lengauer. The derivation of systolic implementation of programs. Acta Informatica, 24:295--632, 1987.

[Hor86] B. K. P. Horn. Robot Vision. The MIT Press, 1986.

[HPS83] H. V. Henderson, F. Pukelsheim, and S. R. Searle. On the history of the kroneker product. Linear and Multilinear Algebra, 14:113--120, 1983.

[HR93] L.G. Hassebrook and M. Rahmati. Training set selection with multiple out-of-plane rotation parameters. In Proceedings of the SPIE, volume 1959, pages 32--42. SPIE, April 1993.

[HRW92] Peter N. Heller, Howard L. Resnikoff, and Raymond O. Jr. Wells. Wavelet matrices and the representation of discrete functions. In Charles K. Chui, editor, Wavelets: A Tutorial in Theory and Applications, Wavelet Analysis and Its Applications, chapter I, pages 15--50. Academic Press, Inc., 1992.

[HS81] H. V. Henderson and S. R. Searle. The vec-permutation matrix, the vec operator and kronecker products, a review. Linear and Multilinear Algebra, 9:271--288, 1981.

[IBM95] IBM Corp. Parallel Engineering and Scientific Subroutine Library Guide and Reference, 1995.

[Jac95] Paul B. Jackson. Enhancing the Nuprl Proof Development System and Applying it to Computational Abstract Algebra. PhD thesis, Department of Computer Science, Cornell University, 1995.

[Jai89] Anil K. Jain. Fundamentals of digital image proccessing. Information and System Sciences. Prentice-Hall International, 1989.

[JJRT90] J. Johnson, R. Johnson, D. Rodriguez, and R. Tolimieri. A methodology for designing, modifying, and implementing fourier transform algorithms on various architectures. Circuits Systems Signal Process., 9(4):449--500, 1990.

[JKFM89] S. Lennart Johnsson, Robert L. Krawitz, Roger Frye, and Douglas MacDonald. Cooley-tukey fft on the connection machine. Technical Report YALEU/DCS/TR-750, Department of Computer Science, Yale University, 1989.

[Kau83] Linda Kaufman. Matrix methods for queuing problems. SIAM Journal of Scientific and Statistical Computing, 4(3):525--552, September 1983.

[KH75] C. D. Kuglin and D. C. Hines. The phase correlation image alignment method. In IEEE ICCS, pages 163--165. IEEE Computer Society, IEEE Computer Society Press, September 1975.

[KHJ{\etalchar{+}}93] S. D. Kaushik, C.-H. Huang, J. R. Johnson, R. W. Johnson, and P. Sadayappan. Efficient transposition algorithms for large matrices. In Supercomputing'93, November 1993.

[KHJS92] S. D. Kaushik, C.-H. Huang, R. W. Johnson, and P. Sadayappan. A methodology for generating efficient disk-based algorithms from tensor product formulas. Technical report, The Ohio State University, 1992.

[KL92] Deepak Kapur and Yagiti N. Lakshman. Elimination methods: an introduction. In B. R. Donald, D. Kapur, and J. L. Mundy, editors, Symbolic and Numerical Computation for Artificial Intelligence, Computational Mathematics and Applications, pages 45--87. Academic Press, 1992.

[KSH{\etalchar{+}}92a] S. Kaushik, S. Sharma, C-H Huang, J. Johnson, R. Johnson, and P. Sadayappan. An algebraic theory for modeling multistage interconnection networks. In International Conference on Parallel and Distributed Systems (ICPDS'92), pages 97--106, December 1992.

[KSH{\etalchar{+}}92b] S. Kaushik, S. Sharma, C-H Huang, J. Johnson, R. Johnson, and P. Sadayappan. An algebraic theory for modeling direct interconnection networks. In Supercomputing'92, pages 488--497, November 1992.

[KSH{\etalchar{+}}92c] S. Kaushik, S. Sharma, C-H Huang, J. Johnson, R. Johnson, and P. Sadayappan. A methodology for generating data distributions from tensor product formulas. Technical report, The Ohio State University, 1992.

[LCCM89] P. Lie Chin Cheong and S. D. Morgera. Iterative methods for restoring noisy images. IEEE Transactions on Acoustics, Speech, and Signal Processing, 37(4):580--585, April 1989.

[LCT93] Chao Lu, James W. Cooley, and Richard Tolimieri. Fft algorithms for prime transform sizes and their implementations on vax, ibm3090vf, and ibm rs/6000. IEEE Transactions on Signal Processing, 41(2):638--647, February 1993.

[LH74] Charles L. Lawson and Richard J. Hanson. Solving least squares problems. Prentice-Hall, 1974.

[LLCC94] H. J. Lee, J. C. Liu, A. K. Chan, and C. K. Chui. Parallel implementation of wavelet decomposition/reconstruction algorithms. SPIE Wavelet Applications, 2242:248--259, 1994.

[Lu93] Jian Lu. Parallelizing mallat algorithm for 2-d wavelet transform. Information Processing Letters, 45:255--259, April 1993.

[Mal89] Stephane G. Mallat. A theory for multiresolution signal decomposition: The wavelet representation. IEEE Trans. PAMI, 11(7):674--693, July 1989.

[Mat90] The MathWorks, Inc. Pro-Matlab, User's Guide, 1990.

[MP92] Manavendra Misra and Viktor K. Prasanna. Parallel computation of 2-d wavelet transforms. IEEE ?, ?:111--114, 1992.

[MR96] David K. Maslen and Daniel N. Rockmore. Generalized ffts - a survey of some recent results. In To Appear, 1996.

[Nag95a] James G. Nagy. Applications of toeplitz systems. SIAM News, 28(8):10--11, 1995.

[Nag95b] James G. Nagy. Iterative techniques for the solution of toeplitz systems. SIAM News, 28(7):8--9, 1995.

[Neu81] Marcel F. Neuts. Matrix-Geometric Solutions in Stochastic Models, An Algorithmic Approach. The John Hopkins University Press, 1981.

[Nus82] Henri J. Nussbaumer. Fast Fourier Transform and Convolution Algorithms. Springer-Verlag, 1982.

[OR70] J. M. Ortega and W. C. Rheinboldt. Iterative Solution of Nonlinear Equations. Academic Press, 1970.

[Pit97] Nikos P. Pitsianis. The Kronecker Product in Approximation and Fast Transform Generation. PhD thesis, Department of Computer Science, Cornell University, 1997.

[Pra91] William K. Pratt. Digital Image Processing. John Wiley \& Sons Inc., second edition, 1991.

[PS73] V. Pereyra and G. Scherer. Efficient computer manipulation of tensor products with applications to multidimentional approximation. Mathematics of Computation, 27(123):595--605, July 1973.

[QD88] L. Qiu and E. J. Davison. A new method for the stability robustness determination of state space models with real perturbations. In IEEE $27^th$ Conference on Decision and Control, pages 538--543, Austin, Texas, December 1988.

[Rau80] Urho A. Rauhala. Introduction to array algebra. Photogrammetric Engineering and Remote Sensing, 46(2):117--192, February 1980.

[RM89] Phillip A. Regalia and Sanjit K. Mitra. Kronecker products, unitary matrices and signal processing applications. SIAM Review, 31(4):586--613, December 1989.

[SC93] David Sharp and Martin Cripps. Synthesis of the fast fourier transform algorithm by functional language program transformation. In Euromicro Workshop on Parallel and Distributed Processing, pages 136--143, January 1993.

[SE92] L.J. Sciacca and R.J. Evans. Signal processing applied to ultrasonic imaging. In IEEE Sixth SP Workshop on Statistical Signal and Array Processing Conference Proceedings, pages 225--228. IEEE; Univ. Victoria; Naval Surface Warfare Center, 1992.

[SHFG95] M. Snir, P. Hochschild, D. D. Frye, and K. J Gildea. The communication software and parallel environment of the ibm sp2. IBM Systems Journal, 34(2):205--221, 1995.

[Ste91] Willi-Hans Steeb. Kronecker Product of Matrices and Applications. Wissenschaftsverlag, 1991.

[Ste95] William J. Stewart. Introduction to the Numerical Solution of Markov Chains. Princeton University Press, 1995.

[Str86] Gilbert Strang. A proposal for toeplitz matrix calculations. Stud. Appl. Math, 74:171--176, 1986.

[Str89] Gilbert Strang. Wavelets and dilation equations: A brief introduction. SIAM Review, 31(4):614--627, December 1989.

[Str92] Gilbert Strang. The optimal coefficients in daubechies wavelets. Physica D, 60:239--244, 1992.

[Str93] Gilbert Strang. Wavelet transforms versus fourier transforms. Bulletin of the AMS, 28(2):288--305, April 1993.

[Swa82] Paul N. Swarztrauber. Vectorizing the fft's. In G. Rodrigue, editor, Parallel Computations, pages 51--83. Academic Press, New York, 1982.

[TAL89] R. Tolimieri, M. An, and C. Lu. Algorithms for Discrete Fourier Transform and Convolution. Springer-Verlag, 1989.

[Thi91] Thinking Machines Corp. Programming in Fortran and Fortran Reference Manual, 1991.

[Thi93] Thinking Machines Corp. CMSSL for CM Fortran: CM-5 Edition, 1993.

[VL92] Charles F. Van Loan. Computational frameworks for the Fast Fourier Transform. SIAM, 1992.

[VLP92] Charles F. Van Loan and Nikos P. Pitsianis. Approximation with kronecker products. Technical Report CTC92TR109, Cornell Theory Center, November 1992.

[VLP93] Charles F. Van Loan and Nikos P. Pitsianis. Approximation with kronecker products. In M. S. Moonen and G. H. Golub, editors, Linear Algebra for Large Scale and Real Time Applications, pages 293--314. Kluwer Publications, 1993.

[War94] J. Ward. Space-time adaptive processing for airborn radar. Technical Report TR1015, Lincoln Labs, MIT, December 1994.

[Wat94] Waterloo Maple Software. Maple V Release 4, 1994.

[Wil88] James Hardy Wilkinson. The Algebraic eigenvalue problem. Oxford University Press, 1988.

[Wol88] Stephen Wolfram. Mathematica, A system for Doing Mathematics by Computer. Addison-Wesley, 1988.

[WP91] H.R. Wu and F.J. Paoloni. A two-dimensional fast cosine transform algorithm based on hou's approach. IEEE Transactions on Signal Processing, 39(2):544--546, February 1991.

[WW91] Lang Jr. Withers and John Whelchel. The multidimensional phase-rotation fft: A new parallel architecture. IEEE ?, ?(?):2889--2892, July 1991.

[WZ93] Raymond O. Wells, Jr and Xiaodong Zhou. Wavelet interpolation and approximate solutions of elliptic partial differential equations. Technical report, Computational Mathematics Laboratory, Rice University, 1993.

[Zip88] Paul Zipkin. The use of phase-type distributions in inventory-control models. Naval Research Logistics, 35:247--257, 1988.

[Zip93] Richard E. Zippel. The weyl computer algebra substrate. In Alfonso Miola, editor, Design and Implementation of Symbolic Computation Systems, volume 722 of Lecture Notes in Computer Science, pages 303--318. Springer Verlag, 1993.

[ZKS94] M.E. Zervakis, Taek Mu Kwon, and A.E. Savakis. Operator decomposition using the wavelet transform: fundamental properties and image restoration applications. In Proceedings ICIP-94, volume 1, pages 56--60. IEEE Signal Process. Soc, November 1994.