CPS 296: Design and Analysis of Algorithms with Applications in GIS

Geographic Information Systems


Simply speaking a Geographic Information Systems (GIS) is a database of spatially-referenced data, such as streets, rivers, mountains, houses, soil layers, pollution levels, voting districts; everything that has a location. Such systems can be used to produce maps, but also to analyze data; do you want to know the different routes to Chapel Hill? - or where flooding is likely to occur during a hurricane?- or how to re-zone the political districts of NC so your favorite candidate gets elected? A GIS should supply the tools to obtain the answers.

GIS systems are used in a growing number of application domains and with this widespread use of continuously growing databases, efficiency has emerged as the major bottleneck in today's GIS.
Most GIS systems at some level store map data as a number of layers. Each layer is a thematic map, that is, it stores only one type of information. Examples are a layer storing all roads, a layer storing all cities, and so on. The theme of a layer can also be more abstract, as for example a layer of population density or land utilization (farmland, forest, residential). Even though the information stored in a layer can be very different, it is typically stored as geometric information like line segments or points. A layer for a road map typically stores the roads as line segments, a layer for cities typically contains points labeled with city names, and a layer for land utilization could store a subdivision of the map into regions labeled with the use of a particular region.
One of most fundamental operations in many GIS systems is map overlaying - the computation of new scenes or maps from a number of existing maps. Some existing software packages are completely based on this operation. Given two thematic maps the problem is to compute a new map in which the thematic attributes of each location is a function of the thematic attributes of the corresponding locations in the two input maps. For example, the input maps could be a map of land utilization and a map of pollution levels. The map overlay operation could then be used to produce a new map of agricultural land where the degree of pollution is above a certain level. One of the main problems in overlaying of maps stored as line segments is ``line-breaking'' - the problem of computing the intersections between the line segments making up the maps. This problem can be abstracted as the in computational geometry well-known problem of red/blue line segment intersection. In this problem one is given a set of non-intersecting red segments and a set of non-intersecting blue segments and should compute all intersection red-blue segment pairs.

In general many important problems from computational geometry are abstractions of important GIS operations. Examples are range searching which e.g. can be used in finding all objects inside a certain region, planar point location which e.g. can be used when locating the region a given city lies in, and region decomposition problems such as trapezoid decomposition, (Voronoi or Delaunay) triangulation, and convex hull computation. The latter problems are useful for rendering and modeling. Furthermore, GIS systems frequently store and manipulate enormous amounts of data, and therefore the problem of minimizing Input/Output (or I/O) communication becomes essential - the time spent on communication between fast internal memory and slower external memory (such as disks), and not the internal computation time, is the bottleneck in many GIS applications. A good example of a large-scale GIS is NASA's Earth Observation System (EOS), which is expected to manipulate petabytes (thousands of terabytes, or millions of gigabytes) of data! In CPS 296 we consider the design and analysis of algorithms with applications in GIS, with special emphasis on I/O-effective algorithms for large-scale problems.


Lars Arge
Wed Nov 27, 1996