Some topics of importance do not fall into categories by application subarea. We consider such topics here.
An important issue is that of human/machine and machine/machine interfaces. One pure case of machine/machine interface is the combined HOL-Nuprl effort. In that case the interface was a significant undertaking, with serious theoretical work needed to accomplish the interface (see comments on the HOL-Nuprl project in Section 3.2.1). Incorporation of one system into another (e.g., the use of Mathematica in Analytica) and even integration of procedures present interface problems. These interface problems are nontrivial in most cases, and real challenges in some cases, as noted above. Support for this aspect of AD research is important. It clearly is money efficiently spent, since it provides for combinations of existing powers, in effect getting the best of two or more worlds.
At least as important as the machine/machine interface issue is the human/machine issue. For the most part the systems discussed in this report are difficult to use, in that considerable training is needed to use them and interpret their output. An exception is the Chou-Gao-Zhang geometry programs. Some programs, such as Otter, that are used outside the AD research community have made an effort to be user-friendly, with some success at a primitive level. What is needed before AD systems really will be used by the outside community are significant interface upgrades, like input and output in the user's language at the level humans communicate to colleagues. Ideally such an interface would incorporate natural language input/output, discussed below, but certainly it includes input parameters determined only by the human description of the problem, not process-determined parameters. Interactive AD systems have dramatic human/machine interface concerns, but all systems must address this issue. Many AD researchers give human/machine interface research high priority in the list of research challenges for the AD field.
Attention has been given to making the output of automated provers more readable. The normal forms and inference rules used by most provers result in proofs very difficult for humans to understand. One early attempt to address this problem was undertaken by Andrews, who defined a procedure, later improved by Miller, for converting the logical structures of his proof system to a natural deduction proof style [3] [75]. More recently, this has been pushed one step further with investigations into translating proofs to natural language. Researchers in Germany have taken the lead here; we mention two major research efforts under way in Germany. One system, the ILF system, operates a mail server that accepts proofs directly from SETHEO, SPASS, and certain other first-order logic theorem provers. The ILF system prepares a LATEX file with a natural language presentation of the proof [32]. (More generally, the ILF system seeks to provide the nonexpert with a shell through which the user can use one of several theorem provers and related systems such as model checkers [31]). The second system, PROVERB, first takes a resolution proof as input and translates it into a natural deduction proof. After several levels of restructuring and abstraction, the latter stages based on state-of-the-art natural language generation techniques, a proof with English text is produced [51]. Some of the output is strikingly close to that appearing in a textbook. Because mathematics often can be presented with limited natural language support, the presentation of proofs in human readable form is an excellent limited domain for natural language generation research. (Natural language understanding and generation has made significant progress in recent years, but the complexity of the general task means that restricted domains are chosen for development and display of natural language capabilities.) The First International Workshop on Proof Transformation and Presentation was held this spring in Germany to bring together researchers in the different disciplines that impinge on this problem [1]. The AD field can contribute to this task significantly because it is central to one of two (perhaps complementary) approaches to this problem. One is better natural language capabilities, and the other is better logics and associated automated inference systems. More readable logics directly reduce the difficulty of translation.
Having completed our discussion of computer interface issues, we return to issues of the deductive structure of AD systems. A research area of importance yet to be addressed is proof planning. Although this area is difficult, it is indeed very important and warrants more attention than it has received. A few groups in the United States and Europe are investigating planning. Techniques under exploration include working with user-supplied diagrams [6] [7] and use of proof by analogy [74] [83]. Other techniques, such as learning, are being studied in the artificial intelligence (AI) community and should be applicable at some point. (For one attempt to apply learning to automated theorem proving, see [37].) Any success in proof planning will have a strong effect on the capabilities of provers, but the difficulty of the problem has discouraged many from undertaking serious work in this area. When good work is located, it should be strongly supported. (Examples are work of Barker-Plummer and Bailin in the U.S. on the use of diagrams in theorem proving noted above and Bundy's Edinburgh group in Europe addressing proof planning in various forms, some noted earlier; for another example see [19].) Relevant AI work is considered below.
We have discussed many facets of higher-order logic theorem provers within the various areas, particularly the verification area. However, the discussion usually has been applicable to all interactive provers, of which the higher-order logic provers are a subclass. The higher-order logic systems have special concerns that warrant separate attention. For example, the representation problem is greatly enhanced because of the much more flexible substitution capability. If unification (the heart of the resolution scheme at the first-order logic level) is used, then to representation problems are added computation problems as the full unification procedure is generally unbounded. The central focus is to make these provers more automated while dealing with complexities such as those just mentioned. The representation power of higher-order logic often allows much shorter proofs relative to first-order logic, with the potential for much smaller search spaces, but whether this will be realized in fully automated systems is not clear at present. There is certainly evidence of the power of higher-order logic in interactive mode with an appropriately trained person. (We remind the reader that the expressive powers are the same; first-order logic can express anything expressible in higher-order logic by use of set theory.) That a number of higher-order logic systems are under development is an indication of the potential many see for such systems. Verification, the most immediate commercial application, need not be done by using higher-order logic, as Nqthm and ACL2 demonstrate. But the success of PVS and other systems shows that higher-order logic can be well used in this domain. The potential of this approach justifies continued interest and support here.
In a discussion of the development of AD systems, it is important to understand a characteristic of development that can be called deduction engineering. It denotes the often nonglamorous, time-consuming process of adjusting a functioning deductive system for improved performance. This can involve anything from a new decision procedure to a new heuristic for finding an induction variable to a new data structure that speeds retrieval of terms used in rewriting or subsumption. This includes the important process of strategy formulation, itself having a range of outcomes, from publishable restrictions of considerable sophistication to the simple delaying of use of clauses that have many free variables. That this evolution-oriented difficult engineering process is key to much of the progress in the AD field has policy implications that need addressing. A major implication is that development is more resource consuming, in time and energy, than is expected, and sufficient support is needed to allow systems to progress. Another implication is that if most progress is found now by augmenting systems rather than implementing new basic procedures, then the systems grow in complexity. As such, they take more resources to sustain, so fewer systems are sustainable. Perhaps this should be anticipated by giving funding priority to more established centers, and isolated researchers would do better to cooperate with existing centers, much as physicists link to locations housing accelerators. This is a complex question, regarding the validity of both the reasoning and the resource allocation. We consider this point in the next section.
We illustrate the nature of deduction engineering by anecdote, to impart some feeling for the nature of development. One anecdote highlights the need to augment proof procedures by deduction strategies that restrict proof search size. We consider what may be the first instance where a restriction strategy was superimposed on a complete proof calculus, the resolution calculus as it happens. The resolution strategy with the subsumption test to purge redundant information was a powerful inference system; it appeared to Larry Wos and his colleagues at Argonne to far surpass the alternatives of the day (1964). However, the new resolution prover failed to prove a simple group theory problem regarding exponent 2 groups. Wos noted that many formulas (clauses) seemed to be general expressions pertaining to group theory and conjectured that more focus on the theorem under consideration was needed. He devised a strategy that separated clauses into two classes, one class to contain clauses that had used special theorem hypotheses or the conclusion's denial in their derivation. That class, the supported set, was given strong preference by requiring its use in each binary inference made. The effect was to obtain the theorem in 3 CPU seconds, whereas without the strategy, memory was exhausted and no proof found. The set of support strategy is one of the most important strategies currently used by Otter [107] [106].
Another story regarding deduction engineering addresses the difficulty often encountered in implementing what seems promising on paper. The example concerns the adoption of an improved version of the modification method [15] (a 1970s algorithm in base form) by SETHEO to ``build in'' equality, defining E-SETHEO. When first installed, the modification method performed far worse than direct use of the equality axioms. Only by eventually revising the very basic linear architecture of the underlying model elimination procedure [64] was the new feature made to yield performance superior to the naive approach. Given E-SETHEO's win in the competition reported earlier, the effort was justified. Sometimes progress is painful and accomplished only when sufficient resources are present.
The same pragmatic, resource-consuming energy goes into improving interactive provers, indeed into any deduction system that is performing at the state-of-the-art today. Successful systems are composites and will become more so. Resources must exist to let the systems grow in capability, for they usually grow more slowly than we would like.
The interaction between automated deduction and AI deserves comment because AD studies began together with AI forty years ago. Many of the topics covered here, such as program synthesis and proof discovery, are major AI topics. Outside these areas, AI has undergone a paradigm shift in methodology in recent years, to where stochastic and decision theoretic methods are predominant in areas such as vision, natural language understanding, expert systems, learning, and even certain types of problem solving. Of course, not all AI research disavows formal methods and use of AD tools. The study of formalizing contexts so that expert systems can know the bounds of their expertise, for example, is a current domain for formal methods.
AI planning is another area using formal methods. One of the major approaches to planning in AI is the explicit use of logic. Planning involves modeling and manipulating a dynamic world, and AI researchers have developed special logics to address action and change. The Situational Calculus introduced by McCarthy [67] [68] has recently been substantially enhanced in expressibility, and real-world applications are being pursued. For example, the Cognitive Robotics Group at the University of Toronto has developed the GOLOG programming language for tasks like controlling robots and software agents [63]. The need for AD may be substantial in this robot domain as the problem of incomplete knowledge is addressed, for example, in the setting of multiagent interaction. The other major approach to planning in the AI community is procedural, mostly using the STRIPS methodology, akin to databases with explicit updates. This approach blossomed in the 1970s when the theorem-proving approach seemed not to scale up. Recently, there has been a return to use of (propositional) logic with the discovery of fast stochastic local search methods for determining satisfiability and methods to encode planning as a satisfaction problem [57]. Although the local search methods usually outperform the systematic AD methods such as the Davis-Putnam procedure on the satisfiability problems that produce plans, the systematic procedures are useful in a complementary role. For example, to show that simpler plans do not exist generally requires more constrained presentations to be shown unsatisfiable, something the local search satisfiability programs cannot do [57]. As an aside we note that a recent paper shows how to extend recent satisfiability methods with the ``lifting'' technique central to automated deduction to reduce the size (complexity) of the problem, at least asymptotically [56]. That AD methodology and tools are reappearing in the planning realm is significant, and perhaps contains a lesson. The use of automated deduction that seemed so natural in AI in the early years of AI but never ``scaled up'' may indeed return to play a role in a number of AI domains. Again, it is simply that the problems, and thus the technology to solve the problems, are much more difficult than we understood.
Another area associated with AI is learning. In machine learning the investigation into inductive logic programming is having considerable success, primarily in Europe. The learning model has been applied to drug design, finite element analysis, natural language understanding, and other areas [17]. The technique calls for generalizations from positive examples so as not to intersect negative examples. The determination of logical consequence is essential here, and more powerful inference techniques, such as better methods for controlling search, could enable larger problems to be undertaken.
In the other direction, there is interest in using AI techniques to help AD. Experiments in use of neural nets to learn parameter control in theorem provers have experienced modest success. A track of the FLAIRS 1997 conference (Daytona Beach, Florida, May 11-14) addresses AI methods for controlling search in AD systems. Real progress here can have a powerful effect on theorem prover capability.