This chapter will include some of the history of computation and logic, showing the relationship between these two topics, showing the changes to them which have arisen from the advent of mathematical logic and the digital computer, leading to the idea of sound computation.
However it is the forward looking aspects which are the more important. This will include discussion of the step forward from ``sound computation'' to ``crisp AI'', making use of the idea of ``FAn oracle'' a refinement of the analytic oracle.
In the architecture which we later propose, proof and computation are intimately intertwined. Throughout their history these two concepts have frequently been transformed, and have frequently impacted upon each other. Far from stabilizing it seems that the pace of change in our ideas about how these concepts interrelate has accelerated.
The chapter begins historically, but ends in the contemporary context with some technical points intended to open up new possibilities for the relationship.
The history of computation goes back a long way. Our history begins with the idea of proof in mathematics, which we first know of in the beginnings of Greek philosophy and mathematics with the philosopher Thales. Whether the reasoning at this point could properly be called deductive is moot. Two later landmarks which may be thought of as pinnacles in the articulation and application of deductive methods were the logical writings of Aristotle's Organon and the mathematical tour de force of Euclid's Elements. These works became the standard for two millennia in philosophical logic and in deductive rigour and axiomatic method in mathematics. Substantial advances beyond these would not be made until the century.
It is generally said that the ancient Greeks invented mathematics and deduction.
This is probably true of the self conscious use of deduction and the development of mathematics as a theoretical discipline.
However, there are some points worth making about what happened before that. The man in the street, hearing the word ``mathematics'' thinks primarily of arithmetic, and of practical skills of computation. He may have little conception of mathematics as a theoretical discipline.
Mathematics in that sense preceded the Greeks. Notation for numbers, numerical computation and elementary geometrical capabilities probably go back as far as 10,000 years BC. Before classical Greece, the principal sources of expertise in these practical matters were Babylonia and Egypt. There are at this stage methods for computation and for solving certain kinds of algebraic problems, but these methods are presented without any attempt at justification, there is neither a theoretical discipline establishing results by deductive proof.
The ability to perform elementary steps of deductive reasoning is probably coeval with descriptive language, since one cannot be said to understand the language without grasping and being able to apply elementary conceptual inclusions. That is to say, for example, that if someone knows the meaning of ``azure'' then he knows that everything which is azure is also blue. However, the first signs of deduction being used elaborately and systematically, and the idea or deductive proof are found at the beginning of the classical period of ancient Greece at about 600BC.
The Greek civilization is thought to date back to about 2,800 BC, but mathematics as a theoretical discipline originates with the school of the philosopher Thales in Ionia at around 600 BC.
The idea here is to sketch the history of proof, and the history of computation.
Initially these two seem quite separate.
The potential for connection begins with the approach to formality found in Aristotle'sAristotle syllogistic, and with the axiomatic method for mathematics documented in Euclid's elements. Aristotle's work dominated the philosophical study of logic until the century, but his logic was the object of philosophical enquiry rather than a practical tool in deduction. The axiomatic method on the other hand had proved highly effective in Greek geometry. The standards then set were held as an ideal in mathematics, but were not matched in subsequent developments in mathematics, even (or especially) when mathematical innovation and the effectiveness of mathematics as a tool for science and engineering was scaling new heights, until we come to the end of the century.
Leibniz connects Aristotelian logic, by conceiving a scheme for arithmetisation of Aristotelian logic, which he thought would permit the truth of a subject/predicate proposition when arithmetically coded in his universal characteristic. Calculating machines were already in existence, and Leibniz designed such a machine contributing to the development of the technology. Important though these ideas were, they were neither applicable nor influential for hundreds of years.
The next important developments both occur in the century. The first was the design by BabbageBabbage of his analytical engine, the first conception of a universal computational engine. The second was the idea of formal proof first exhibited in Frege's Begriffsschrift[Fre67]. Frege's idea was the first advance in rigour beyond Euclid's axiomatic method, advancing that method by requiring not only that that the axioms, postulates and definitions on which a proof is based are made explicit, but also by requiring that the rules by which conclusions are drawn at each step in a proof from these premises and previous conclusions, be carefully specified in advance. This required that the language in which the proofs are conducted be formally specified.
There is a connection here, not explicitly made, between these two developments. By designing a universal analytic engine Babbage had implicitly defined a notion of computability. A function is computable if it can be calculated by Babbage's analytical engine. By requiring that the rules of inference used in a proof be fully specified Frege was demanding that the correctness of such a formal proof be mechanically checkable, that in principle the correctness of such a proof could be verified by Babbage's analytical engine.
Though we can now see this connection, it was not made explicit for a while.
Subsequent to these two developments the notion of proof and of the axiomatic method were refined as a part of an explosion of work in logic and its application to the derivation of mathematics. A key figure in this was David Hilbert, who began by advancing the axiomatic method, for the first time achieving levels of rigour surpassing those of Euclid, following the initial insights into the nature of proof by Frege. He did this by returning to the axiomatic theory of geometry and in the process found and resolved weaknesses in Euclidean geometry. By this time Frege's the logical system in which Frege had sought to formally derive arithmetic using his new logical methods had been found to be inconsistent (by Bertrand Russell ), precipitating a sense of crisis in the foundations of mathematics. Hilbert responded to this with a program for putting mathematics on a firm foundation, and a major part of this was a new domain of research which he called meta-mathematics.
This new domain of meta-mathematics was the study of the kinds of formal system in which mathematics might be developed, with a view to establishing the consistency of such systems.
Bertrand Russell came up with a new logical system which he called the ``theory of types'' for use in his collaboration with A.N.Whitehead on Principia Mathematica. With this formal logical foundation they carried through the formal development of a significant part of mathematics, starting (after more logical and abstract considerations) with arithmetic. It is notorious that in that monumental work they prove the elementary arithmetic theorem that 1+1=2, do not reach that point until page 362. This emphasises an important difference between computation and proof as it was then understood. It is possible to prove theorems which express the results of arithmetic computations, but the effort and complexity involved in such proofs outstrips by many orders of magnitude the difficulty of simply performing the calculation, which would be accepted in an informal proof which depended upon it without further justification. Formal proofs are so detailed that their complexity compares badly with the informal proofs which were then and still are the norm in mathematics, and in the case of elementary equations with the difficulty in performing arithmetic by the usual methods.
Early in the century radical changes have taken place in the conception of proof and of the role of proof in mathematics.
There were at this time several distinct attitudes towards proof in mathematics which are of interest here.
The simplest is that a formal deductive system consists of a collection of axioms which are simple enough to be self-evidently true, together with some rules of inference which likewise may be thought self-evidently to preserve truth. In consequence all theorems derivable from the axioms using the rules of inference will be true.
The discovery of the inconsistency of Frege's Grundgesetze undermined this position by showing that a system whose soundness in the above sense seemed self-evident, might nevertheless prove inconsistent.
In devising a new logical system intended to avoid the kinds of difficulty found in Frege's, Russell was therefore not able wholeheartedly to claim soundness on these grounds, and instead suggested a partly post-hoc rationale. Russell thought that the success of a logical system in deriving mathematics without proving any contradictions provides evidence that the system is sound.
The developments in mathematics which rendered mathematical analysis rigorous by eliminating the use of infinitesimals lead on to a liberal conception of the notion of function as an arbitrary graph which proved unacceptable to some mathematicians, notably Leopold Kronecker. Kronecker's scepticism about the new mathematics particularly as it is found in Cantor's Set Theory, was an early example of constructivism in mathematics in which mathematical objects are required to be constructible by limited means resulting in an ontology radically less expansive than that of set theory. Associated with this more frugal ontology there is a completely different attitude to proof. The effect is not just that mathematical proofs are expected to use more limited means, but that the subject matters of mathematics are modified because of ontological reservations, and in effect certain domains of enquiry are banished (of which again, set theory as conceived by Cantor is a principal exemplar).
Against this ``impoverishment'' of mathematics the mathematician David Hilbert reacted. He devised a program intended to establish the soundness of the whole of classical mathematics to a standard equivalent to that of constructive mathematics, and introduced the concept of meta-mathematics to that end. Meta-mathematics is the study of the kinds of formal deductive system which can be used in the derivation of mathematics. Hilbert's aim was by developing metamathematical techniques (now know as proof theory) to establish the consistency of the whole of mathematics by means which would be acceptable to constructivists.
We have now three attitudes toward proof.
The first consists in choosing a deductive system carefully, adopting some underlying philosophical rationale (in Russell's case this was the avoidance of ``vicious circles''), choosing axioms and rules guided by the chosen rationale, and then seeing whether the system proves to be adequate for the mathematics and seems to be immune from paradox.
The second rejects the idea of a post hoc justification, is ontologically conservative, and expects a close connection between the demand for a constructible ontology and the kind of proofs which are acceptable.
The third seeks a bridge between the two, with the aim of allowing mathematics the scope which it might assume under the first approach, while achieving the standards of constructive rigour sought in the second approach. This promising idea turned out to be unrealizable, but its failure spawned another, in which relative consistency proofs provide a partial ordering of formal deductive systems providing a measure of strength and of risk.
The next important connection between proof and computation comes from Gödel in connection with Hilbert's program, and is the first of a series of advances relevant to proof and computation which appeared in the 1930s and 40s. Gödel's famous incompleteness theorems were obtained by the technique of arithmetisation of which we saw a precursor in Leibniz's calculus ratiocinator. The result demonstrates a limitation on what can be done with certain kinds of consistent formal deductive system, viz that no such system can prove every statement of arithmetic. The precise definition of the kind of system within the scope of the proof provides formally the connection implicit in the idea of Frege that the notion of proof be computable.
We are now on the verge of an explicit theory of computability. The idea of computability first appearing implicitly in the design of Babbage's analytical engine, now comes under theoretical scrutiny in the new discipline of mathematical logic. Several different formal conceptions of effective process are enunciated (by Church, Kleene, Post, and Turing) and are shown to be equivalent in expressiveness, leading Alonzo Church to put forward Church's thesis that these equivalent formal ideas capture the informal notion of effective calculability.
In the meantime, the technology of computation has moved on. The analytic engine designed by Babbage was a mechanical computer, and its realization was beyond his resources. Computation was by the 1940s in transition from electro-mechanical to electronic technology, universal computers were now within reach.
When they arrived, though predominantly applied to the purposes of businesses or for scientific calculations, they were soon also applied in academia for the construction and checking of formal proofs. The new academic discipline of Computer Science and the rapidly growing computer industry became principal users and developers of formal languages, in which the algorithms to be executed by computers were described. The theory behind computing was also studied, and the techniques of mathematical logic extensively adopted. Mathematical methods were advocated for proving correctness of programs, and even of the ``hardware'' which executed the programs, and the expected complexity of these proofs encouraged the development of software to assist in constructing and checking the proofs.
These developments spawned new kinds of logical languages, new approaches to reasoning in those languages, and new applications for proof. Some of these are significant here because the notion of proof for which our project is to support and exploit is distant from those which predate the computer and computer science.
A principal influence in shaping a more diverse conception of proof is the development of automatic and semi-automatic or interactive theorem provers. Three different kinds of such software systems are of interest.
The first kind is the one closest to the new conception of formal proof which came from Frege. In such a system a proof is a sequence of theorems each of which is either an axiom of the logical system or is derived in a specified way according to an inference rule of the logic from one or more theorems appearing earlier in the list. The end result of the proof is the last sentence in this sequence.
A second approach to theorem proving simply leaves the details to the software, which is not required to construct or verify a detailed formal proof, but instead is written to check reliably for derivability in the relevant formal system and to deliver an informal proof of the result.5.1
The third approach which is of greatest relevance here began again at Edinburgh University as a theorem prover for a Logic for Computable Functions (LCF). This approach to proof later became known as ``the LCF paradigm''. It makes use of a feature of certain kinds of programming language which is called an ``abstract data-type'', whereby a new kind of structure is defined which can only be constructed by the use of a limited set of functions defined for that purpose. The implementation of such a theorem prover begins with the specification of a formal deductive system, and proceeds by the implementation of an abstract data type which corresponds closely to the deductive system. Each of the available constructors in for the abstract data type is required to correctly implement one of the axioms or inference rules of the logic delivering only theorems which could be obtained using the relevant axiom or inference rule.
By this means it is ensured that values which can be computed using this abstract data type must be theorems of the logic, for the structure of the computation yielding the value corresponds precisely to the structure of a proof in the logic.
In this approach to the automation of proof, the proof itself as an object has disappeared, instead of a sequence of theorems of which the result demonstrated is the last, we have a computation which by means guaranteed sound delivers as its result the desired theorem. Here the proof is a computation.
The implementation of the abstract data type provides a logical kernel, which ensures that any value which can be computed of type thm will be a theorem of the logic. This is by itself too primitive to be useful for a user of the theorem prover. More complex software is developed which partly automates the computation of the theorem (which is analogous to and serves as the discovery of a proof). These additional layers of software are then made available to the user through an interactive programming interface through which he can undertake proofs, possibly augmenting the proof capability by further programming of proof discovery methods.
Several further steps along this direction are significant for us here. As with the kind of formal proof found in Principia Mathematica complexity remains an issue. Even with a computer to do the work, the search for a formal proof, or even the checking of the proof can be onerous. For the sake of efficiency it is therefore usual to implement in the logical kernel of a theorem prover following the LCF paradigm a number of derived rules, the use of which significantly improves the computational efficiency of the theorem prover without impairing its soundness.
Taking this a step further we may consider the addition of any inference rule which is known only to perform inferences which could have been done in the primitive logic. Such enhancements in principle improve computational efficiency without alteration to the theorems which are provable in the system.
The difference between computation and proof may be seen as a difference in perception or interpretation.
A computation is an operation on data yielding data, an inference is an operation on propositions yielding a proposition. Propositions are interpreted data. The data transformed by a computation may have multiple possible propositional interpretations, relative to which the computation might be seen to be a sound logical inference.
A computable function becomes a sound inference rule if can be given a suitable specification. This is always possible, though this nominal possibility does not necessarily translate into something useful.
We know from recursion theory that there are unsolvable problems. One of these is the Turing machine halting problem, the problem of determining for any given specification of a Turing machine and its tape whether the Turing machine will halt.
Recursion theory studies degrees of unsolvability using the notion of an Oracle. An Oracle is a hypothetical machine which answers a problem which is not in fact effectively decidable, such as the Turing machine halting problem. This idea can be used to give an upper bound on what one could conceivably achieve by way of problem solving capability in a program running on a digital computer.
The halting problem is equivalent to the problem of deciding whether a sentence is a theorem in the kind of deductive system which is suitable as a formal foundation for mathematics. Automatic proof methods for first order logic can be arranged to deliver with the proof of an existential theorem a term which is a witness for that theorem, which satisfies the body of the existential. If a design problem is expressed as existential proposition in which the body stipulates the conditions required of a solution to the design problem and a solution to the problem is any term which satisfies the body of the existential, then degree of recursive solvability corresponds to a design capability.
Because of their intelligence, people can solve problems which have no feasible algorithmic solution. It doubtful that they do this because intelligence can draw upon resources which go beyond the limits of Turing computability, that possibility and its consequences are outside the scope of this book.
Let us assume that, as Turing himself believed, there is nothing in human intelligence which might not be realized in a Turing machine. That entails, that for any particular definite problem domain, there exists an algorithm which performs as well in that domain as an above average appropriately trained and experienced human being.
The algorithm might not be one which any of us will ever be able to code. It is possible that its complexity, like that of the human brain, might exceed anything which a human beings could design and implement. It might be that such an algorithm could only arise through a process of evolution.
Our project is intended to support the use of evolutionary processes to realize machine intelligence in solving crisp problems. Some details of how that might be realized are touched upon here.
Let us suppose that some implementation or extension of our architecture makes provision for its own evolution over time. Because the notion of proof is assurance sensitive, some assurance may be sought that such changes do not render the system unsound and thus degrade the assurance which we could attach to its results.
It is possible to implement a system which allows for self modification in a qualified way, permitting only modifications which preserve the trustworthiness of the system. This can be done by admitting only modifications which preserve that level of assurance, which they will do if they are conservative.
For any formally precise property of systems, there will be a system which undertakes self modification in ways which are guaranteed to preserve that property.
The implementation of such a system is not difficult, for any proof tool which implements a reflection principle of the kind discussed above will be a self-modifying proof system which preserves the property of infallibility in asserting theorem-hood. No simplistic approach to realizing intelligence in such a way would be likely to succeed.
Roger Bishop Jones 2012-09-23