Definitions and locutions

Definitions and locutions

In practice, even in quite rich foundational systems, let alone in pure set theory, representations of mathematical objects become unwieldy. For example, [bourbaki-sets] estimates³⁷ that even the number 1 if fully expanded would occupy thousands of symbols in his system. So some kind of definitional mechanism is needed, at least for human consumption, and possibly to stop overflow even for a machine! The fundamental property of definitions is that it should be possible in principle to do without them, though we need to say precisely what this means.

The simplest kinds of definitions are just shorthands for more complicated expressions, for example 2 rather than {0,1}, (x,y) rather than {x,{x,y}} and exists !x. P[x] rather than exists x. P[x] and forall y. P[y] implies y = x. We can understand such definitions in two ways, either as metalogical abbreviations used to make formal texts more palatable to the reader, or as a method of sanctioning the addition (in the object logic) of new equational axioms. Generally, the latter is more efficient, since it keeps the underlying terms small, but the former is more flexible: the defined notion need not correspond to any kind of logical object, nor respect any underlying type system. In fact the Nuprl system allows boths kinds of definition.

Assuming that the object language has facilities for variable abstraction (something like lambda-calculus), then any definitional constructs of this kind can be reduced to new abbreviations or axioms of the form c = t where c is a new constant name any t any closed term. For example, instead of writing f(x) = E[x], we use f = lambda x. E[x]. In any reasonable logical system the addition of such axioms makes no difference: in principle we could do without them (replace c by its expansion everywhere), and we do not derive any new facts not mentioning c. They thus fulfil all the criteria we could demand of definitions.

There is a more general means of extending the logic with new constant symbols: using names for things which have been proved to exist. If we can prove exists x. P[x], then adding a new constant c and an axiom P[c] is a conservative extension of most logical systems, in the sense that anything not involving c could equally well be deduced without using c or its axiom. (Simply because something like the natural deduction exists -elim rule is usually valid.) It's not quite conservative in the sense that it can be written away from formulas, though any formula Q[c] can be read as ( exists x. P[x]) and forall x. P[x] implies Q[x], for example. Given a theorem forall x. exists !y. P[x,y], it is usually conservative to add a new function symbol f and an axiom forall x. P[x,f(x)]. However if unique existence is weakened to existence, this can smuggle in the Axiom of Choice. The situation for constructive logics is even more delicate.

A useful feature in a practical logic is a descriptor. For example, we may allow a new term constructor iota x. P[x] ('the (unique) x such that P[x]'), together with an axiom ( exists !x. P[x]) implies P[ iota x. P[x]], or epsilon x. P[x] ('some x such that P[x]'), together with an axiom ( exists x. P[x]) implies P[ epsilon x. P[x]]. These are called definite and indefinite descriptors, respectively. Note that the latter is something like the Axiom of Global Choice if we allow P[x] to contain other free variables, and even the former is weakly nonconstructive because it effectively allows us to introduce a total function even though exists x. P[x] may be conditional on some other property. For this reason descriptors in formal presentations of constructive logic [troelstra-con1] are usually partial operators --- this topic is discussed below.

A descriptor is very useful for rendering informal notions of 'the x such that ...' (it's much used by Bourbaki), and can even be used to implement lambda-binding, set abstraction and so on, e.g. define lambda x. E[x] to be epsilon f. forall x. f(x) = E[x]. If we have exists x. P[x] and we have an indefinite descriptor, then defining c = epsilon x. P[x] gives us P[c]. Conversely, for any term t we have exists x. x = t, so the existential kind of definition always sanctions the equational kind. There is thus a close relationship between the two kinds of definition, given a descriptor.

There are more complex definitional mechanisms which cannot be expanded away in such a simple-minded manner, but can nevertheless be understood as inessential abbreviations that could in principle be done without. These are usually referred to as contextual definitions. A good example is the use of ZF proper classes, where set-like statements about classes are really encoded using logical predicates. Another example: cardinal numbers can in principle be done without for most purposes, since all statements about arithmetic and equality of cardinals may be regarded as statements about equinumerosity etc. of sets. For example instead of |X| = |Y|² |Z| we can write 'there is a bijection between X and (Y cross Y) cross Z'. The most sophisticated example of all is using category-theoretic language like 'the category of all sets' purely as a syntactic sugar on top of quite different statements. One of the interesting questions in formalized mathematics is the extent to which this is possible.

In any case, we shouldn't lay too much stress on the theoretical eliminability of definitions. It's mainly a way of assuring ourselves that we aren't 'really' extending the logic. Life without definitions would be unbearably complicated, and anyway the whole significance of a theorem may subsist in the definitions it depends on.

There are other forms of definition, for example definition by recursion and definition of inductive sets or relations. However given a mathematical theorem justifying the existence of such a function, it can then be defined using one of the mechanisms given above. In programmable systems, the tedious matter of deriving the required case of this theorem can be automated completely.

John Harrison 96/8/13; HTML by

96/8/16