Schwichtenberg & Wainer- Proofs and Computations

Proofs and Computations

Helmut Schwichtenberg (Munich)

and

Stanley S. Wainer (Leeds)

Preface

This book is about the deep connections between proof theory and re-cursive function theory. Their interplay has continuously underpinned andmotivated the more constructively-orientated developments in mathemat-ical logic ever since the pioneering days of Hilbert, Godel, Church, Tur-ing, Kleene, Ackermann, Gentzen, Peter, Herbrand, Skolem, Malcev, Kol-mogorov and others in the 1930’s. They were all concerned in one way oranother with the links between logic and computability. Godel’s Theoremutilized the logical representability of recursive functions in number-theory;Herbrand’s Theorem extracted explicit loop-free programs (sets of witness-ing terms) from existential proofs in logic; Ackermann and Gentzen analysedthe computational content of ε-reduction and cut-elimination in terms oftransfinite recursion; Turing not only devised the classical machine-modelof computation, but (what is less well known) already foresaw the potentialof transfinite induction as a method for program verification; and of coursethe Herbrand-Godel-Kleene Equation Calculus presented computability asa formal system of equational derivation (with “call by value” being mod-elled by a substitution rule which itself is a form of “cut” but at the level ofterms).

That these two fields – proof and recursion – have developed side byside over the intervening seventyfive years so as to form now a cornerstonein the foundations of computer science, testifies to the power and impor-tance of mathematical logic in transferring what was originally a body ofphilosophically-inspired ideas and results, down to the frontiers of moderninformation technology. A glance through the contents of any good under-graduate text on the fundamentals of computing should lend conviction tothis argument, but we hope also that some of the examples and applicationsin this book will support it further.

Our book is not about “technology transfer” however, but rather abouta classical area of mathematical logic which underlies it, somewhat in thetradition of Kleene’s “Introduction to Metamathematics” and Girard’s morerecent “Proof Theory and Logical Complexity”, though we would not pre-sume to compete with those excellent volumes. Rather, we aim to com-plement them and extend their range of proof-theoretic application, witha coherent, self-contained and up-to-date graduate-level treatment of top-ics which reflect our own personal interests over many years, and includingsome which have not previously been covered in text-book form. Thus thetheory of proofs, recursions, provably recursive functions, their subrecursivehierarchy classifications, and the computational significance and applicationof these, will constitute the driving theme. The methods will be those now-classical ones of cut elimination, normalization and their associated ordinal

iii

iv PREFACE

analyses, but restricted to the “small-to-medium-sized” range of mathemat-ically significant proof systems between Elementary Recursive Arithmeticand (restricted) Π1

1-Comprehension or ID(< ω). Within this range we feelwe have something new to contribute in terms of a unified (and we hope con-ceptually simple) presentational framework. Beyond it, the “outer limits”of ordinal analysis and the emerging connections there with large cardinaltheory are presently undergoing rapid and surprising development. Whoknows where that will lead? – others are far better equipped to comment.

The fundamental point of proof theory as we see it is Kreisel’s dictum: aproof of a theorem conveys more information than the mere statement thatit is true (at least it does if we know how to analyse its structure). In acomputational context, knowledge of the truth of a “program specification”

∀x∈N ∃y∈N Spec(x, y)

tells us that there is a while-program

y := 0; while ¬Spec(x, y) do y := y + 1; p := y

which satisfies it in the sense that

∀x∈N Spec(x, p(x)).

However we know nothing about the complexity of the program withoutknowing why the specification was true in the first place. What we needis a proof! However even when we have one it might use lemmas of logicalcomplexity far greater than Σ0

1, and this would prevent us from analysingdirectly the computational structure embedded within it. So what is re-quired is a method of reducing the proof and the applications of lemmas init, to a “computational” (Σ0

1) form, together with some means of measuringthe cost or complexity of that reduction. The method is cut elimination ornormalization and the measurement is achieved by ordinal analysis.

One may wonder why transfinite ordinals enter into the measurement ofprogram complexity. The reason is this :- a program, say over the naturalnumbers, is a syntactic description of a type-2 recursive functional whichtakes variable “given functions” g to output functions f . By unravelling theintended operation of the program according to the various function-calls itmakes in the course of evaluation, one constructs a tree of subcomputations,each branch of which is determined by an input number for the function fbeing computed together with a particular choice of given function g. Tosay that the program “terminates everywhere” is to say that every branchof the computation tree ends with an output value after finitely many steps.Thus

termination = well-foundedness .But what is the obvious way to measure the size of an infinite well-foundedtree? Of course, by its ordinal height or rank!

We thus have a natural hierarchy of total recursive functionals in termsof the (recursive) ordinal ranks of their defining programs. Kleene was al-ready aware in 1958 that this hierarchy continues to expand throughout therecursive ordinals – i.e., for each recursive ordinal α there is a total recur-sive functional which cannot be defined by any program of rank < α. The“subrecursive classification problem” therefore has a perfectly natural and

PREFACE v

satisfying solution when viewed in the light of type-2 functionals, in starkcontrast to the rather disappointing state of affairs in the case of type-1functions – where “intensionality” and the question “what is a natural wellordering?” are stumblingblocks which have long been a barrier to achiev-ing any useful hierarchy classification of all recursive functions (in one go).Nevertheless there has been good progress in classifying subclasses of therecursive functions which arise naturally in a proof-theoretic context, andthe later parts of this book will be much concerned with this.

The point is that, just as in other areas of mathematical logic, ordinals(in our case recursive ordinals) provide a fundamental abstract mathematicalscale against which we can measure and compare the logical complexity ofinductive proofs and the computational complexity of recursive programsspecified by them. The bridge is formed by the fast-, medium- and slow-growing hierarchies of proof-theoretic bounding functions which are quitenaturally associated with the ordinals themselves, and which also “model”in a clear way the basic computational paradigms: “functional”, “while-loop” and “term-reduction”.

Our aim is to bring all these issues together as two sides of the samecoin :- on one the proof-theoretic aspects of computation, and on the otherthe computational aspects of proof. We shall try to do this in progressivestages through three distinct parts, keeping in mind that we want the bookto be self-contained, orderly and fairly complete in its presentation of ourmaterial, and also useful as a reference. Thus we begin with two basic chap-ters on proof theory and recursion theory, followed by Chapter 3 on Godel’stheorems, providing the fundamental material without which any book withthis title would be incomplete. Part 2 deals with the, now, fairly classicalresults on hierarchies of provably recursive functions for a spectrum of theo-ries ranging between I∆0(exp) and Π1

1-CA0. We also bring out connectionsbetween fast-growing functions and combinatorial independence results suchas the modified finite Ramsey theorem and Kruskal’s theorem. Part 3 givesthe fundamental theory underlying the first author’s proof assistant andprogram extraction system Minlog1. The implementation is not discussedhere, but the underlying proof theoretic ideas and the various aspects ofconstructive logic involved are dealt with in some detail. Thus: the domainof continuous functionals in which higher type computation naturally arises,functional interpretations, and finally implicit complexity, where ideas de-veloped throughout the whole book are brought to bear on certain newerweak systems with more “feasible” provable functions. Every chapter is in-tended to contain some examples or applications illustrating our intendedtheme :- the link between proof theory, recursion and computation.

Although we have struggled with this book project over many years, wehave found the writing of it more and more stimulating as it got closer tofruition. The reason for this has been a divergence of our mathematical“standpoints” – while one (S.W.) holds to a more pragmatic middle of theroad stance, the other (H.S.) holds a somewhat clearer and committed con-structive view of the mathematical world. The difference has led to manyhappy hours of dispute and this inevitably may be evident in the choice of

1See http://www.minlog-system.de

vi PREFACE

topics and their presentations which follow. Despite these differences, bothauthors believe (to a greater or lesser extent) that it is a rather extremeposition to hold that existence is really equivalent to the impossibility ofnon-existence. Foundational studies – even if classically inspired – shouldsurely investigate these positions to see what relative properties the “strong”(∃) and “weak” (∃) existential quantifiers might possess.

Acknowledgement. We would like to thank the many people whohave contributed to the book in one way or another. The material in parts 1and 2 has been used as a basis for graduate lecture courses by both authors,and we gratefully acknowledge the many useful student contributions toboth the exposition and the content. Simon Huber – in his diploma thesis(2009) – provided many improvements and/or corrections to part 3. Ourspecial thanks go to Josef Berger and Grigori Mints, who kindly agreed tocritically read the manuscript.

Preliminaries

Referencing. References are by chapter, section and subsection: i.j.krefers to subsection k of section j in chapter i. Theorems and the like arereferred to by their names or the number of the subsection they appear in.Equations are numbered within a chapter; reference to equation n in sectionj is in the form “(j.n)”.

Mathematical notation. := is used for definitional equality. We writeY ⊆ X for “Y is a subset ofX”. Application for terms is left associative, andlambda abstraction binds stronger than application. For example, MNKmeans (MN)K and notM(NK), and λxMN means (λxM)N , not λx(MN).

We also save on parentheses by writing, e.g., Rxyz, Rt0t1t2 instead ofR(x, y, z), R(t0, t1, t2), where R is some predicate symbol. Similarly for aunary function symbol with a (typographically) simple argument, so fx forf(x), etc. In this case no confusion will arise. But readability requires thatwe write in full R(fx, gy, hz), instead of Rfxgyhz.

Binary function and relation symbols are usually written in infix nota-tion, e.g., x + y instead of +(x, y), and x < y instead of <(x, y). We writet 6= s for ¬(t = s) and t 6< s for ¬(t < s).

Logical formulas. We use the notation →, ∧, ∨, ⊥, ¬A, ∀xA, ∃xA,where ⊥ means logical falsity and negation is defined by ¬A := A → ⊥.Disjunction ∨ and the existential quantifier ∃ are understood in the strong(or “constructive”) sense. Their weak (or “classical”) counterparts are de-noted by ∨ and ∃; they are defined by A ∨ B := ¬A → (¬B → ⊥) and∃xA := ∀x(A → ⊥) → ⊥. Bounded quantifiers are written like ∀i<nA. Inwriting formulas we save on parentheses by assuming that ∀,∃,¬ bind morestrongly than ∧,∨, and that in turn ∧,∨ bind more strongly than →,↔(where A ↔ B abbreviates (A → B) ∧ (B → A)). Outermost parenthesesare also usually dropped. Thus A ∧ ¬B → C is read as ((A ∧ (¬B))→ C).In the case of iterated implications we sometimes use the short notation

A1 → A2 → · · · → An−1 → An for A1 → (A2 → · · · → (An−1 → An) . . . ).

vii

Contents

Preface iii

Preliminaries vii

Part 1. Basic Proof Theory and Computability 1

Chapter 1. Logic 31.1. Natural Deduction 41.2. Normalization 151.3. Soundness and Completeness for Tree Models 281.4. Soundness and Completeness of the Classical Fragment 361.5. Tait Calculus 401.6. Notes 41

Chapter 2. Recursion Theory 432.1. Register Machines 432.2. Elementary Functions 462.3. The Normal Form Theorem 542.4. Recursive Definitions 582.5. Primitive Recursion and For-Loops 622.6. The Arithmetical Hierarchy 682.7. The Analytical Hierarchy 722.8. Recursive Type-2 Functionals and Well-Foundedness 752.9. Inductive Definitions 782.10. Notes 86

Chapter 3. Godel’s Theorems 873.1. I∆0(exp) 883.2. Godel Numbers 963.3. The Notion of Truth in Formal Theories 1033.4. Undecidability and Incompleteness 1053.5. Representability 1073.6. Unprovability of Consistency 1113.7. Notes 114

Part 2. Provable Recursion in Classical Systems 115

Chapter 4. The Provably Recursive Functions of Arithmetic 1174.1. Primitive Recursion and IΣ1 1194.2. ε0-Recursion in Peano Arithmetic 1234.3. Ordinal Bounds for Provable Recursion in PA 1384.4. Independence Results for PA 148

ix

x CONTENTS

4.5. Notes 154

Chapter 5. Accessible Recursive Functions 1555.1. The Subrecursive Stumblingblock 1555.2. Accessible Recursive Functions 1595.3. Proof Theoretic Characterizations of Accessibility 1725.4. ID<ω and Π1

1-CA0 1855.5. An Independence Result – Kruskal’s Theorem 1905.6. Notes 195

Part 3. Constructive Logic and Complexity 197

Chapter 6. Computability in Higher Types 1996.1. Abstract Computability via Information Systems 1996.2. Denotational and Operational Semantics 2146.3. Normalization 2346.4. Computable Functionals 2396.5. Total Functionals 2456.6. Notes 251

Chapter 7. Extracting Computational Content from Proofs 2537.1. Theory of Computable Functionals 2537.2. Realizability Interpretation 2657.3. Refined A-Translation 2857.4. Godel’s Dialectica Interpretation 2997.5. Optimal Decoration of Proofs 3117.6. Application: Euclid’s Theorem 3197.7. Notes 322

Chapter 8. Linear Two-Sorted Arithmetic 3258.1. Provable Recursion and Complexity in EA(;) 3268.2. A Two-Sorted Variant T(;) of Godel’s T 3328.3. A Linear Two-Sorted Variant LT(;) of Godel’s T 3408.4. Two-Sorted Systems A(;), LA(;) 3498.5. Notes 355

Bibliography 357

Index 369

Part 1

Basic Proof Theory andComputability

CHAPTER 1

Logic

The main subject of Mathematical Logic is mathematical proof. In thisintroductory chapter we deal with the basics of formalizing such proofs and,via normalization, analysing their structure. The system we pick for therepresentation of proofs is Gentzen’s natural deduction from (1934). Ourreasons for this choice are twofold. First, as the name says this is a naturalnotion of formal proof, which means that the way proofs are representedcorresponds very much to the way a careful mathematician writing out alldetails of an argument would go anyway. Second, formal proofs in naturaldeduction are closely related (via the so-called Curry-Howard correspon-dence) to terms in typed lambda calculus. This provides us not only witha compact notation for logical derivations (which otherwise tend to becomesomewhat unmanagable tree-like structures), but also opens up a route toapplying (in part 3) the computational techniques which underpin lambdacalculus.

Apart from classical logic we will also deal with more constructive logics:minimal and intuitionistic logic. This will reveal some interesting aspects ofproofs, e.g., that it is possible and useful to distinguish beween existentialproofs that actually construct witnessing objects, and others that don’t.

An essential point for Mathematical Logic is to fix a formal languageto be used. We take implication → and the universal quantifier ∀ as basic.Then the logic rules correspond precisely to lambda calculus. The additionalconnectives: the existential quantifier ∃, disjunction ∨ and conjunction ∧,can then be added either as rules or axiom schemes. It is “natural” to treatthem as rules, and that is what we do here. However later (in chapter 7) theywill appear instead as axioms formalizing particular inductive definitions. Inaddition to the use of inductive definitions as a unifying concept, anotherreason for that change of emphasis will be that it fits more readily with themore computational viewpoint adopted there.

We shall not develop sequent-style logics, except for Tait’s one-sidedsequent calculus for classical logic, it (and the associated cut eliminationprocess) being a most convenient tool for the ordinal analysis of classicaltheories, as done in part 2. There are many excellent treatments of sequentcalculus in the literature and we have little of substance to add. Rather, weconcentrate on those logical issues which have interested us. This chapterdoes not simply introduce basic proof theory, but in addition there is anunderlying theme: to bring out the constructive content of logic, particularlyin regard to the relationship between minimal and classical logic. For us thelatter is most appropriately viewed as a subsystem of the former.

3

4 1. LOGIC

1.1. Natural Deduction

Rules come in pairs: we have an introduction and an elimination rule foreach of the logical connectives. The resulting system is called minimal logic;it was introduced by Kolmogorov (1932), Gentzen (1934) and Johansson(1937). Notice that no negation is yet present. If we go on and requireex-falso-quodlibet for the nullary propositional symbol ⊥ (“falsum”) we canembed intuitionistic logic with negation as A→ ⊥. To embed classical logic,we need to go further and add as an axiom schema the principle of indirectproof, also called stability (∀~x(¬¬R~x → R~x ) for relation symbols R), butthen it is appropriate to restrict to the language based on →, ∀, ⊥ and ∧.The reason for this restriction is that we can neither prove ¬¬∃xA → ∃xAnor ¬¬(A∨B)→ A∨B, for there are countermodels to both (the former isMarkov’s scheme). However, we can prove them for the classical existentialquantifier and disjunction defined by ¬∀x¬A and ¬A → ¬B → ⊥. Thuswe need to make a distinction between two kinds of “exists” and two kindsof “or”: the classical ones are “weak” and the non-classical ones “strong”since they have constructive content. In situations where both kinds occurtogether we must mark the distinction, and we shall do this by writing a tildeabove the weak disjunction and existence symbols thus ∨, ∃. Of course, in aclassical context this distinction does not arise and the tilde is not necessary.

1.1.1. Terms and formulas. Let a countably infinite set vi | i ∈ N of variables be given; they will be denoted by x, y, z. A first order languageL then is determined by its signature, which is to mean the following.

(i) For every natural number n ≥ 0 a (possible empty) set of n-ary rela-tion symbols (or predicate symbols). 0-ary relation symbols are calledpropositional symbols. ⊥ (read “falsum”) is required as a fixed proposi-tional symbol. The language will not, unless stated otherwise, contain= as a primitive. Binary relation symbols can be marked as infix .

(ii) For every natural number n ≥ 0 a (possible empty) set of n-ary functionsymbols. 0-ary function symbols are called constants. Binary functionsysmbols can also be marked as infix.

We assume that all these sets of variables, relation and function symbols aredisjoint. L is kept fixed and will only be mentioned when necessary.

Terms are inductively defined as follows.

(i) Every variable is a term.(ii) Every constant is a term.(iii) If t1, . . . , tn are terms and f is an n-ary function symbol with n ≥ 1,

then f(t1, . . . , tn) is a term. (If r, s are terms and is a binary functionsymbol, then (r s) is a term.)

From terms one constructs prime formulas, also called atomic formulasor just atoms: If t1, . . . , tn are terms and R is an n-ary relation symbol, thenR(t1, . . . , tn) is a prime formula. (If r, s are terms and ∼ is a binary relationsymbol, then (r ∼ s) is a prime formula.)

Formulas are inductively defined from prime formulas by

(i) Every prime formula is a formula.

1.1. NATURAL DEDUCTION 5

(ii) If A and B are formulas, then so are (A→ B) (“if A then B”), (A∧B)(“A and B”) and (A ∨B) (“A or B”).

(iii) If A is a formula and x is a variable, then ∀xA (“A holds for all x”)and ∃xA (“there is an x such that A”) are formulas.

Negation is defined by¬A := (A→ ⊥).

We shall often need to do induction on the height, denoted |A|, offormulas A. This is defined as follows: |P | = 0 for atoms P , |A B| =max(|A|, |B|) + 1 for binary operators (i.e., →,∧,∨) and | A| = |A| + 1for unary operators (i.e., ∀x, ∃x).

1.1.2. Substitution, free and bound variables. Expressions E , E ′which differ only in the names of bound variables will be regarded as iden-tical. This is sometimes expressed by saying that E and E ′ are α-equivalent.In other words, we are only interested in expressions “modulo renaming ofbound variables”. There are methods of finding unique representatives forsuch expressions, for example the name-free terms of de Bruijn (1972). Forthe human reader such representations are less convenient, so we shall stickto the use of bound variables.

In the definition of “substitution of expression E ′ for variable x in ex-pression E”, either one requires that no variable free in E ′ becomes boundby a variable-binding operator in E , when the free occurrences of x are re-placed by E ′ (also expressed by saying that there must be no “clashes ofvariables”), “E ′ is free for x in E”, or the substitution operation is taken toinvolve a systematic renaming operation for the bound variables, avoidingclashes. Having stated that we are only interested in expressions modulorenaming bound variables, we can without loss of generality assume thatsubstitution is always possible.

Also, it is never a real restriction to assume that distinct quantifieroccurrences are followed by distinct variables, and that the sets of boundand free variables of a formula are disjoint.

Notation. “FV” is used for the (set of) free variables of an expression;so FV(t) is the set of variables free in the term t, FV(A) the set of variablesfree in formula A etc. A formula A is said to be closed if FV(A) = ∅.E [x := t] denotes the result of substituting the term t for the variable

x in the expression E . Similarly, E [~x := ~t ] is the result of simultaneouslysubstituting the terms ~t = t1, . . . , tn for the variables ~x = x1, . . . , xn, respec-tively.

In a given context we shall adopt the following convention. Once aformula has been introduced as A(x), i.e., A with a designated variable x,we write A(t) for A[x := t], and similarly with more variables.

1.1.3. Subformulas. Unless stated otherwise, the notion of subfor-mula will be that defined by Gentzen.

Definition. (Gentzen) subformulas of A are defined by(a) A is a subformula of A;(b) if B C is a subformula of A then so are B, C, for = →,∧,∨;(c) if ∀xB(x) or ∃xB(x) is a subformula of A, then so is B(t).

6 1. LOGIC

Definition. The notions of positive, negative, strictly positive subfor-mula are defined in a similar style:(a) A is a positive and a strictly positive subformula of itself;(b) if B ∧ C or B ∨ C is a positive (negative, strictly positive) subformula

of A, then so are B, C;(c) if ∀xB(x) or ∃xB(x) is a positive (negative, strictly positive) subformula

of A, then so is B(t);(d) if B → C is a positive (negative) subformula of A, then B is a negative

(positive) subformula of A, and C is a positive (negative) subformula ofA;

(e) if B → C is a strictly positive subformula of A, then so is C.A strictly positive subformula of A is also called a strictly positive part(s.p.p.) of A. Note that the set of subformulas of A is the union of thepositive and negative subformulas of A.

Example. (P → Q) → R ∧ ∀xS(x) has as s.p.p.’s the whole formula,R ∧ ∀xS(x), R, ∀xS(x), S(t). The positive subformulas are the s.p.p.’s andin addition P ; the negative subformulas are P → Q, Q.

1.1.4. Examples of derivations.

(A→ B → C)→ (A→ B)→ A→ C.

Informal proof. Assume A → B → C. To show: (A → B) → A → C.So assume A → B. To show: A → C. So finally assume A. To show: C.Using the third assumption twice we have B → C by the first assumption,and B by the second assumption. From B → C and B we then obtainC. Then A → C, cancelling the assumption on A; (A → B) → A → Ccancelling the second assumption; and the result follows by cancelling thefirst assumption.

∀x(A→ B)→ A→ ∀xB, if x /∈ FV(A).

Informal proof. Assume ∀x(A→ B). To show: A→ ∀xB. So assume A. Toshow: ∀xB. Let x be arbitrary; note that we have not made any assumptionson x. To show: B. We have A → B by the first assumption. Hence alsoB by the second assumption. Hence ∀xB. Hence A → ∀xB, cancelling thesecond assumption. Hence the result, cancelling the first assumption.

A characteristic feature of these proofs is that assumptions are intro-duced and eliminated again. At any point in time during the proof the freeor “open” assumptions are known, but as the proof progresses, free assump-tions may become cancelled or “closed” because of the implies-introductionrule.

We reserve the word proof for the informal level; a formal representationof a proof will be called a derivation.

An intuitive way to communicate derivations is to view them as labelledtrees each node of which denotes a rule application. The labels of the innernodes are the formulas derived as conclusions at those points, and the labelsof the leaves are formulas or terms. The labels of the nodes immediatelyabove a node k are the premises of the rule application. At the root ofthe tree we have the conclusion (or end formula) of the whole derivation.In natural deduction systems one works with assumptions at leaves of the


tree; they can be either open or closed (cancelled). Any of these assump-tions carries a marker . As markers we use assumption variables denotedu, v, w, u0, u1, . . . . The variables of the language previously introduced willnow often be called object variables, to distinguish them from assumptionvariables. If at a node below an assumption the dependency on this as-sumption is removed (it becomes closed) we record this by writing down theassumption variable. Since the same assumption may be used more thanonce (this was the case in the first example above), the assumption markedwith u (written u : A) may appear many times. Of course we insist thatdistinct assumption formulas must have distinct markers. An inner node ofthe tree is understood as the result of passing from premises to the conclu-sion of a given rule. The label of the node then contains, in addition to theconclusion, also the name of the rule. In some cases the rule binds or closesor cancels an assumption variable u (and hence removes the dependency ofall assumptions u : A thus marked). An application of the ∀-introductionrule similarly binds an object variable x (and hence removes the dependencyon x). In both cases the bound assumption or object variable is added tothe label of the node.

Definition. A formula A is called derivable (in minimal logic), writ-ten ` A, if there is a derivation of A (without free assumptions) usingthe natural deduction rules. A formula B is called derivable from assump-tions A1, . . . , An, if there is a derivation of B with free assumptions amongA1, . . . , An. Let Γ be a (finite or infinite) set of formulas. We write Γ ` B ifthe formula B is derivable from finitely many assumptions A1, . . . , An ∈ Γ.

We now formulate the rules of natural deduction.

1.1.5. Introduction and elimination rules for → and ∀. First wehave an assumption rule, allowing to write down an arbitrary formula Atogether with a marker u:

u : A assumption.

The other rules of natural deduction split into introduction rules (I-rules forshort) and elimination rules (E-rules) for the logical connectives which, forthe time being, are just→ and ∀. For implication→ there is an introductionrule →+ and an elimination rule →− also called modus ponens. The leftpremise A→ B in →− is called the major (or main) premise, and the rightpremise A the minor (or side) premise. Note that with an application of the→+-rule all assumptions above it marked with u : A are cancelled (whichis denoted by putting square brackets around these assumptions), and theu then gets written alongside. There may of course be other uncancelledassumptions v : A of the same formula A, which may get cancelled at a laterstage.

[u : A]|MB →+uA→ B

|MA→ B

| NA →−

B

For the universal quantifier ∀ there is an introduction rule ∀+ (again marked,but now with the bound variable x) and an elimination rule ∀− whose right

8 1. LOGIC

premise is the term r to be substituted. The rule ∀+x with conclusion ∀xAis subject to the following (Eigen-)variable condition: the derivation M ofthe premise A should not contain any open assumption having x as a freevariable.

|MA ∀+x∀xA

|M∀xA(x) r

∀−A(r)

We now give derivations of the two example formulas treated informallyabove. Since in many cases the rule used is determined by the conclusion,we suppress in such cases the name of the rule.

u : A→ B → C w : AB → C

v : A→ B w : AB

C →+wA→ C →+v(A→ B)→ A→ C→+u(A→ B → C)→ (A→ B)→ A→ C

u : ∀x(A→ B) x

A→ B v : AB ∀+x∀xB →+vA→ ∀xB →+u∀x(A→ B)→ A→ ∀xB

Note that the variable condition is satisfied: x is not free in A (and also notfree in ∀x(A→ B)).

1.1.6. Properties of negation. Recall that negation is defined by¬A := (A→ ⊥). The following can easily be derived.

A→ ¬¬A,¬¬¬A→ ¬A.

However, ¬¬A → A is in general not derivable (without stability – we willcome back to this later on).

Lemma. The following are derivable.

(A→ B)→ ¬B → ¬A,¬(A→ B)→ ¬B,

¬¬(A→ B)→ ¬¬A→ ¬¬B,(⊥ → B)→ (¬¬A→ ¬¬B)→ ¬¬(A→ B),

¬¬∀xA→ ∀x¬¬A.

Derivations are left as an exercise.


1.1.7. Introduction and elimination rules for disjunction ∨, con-junction ∧ and existence ∃. For disjunction the introduction and elimi-nation rules are

|MA ∨+

0A ∨B

|MB ∨+

1A ∨B

|MA ∨B

[u : A]| NC

[v : B]| KC ∨−u, v

C

For conjunction we have

|MA

| NB ∧+

A ∧B

|MA ∧B

[u : A] [v : B]| NC ∧− u, v

C

and for the existential quantifier

r

|MA(r)

∃+∃xA(x)

|M∃xA

[u : A]| NB ∃−x, u (var.cond.)

B

The rule ∃−x, u is subject to the following (Eigen-)variable condition: inthe derivation N the variable x should not occur free in B nor in any openassumption other than A.

Again, in each of the elimination rules ∨−, ∧− and ∃− the left premiseis called major (or main) premise, and the right premise is called the minor(or side) premise.

It is easy to see that for each of the connectives ∨, ∧, ∃ the rules and thefollowing axioms are equivalent over minimal logic; this is left as an exercise.For disjunction the introduction and elimination axioms are

∨+0 : A→ A ∨B,∨+

1 : B → A ∨B,∨− : A ∨B → (A→ C)→ (B → C)→ C.

For conjunction we have

∧+ : A→ B → A ∧B, ∧− : A ∧B → (A→ B → C)→ C

and for the existential quantifier

∃+ : A→ ∃xA, ∃− : ∃xA→ ∀x(A→ B)→ B (x /∈ FV(B)).

Remark. All these axioms can be seen as special cases of a generalschema, that of an inductively defined predicate, which is defined by someintroduction rules and one elimination rule. Later we will study this kindof definition in full generality.

We collect some easy facts about derivability; B ← A means A→ B.


(A ∧B → C)↔ (A→ B → C),

(A→ B ∧ C)↔ (A→ B) ∧ (A→ C),

10 1. LOGIC

(A ∨B → C)↔ (A→ C) ∧ (B → C),

(∀xA→ B)← ∃x(A→ B) if x /∈ FV(B),

(A→ ∀xB)↔ ∀x(A→ B) if x /∈ FV(A),

(∃xA→ B)↔ ∀x(A→ B) if x /∈ FV(B),

(A→ ∃xB)← ∃x(A→ B) if x /∈ FV(A).

The proof is left as an exercise.As already mentioned, we distinguish between two kinds of “exists” and

two kinds of “or”: the “weak” or classical ones and the “strong” or non-classical ones, with constructive content. In the present context both kindsoccur together and hence we must mark the distinction; we shall do this bywriting a tilde above the weak disjunction and existence symbols thus

∨:= ¬A→ ¬B → ⊥, ∃ := ¬∀x¬A.

One can show easily that these weak variants of disjunction and the exis-tential quantifier are no stronger than the proper ones (in fact, they areweaker):

A ∨B → A ∨ B, ∃xA→ ∃xA.

This can be seen easily by putting C := ⊥ in ∨− and B := ⊥ in ∃−.

1.1.8. Intuitionistic and classical derivability. In the definition ofderivability in 1.1.4 falsity ⊥ plays no role. We may change this and requireex-falso-quodlibet axioms, of the form

∀~x(⊥ → R~x )

with R a relation symbol distinct from ⊥. Let Efq denote the set of all suchaxioms. A formula A is called intuitionistically derivable, written ì A, ifEfq ` A. We write Γ ì B for Γ ∪ Efq ` B.

We may even go further and require stability axioms, of the form

∀~x(¬¬R~x→ R~x )

with R again a relation symbol distinct from ⊥. Let Stab denote the set ofall these axioms. A formula A is called classically derivable, written `c A,if Stab ` A. We write Γ `c B for Γ ∪ Stab ` B.

It is easy to see that intuitionistically (i.e., from Efq) we can derive⊥ → A for an arbitrary formula A, using the introduction rules for theconnectives. A similar generalization of the stability axioms is only possiblefor formulas in the language not involving ∨,∃. However, it is still possibleto use the substitutes ∨ and ∃.

Theorem (Stability, or principle of indirect proof).

(a) ` (¬¬A→ A)→ (¬¬B → B)→ ¬¬(A ∧B)→ A ∧B.(b) ` (¬¬B → B)→ ¬¬(A→ B)→ A→ B.(c) ` (¬¬A→ A)→ ¬¬∀xA→ A.(d) `c ¬¬A→ A for every formula A without ∨,∃.


Proof. (a) is left as an exercise. (b). For simplicity, in the derivationto be constructed we leave out applications of →+ at the end.

u : ¬¬B → B

v : ¬¬(A→ B)

u1 : ¬Bu2 : A→ B w : A

B⊥ →+u2¬(A→ B)

⊥ →+u1¬¬BB

(c).

u : ¬¬A→ A

v : ¬¬∀xA

u1 : ¬Au2 : ∀xA x

A⊥ →+u2¬∀xA

⊥ →+u1¬¬AA

(d). Induction on A. The case R~t with R distinct from ⊥ is given by Stab.In the case ⊥ the desired derivation is

v : (⊥ → ⊥)→ ⊥u : ⊥ →+u⊥ → ⊥

⊥In the cases A ∧B, A→ B and ∀xA use (a), (b) and (c), respectively.

Using stability we can prove some well-known facts about the interactionof weak disjunction and the weak existential quantifier with implication. Wefirst prove a more refined claim, stating to what extent we need to go beyondminimal logic.


(∃xA→ B)→ ∀x(A→ B) if x /∈ FV(B),(1.1)

(¬¬B → B)→ ∀x(A→ B)→ ∃xA→ B if x /∈ FV(B),(1.2)

(⊥ → B[x:=c])→ (A→ ∃xB)→ ∃x(A→ B) if x /∈ FV(A),(1.3)

∃x(A→ B)→ A→ ∃xB if x /∈ FV(A).(1.4)

The last two items can also be seen as simplifying a weakly existentiallyquantified implication whose premise does not contain the quantified variable.In case the conclusion does not contain the quantified variable we have

(¬¬B → B)→ ∃x(A→ B)→ ∀xA→ B if x /∈ FV(B),(1.5)

∀x(¬¬A→ A)→ (∀xA→ B)→ ∃x(A→ B) if x /∈ FV(B).(1.6)

Proof. (1.1)

∃xA→ B

u1 : ∀x¬A x¬A A

⊥ →+u1¬∀x¬AB

12 1. LOGIC

(1.2)

¬¬B → B

¬∀x¬A

u2 : ¬B

∀x(A→ B) x

A→ B u1 : AB

⊥ →+u1¬A∀x¬A

⊥ →+u2¬¬BB

(1.3) Writing B0 for B[x:=c] we have

∀x¬(A→ B) c

¬(A→ B0)

⊥ → B0

A→ ∃xB u2 : A∃xB

∀x¬(A→ B) x

¬(A→ B)u1 : BA→ B

⊥ →+u1¬B∀x¬B

⊥B0 →+u2

A→ B0

⊥

(1.4)

∃x(A→ B)

∀x¬B x¬B

u1 : A→ B AB

⊥ →+u1¬(A→ B)∀x¬(A→ B)

⊥

(1.5)

¬¬B → B

∃x(A→ B)

u2 : ¬Bu1 : A→ B

∀xA xA

B⊥ →+u1¬(A→ B)

∀x¬(A→ B)⊥ →+u2¬¬B

B


(1.6) We derive ∀x(⊥ → A) → (∀xA → B) → ∀x¬(A → B) → ¬¬A.Writing Ax,Ay for A(x), A(y) we have

∀x¬(Ax→ B) x

¬(Ax→ B)

∀xAx→ B

∀y(⊥ → Ay) y

⊥ → Ayu1 : ¬Ax u2 : Ax

⊥Ay

∀yAyB →+u2

Ax→ B

⊥ →+u1¬¬Ax

Using this derivation M we obtain

∀x¬(Ax→ B) x

¬(Ax→ B)

∀xAx→ B

∀x(¬¬Ax→ Ax) x

¬¬Ax→ Ax

|M¬¬Ax

Ax∀xAx

BAx→ B

⊥

Since clearly ` (¬¬A→ A)→ ⊥→ A the claim follows.

Remark. An immediate consequence of (1.6) is the classical derivabilityof the “drinker formula” ∃x(Px → ∀xPx), to be read “in every non-emptybar there is a person such that, if this person drinks, then everybody drinks”.To see this let A := Px and B := ∀xPx in (1.6).

Corollary.

`c (∃xA→ B)↔ ∀x(A→ B) if x /∈ FV(B) and B without ∨,∃,

ì (A→ ∃xB)↔ ∃x(A→ B) if x /∈ FV(A),

`c ∃x(A→ B)↔ (∀xA→ B) if x /∈ FV(B) and A,B without ∨,∃.

There is a similar lemma on weak disjunction:


(A ∨ B → C)→ (A→ C) ∧ (B → C),

(¬¬C → C)→ (A→ C)→ (B → C)→ A ∨ B → C,

(⊥ → B)→ (A→ B ∨ C)→ (A→ B) ∨ (A→ C),

(A→ B) ∨ (A→ C)→ A→ B ∨ C,(¬¬C → C)→ (A→ C) ∨ (B → C)→ A→ B → C

(⊥ → C)→ (A→ B → C)→ (A→ C) ∨ (B → C)

14 1. LOGIC

Proof. The derivation of the final formula is

¬(B → C)

⊥ → C

¬(A→ C)

A→ B → C u1 : AB → C u2 : B

C →+u1A→ C

⊥C →+u2

B → C

⊥

The other derivations are similar to the ones above, if one views ∃ as aninfinitary version of ∨.

Corollary.

`c (A ∨ B → C)↔ (A→ C) ∧ (B → C) for C without ∨,∃,ì (A→ B ∨ C)↔ (A→ B) ∨ (A→ C),

`c (A→ C) ∨ (B → C)↔ (A→ B → C) for C without ∨,∃.

Remark. It follows that weak disjunction and the weak existentialquantifier satisfy the same axioms as the strong variants, if one restrictsthe conclusion of the elimination axioms to formulas without ∨,∃:

` A→ A ∨ B, ` B → A ∨ B,`c A ∨ B → (A→ C)→ (B → C)→ C for C without ∨,∃,

` A→ ∃xA,

`c ∃xA→ ∀x(A→ B)→ B if x /∈ FV(B) and B is without ∨,∃.

1.1.9. Godel-Gentzen translation. Classical derivability Γ `c B wasdefined in 1.1.8 by Γ ∪ Stab ` B. This embedding of classical logic intominimal logic can be expressed in a somewhat different and very explicitform, namely as a syntactic translation A 7→ Ag of formulas such that Ais derivable in classical logic if and only if its translation Ag is derivable inminimal logic.

Definition (Godel-Gentzen translation Ag).

P g := ¬¬P for prime formulas P 6= ⊥(A ∨B)g := Ag ∨ Bg

(∃xA)g := ∃xAg

(A B)g := Ag Bg for = →,∧(∀xA)g := ∀xAg.

Lemma. ` ¬¬Ag → Ag.

Proof. Induction on A.Case R~t. We must show ¬¬¬¬R~t → ¬¬R~t, which is a special case of

` ¬¬¬B → ¬B.

1.2. NORMALIZATION 15

Case A ∨ B. We must show ` ¬¬(Ag ∨ Bg) → Ag ∨ Bg, which is aspecial case of ` ¬¬(¬C → ¬D → ⊥)→ ¬C → ¬D → ⊥:

¬¬(¬C → ¬D → ⊥)

u1 : ¬C → ¬D → ⊥ ¬C¬D → ⊥ ¬D

⊥ →+u1¬(¬C → ¬D → ⊥)⊥

Case ∃xA. In this case we must show ` ¬¬∃xAg → ∃xAg, but this is aspecial case of ` ¬¬¬B → ¬B, because ∃xAg is the negation ¬∀x¬Ag.

Case A ∧ B. We must show ` ¬¬(Ag ∧ Bg) → Ag ∧ Bg. By inductionhypothesis ` ¬¬Ag → Ag and ` ¬¬Bg → Bg. Now use part (a) of thestability lemma in 1.1.8.

The cases A → B and ∀xA are similar, using parts (b) and (c) of thestability lemma instead.

Theorem. (a) Γ `c A implies Γg ` Ag.(b) Γg ` Ag implies Γ `c A for Γ, A without ∨,∃.

Proof. (a). We use induction on Γ `c A. For a stability axiom∀~x(¬¬R~x → R~x ) we must derive ∀~x(¬¬¬¬R~x → ¬¬R~x ), which is easy(as above). For the rules →+, →−, ∀+, ∀−, ∧+ and ∧− the claim followsimmediately from the induction hypothesis, using the same rule again. Thisworks because the Godel-Gentzen translation acts as a homomorphism forthese connectives. For the rules ∨+

i , ∨−, ∃+ and ∃− the claim follows fromthe induction hypothesis and the last lemma in 1.1.8. For example, in case∃− the induction hypothesis gives

|M∃xAg

andu : Ag

| NBg

with x /∈ FV(Bg). Now use ` (¬¬Bg → Bg) → ∃xAg → ∀x(Ag → Bg) →Bg. Its premise ¬¬Bg → Bg is derivable by the lemma above.

(b). First note that `c (B ↔ Bg) if B is without ∨,∃. Now assume thatΓ, A are without ∨,∃. From Γg ` Ag we obtain Γ `c A as follows. We argueinformally. Assume Γ. Then Γg by the note, hence Ag because of Γg ` Ag,hence A again by the note.

1.2. Normalization

A derivation in normal form does not make “detours”, or more precisely,it cannot occur that an elimination rule immediately follows an introductionrule. We will use “conversions” to remove such “local maxima” of complex-ity, thus reducing any given derivation to normal form. However, there is adifficulty when we consider an elimination rule for ∨, ∧ or ∃. An introducedformula may be used as a minor premise of an application of ∨−, ∧− or∃−, then stay the same throughout a sequence of applications of these rules,being eliminated at the end. This also constitutes a local maximum, whichwe should like to eliminate; permutative conversions are designed for exactlythis situation. In a permutative conversion we permute an E-rule upwardsover the minor premises of ∨−, ∧− or ∃−.

16 1. LOGIC

Derivation Term

u : A uA

[u : A]|MB →+uA→ B

(λuAMB)A→B

|MA→ B

| NA →−

B

(MA→BNA)B

|MA ∀+x (with var.cond.)∀xA

(λxMA)∀xA (with var.cond.)

|M∀xA(x) r

∀−A(r)

(M∀xA(x)r)A(r)

Table 1. Derivation terms for → and ∀

We analyse the shape of derivations in normal form, and then prove the(crucial) subformula property, which says that every formula in a normalderivation is a subformula of the end-formula or else of an assumption.

It will be convenient to represent derivations as typed terms, wherethe derived formula is seen as the “type” of the term (and displayed as asuperscript). This representation is known under the name Curry-Howardcorrespondence. We give an inductive definition of such derivation terms forthe→,∀-rules in table 1 where for clarity we have written the correspondingderivations to the left. In table 2 this is extended also to cover the rules for∨, ∧ and ∃.

1.2.1. Conversions. A conversion eliminates a detour in a derivation,i.e., an elimination immediately following an introduction. We now spellout in detail which conversions we shall allow. This is done for derivationswritten in tree notation and also as derivation terms.


Derivation Term

|MA ∨+

0A ∨B

|MB ∨+

1A ∨B

(∨+0,BM

A)A∨B (∨+1,AM

B)A∨B

|MA ∨B

[u : A]| NC

[v : B]| KC ∨−u, v

C

(MA∨B(uA.NC , vB.KC))C

|MA

| NB ∧+

A ∧B〈MA, NB〉A∧B

|MA ∧B

[u : A] [v : B]| NC ∧− u, v

C

(MA∧B(uA, vB.NC))C

r

|MA(r)

∃+∃xA(x)

(∃+x,ArMA(r))∃xA(x)

|M∃xA

[u : A]| NB ∃−x, u (var.cond.)

B

(M∃xA(uA.NB))B (var.cond.)

Table 2. Derivation terms for ∨, ∧ and ∃

→-conversion.

[u : A]|MB →+uA→ B

| NA →−

B

7→

| NA|MB

18 1. LOGIC

or written as derivation terms (λuM(uA)B)A→BNA 7→ M(NA)B. Thereader familiar with λ-calculus should note that this is nothing other thanβ-conversion.∀-conversion.

|MA(x)

∀+x∀xA(x) r∀−

A(r)

7→ |M ′

A(r)

or written as derivation terms (λxM(x)A(x))∀xA(x)r 7→M(r).∨-conversion.

|MA ∨+

0A ∨B

[u : A]| NC

[v : B]| KC ∨−u, v

C

7→

|MA| NC

or as derivation terms (∨+0,BM

A)A∨B(uA.N(u)C , vB.K(v)C) 7→ N(MA)C ,and similarly for ∨+

1 with K instead of N .∧-conversion.

|MA

| NB ∧+

A ∧B

[u : A] [v : B]| KC ∧− u, v

C

7→

|MA

| NB

| KC

or 〈MA, NB〉A∧B(uA, vB.K(u, v)C) 7→ K(MA, NB)C .∃-conversion.

r

|MA(r)

∃+∃xA(x)

[u : A(x)]| NB∃−x, u

B

7→

|MA(r)| N ′

B

or (∃+x,ArMA(r))∃xA(x)(uA(x).N(x, u)B) 7→ N(r,MA(r))B.

1.2.2. Permutative conversions.∨-permutative conversion.

|MA ∨B

| NC

| KC

C

| LC ′ E-rule

D

7→

|MA ∨B

| NC

| LC ′ E-rule

D

| KC

| LC ′ E-rule

DD

or with for instance→− as E-rule (MA∨B(uA.NC→D, vB.KC→D))C→DLC 7→(MA∨B(uA.(NC→DLC)D, vB.(KC→DLC)D))D.


∧-permutative conversion.

|MA ∧B

| NC

C

| KC ′ E-rule

D

7→

|MA ∧B

| NC

| KC ′ E-rule

DD

or (MA∧B(uA, vB.NC→D))C→DKC 7→ (MA∧B(uA, vB.(NC→DKC)D))D.∃-permutative conversion.

|M∃xA

| NB

B

| KC E-rule

D

7→

|M∃xA

| NB

| KC E-rule

DD

or (M∃xA(uA.NC→D))C→DKC 7→ (M∃xA(uA.(NC→DKC)D))D.

1.2.3. Simplification conversions. These are somewhat trivial con-versions, which remove unnecessary applications of the elimination rules for∨, ∧ and ∃. For ∨ we have

|MA ∨B

[u : A]| NC

[v : B]| KC ∨−u, v

C

7→ | NC

if u : A is not free in N , or (MA∨B(uA.NC , vB.KC))C 7→ NC ; similar for thesecond component. For ∧ there is the conversion

|MA ∧B

[u : A] [v : B]| NC ∧− u, v

C

7→ | NC

if neither u : A nor v : B is free in N , or (MA∧B(uA, vB.NC))C 7→ NC . For∃ the simplification conversion is

|M∃xA

[u : A]| NB ∃−x, u

B

7→ | NB

if again u : A is not free in N , or (M∃xA(uA.NB))B 7→ NB.

20 1. LOGIC

1.2.4. Strong normalization. We now show that no matter in whichorder we apply the conversion rules, they will always terminate and producea derivation in “normal form”, where no further conversions can be applied.

We shall write derivation terms without formula super- or subscripts.For instance, we write ∃+ instead of ∃+x,A. Hence we consider derivationterms M,N,K now of the forms

u | λvM | λyM | ∨+0 M | ∨

+1 M | 〈M,N〉 | ∃+rM |

MN |Mr |M(v0.N0, v1.N1) |M(v, w.N) |M(v.N)

where, in these expressions, the variables v, y, v0, v1, w are bound.To simplify the technicalities, we restrict our treatment to the rules for

→ and ∃. The argument easily extends to the full set of rules. Hence weconsider

u | λvM | ∃+rM |MN |M(v.N).The strategy for strong normalization is set out below, but a word about

notation is crucial here. Whenever we write an applicative term as M ~N :=MN1 . . . Nk the convention is that bracketing to the left operates. That is,M ~N = (. . . (MN1) . . . Nk).

We reserve the letters E,F,G for eliminations, i.e., expressions of theform (v.N), and R,S, T for both terms and eliminations. Using this notationwe obtain a second (and clearly equivalent) inductive definition of terms:

u ~M | u ~ME | λvM | ∃+rM |

(λvM)N ~R | ∃+rM(v.N)~R | u ~MER~S.

Here only the final three forms are not normal: (λvM)N ~R and ∃+rM(v.N)~Rboth are β-redexes, and u ~MER~S is a permutative redex . The conversionrules for them are

(λvM(v))N 7→β M(N) β→-conversion,

∃+x,ArM(v.N(x, v)) 7→β N(r,M) β∃-conversion,

M(v.N)R 7→π M(v.NR) permutative conversion.

In addition we also allow

M(v.N) 7→σ N if v : A is not free in N ; a simplification conversion.

M(v.N) is then called a simplification redex .The closure of these conversions is defined by

(a) If M 7→ξ M′ for ξ = β, π, σ, then M →M ′.

(b) If M → M ′, then MR → M ′R, NM → NM ′, N(v.M) → N(v.M ′),λvM → λvM

′, ∃+rM → ∃+rM ′ (inner reductions).So M → N means that M reduces in one step to N , i.e., N is obtained fromM by replacement of (an occurrence of) a redex M ′ of M by a conversumM ′′ of M ′, i.e., by a single conversion. The relation →+ (“properly reducesto”) is the transitive closure of →, and →∗ (“reduces to”) is the reflexiveand transitive closure of →. A term M is in normal form, or M is normal ,if M does not contain a redex. M has a normal form if there is a normal Nsuch that M →∗ N . A reduction sequence is a (finite or infinite) sequenceM0 →M1 →M2 → . . . such that Mi →Mi+1, for all i.


We inductively define a set SN. In doing so we take care that for a givenM there is exactly one rule applicable to generate M ∈ SN. This will becrucial to make the later proofs work.

~M ∈ SN (Var0)u ~M ∈ SN

M ∈ SN (λ)λvM ∈ SN

M ∈ SN (∃)∃+rM ∈ SN

~M,N ∈ SN(Var)

u ~M(v.N) ∈ SN

u ~M(v.NR)~S ∈ SN(Varπ)

u ~M(v.N)R~S ∈ SN

M(N)~R ∈ SN N ∈ SN(β→)

(λvM(v))N ~R ∈ SN

.N(r,M)~R ∈ SN M ∈ SN

(β∃)∃+x,ArM(v.N(x, v))~R ∈ SN

In (Varπ) we require that x (from ∃xA) and v are not free in R.It is easy to see that SN is closed under substitution for object variables:

if M(x) ∈ SN, then M(r) ∈ SN. The proof of this is by induction onM ∈ SN, applying the induction hypothesis first to the premise(es) andthen reapplying the same rule.

We write M↓ to mean that M is strongly normalizing, i.e., that everyreduction sequence starting from M terminates. By analysing the possi-ble reduction steps we now show that the set M | M↓ has the closureproperties of the definition of SN above, and hence SN ⊆ M |M↓ .

Lemma. Every term in SN is strongly normalizing.

Proof. We distinguish cases according to the generation rule of SNapplied last. The following rules deserve special attention.

Case (Varπ). We prove, as an auxiliary lemma, that

u ~M(v.NR)~S↓ implies u ~M(v.N)R~S↓,

by induction on u ~M(v.NR)~S↓ (i.e., on the reduction tree of this term). Weconsider the possible reducts of u ~M(v.N)R~S. The only interesting case isu ~M(v.N)(v′.N ′)T ~T , and we have a permutative conversion of (v′.N ′) withT , leading to the term M = u ~M(v.N)(v′.N ′T )~T . We show M↓. Consider anarbitrary reduction sequence starting from M ; for simplicity assume ~T = R.Reductions inside ~M , N , N ′T and R lead to u ~M1(v.N1)(v′.K1)R1. Then wemay have a permutative conversion to u ~M1(v.N1)(v′.K1R1) and afterwardsfurther inner reductions leading to u ~M2(v.N2)(v′.K2). Such reductions mustterminate because any infinite reduction sequence caused by them wouldalso lead to an infinite reduction sequence for u ~M(v.N)(v′.N ′)TR, whichcannot happen by induction hypothesis. Therefore we are left with thecase of a final permutative conversion to u ~M2(v.N2(v′.K2)). But this termis also a reduct of u ~M(v.N)(v′.K)TR (first three permutative conversionslead to u ~M(v.N)(v′.KTR), and then inner reductions), which by inductionhypothesis is strongly normalizing.

22 1. LOGIC

Case (β→). We show that M(N)~R↓ and N↓ imply (λvM(v))N ~R↓. Thisis done by induction on N↓, with a side induction on M(N)~R↓. We needto consider all possible reducts of (λvM(v))N ~R. In case of an outer β-reduction use the assumption. If N is reduced, use the induction hypothesis.Reductions in M and in ~R as well as permutative reductions within ~R aretaken care of by the side induction hypothesis.

Case (β∃). We show that

N(r,M)~R↓ and M↓ together imply ∃+rM(v.N(x, v))~R↓.

This is done by a threefold induction: first on M↓, second on N(r,M)~R↓and third on the length of ~R. We need to consider all possible reductsof ∃+rM(v.N(x, v))~R. In case of an outer β-reduction it must reduce toN(r,M)~R, hence the result by assumption. If M is reduced, use the firstinduction hypothesis. Reductions in N(x, v) and in ~R as well as permutativereductions within ~R are taken care of by the second induction hypothesis.The only remaining case is when ~R = S~S and (v.N(x, v)) is permuted withS, to yield ∃+rM(v.N(x, v)S)~S, in which case Apply the third inductionhypothesis applies.

For later use we prove a slightly generalized form of the rule (Varπ):

Proposition. If M(v.NR)~S ∈ SN, then M(v.N)R~S ∈ SN.

Proof. Induction on the generation of M(v.NR)~S ∈ SN. We distin-guish cases according to the form of M .

Case u~T (v.NR)~S ∈ SN. If ~T = ~M (i.e., ~T consists of derivation termsonly), use (Varπ). Else we have u ~M(v′.N ′)~R(v.NR)~S ∈ SN. This must begenerated by repeated applications of (Varπ) from u ~M(v′.N ′ ~R(v.NR)~S) ∈SN, and finally by (Var) from ~M ∈ SN and N ′ ~R(v.NR)~S ∈ SN. Theinduction hypothesis for the latter fact yields N ′ ~R(v.N)R~S ∈ SN, henceu ~M(v′.N ′ ~R(v.N)R~S) ∈ SN by (Var) and finally u ~M(v′.N ′)~R(v.N)R~S ∈ SNby (Varπ).

Case ∃+rM ~T (v.N(x, v)R)~S ∈ SN. Similar, with (β∃) instead of (Varπ).In detail: If ~T is empty, by (β∃) this came from N(r,M)R~S ∈ SN andM ∈ SN, hence ∃+rM(v.N(x, v))R~S ∈ SN again by (β∃). Otherwisewe have ∃+rM(v′.N ′(x′, v′))~T (v.NR)~S ∈ SN. This must be generatedby (β∃) from N ′(r,M)~T (v.NR)~S ∈ SN. The induction hypothesis yieldsN ′(r,M)~T (v.N)R~S ∈ SN, hence ∃+rM(v′.N ′(x, v′))~T (v.N)R~S ∈ SN by(β∃).

Case (λvM(v))N ′ ~R(w.NR)~S ∈ SN. By (β→) this came from N ′ ∈SN and M(N ′)~R(w.NR)~S ∈ SN. But the induction hypothesis yieldsM(N ′)~R(w.N)R~S ∈ SN, hence (λvM(v))N ′ ~R(w.N)R~S ∈ SN by (β→).

We show, finally, that every term is in SN and hence is strongly normal-izing. Given the definition of SN we only have to show that SN is closedunder→− and ∃−. But in order to prove this we must prove simultaneouslythe closure of SN under substitution.

Theorem (Properties of SN). For all formulas A,


(a) for all M ∈ SN, if M proves A = A0→A1 and N ∈ SN, then MN ∈ SN,(b) for all M ∈ SN, if M proves A = ∃xB and N ∈ SN, then M(v.N) ∈ SN,(c) for all M(v) ∈ SN, if NA ∈ SN, then M(N) ∈ SN.

Proof. Induction on |A|. We prove (a) and (b) before (c), and hencehave (a) and (b) available for the proof of (c). More formally, by inductionon A we simultaneously prove that (a) holds, that (b) holds and that (a),(b) together imply (c).

(a). By side induction on M ∈ SN. Let M ∈ SN and assume that Mproves A = A0 → A1 and N ∈ SN. We distinguish cases according to howM ∈ SN was generated. For (Var0), (Varπ), (β→) and (β∃) use the samerule again.

Case u ~M(v.N ′) ∈ SN by (Var) from ~M,N ′ ∈ SN. Then N ′N ∈ SN byside induction hypothesis for N ′, hence u ~M(v.N ′N) ∈ SN by (Var), henceu ~M(v.N ′)N ∈ SN by (Varπ).

Case (λvM(v))A0→A1 ∈ SN by (λ) from M ∈ SN. Use (β→); for this weneed to know M(N) ∈ SN. But this follows from induction hypothesis (c)for M , since N derives A0.

(b). By side induction on M ∈ SN. Let M ∈ SN and assume that Mproves A = ∃xB and N ∈ SN. The goal is M(v.N) ∈ SN. We distinguishcases according to how M ∈ SN was generated. For (Varπ), (β→) and (β∃)use the same rule again.

Case u ~M ∈ SN by (Var0) from ~M ∈ SN. Use (Var).Case (∃+rM)∃xA ∈ SN by (∃) from M ∈ SN. We must show that

∃+rM(v.N(x, v)) ∈ SN. Use (β∃); for this we need to know N(r,M) ∈ SN.But this follows from induction hypothesis (c) for N(r, v) (which is in SNby the remark above), since M derives A(r).

Case u ~M(v′.N ′) ∈ SN by (Var) from ~M,N ′ ∈ SN. Then N ′(v.N) ∈ SNby side induction hypothesis for N ′, hence u ~M(v.N ′(v.N)) ∈ SN by (Var)and therefore u ~M(v.N ′)(v.N) ∈ SN by (Varπ).

(c). By side induction on M(v) ∈ SN. Let NA ∈ SN; the goal is M(N) ∈SN. We distinguish cases according to how M(v) ∈ SN was generated. For(λ), (∃), (β→) and (β∃) use the same rule again, after applying the inductionhypothesis to the premise(es).

Case u ~M(v) ∈ SN by (Var0) from ~M(v) ∈ SN. Then ~M(N) ∈ SN byside induction hypothesis (c). If u 6= v, use (Var0) again. If u = v, we mustshow N ~M(N) ∈ SN. Note that N proves A; hence the claim follows from~M(N) ∈ SN by (a) with M = N .

Case u ~M(v)(v′.N ′(v)) ∈ SN by (Var) from ~M(v), N ′(v) ∈ SN. If u 6= v,use (Var) again. If u = v, we must show N ~M(N)(v′.N ′(N)) ∈ SN. Notethat N proves A; hence in case ~M(v) is empty the claim follows from (b)with M = N , and otherwise from (a), (b) and the induction hypothesis.

Case u ~M(v)(v′.N ′(v))R(v)~S(v) ∈ SN has been obtained by (Varπ) fromu ~M(v)(v′.N ′(v)R(v))~S(v) ∈ SN. If u 6= v, use (Varπ) again. If u = v, fromthe side induction hypothesis we obtain N ~M(N)(v′.N ′(N)R(N))~S(N) ∈SN. Now use the proposition above with M := N ~M(N).

24 1. LOGIC

Corollary. Every derivation term is in SN and therefore strongly nor-malizing.

Proof. Induction on the (first) inductive definition of derivation terms.In cases u, λvM and ∃+rM the claim follows from the definition of SN, andin cases MN and M(v.N) from parts (a), (b) of the previous theorem.

1.2.5. On disjunction. Incorporating the full set of rules adds noother technical complications but merely increases the length. For the ener-getic reader, however, we include here the details necessary for disjunction.The conjunction case is entirely straightforward.

We have additional β-conversions

∨+i M(v0.N0, v1.N1) 7→β M [vi := Ni] β∨i-conversion.

The definition of SN needs to be extended byM ∈ SN (∨i)∨+i M ∈ SN

~M,N0, N1 ∈ SN(Var∨)

u ~M(v0.N0, v1.N1) ∈ SN

u ~M(v0.N0R, v1.N1R)~S ∈ SN(Var∨,π)

u ~M(v0.N0, v1.N1)R~S ∈ SN

Ni[vi := M ]~R ∈ SN N1−i ~R ∈ SN M ∈ SN(β∨i)

∨+i M(v0.N0, v1.N1)~R ∈ SN

The former rules (Var), (Varπ) should then be renamed into (Var∃), (Var∃,π).The lemma above stating that every term in SN is strongly normalizable

needs to be extended by an additional clause:Case (β∨i). We show that Ni[vi := M ]~R↓, N1−i ~R↓ and M↓ together im-

ply ∨+i M(v0.N0, v1.N1)~R↓. This is done by a fourfold induction: first onM↓,

second onNi[vi := M ]~R↓, N1−i ~R↓, third onN1−i ~R↓ and fourth on the lengthof ~R. We need to consider all possible reducts of ∨+

i M(v0.N0, v1.N1)~R. Incase of an outer β-reduction use the assumption. If M is reduced, use thefirst induction hypothesis. Reductions in Ni and in ~R as well as permutativereductions within ~R are taken care of by the second induction hypothesis.Reductions in N1−i are taken care of by the third induction hypothesis. Theonly remaining case is when ~R = S~S and (v0.N0, v1.N1) is permuted withS, to yield (v0.N0S, v1.N1S). Apply the fourth induction hypothesis, since(NiS)[v := M ]~S = Ni[v := M ]S~S.

Finally the theorem above stating properties of SN needs an additionalclause:

• for all M ∈ SN, if M proves A = A0 ∨ A1 and N0, N1 ∈ SN, thenM(v0.N0, v1.N1) ∈ SN.

Proof. The new clause is proved by induction onM ∈ SN. LetM ∈ SNand assume that M proves A = A0 ∨ A1 and N0, N1 ∈ SN. The goal isM(v0.N0, v1.N1) ∈ SN. We distinguish cases according to how M ∈ SN wasgenerated. For (Var∃,π), (Var∨,π), (β→), (β∃) and (β∨i) use the same ruleagain.


Case u ~M ∈ SN by (Var0) from ~M ∈ SN. Use (Var∨).Case (∨+

i M)A0∨A1 ∈ SN by (∨i) from M ∈ SN. Use (β∨i); for this weneed to know Ni[vi := M ] ∈ SN and N1−i ∈ SN. The latter is assumed,and the former follows from main induction hypothesis (with Ni) for thesubstitution clause of the theorem, since M derives Ai.

Case u ~M(v′.N ′) ∈ SN by (Var∃) from ~M,N ′ ∈ SN. For brevity letE := (v0.N0, v1.N1). Then N ′E ∈ SN by side induction hypothesis forN ′, so u ~M(v′.N ′E) ∈ SN by (Var∃) and therefore u ~M(v′.N ′)E ∈ SN by(Var∃,π).

Case u ~M(v′0.N′0, v

′1.N

′1) ∈ SN by (Var∨) from ~M,N ′

0, N′1 ∈ SN. Let

E := (v0.N0, v1.N1). ThenN ′iE ∈ SN by side induction hypothesis forN ′

i , sou ~M(v′0.N

′0E, v

′1.N

′1E) ∈ SN by (Var∨) and therefore u ~M(v′0.N

′0, v

′1.N

′1)E ∈

SN by (Var∨,π).Clause (c) now needs additional cases, e.g.,Case u ~M(v0.N0, v1.N1) ∈ SN by (Var∨) from ~M,N0, N1 ∈ SN. If u 6= v,

use (Var∨). If u = v, we show N ~M [v := N ](v0.N0[v := N ], v1.N1[v := N ]) ∈SN. Note that N proves A; hence in case ~M empty the claim follows from(b), and otherwise from (a) and the induction hypothesis.

1.2.6. The structure of normal derivations. To analyse normalderivations, it will be useful to introduce the notions of a segment and of atrack in a proof tree, which make sense for non-normal derivations as well.

Definition. A segment of (length n) in a derivation M is a sequenceA1, . . . , An of occurrences of a formula A such that(a) for 1 ≤ i < n, Ai is a minor premise of an application of ∨−, ∧− or ∃−,

with conclusion Ai+1;(b) An is not a minor premise of ∨−, ∧− or ∃−.(c) A1 is not the conclusion of ∨−, ∧− or ∃−.Notice that a formula occurrence (f.o.) which is neither a minor premisenor the conclusion of an application of ∨−, ∧− or ∃− always constitutes asegment of length 1. A segment is maximal or a cut (segment) if An is themajor premise of an E-rule, and either n > 1, or n = 1 and A1 = An is theconclusion of an I-rule.

We use σ, σ′ for segments. σ is called a subformula of σ′ if the formulaA in σ is a subformula of B in σ′.

The notion of a track is designed to retain the subformula property incase one passes through the major premise of an application of a ∨−,∧−,∃−-rule. In a track, when arriving at an Ai which is the major premise of anapplication of such a rule, we take for Ai+1 a hypothesis discharged by thisrule.

Definition. A track of a derivation M is a sequence of f.o.’s A0, . . . , Ansuch that(a) A0 is a top f.o. in M not discharged by an application of an ∨−,∧−,∃−-

rule;(b) Ai for i < n is not the minor premise of an instance of →−, and either

(i) Ai is not the major premise of an instance of a ∨−,∧−,∃−-rule andAi+1 is directly below Ai, or

26 1. LOGIC

(ii) Ai is the major premise of an instance of a ∨−,∧−,∃−-rule andAi+1 is an assumption discharged by this instance;

(c) An is either(i) the minor premise of an instance of →−, or(ii) the end formula of M , or(iii) the major premise of an instance of a ∨−,∧−,∃−-rule in case there

are no assumptions discharged by this instance.

Lemma. In a derivation each formula occurrence belongs to some track.

Proof. By induction on derivations. For example, suppose a derivationK ends with an ∃−-application:

|M∃xA

[u : A]| NB ∃−x, u

B

B in N belongs to a track π (induction hypothesis); either this does notstart in u : A, and then π,B is a track in K which ends in the end formula;or π starts in u : A, and then there is a track π′ in M (induction hypothesis)such that π′, π, B is a track in K ending in the end formula. The other casesare left to the reader.

Definition. A track of order 0, or main track , in a derivation is atrack ending either in the end formula of the whole derivation or in themajor premise of an application of a ∨−, ∧− or ∃−-rule, provided there areno assumption variables discharged by the application. A track of ordern + 1 is a track ending in the minor premise of an →−-application, withmajor premise belonging to a track of order n.

A main branch of a derivation is a branch π (i.e., a linearly orderedsubtree) in the proof tree such that π passes only through premises of I-rules and major premises of E-rules, and π begins at a top node and endsin the end formula.

Since by simplification conversions we have removed every applicationof an ∨−, ∧− or ∃−-rule that discharges no assumption variables, each trackof order 0 in a normal derivation is a track ending in the end formula ofthe whole derivation. Note also that if we search for a main branch goingupwards from the end formula, the branch to be followed is unique as longas we do not encounter an ∧+-application. Now let us consider normalderivations. Recall the notion of a strictly positive part of a formula, definedin 1.1.3.

Proposition. Let M be a normal derivation, and let π = σ0, . . . , σn bea track in M . Then there is a segment σi in π, the minimum segment orminimum part of the track, which separates two (possibly empty) parts of π,called the E-part ( elimination part) and the I-part ( introduction part) of πsuch that(a) for each σj in the E-part one has j < i, σj is a major premise of an

E-rule, and σj+1 is a strictly positive part of σj, and therefore each σjis a s.p.p. of σ0;


(b) for each σj which is the minimum segment or is in the I-part one hasi ≤ j, and if j 6= n, then σj is a premise of an I-rule and a s.p.p. ofσj+1, so each σj is a s.p.p. of σn.

Proof. By tracing through the definitions.

Theorem (Subformula property). Let M be a normal derivation. Theneach formula occurring in the derivation is a subformula of either the endformula or else an (uncancelled) assumption formula.

Proof. As noted above, each track of order 0 in M is a track endingin the end formula of M . Furthermore each track has an E-part above anI-part. Therefore any formula on a track of order 0 is either a subformulaof the end formula or else a subformula of an (uncancelled) assumption. Wecan now prove the theorem for tracks of order n, by induction on n. Soassume the result holds for tracks of order n. If A is any formula on a trackof order n + 1, either A lies in the E-part in which case it is a subformulaof an assumption, or else it lies in the I-part and is therefore a subformulaof the minor premise of an →− whose main premise belongs to a track oforder n. In this case A is a subformula of a formula on a track of order nand we can apply the induction hypothesis.

Theorem (Disjunction property). If no strictly positive part of a for-mula in Γ is a disjunction, then Γ ` A ∨B implies Γ ` A or Γ ` B.

Proof. Consider a normal derivation M of A ∨ B from assumptionsΓ not containing a disjunction as s.p.p. The end formula A ∨ B is thefinal formula of a (main) track. If the I-part of this track is empty, thenthe structure of main tracks ensures that A ∨ B would be a s.p.p. of anassumption in Γ, but this is not allowed. Hence A ∨ B lies in the I-partof a main track. If above A ∨ B this track goes through a minor premiseof an ∨−, then the major premise would again be a disjunctive s.p.p. of anassumption, which is not allowed. Thus A ∨B belongs to a segment withinthe I-part of the track, above which there can only be finitely many ∃− and∧− followed by an ∨+

i . Its premise is either A or B, and therefore we canreplace the segment of A∨B’s by a segment of A’s or a segment of B’s, thustransforming the proof into either a proof into either a proof of A or a proofof B.

There is a similar theorem for the existential quantifier:

Theorem (Explicit definability under hypotheses). If no strictly pos-itive part of a formula in Γ is existential, then Γ ` ∃xA(x) implies Γ À(r1) ∨ · · · ∨ A(rn) for some terms r1, . . . , rn. If in addition no s.p.p. of aformula in Γ is disjunctive then Γ ` ∃xA(x) implies there is even a singleterm r such that Γ ` A(r).

Proof. Consider a normal derivation M of ∃xA(x) from assumptionsΓ not containing an existential s.p.p. We use induction on the derivation,and distinguish cases on the last rule.

28 1. LOGIC

By assumption the last rule cannot be ∃−, using a similar argument tothe above. Again as before, the only critical case is when the last rule is ∨−.

|MB ∨ C

[u : B]| N0

∃xA(x)

[v : C]| N1

∃xA(x)∨−u, v

∃xA(x)

By assumption again neither B nor C can have an existential s.p.p. Applyingthe induction hypothesis to N0 and N1 we obtain

|MB ∨ C

[u : B]|∨∨n

i=1A(ri)∨+∨∨n+m

i=1 A(ri)

[v : C]|∨∨n+m

i=n+1A(ri)∨+∨∨n+m

i=1 A(ri) ∨−u, v∨∨n+mi=1 A(ri)

The remaining cases are left to the reader.The second part of the theorem is proved similarly; by assumption the

last rule can be neither ∨− nor ∃−, so it may be an ∧−. In that case there isonly one minor premise and so no need to duplicate instances of A(x).

1.3. Soundness and Completeness for Tree Models

It is an obvious question to ask whether the logical rules we have beenconsidering suffice, i.e., whether we have forgotten some necessary rules. Toanswer this question we first have to fix the meaning of a formula, i.e., pro-vide a semantics. This will be done by means of the tree models introducedby Beth (1956). Using this concept of a model we will prove soundness andcompleteness.

1.3.1. Tree models. Consider a finitely branching tree of “possibleworlds”. The worlds are represented as nodes in this tree. They may bethought of as possible states such that all nodes “above” a node k are theways in which k may develop in the future. The worlds are increasing, thatis, if an atomic formula R~s is true in a world k, then R~s is true in all futureworlds k′.

More formally, each tree model is based on a finitely branching tree T . Anode k over a set S is a finite sequence k = 〈a0, a1, . . . , an−1〉 of elements ofS; lh(k) is the length of k. We write k k′ if k is an initial segment of k′. Atree on S is a set of nodes closed under initial segments. A tree T is finitelybranching if every node in T has finitely many immediate successors. A treeT is infinite if for every n ∈ N there is a node k ∈ T such that lh(k) = n.A branch of T is a linearly ordered subtree of T . A leaf is a node withoutsuccessors in T .

For the proof of the completeness theorem, the completeness tree over0, 1 (whose branches constitute Cantor space) will suffice. The nodes willbe all the finite sequences of 0’s and 1’s, and the ordering is as above. Theroot is the empty sequence and k0 is the sequence k with the element 0added at the end; similarly for k1.

1.3. SOUNDNESS AND COMPLETENESS FOR TREE MODELS 29

For the rest of this section, fix a countable formal language L.

Definition. Let T be a finitely branching tree. A tree model on T is atriple T = (D, I0, I1) such that(a) D is a nonempty set;(b) for every n-ary function symbol f (in the underlying language L), I0

assigns to f a map I0(f) : Dn → D;(c) for every n-ary relation symbol R and every node k ∈ T , I1(R, k) ⊆ Dn

is assigned in such a way that monotonicity is preserved:

k k′ → I1(R, k) ⊆ I1(R, k′).

If n = 0, then I1(R, k) is either true or false. There is no special re-quirement set on I1(⊥, k). (Recall that minimal logic places no particularconstraints on falsum ⊥.) We write RT (~a, k) for ~a ∈ I1(R, k), and |T | todenote the domain D.

It is obvious from the definition that any tree T can be extended to acomplete tree T without leaves, in which for every leaf k ∈ T all sequencesk0, k00, k000, . . . are added to T . For every node k0 . . . 0, we then addI1(R, k0 . . . 0) := I1(R, k).

An assignment (or variable assignment) in D is a map η assigning toevery variable x ∈ dom(η) a value η(x) ∈ D. Finite assignments will bewritten as [x1 := a1, . . . , xn := an] or else as [a1/x1, . . . , an/xn], with distinctx1, . . . , xn. If η is an assignment in D and a ∈ D, let ηax be the assignmentin D mapping x to a and coinciding with η elsewhere:

ηax(y) :=

η(y), if y 6= x

a, if y = x.

Let a tree model T = (D, I0, I1) and an assignment η in D be given. Wedefine a homomorphic extension of η (denoted by η as well) to terms t whosevariables lie in dom(η) by

η(c) := I0(c),

η(f(t1, . . . , tn)) := I0(f)(η(t1), . . . , η(tn)).

Observe that the extension of η depends on T ; we often write tT [η] for η(t).

Definition. T , k A[η] (T forces A at node k for an assignment η) isdefined inductively. We write k A[η] when it is clear from the context whatthe underlying model T is, and ∀k′nkA for ∀k′k(lh(k′) = lh(k) + n→ A).

k (R~s )[η] := ∃n∀k′nkRT (~sT [η], k′)

k (A ∨B)[η] := ∃n∀k′nk(k′ A[η] ∨ k′ B[η])

k (∃xA)[η] := ∃n∀k′nk∃a∈|T |(k′ A[ηax])

k (A→ B)[η] := ∀k′k(k′ A[η]→ k′ B[η])

k (A ∧B)[η] := k A[η] ∧ k B[η]

k (∀xA)[η] := ∀a∈|T |(k A[ηax]).

Thus in the atomic, disjunctive and existential cases, the set of k′ whoselength is lh(k) + n acts as a “bar” in the complete tree. Note that the im-plicational case is treated differently, and refers to the “unbounded future”.

30 1. LOGIC

In this definition, the logical connectives →,∧,∨,∀,∃ on the left handside are part of the object language, whereas the same connectives on theright hand side are to be understood in the usual sense: they belong tothe “metalanguage”. It should always be clear from the context whether aformula is part of the object or the metalanguage.

1.3.2. Covering lemma. It is easily seen (using the definition andmonotonicity) that from k A[η] and k k′ we can conclude k′ A[η].The converse is also true:

Lemma (Covering).

∀k′nk(k′ A[η])→ k A[η].

Proof. Induction on A. We write k A for k A[η].Case R~s. Assume

∀k′nk(k′ R~s ),

hence by definition

∀k′nk∃m∀k′′mk′RT (~sT [η], k′′).

Since T is a finitely branching tree,

∃m∀k′mkRT (~sT [η], k′).

Hence k R~s.The cases A ∨B and ∃xA are handled similarly.Case A → B. Let k′ A → B for all k′ k with lh(k′) = lh(k) + n.

We show∀lk(l A→ l B).

Let l k and l A. We must show l B. To this end we apply theinduction hypothesis to B and m := max(lh(k) + n, lh(l)). So assume l′ land lh(l′) = m. It is sufficient to show l′ B. If lh(l′) = lh(l), then l′ = land we are done. If lh(l′) = lh(k) + n > lh(l), then l′ is an extension of l aswell as of k and has length lh(k) +n, and hence l′ A→ B by assumption.Moreover, l′ A, since l′ l and l A. It follows that l′ B.

The cases A ∧B and ∀xA are easy.

1.3.3. Soundness.

Lemma (Coincidence). Let T be a tree model, t a term, A a formulaand η, ξ assignments in |T |.(a) If η(x) = ξ(x) for all x ∈ vars(t), then η(t) = ξ(t).(b) If η(x) = ξ(x) for all x ∈ FV(A), then T , k A[η] if and only ifT , k A[ξ].

Proof. Induction on terms and formulas.

Lemma (Substitution). Let T be a tree model, t, r terms, A a formulaand η an assignment in |T |. Then

(a) η(r(t)) = ηη(t)x (r(x)).

(b) T , k A(t)[η] if and only if T , k A(x)[ηη(t)x ].



Theorem (Soundness). Let Γ∪A be a set of formulas such that Γ ` A.Then, if T is a tree model, k any node and η an assignment in |T |, it followsthat T , k Γ[η] implies T , k A[η].

Proof. Induction on derivations.We begin with the axiom schemes ∨+

0 , ∨+1 , ∨−, ∧+, ∧−, ∃+ and ∃−.

k C[η] is abbreviated k C, when η is known from the context.Case ∨+

0 : A→ A∨B. We show k A→ A∨B. Assume for k′ k thatk′ A. Show: k′ A ∨ B. This follows from the definition, since k′ A.The case ∨+

1 : B → A ∨B is symmetric.Case ∨− : A ∨ B → (A → C) → (B → C) → C. We show that

k A ∨ B → (A → C) → (B → C) → C. Assume for k′ k thatk′ A ∨ B, k′ A → C and k′ B → C (we can safely assume that k′

is the same for all three premises.) Show that k′ C. By definition, thereis an n s.t. for all k′′ n k′, k′′ A or k′′ B. In both cases it followsthat k′′ C, since k′ A → C and k′ B → C. By the covering lemma,k′ C.

The cases ∧+, ∧− are easy.Case ∃+ : A → ∃xA. We show k (A → ∃xA)[η]. Assume k′ k and

k′ A[η]. We show k′ (∃xA)[η]. Since η = ηη(x)x there is an a ∈ |T |

(namely a := η(x)) such that k′ A[ηax]. Hence, k′ (∃xA)[η].Case ∃− : ∃xA → ∀x(A → B) → B and x /∈ FV(B). We show that

k (∃xA → ∀x(A → B) → B)[η]. Assume that k′ k and k′ (∃xA)[η]and k′ ∀x(A → B)[η]. We show k′ B[η]. By definition, there isan n such that for all k′′ n k′ we have a ∈ |T | and k′′ A[ηax]. Fromk′ ∀x(A → B)[η] it follows that k′′ B[ηax], and since x /∈ FV(B), fromthe coincidence lemma, k′′ B[η]. Then, finally, by the covering lemmak′ B[η].

This concludes the treatment of the axioms. We now consider the rules.In case of the assumption rule u : A we have A ∈ Γ and the claim is obvious.

Case →+. Assume k Γ. We show k A → B. Assume k′ k andk′ A. Our goal is k′ B. We have k′ Γ ∪ A. Thus, k′ B byinduction hypothesis.

Case →−. Assume k Γ. The induction hypothesis gives us k A→ Band k A. Hence k B.

Case ∀+. Assume k Γ[η] and x /∈ FV(Γ). We show k (∀xA)[η], i.e.,k A[ηax] for an arbitrary a ∈ |T |. We have

k Γ[ηax] by the coincidence lemma, since x /∈ FV(Γ)

k A[ηax] by induction hypothesis.

Case ∀−. Let k Γ[η]. We show that k A(t)[η]. This follows from

k (∀xA(x))[η] by induction hypothesis

k A(x)[ηη(t)x ] by definition

k A(t)[η] by the substitution lemma.

This concludes the proof.

1.3.4. Counter models. With soundness at hand, it is easy to buildcounter models for derivations not valid in minimal or intuitionistic logic. A

32 1. LOGIC

tree model for intuitionistic logic is a tree model T = (D, I0, I1) in which ⊥is never forced, and consequently T , 〈〉 Efq. This is equivalent to sayingI1(⊥, k) is false for all k.

Lemma. Given any tree model T , ⊥T (k) is false at all nodes k if andonly if k 6 ⊥ for all nodes k.

Proof. Clearly if k 6 ⊥ then ⊥ is false at node k. Conversely, suppose⊥T (k′) is false at all nodes k′. We must show ∀k(k 6 ⊥). Let k be given.Then, since ⊥T (k′) is false at all nodes k′, is is certainly false at somek′ n k, for every n. This means k 6 ⊥ by definition.

Therefore by unravelling the implication clause in the forcing definition,one sees that in any tree model for intuitionistic logic,

(k ¬A)↔ ∀k′k(k′ 6 A),

(k ¬¬A)↔ ∀k′k(k′ 6 ¬A)

↔ ∀k′k∃k′′k′(k′′ A).

As an example we show that 6ì ¬¬P → P . We describe the desiredtree model by means of a diagram below. Next to every node we write allpropositions forced at that node.

•@@

•P •@@

•P •@@

•P ..

.

This is a tree model because monotonicity clearly holds. Observe also thatI1(⊥, k) is false at all nodes k. Hence this is an intuitionistic tree model, andmoreover 〈〉 6 P . Using the remark above, it is easily seen that 〈〉 ¬¬P .Thus 〈〉 6 (¬¬P → P ) and hence 6ì (¬¬P → P ). The model also shows thatthe Peirce formula ((P → Q) → P ) → P is not derivable in intuitionisticlogic.

As another example we show that the drinker formula ∃x(Px→ ∀xPx)from 1.1.8 is intuitionistically underivable, using a quite different tree model.In this case the underlying tree is the full binary one, i.e., its nodes are thefinite sequences k = 〈i0, i1, . . . , in−1〉 of numbers 0 or 1. For the languagedetermined by ⊥ and a unary predicate symbol P consider T := (D, I1)with I1(⊥, k) false, D := N and

I1(P, 〈i0, . . . , in−1〉) := a ∈ D | i0, . . . , in−1 contains at least a zeros .

Cleary T is an intuitionistic tree model (monotonicity is easily checked),k 6 ∀xPx for every k, and ∀a,k∃lk(l Px[x := a]). Therefore

∀a,k(k 6 (Px→ ∀xPx)[x := a])

〈〉 ∀x¬(Px→ ∀xPx).

Hence 6ì ¬∀x¬(Px→ ∀xPx).


1.3.5. Completeness.

Theorem (Completeness). Let Γ ∪ A be a set of formulas. Then thefollowing propositions are equivalent.(a) Γ ` A.(b) Γ A, i.e., for all tree models T , nodes k and assignments η

T , k Γ[η]→ T , k A[η].

Proof. Soundness already gives “(a) implies (b)”. For the other direc-tion we employ a technique due to Harvey Friedman and construct a treemodel T (over the set T01 of all finite 0-1-sequences) whose domain D is theset of all terms of the underlying language, with the property that Γ ` Bis equivalent to T , 〈〉 B[id]. We can assume here that Γ and also A areclosed.

In order to define T , we will need an enumeration A0, A1, A2, . . . of theunderlying language L (assumed countable), in which every formula occursinfinitely often. We also fix an enumeration x0, x1, . . . of distinct variables.Since Γ is countable it can we written Γ =

⋃n Γn with finite sets Γn such

that Γn ⊆ Γn+1. With every node k ∈ T01, we associate a finite set ∆k offormulas and a set Vk of variables, by induction on the length of k.

Let ∆〈〉 := ∅ and V〈〉 := ∅. Take a node k such that lh(k) = n andsuppose that ∆k, Vk are already defined. Write ∆ `n B to mean that thereis a derivation of length ≤ n of B from ∆. We define ∆k0, Vk0 and ∆k1, Vk1as follows:

Case 0. FV(An) 6⊆ Vk. Then let

∆k0 := ∆k1 := ∆k and Vk0 := Vk1 := Vk.

Case 1. FV(An) ⊆ Vk and Γn,∆k 6`n An. Let

∆k0 := ∆k and ∆k1 := ∆k ∪ An,Vk0 := Vk1 := Vk.

Case 2. FV(An) ⊆ Vk and Γn,∆k `n An = A′n ∨A′′n. Let

∆k0 := ∆k ∪ An, A′n and ∆k1 := ∆k ∪ An, A′′n,Vk0 := Vk1 := Vk.

Case 3. FV(An) ⊆ Vk and Γn,∆k `n An = ∃xA′n(x). Let

∆k0 := ∆k1 := ∆k ∪ An, A′n(xi) and Vk0 := Vk1 := Vk ∪ xi,where xi is the first variable /∈ Vk.

Case 4. FV(An) ⊆ Vk and Γn,∆k `n An, with An neither a disjunctionnor an existentially quantified formula. Let

∆k0 := ∆k1 := ∆k ∪ An and Vk0 := Vk1 := Vk.

Obviously FV(∆k) ⊆ Vk, and k k′ implies that ∆k ⊆ ∆k′ . Noticealso that because of ` ∃x(⊥ → ⊥) and the fact that this formula is repeatedinfinitely often in the given enumeration, for every variable xi there is an msuch that xi ∈ Vk for all k with lh(k) = m.

We note that

(1.7) ∀k′nk (Γ,∆k′ ` B)→ Γ,∆k ` B, provided FV(B) ⊆ Vk.

34 1. LOGIC

It is sufficient to show that, for FV(B) ⊆ Vk,(Γ,∆k0 ` B) ∧ (Γ,∆k1 ` B)→ (Γ,∆k ` B).

In cases 0, 1 and 4, this is obvious. For case 2, the claim follows imme-diately from the axiom schema ∨−. In case 3, we have FV(An) ⊆ Vk andΓn,∆k `n An = ∃xA′n(x). Assume Γ,∆k ∪ An, A′n(xi) ` B with xi /∈ Vk,and FV(B) ⊆ Vk. Then xi /∈ FV(∆k ∪ An, B), hence Γ,∆k ∪ An ` Bby ∃− and therefore Γ,∆k ` B.

Next, we show

(1.8) Γ,∆k ` B → ∃n∀k′nk (B ∈ ∆k′), provided FV(B) ⊆ Vk.Choose n ≥ lh(k) such that B = An and Γn,∆k `n An. For all k′ k, iflh(k′) = n+ 1 then An ∈ ∆k′ (cf. the cases 2-4).

Using the sets ∆k we can define a tree model T as (Ter, I0, I1) whereTer denotes the set of terms of the underlying language, I0(f)(~s ) := f~s and

RT (~s, k) = I1(R, k)(~s ) := R~s ∈ ∆k.

Obviously, tT [id] = t for all terms t.Now write k B for T , k B[id]. We show:

Claim. Γ,∆k ` B ↔ k B provided FV(B) ⊆ Vk.

The proof is by induction on B.Case R~s. Assume FV(R~s ) ⊆ Vk. The following are equivalent.

Γ,∆k ` R~s∃n∀k′nk (R~s ∈ ∆k′) by (1.8) and (1.7)

∃n∀k′nk RT (~s, k′) by definition of T

k R~s by definition of , since tT [id] = t.

Case B ∨ C. Assume FV(B ∨ C) ⊆ Vk. For the implication → letΓ,∆k ` B ∨ C. Choose an n ≥ lh(k) such that Γn,∆k `n An = B ∨ C.Then, for all k′ k s.t. lh(k′) = n,

∆k′0 = ∆k′ ∪ B ∨ C,B and ∆k′1 = ∆k′ ∪ B ∨ C,C,and therefore by induction hypothesis

k′0 B and k′1 C.

Then by definition, we have k B∨C. For the reverse implication← argueas follows.

k B ∨ C∃n∀k′nk(k

′ B ∨ k′ C)

∃n∀k′nk((Γ,∆k′ ` B) ∨ (Γ,∆k′ ` C)) by induction hypothesis

∃n∀k′nk (Γ,∆k′ ` B ∨ C)

Γ,∆k ` B ∨ C by (1.7).

Case B ∧ C is evident.Case B → C. Assume FV(B → C) ⊆ Vk. For → let Γ,∆k ` B → C.

We must show k B → C, i.e.,

∀k′k(k′ B → k′ C).


Let k′ k be such that k′ B. By induction hypothesis, it follows thatΓ,∆k′ ` B, and Γ,∆k′ ` C follows by assumption. Then again by inductionhypothesis k′ C.

For ← let k B → C, i.e., ∀k′k(k′ B → k′ C). We show thatΓ,∆k ` B → C, using (1.7). Choose n ≥ lh(k) such that B = An. For allk′ m k with m := n− lh(k) we show that Γ,∆k′ ` B → C.

If Γ,∆k′ `n An, then k′ B by induction hypothesis, and k′ Cby assumption. Hence Γ,∆k′ ` C again by induction hypothesis and thusΓ,∆k′ ` B → C.

If Γ,∆k′ 6`n An, then by definition ∆k′1 = ∆k′∪B. Hence Γ,∆k′1 ` B,and thus k′1 B by induction hypothesis. Now k′1 C by assumption,and finally Γ,∆k′1 ` C by induction hypothesis. From ∆k′1 = ∆k′ ∪ B itfollows that Γ,∆k′ ` B → C.

Case ∀xB(x). Assume FV(∀xB(x)) ⊆ Vk. For → let Γ,∆k ` ∀xB(x).Fix a term t. Then Γ,∆k ` B(t). Choose n such that FV(B(t)) ⊆ Vk′ for allk′ n k. Then ∀k′nk (Γ,∆k′ ` B(t)), hence ∀k′nk (k′ B(t)) by inductionhypothesis, hence k B(t) by the covering lemma. This holds for everyterm t, hence k ∀xB(x).

For ← assume k ∀xB(x). Pick k′ n k such that Am = ∃x(⊥ → ⊥),for m := lh(k) + n. Then at height m we put some xi into the variablesets: for k′ n k we have xi /∈ Vk′ but xi ∈ Vk′j . Clearly k′j B(xi),hence Γ,∆k′j ` B(xi) by induction hypothesis, hence (since at this heightwe consider the trivial formula ∃x(⊥ → ⊥)) also Γ,∆k′ ` B(xi). Sincexi /∈ Vk′ we obtain Γ,∆k′ ` ∀xB(x). This holds for all k′ n k, henceΓ,∆k ` ∀xB(x) by (1.7).

Case ∃xB(x). Assume FV(∃xB(x)) ⊆ Vk. For → let Γ,∆k ` ∃xB(x).Choose an n ≥ lh(k) such that Γn,∆k `n An = ∃xB(x). Then, for all k′ kwith lh(k′) = n

∆k′0 = ∆k′1 = ∆k ∪ ∃xB(x), B(xi)where xi /∈ Vk′ . Hence by induction hypothesis for B(xi) (applicable sinceFV(B(xi)) ⊆ Vk′j for j = 0, 1)

k′0 B(xi) and k′1 B(xi).

It follows by definition that k ∃xB(x).For ← assume k ∃xB(x). Then ∀k′nk∃t∈Ter (k′ B(x)[idtx]) for some

n, hence ∀k′nk∃t∈Ter (k′ B(t)). For each of the finitely many k′ n k pickan m such that ∀k′′mk′ (FV(B(tk′)) ⊆ Vk′′). Let m0 be the maximum of allthese m. Then

∀k′′m0+nk∃t∈Ter ((k′′ B(t)) ∧ FV(B(t)) ⊆ Vk′′).

The induction hypothesis for B(t) yields

∀k′′m0+nk∃t∈Ter (Γ,∆k′′ ` B(t))

∀k′′m0+nk (Γ,∆k′′ ` ∃xB(x))

Γ,∆k ` ∃xB(x) by (1.7)

and this completes the proof of the claim.Now we can finish the proof of the completeness theorem by showing (b)

implies (a). We apply (b) to the tree model T constructed above from Γ,

36 1. LOGIC

the empty node 〈〉 and the assignment η = id. Then T , 〈〉 Γ[id] by theclaim (since each formula in Γ is derivable from Γ). Hence T , 〈〉 A[id] by(b) and therefore Γ ` A by the claim again.

Completeness of intuitionistic logic follows as a corollary.

Corollary. Let Γ ∪ A be a set of formulas. The following proposi-tions are equivalent.

(a) Γ ì A.(b) Γ,Efq A, i.e., for all tree models T for intuitionistic logic, nodes k

and assignments η

T , k Γ[η]→ T , k A[η].

1.4. Soundness and Completeness of the Classical Fragment

We give a proof of completeness of classical logic relying on the com-pleteness proof for minimal logic above.

1.4.1. Models. We define the notion of a (classical) model (or moreaccurately, L-model), and what the value of a term and the meaning ofa formula in a model should be. The latter definition is by induction onformulas, where in the quantifier case we need a quantifier in the definition.

For the rest of this section, fix a countable formal language L; we do notmention the dependence on L in the notation. Since we deal with classicallogic, we only consider formulas built without ∨,∃.

Definition. A model is a triple M = (D, I0, I1) such that

(a) D is a nonempty set;(b) for every n-ary function symbol f , I0 assigns to f a map I0(f) : Dn → D;(c) for every n-ary relation symbol R, I1 assigns to R an n-ary relation on

Dn. In case n = 0, I1(R) is either true or false. We require that I1(⊥)is false.

We write |M| for the carrier set D ofM and fM, RM for the interpre-tations I0(f), I1(R) of the function and relation symbols. Assignments ηand their homomorphic extensions are defined as in 1.3.1. Again we writetM[η] for η(t).

Definition (Validity). For every model M, assignment η in |M| andformula A such that FV(A) ⊆ dom(η) we defineM |= A[η] (read: A is validinM under the assignment η) by induction on A.

M |= (R~s )[η] := RM(~sM[η]),

M |= (A→ B)[η] := ((M |= A[η])→ (M |= B[η])),

M |= (A ∧B)[η] := ((M |= A[η]) ∧ (M |= B[η])),

M |= (∀xA)[η] := ∀a∈|M|(M |= A[ηax]).

Since I1(⊥) is false, we have M 6|= ⊥[η].

1.4. SOUNDNESS AND COMPLETENESS OF THE CLASSICAL FRAGMENT 37

1.4.2. Soundness of classical logic.

Lemma (Coincidence). Let M be a model, t a term, A a formula andη, ξ assignments in |M|.(a) If η(x) = ξ(x) for all x ∈ vars(t), then η(t) = ξ(t).(b) If η(x) = ξ(x) for all x ∈ FV(A), then M |= A[η] if and only if M |=

A[ξ].


Lemma (Substitution). Let M be a model, t, r terms, A a formula andη an assignment in |M|. Then

(a) η(r(t)) = ηη(t)x (r(x)).

(b) M |= A(t) if and only if M |= A(x)[ηη(t)x ].


A model M is called classical if ¬¬RM(~a ) → RM(~a ) for all relationsymbols R and all ~a ∈ |M|. We prove that every formula derivable inclassical logic is valid in an arbitrary classical model.

Theorem (Soundness of classical logic). Let Γ∪A be a set of formulassuch that Γ `c A. Then, if M is a classical model and η an assignment in|M|, it follows that M |= Γ[η] implies M |= A[η].

Proof. Induction on derivations. We begin with the axioms in Staband the axiom schemes ∧+, ∧−. M |= C[η] is abbreviated M |= C when ηis known from the context.

For the stability axiom ∀~x(¬¬R~x → R~x ) the claim follows from ourassumption that M is classical, i.e., ¬¬RM(~a ) → RM(~a ) for all ~a ∈ |M|.The axioms ∧+, ∧− are clearly valid.

This concludes the treatment of the axioms. We now consider the rules.In case of the assumption rule u : A we have A ∈ Γ and the claim is obvious.

Case →+. Assume M |= Γ. We show M |= (A → B). So assume inaddition M |= A. We must show M |= B. By induction hypothesis (withΓ ∪ A instead of Γ) this clearly holds.

Case →−. Assume M |= Γ. We must show M |= B. By inductionhypothesis, M |= (A → B) and M |= A. The claim follows from thedefinition of |=.

Case ∀+. Assume M |= Γ[η] and x /∈ FV(Γ). We show M |= (∀xA)[η],i.e.,M |= A[ηax] for an arbitrary a ∈ |M|. We have

M |= Γ[ηax] by the coincidence lemma, since x /∈ FV(Γ)

M |= A[ηax] by induction hypothesis.

Case ∀−. LetM |= Γ[η]. We show thatM |= A(t)[η]. This follows from

M |= (∀xA(x))[η] by induction hypothesis

M |= A(x)[ηη(t)x ] by definition

M |= A(t)[η] by the substitution lemma.

This concludes the proof.

38 1. LOGIC

1.4.3. Completeness of classical logic. We give a constructive anal-ysis of the completeness of classical logic by using, in the metatheory below,constructively valid arguments only, mentioning explicitly any assumptionswhich go beyond. When dealing with the classical fragment we of courseneed to restrict to classical models. The only non-constructive principlewill be the use of the axiom of dependent choice for the weak existentialquantifier

∃xA(0, x)→ ∀n,x(A(n, x)→ ∃yA(n+ 1, y))→ ∃f∀nA(n, fn).

Recall that we only consider formulas without ∨,∃.

Theorem (Completeness of classical logic). Let Γ ∪ A be a set offormulas. Assume that for all classical models M and assignments η,

M |= Γ[η]→M |= A[η].

Then there must exist a derivation of A from Γ ∪ Stab.

Proof. Since “there must exist a derivation” expresses the weak ex-istential quantifier in the metalanguage, we need to prove a contradictionfrom the assumption Γ,Stab 6` A.

By the completeness theorem for minimal logic, there must be a treemodel T = (Ter, I0, I1) on the complete binary tree T01 and a node l0 suchthat l0 Γ,Stab and l0 6 A.

Call a node k consistent if k 6 ⊥, and stable if k Stab. We prove

(1.9) k 6 B → ∃k′k(k′ ¬B ∧ k 6 ⊥) (k stable).

Let k be a stable node, and B a formula (without ∨, ∃). Then Stab `¬¬B → B by the stability lemma, and therefore k ¬¬B → B. Hencefrom k 6 B we obtain k 6 ¬¬B. By a remark in 1.3.4 this implies that¬∀k′k(k′ ¬B → k′ ⊥), which proves (1.9).

Let α be a branch in the underlying tree T01. We define

α A := ∃k∈α(k A),α is consistent := α 6 ⊥,

α is stable := ∃k∈α(k Stab).

Note that from α ~A and ` ~A → B it follows that α B. To see this,consider α ~A. Then k ~A for a k ∈ α, since α is linearly ordered. From` ~A→ B it follows that k B, i.e., α B.

A branch α is generic (in the sense that it generates a classical model)if it is consistent and stable, if in addition for all formulas B

(1.10) (α B) ∨ (α ¬B),

and if for all formulas ∀~yB(~y ) with B(~y ) not a universal formula,

(1.11) ∀~s∈Ter(α B(~s ))→ α ∀~yB(~y ).

For a branch α, we define a classical modelMα = (Ter, I0, Iα1 ) as

Iα1 (R)(~s ) := ∃k∈αI1(R, k)(~s ) (R 6= ⊥).

Since ∃ is used in this definition,Mα is stable.

1.4. SOUNDNESS AND COMPLETENESS OF THE CLASSICAL FRAGMENT 39

We show that for every generic branch α and formula B (without ∨, ∃)

(1.12) α B ↔Mα |= B.

The proof is by induction on the logical complexity of B.Case R~s with R 6= ⊥. Then (1.12) holds for all α.Case ⊥. We have α 6 ⊥ since α is consistent.Case B → C. Let α B → C and Mα |= B. We must show that

Mα |= C. Note that α B by induction hypothesis, hence α C, henceMα |= C again by induction hypothesis. Conversely let Mα |= B → C.Clearly (Mα |= B) ∨ (Mα 6|= B). If Mα |= B, then Mα |= C. Henceα C by induction hypothesis and therefore α B → C. If Mα 6|= Bthen α 6 B by induction hypothesis. Hence α ¬B by (1.10) and thereforeα B → C, since α is stable (and ` (¬¬C → C)→ ⊥→ C). [Note that forthis argument to be contructively valid one needs to observe that the formulaα B → C is a negation, and therefore we can argue by the case distinctionbased on ∨. This is because, with P1 := Mα |= B, P2 := Mα 6|= B andQ := α B → C, the formula (P1 ∨ P2)→ (P1 → Q)→ (P2 → Q)→ Q isderivable in minimal logic.]

Case B ∧ C. Easy.Case ∀~yB(~y ) (~y not empty) where B(~y ) is not a universal formula. The

following are equivalent.

α ∀~yB(~y )

∀~s∈Ter(α B(~s )) by (1.11)

∀~s∈Ter(Mα |= B(~s )) by induction hypothesis

Mα |= ∀~yB(~y ).

This concludes the proof of (1.12).Next we show that for every consistent and stable node k there must be

a generic branch containing k:

(1.13) k 6 ⊥ → k Stab→ ∃α(α generic ∧ k ∈ α).

For the proof, let A0, A1, . . . enumerate all formulas. We define a sequencek = k0 k1 k2 . . . of consistent stable nodes by dependent choice. Letk0 := k. Assume that kn is defined. We write An in the form ∀~yB(~y ) (with~y possibly empty) where B is not a universal formula. In case kn ∀~yB(~y )let kn+1 := kn. Otherwise we have kn 6 B(~s ) for some ~s, and by (1.9) theremust be a consistent node k′ kn such that k′ ¬B(~s ). Let kn+1 := k′.Since kn kn+1, the node kn+1 is stable.

Let α := l | ∃n(l kn) , hence k ∈ α. We show that α is generic.Clearly α is consistent and stable. We now prove both (1.10) and (1.11).Let C = ∀~yB(~y ) (with ~y possibly empty) where B(~y ) is not a universalformula, and choose n such that C = An. In case kn ∀~yB(~y ) we aredone. Otherwise by construction kn+1 ¬B(~s ) for some ~s. For (1.10) weget kn+1 ¬∀~yB(~y ) since ` ∀~yB(~y ) → B(~s ), and (1.11) follows from theconsistency of α. This concludes the proof of (1.13).

Now we can finalize the completeness proof. Recall that l0 Γ,Staband l0 6 A. Since l0 6 A and l0 is stable, (1.9) yields a consistent nodek l0 such that k ¬A. Evidently, k is stable as well. By (1.13) there

40 1. LOGIC

must be a generic branch α such that k ∈ α. Since k ¬A it follows thatα ¬A, hence Mα |= ¬A by (1.12). Moreover, α Γ, thus Mα |= Γ by(1.12). This contradicts our assumption.

1.4.4. Compactness and Lowenheim-Skolem theorems. Amongthe many important corollaries of the completeness theorem the compactnessand Lowenheim-Skolem theorems stand out as particularly important. A setΓ of formulas is consistent if Γ 6`c ⊥, and satisfiable if there is (in the weaksense) a classical modelM and an assignment η in |M| such thatM |= Γ[η].

Corollary. Let Γ be a set of formulas.(a) If Γ is consistent, then Γ is satisfiable.(b) (Compactness). If each finite subset of Γ is satisfiable, Γ is satisfiable.

Proof. (a). Assume Γ 6`c ⊥ and that for all classical models M wehaveM 6|= Γ, i.e.,M |= Γ impliesM |= ⊥. Then the completeness theoremyields a contradiction.

(b). Otherwise by the completeness theorem there must be a derivationof ⊥ from Γ∪Stab, hence also from Γ0∪Stab for some finite subset Γ0 ⊆ Γ.This contradicts the assumption that Γ0 is satisfiable.

Corollary (Lowenheim and Skolem). Let Γ be a set of formulas (weassume that L is countable). If Γ is satisfiable, then Γ is satisfiable in amodel with a countably infinite carrier set.

Proof. Assume that Γ is not satisfiable in a countable model. Thenby the completeness theorem Γ ∪ Stab ` ⊥. Therefore by the soundnesstheorem Γ cannot be satisfiable.

Of course one often wishes to incorporate equality into the formal lan-guage. One adds the equality axioms

x = x (reflexivity),

x = y → y = x (symmetry),

x = y → y = z → x = z (transitivity),

x1 = y1 → · · · → xn = yn → f(x1, . . . , xn) = f(y1, . . . , yn),

x1 = y1 → · · · → xn = yn → R(x1, . . . , xn)→ R(y1, . . . , yn).

Cleary they induce a congruence relation on any model. By “collapsing” thedomain to congruence classes any model would become a “normal” modelin which = is interpreted as identity. One thus obtains completeness, com-pactness etc. for theories with equality and their normal models.

1.5. Tait Calculus

In this section we deal with classical logic only and hence disregard thedistinction between strong and weak existential quantifiers and disjunctions.In classical logic one has the de Morgan laws and these allow any formula tobe brought into negation normal form, i.e., built up from atoms or negatedatoms by applying ∨,∧,∃,∀. For such formulas Tait (1968) derived a de-ceptively simple calculus with just one rule for each symbol. However itdepends crucially on the principle that finite sets of formulas Γ,∆ etc. are

1.6. NOTES 41

derived. The rules of Tait’s calculus are as follows where, in order to singleout a particular formula from a finite set, the convention is that Γ, A denotesthe finite set Γ ∪ A.

Γ, R(~t ),¬R(~t ) (Ax)

Γ, A0, A1 (∨)Γ, (A0 ∨A1)

Γ, A0 Γ, A1 (∧)Γ, (A0 ∧A1)

Γ, A(t)(∃)

Γ,∃xA(x)

Γ, A(∀)

Γ,∀xA

Γ, C Γ,¬C(Cut)

Γ

where in the axioms R(~t ) is an atom, and in the ∀-rule x is not free in Γ.That this is an equivalent formulation of classical logic is easy. First

notice that any finite set derivable as above is, when considered as a disjunc-tion, valid in all classical models and therefore (by completeness) classicallyderivable. In the opposite direction, if Γ `c A, then ¬Γ, A is derivable in theTait calculus (where ¬Γ is the finite set consisting of the negation normalforms of all ¬A’s for A ∈ Γ.) We treat some examples.

(→−). The →−-rule from assumptions Γ embeds into the Tait calculusas follows: from ¬Γ, A → B (which is equiderivable with ¬Γ,¬A,B) and¬Γ, A derive ¬Γ, B by (Cut), after first weakening ¬Γ, A to ¬Γ, A,B.

(→+). From ¬Γ,¬A,B one obtains ¬Γ,¬A ∨B and hence ¬Γ, A→ B.(∀−). First note that the Tait calculus easily derives A,¬A, for any A.

From A(t),¬A(t) derive A(t),∃x¬A(x) by (∃). Hence from ¬Γ,∀xA(x) (andsome weakenings) we have ¬Γ, A(t) by (Cut).

(∀+) is given by the Tait (∀)-rule.It is well known that from any derivation in the Tait calculus one can

eliminate the (Cut) rule. Cut elimination plays a role analogous to normal-ization in natural deduction. We do not treat it here because it will appearin much more detail in Part 2, where cut elimination will be the principaltool in extracting bounds for existential theorems in a hierarchy of theoriesbased on arithmetic. Of course normalization could be used instead, but themain point behind the use of the Tait calculus is that the natural dualitiesbetween ∃ and ∀, ∨ and ∧, simplify the reduction processes involved andreduce the number of cases to be considered.

1.6. Notes

Logic in natural deduction style has first been considered by Kolmogorov(1925), Gentzen (1934) and Johansson (1937); Gentzen gave a particularlyconvincing exposition. The first proof of the existence of a normal form forarbitrary derivations in natural deduction is due to Prawitz (1965). He alsoconsidered permutative and simplification conversions. The proof presentedin 1.2 uses the so-called SN-technique introduced by van Raamsdonk andSeveri (1995), which was further developed and extended by Joachimski andMatthes (2003).

42 1. LOGIC

Tree models as used here were first introduced by Beth (1956), and areoften called Beth models in the literature, for instance in Troelstra and vanDalen (1988).

Tait introduced his calculus in (1968), as a convenient refinement ofthe sequent calculus of Gentzen (1934). Due to its usage of the negationnormal form is particularly well suited for classical logic. The cut elimi-nation theorem for his sequent calculus was proved by Gentzen (1934); formore recent expositions see Troelstra and van Dalen (1988); Troelstra andSchwichtenberg (2000); Negri and von Plato (2001).

CHAPTER 2

Recursion Theory

In this chapter we develop the basics of recursive function theory, or asit is more generally known, computability theory. Its history goes back tothe seminal works of Turing, Kleene and others in the 1930’s.

A computable function is one defined by a program whose operationalsemantics tell an idealized computer what to do to its storage locations asit proceeds deterministically from input to output, without any prior re-strictions on storage space or computation time. We shall be concernedwith various program-styles and the relationships between them, but theemphasis throughout will be on one underlying data-type, namely the nat-ural numbers, since it is there that the most basic foundational connectionsbetween proof theory and computation are to be seen in their clearest light.

The two best-known models of machine computation are the TuringMachine and the (Unlimited) Register Machine of Shepherdson and Sturgis(1963). We base our development on the latter since it affords the quickestroute to the results we want to establish.

2.1. Register Machines

2.1.1. Programs. A register machine stores natural numbers in regis-ters denoted u, v, w, x, y, z possibly with subscripts, and it responds stepby step to a program consisting of an ordered list of basic instructions:

I0I1...Ik−1

Each instruction has one of the following three forms whose meanings areobvious:

Zero: x := 0,Succ: x := x+ 1,

Jump: [if x = y then In else Im].The instructions are obeyed in order starting with I0 except when a condi-tional jump instruction is encountered, in which case the next instructionwill be either In or Im according as the numerical contents of registers xand y are equal or not at that stage. The computation terminates when itruns out of instructions, that is when the next instruction called for is Ik.Thus if a program of length k contains a jump instruction as above then itmust satisfy the condition n,m ≤ k and Ik means “halt”. Notice of coursethat some programs do not terminate, for example the following one-liner:

[if x = x then I0 else I1]

43

44 2. RECURSION THEORY

2.1.2. Program constructs. We develop some shorthand for buildingup standard sorts of programs.

Transfer. “x := y” is the program

x := 0[if x = y then I4 else I2]x := x+ 1[if x = x then I1 else I1],

which copies the contents of register y into register x.Predecessor. The program “x := y−· 1” copies the modified predecessor

of y into x, and simultaneously copies y into z:

x := 0z := 0[if x = y then I8 else I3]z := z + 1[if z = y then I8 else I5]z := z + 1x := x+ 1[if z = y then I8 else I5].

Composition. “P ; Q” is the program obtained by concatenating pro-gram P with program Q. However in order to ensure that jump instructionsin Q of the form “[if x = y then In else Im]” still operate properly within Qthey need to be re-numbered by changing the addresses n,m to k+n, k+mrespectively where k is the length of program P . Thus the effect of thisprogram is to do P until it halts (if ever) and then do Q.

Conditional. “if x = y then P else Q fi” is the program

[if x = y then I1 else Ik+2]...P[if x = x then Ik+2+l else I2]...Q

where k, l are the lengths of the programs P,Q respectively, and again theirjump instructions must be appropriately renumbered by adding 1 to theaddresses in P and k + 2 to the addresses in Q. Clearly if x = y thenprogram P is obeyed and the next jump instruction automatically bypassesQ and halts. If x 6= y then program Q is performed.

For Loop. “for i = 1 . . . x do P od” is the program

i := 0[if x = i then Ik+4 else I2]i := i+ 1...P[if x = i then Ik+4 else I2]

where again, k is the length of program P and the jump instructions inP must be appropriately re-addressed by adding 3. The intention of thisnew program is that it should iterate the program P x times (do nothingif x = 0). This requires the restriction that the register x and the “local”counting-register i are not re-assigned new values inside P .

2.1. REGISTER MACHINES 45

While Loop. “while x 6= 0 do P od” is the program

y := 0[if x = y then Ik+3 else I2]...P[if x = y then Ik+3 else I2]

where again, k is the length of program P and the jump instructions in Pmust be re-addressed by adding 2. This program keeps on doing P until (ifever) the register x becomes 0; it requires the restriction that the auxiliaryregister y is not re-assigned new values inside P .

2.1.3. Register machine computable functions. A register ma-chine program P may have certain distinguished “input registers” and “out-put registers”. It may also use other “working registers” for scratchwork andthese will initially be set to zero. We write P (x1, . . . , xk; y) to signify thatprogram P has input registers x1, . . . , xk and one output register y, whichare distinct.

Definition. The program P (x1, . . . , xk; y) is said to compute the k-arypartial function ϕ : Nk → N if, starting with any numerical values n1, . . . , nkin the input registers, the program terminates with the number m in theoutput register if and only if ϕ(n1, . . . , nk) is defined with value m. In thiscase, the input registers hold their original values.

A function is register machine computable if there is some program whichcomputes it.

Here are some examples.Addition. “Add(x, y; z)” is the program

z := x ; for i = 1, . . . , y do z := z + 1 od

which adds the contents of registers x and y into register z.Subtraction. “Subt(x, y; z)” is the program

z := x ; for i = 1, . . . , y do w := z −· 1 ; z := w od

which computes the modified subtraction function x−· y.Bounded Sum. If P (x1, . . . , xk, w; y) computes the k + 1-ary function ϕ

then the program Q(x1, . . . , xk, z;x):

x := 0 ;for i = 1, . . . , z do w := i−· 1 ; P (~x,w; y) ; v := x ; Add(v, y;x) od

computes the function

ψ(x1, . . . , xk, z) =∑w<z

ϕ(x1, . . . , xk, w)

which will be undefined if for some w < z, ϕ(x1, . . . , xk, w) is undefined.Multiplication. Deleting “w := i−· 1 ; P” from the last example gives a

program Mult(z, y;x) which places the product of y and z into x.


Bounded Product. If in the bounded sum example, the instruction x :=x+ 1 is inserted immediately after x := 0, and if Add(v, y;x) is replaced byMult(v, y;x), then the resulting program computes the function

ψ(x1, . . . , xk, z) =∏w<z

ϕ(x1, . . . , xk, w).

Composition. If Pj(x1, . . . , xk; yj) computes ϕj for each j = i, . . . , n andif P0(y1, . . . , yn; y0) computes ϕ0, then the program Q(x1, . . . , xk; y0):

P1(x1, . . . , xk; y1) ; . . . ; Pn(x1, . . . , xk; yn) ; P0(y1, . . . , yn; y0)


ψ(x1, . . . , xk) = ϕ0(ϕ1(x1, . . . , xk) , . . . , ϕn(x1, . . . , xk))

which will be undefined if any of the ϕ-subterms on the right hand side isundefined.

Unbounded Minimization. If P (x1, . . . , xk, y; z) computes ϕ then the pro-gram Q(x1, . . . , xk; z):

y := 0 ; z := 0 ; z := z + 1 ;while z 6= 0 do P (x1, . . . , xk, y; z) ; y := y + 1 od ;z := y −· 1


ψ(x1, . . . , xk) = µy(ϕ(x1, . . . , xk, y) = 0)

that is, the least number y such that ϕ(x1, . . . , xk, y′) is defined for every

y′ ≤ y and ϕ(x1, . . . , xk, y) = 0.

2.2. Elementary Functions

2.2.1. Definition and simple properties. The elementary functionsof Kalmar (1943) are those number-theoretic functions which can be definedexplicitly by compositional terms built up from variables and the constants0, 1 by repeated applications of addition +, modified subtraction−· , boundedsums and bounded products.

By omitting bounded products, one obtains the subelementary functions.The examples in the previous section show that all elementary functions

are computable and totally defined. Multiplication and exponentiation areelementary since

m · n =∑

i<nm and mn =∏i<nm

and hence by repeated composition, all exponential polynomials are elemen-tary.

In addition the elementary functions are closed underDefinition by Cases.

f(~n ) =

g0(~n ) if h(~n ) = 0g1(~n ) otherwise

since f can be defined from g0, g1 and h by

f(~n ) = g0(~n ) · (1−· h(~n )) + g1(~n ) · (1−· (1−· h(~n ))).

2.2. ELEMENTARY FUNCTIONS 47

Bounded Minimization.

f(~n,m) = µk<m(g(~n, k) = 0)

since f can be defined from g by

f(~n,m) =∑i<m

(1−·

∑k≤i

(1−· g(~n, k))).

Note: this definition gives value m if there is no k < m such that g(~n, k) =0. It shows that not only the elementary, but in fact the subelementaryfunctions are closed under bounded minimization. Furthermore, we defineµk≤m(g(~n, k) = 0) as µk<m+1(g(~n, k) = 0).

Lemma.

(a) For every elementary function f : Nr → N there is a number k such thatfor all ~n = n1, . . . , nr,

f(~n ) < 2k(max(~n ))

where 20(m) := m and 2k+1(m) := 22k(m).(b) Hence the function n 7→ 2n(1) is not elementary.

Proof. (a). By induction on the build-up of the compositional termdefining f . The result clearly holds if f is any one of the base functions:

f(~n ) = 0 or 1 or ni or ni + nj or ni −· nj .

If f is defined from g by application of bounded sum or product:

f(~n,m) =∑i<m

g(~n, i) or∏i<m

g(~n, i)

where g(~n, i) < 2k(max(~n, i)) then we have

f(~n,m) ≤ 2k(max(~n,m))m < 2k+2(max(~n,m))

using nn < (2n)n ≤ 22n.

If f is defined from g0, g1, . . . , gl by composition:

f(~n ) = g0(g1(~n ), . . . , gl(~n ))

where for each j ≤ l we have gj(−) < 2kj(max(−)), then with k = maxj kj ,

f(~n ) < 2k(2k(max(~n ))) = 22k(max(~n ))

and this completes the first part.(b). If 2n(1) were an elementary function of n then by (a) there would

be a positive k such that for all n,

2n(1) < 2k(n)

but then putting n = 2k(1) yields 22k(1)(1) < 22k(1), a contradiction.


2.2.2. Elementary relations. A relation R on Nk is said to be ele-mentary if its characteristic function

cR(~n ) =

1 if R(~n )0 otherwise

is elementary. In particular, the “equality” and “less than” relations areelementary since their characteristic functions can be defined as follows:

c<(n,m) = 1−· (1−· (m−· n)), c=(n,m) = 1−· (c<(n,m) + c<(m,n)).

Furthermore if R is elementary then so is the function

f(~n,m) = µk<mR(~n, k)

since R(~n, k) is equivalent to 1−· cR(~n, k) = 0.

Lemma. The elementary relations are closed under applications of propo-sitional connectives and bounded quantifiers.

Proof. For example, the characteristic function of ¬R is

1−· cR(~n ).

The characteristic function of R0 ∧R1 is

cR0(~n ) · cR1(~n ).

The characteristic function of ∀i<mR(~n, i) is

c=(m,µi<m(cR(~n, i) = 0)).

Examples. The above closure properties enable us to show that many“natural” functions and relations of number theory are elementary; thus

b nmc = µk<n(n < (k + 1)m),

n mod m = n−· b nmcm,

Prime(n)↔ 1 < n ∧ ¬∃m<n(1 < m ∧ n mod m = 0),

pn = µm<22n (Prime(m) ∧ n =∑i<m

cPrime(i)),

so p0, p1, p2, . . . gives the enumeration of primes in increasing order. Theestimate pn ≤ 22n

for the nth prime pn can be proved by induction on n:For n = 0 this is clear, and for n ≥ 1 we obtain

pn ≤ p0p1 · · · pn−1 + 1 ≤ 220221 · · · 22n−1

+ 1 = 22n−1 + 1 < 22n.

2.2.3. The class E.

Definition. The class E consists of those number theoretic functionswhich can be defined from the initial functions: constant 0, successor S,projections (onto the ith coordinate), addition +, modified subtraction −· ,multiplication · and exponentiation 2x, by applications of composition andbounded minimization.


The remarks above show immediately that the characteristic functionsof the equality and less than relations lie in E , and that (by the proof of thelemma) the relations in E are closed under propositional connectives andbounded quantifiers.

Furthermore the above examples show that all the functions in the classE are elementary. We now prove the converse, which will be useful later.

Lemma. There are “pairing functions” π, π1, π2 in E with the followingproperties:(a) π maps N× N bijectively onto N,(b) π(a, b) + b+ 2 ≤ (a+ b+ 1)2 for a+ b ≥ 1, hence π(a, b) < (a+ b+ 1)2,(c) π1(c), π2(c) ≤ c,(d) π(π1(c), π2(c)) = c,(e) π1(π(a, b)) = a,(f) π2(π(a, b)) = b.

Proof. Enumerate the pairs of natural numbers as follows:

...6 . . .

3 7 . . .

1 4 8 . . .

0 2 5 9 . . .

At position (0, b) we clearly have the sum of the lengths of the precedingdiagonals, and on the next diagonal a + b remains constant. Let π(a, b) bethe number written at position (a, b). Then we have

π(a, b) =( ∑i≤a+b

i)

+ a =12(a+ b)(a+ b+ 1) + a.

Clearly π : N × N → N is bijective. Moreover, a, b ≤ π(a, b) and in caseπ(a, b) 6= 0 also a < π(a, b). Let

π1(c) := µx≤c∃y≤c(π(x, y) = c),

π2(c) := µy≤c∃x≤c(π(x, y) = c).

Then clearly πi(c) ≤ c for i ∈ 1, 2 and

π1(π(a, b)) = a, π2(π(a, b)) = b, π(π1(c), π2(c)) = c.

π, π1 and π2 are elementary by definition. For π(a, b) we have the estimate

π(a, b) + b+ 2 ≤ (a+ b+ 1)2 for a+ b ≥ 1.

This follows with n := a+ b from12n(n+ 1) + n+ 2 ≤ (n+ 1)2 for n ≥ 1,

which is equivalent to n(n + 1) + 2(n + 1) ≤ 2((n + 1)2 − 1) and hence to(n+ 2)(n+ 1) ≤ 2n(n+ 2), which holds for n ≥ 1.

The proof shows that π, π1 and π2 are in fact subelementary.


Theorem (Godel’s β-function). There is in E a function β with thefollowing property: For every sequence a0, . . . , an−1 < b of numbers lessthan b we can find a number c ≤ 4 · 4n(b+n+1)4 such that β(c, i) = ai for alli < n.

Proof. Let

a := π(b, n) and d :=∏i<n

(1 + π(ai, i)a!

).

From a! and d we can, for each given i < n, reconstruct the number ai asthe unique x < b such that

1 + π(x, i)a! | d.For clearly ai is such an x, and if some x < b were to satisfy the samecondition, then because π(x, i) < a and the numbers 1 + ka! are relativelyprime for k ≤ a, we would have π(x, i) = π(aj , j) for some j < n. Hencex = aj and i = j, thus x = ai. – Therefore

ai = µx<b∃z<d((1 + π(x, i)a!)z = d).

We can now define Godel’s β-function as

β(c, i) := µx<π1(c)∃z<π2(c)((1 + π(x, i) · π1(c)) · z = π2(c)).

Clearly β is in E . Furthermore with c := π(a!, d) we see that β(c, i) = ai.It is then not difficult to estimate the given bound on c, using π(b, n) <(b+ n+ 1)2.

The above definition of β shows that it is subelementary.

2.2.4. Closure properties of E.

Theorem. The class E is closed under limited recursion. Thus if g, h, kare given functions in E and f is defined from them according to the schema

f(~m, 0) = g(~m),

f(~m, n+ 1) = h(n, f(~m, n), ~m),

f(~m, n) ≤ k(~m, n),

then f is in E also.

Proof. Let f be defined from g, h and k in E , by limited recursionas above. Using Godel’s β-function as in the last lemma we can find forany given ~m, n a number c such that β(c, i) = f(~m, i) for all i ≤ n. LetR(~m, n, c) be the relation

β(c, 0) = g(~m) ∧ ∀i<n(β(c, i+ 1) = h(i, β(c, i), ~m))

and note by the remarks above that its characteristic function is in E . Itis clear, by induction, that if R(~m, n, c) holds then β(c, i) = f(~m, i), for alli ≤ n. Therefore we can define f explicitly by the equation

f(~m, n) = β(µcR(~m, n, c), n).

f will lie in E if µc can be bounded by an E function. However, the lemmaon Godel’s β-function gives a bound 4 · 4(n+1)(b+n+2)4 , where in this case bcan be taken as the maximum of k(~m, i) for i ≤ n. But this can be defined


in E as k(~m, i0), where i0 = µi≤n∀j≤n(k(~m, j) ≤ k(~m, i)). Hence µc can bebounded by an E function.

Remark. Note that it is in this proof only that the exponential functionis required, in providing a bound for µ.

Corollary. E is the class of all elementary functions.

Proof. It is sufficient merely to show that E is closed under boundedsums and bounded products. Suppose for instance, that f is defined fromg in E by bounded summation: f(~m, n) =

∑i<n g(~m, i). Then f can be

defined by limited recursion, as follows

f(~m, 0) = 0

f(~m, n+ 1) = f(~m, n) + g(~m, n)

f(~m, n) ≤ n ·maxi<n

g(~m, i)

and the functions (including the bound) from which it is defined are in E .Thus f is in E by the theorem. If instead, f is defined by bounded product,then proceed similarly.

2.2.5. Coding finite lists. Computation on lists is a practical neces-sity, so because we are basing everything here on the single data type Nwe must develop some means of “coding” finite lists or sequences of naturalnumbers into N itself. There are various ways to do this and we shall adoptone of the most traditional, based on the pairing functions π, π1, π2.

The empty sequence is coded by the number 0 and a sequence n0, n1,. . . , nk−1 is coded by the “sequence number”

〈n0, n1, . . . , nk−1〉 = π′(. . . π′(π′(0, n0), n1), . . . , nk−1)

with π′(a, b) := π(a, b) + 1, thus recursively,

〈〉 := 0,

〈n0, n1, . . . , nk〉 := π′(〈n0, n1, . . . , nk−1〉, nk).Because of the surjectivity of π, every number a can be decoded uniquely asa sequence number a = 〈n0, n1, . . . , nk−1〉. If a is greater than zero, hd(a) :=π2(a−· 1) is the “head” (i.e., rightmost element) and tl(a) := π1(a−· 1) is the“tail” of the list. The kth iterate of tl is denoted tl(k) and since tl(a) is lessthan or equal to a, tl(k)(a) is elementarily definable (by limited recursion).Thus we can define elementarily the “length” and “decoding” functions:

lh(a) := µk≤a(tl(k)(a) = 0),

(a)i := hd(tl(lh(a)−· (i+1))(a)).

Then if a = 〈n0, n1, . . . , nk−1〉 it is easy to check that

lh(a) = k and (a)i = ni for each i < k.

Furthermore (a)i = 0 when i ≥ lh(a). We shall write (a)i,j for ((a)i)j and(a)i,j,k for (((a)i)j)k. This elementary coding machinery will be used atvarious crucial points in the following.

Note that our previous remarks show that the functions lh(·) and (a)iare subelementary, and so is 〈n0, n1, . . . , nk−1〉 for each fixed k.


Lemma (Estimate for sequence numbers).

(n+ 1)k ≤ 〈n, . . . , n︸︷︷︸k

〉 < (n+ 1)2k.

Proof. We prove a slightly strengthened form of the second estimate:

〈n, . . . , n︸︷︷︸k

〉+ n+ 1 ≤ (n+ 1)2k,

by induction on k. For k = 0 the claim is clear. In the step k 7→ k + 1 wehave

〈n, . . . , n︸︷︷︸k+1

〉+ n+ 1 = π(〈n, . . . , n︸︷︷︸k

〉, n) + n+ 2

≤ (〈n, . . . , n︸︷︷︸k

〉+ n+ 1)2 by the lemma in 2.2.3

≤ (n+ 1)2k+1

by induction hypothesis.

For the first estimate the base case k = 0 is clear, and in the step we have

〈n, . . . , n︸︷︷︸k+1

〉 = π(〈n, . . . , n︸︷︷︸k

〉, n) + 1

≥ 〈n, . . . , n︸︷︷︸k

〉+ n+ 1

≥ (n+ 1)(k + 1) by induction hypothesis.

Concatenation of sequence numbers b ∗ a is defined thus:

b ∗ 〈〉 := b,

b ∗ 〈n0, n1, . . . , nk〉 := π(b ∗ 〈n0, n1, . . . , nk−1〉, nk) + 1.

To check that this operation is also elementary, define h(b, a, i) by recursionon i as follows.

h(b, a, 0) = b,

h(b, a, i+ 1) = π(h(b, a, i), (a)i) + 1

and note that since

h(b, a, i) = 〈(b)0, . . . , (b)lh(b)−· 1, (a)0, . . . , (a)i−· 1〉 for i ≤ lh(a)

it follows from the estimate above that h(a, b, i) ≤ (b+ a)2lh(b)+i

. Thus h isdefinable by limited recursion from elementary functions and hence is itselfelementary. Finally

b ∗ a = h(b, a, lh(a)).

Lemma. The class E is closed under limited course-of-values recursion.Thus if h, k are given functions in E and f is defined from them accordingto the schema

f(~m, n) = h(n, 〈f(~m, 0), . . . .f(~m, n− 1)〉, ~m)

f(~m, n) ≤ k(~m, n)

then f is in E also.


Proof. f(~m, n) := 〈f(~m, 0), . . . .f(~m, n− 1)〉 is definable by

f(~m, 0) = 0,

f(~m, n+ 1) = f(~m, n) ∗ 〈h(n, f(~m, n), ~m)〉

f(~m, n) ≤(∑i≤n

k(~m, i) + 1)2n

,

using 〈n, . . . , n︸︷︷︸k

〉 < (n+ 1)2k. But f(~m, n) = (f(~m.n))n.

The next lemma gives closure of E under limited course-of-values re-cursion but with parameter substitution allowed. Here we are working atthe extremity of elementary definability, but this generalized schema will becrucially important for the elementary arithmetization of syntax which isdeveloped prior to Godel’s theorems in the next chapter (particularly in re-gard to the substitution function). Unfortunately this last closure propertyof E is rather complicated to state, because it requires notational details todo with iteration of parameter substitutions.

Lemma. The class E is closed under limited course-of-values recursionwith parameter substitution. Suppose g, h, k, pi and ai (for i ≤ l) are all inE and let f be defined from them as follows.

f(m,n) =

g(m) if n = 0h(n, f(p0(m,n), a0(n)), . . . , f(pl(m,n), al(n)),m) otherwise

f(m,n) ≤ k(m,n)

where ai(n) < n when n > 0. Then f is also in E provided that the iteratedparameter function p(σ,m, n) defined below is elementarily bounded.

For any sequence σ := 〈i0, i1, . . . , ir−1〉 of numbers ≤ l define n(σ) by:n(〈〉) := n, n(σ ∗ 〈i〉) := ai(n(σ)) if n(σ) 6= 0 and := 0 otherwise. Thenp(σ,m, n) is given by the course-of-values recursion:

p(〈〉,m, n) = m

p(σ ∗ 〈i〉,m, n) =

pi(p(σ,m, n), n(σ)) if n(σ) 6= 0p(σ,m, n) if n(σ) = 0.

Proof. First note that since p(σ,m, n) is defined by a course-of-valuesrecursion and, by supposition, is elementarily bounded, it is itself in E bythe last lemma. Similarly, n(σ) is elementary.

We code the computation of f(m,n) as a finitely branching tree ofheight ≤ n + 1, growing downwards. Nodes are sequence numbers σ =〈i0, i1, . . . , ir−1〉 with ij ≤ l and each such node is bounded in value by(l + 1)2

n+1. At each node σ is attached the value of f at the current para-

meter substitution p(σ,m, n) and the current stage n(σ). Let Q(m,n, z) bethe elementary relation expressing the fact that z correctly encodes the com-putation tree for f(m,n) with (z)σ being the correct value at current node σ.Thus Q(m,n, z) is the following condition, for all nodes σ ≤ (l + 1)2

n+1: If

n(σ) 6= 0, (z)σ = h(n(σ), (z)σ∗〈0〉, . . . , (z)σ∗〈l〉, p(σ,m, n)) and if n(σ) = 0then (z)σ = g(p(σ,m, n)). Clearly Q is an elementary relation, and if


z is the least such that Q(m,n, z) holds then f(m,n) = (z)〈〉. There-fore f will be elementary if z can be bounded by an elementary func-tion. This is now easy because z = 〈(z)〈〉, (z)1, . . . (z)(l+1)2n+1 〉 where each(z)σ = f(p(σ,m, n), n(σ)) ≤ k(p(σ,m, n), n(σ)). Therefore

z ≤ (max k(p(σ,m, n), n(σ)) | σ ≤ (l + 1)2n+1 + 1)2

(l+1)2n+1

and this is elementary.

2.3. The Normal Form Theorem

2.3.1. Program numbers. The three types of register machine in-structions I can be coded by “instruction numbers” ]I thus, where v0, v1,v2, . . . is a list of all variables used to denote registers:

If I is “vj := 0” then ]I = 〈0, j〉.If I is “vj := vj + 1” then ]I = 〈1, j〉.If I is “if vj = vl then Im else In” then ]I = 〈2, j, l,m, n〉.

Clearly, using the sequence coding and decoding apparatus above, we cancheck elementarily whether or not a given number is an instruction number.

Any register machine program P = I0, I1, . . . , Ik−1 can then be codedby a “program number” or “index” ]P thus:

]P = 〈]I0, ]I1, . . . , ]Ik−1〉

and again (although it is tedious) we can elementarily check whether or nota given number is indeed of the form ]P for some program P . Tradition hasit that e is normally reserved as a variable over putative program numbers.

Standard program constructs such as those in 2.1 have associated “index-constructors”, i.e., functions which, given indices of the subprograms, pro-duce an index for the constructed program. The point is that for standardprogram constructs the associated index-constructor functions are elemen-tary. For example there is an elementary index-constructor comp such that,given programs P0, P1 with indices e0, e1, comp(e0, e1) is an index of theprogram P0 ; P1. A moment’s thought should convince the reader that theappropriate definition of comp is as follows:

comp(e0, e1) = e0 ∗ 〈r(e0, e1, 0), r(e0, e1, 1), . . . , r(e0, e1, lh(e1)−· 1)〉

where r(e0, e1, i) =〈2, (e1)i,1, (e1)i,2, (e1)i,3 + lh(e0), (e1)i,4 + lh(e0)〉 if (e1)i,0 = 2(e1)i otherwise

re-addresses the jump instructions in P1. Clearly r and hence comp areelementary functions.

Definition. Henceforth, ϕ(r)e denotes the partial function computed by

the register machine program with program number e, operating on theinput registers v1, . . . , vr and with output register v0. There is no loss ofgenerality here, since the variables in any program can always be renamedso that v1, . . . , vr become the input registers and v0 the output. If e is not aprogram number, or it is but does not operate on the right variables, then

2.3. THE NORMAL FORM THEOREM 55

we adopt the convention that ϕ(r)e (n1, . . . , nr) is undefined for all inputs

n1, . . . , nr.

2.3.2. Normal form.

Theorem (Kleene’s Normal Form). For each arity r there is an ele-mentary function U and an elementary relation T such that, for all e andall inputs n1, . . . , nr,

(a) ϕ(r)e (n1, . . . , nr) is defined if and only if ∃sT (e, n1, . . . , nr, s),

(b) ϕ(r)e (n1, . . . , nr) = U(e, n1, . . . , nr, µsT (e, n1, . . . , nr, s)).

Proof. A computation of a register machine program P (v1, . . . , vr; v0)on numerical inputs ~n = n1, . . . , nr proceeds deterministically, step by step,each step corresponding to the execution of one instruction. Let e be itsprogram number, and let v0, . . . , vl be all the registers used by P , includingthe “working registers” so r ≤ l.

The “state” of the computation at step s is defined to be the sequencenumber

state(e, ~n, s) = 〈e, i,m0,m1, . . . ,ml〉where m0,m1, . . . ,ml are the values stored in the registers v0, v1, . . . , vl afterstep s is completed, and the next instruction to be performed is the ith one,thus (e)i is its instruction number.

The “state transition function” tr : N → N computes the “next state”.So suppose that x = 〈e, i,m0,m1, . . . ,ml〉 is any putative state. Then inwhat follows, e = (x)0, i = (x)1, and mj = (x)j+2 for each j ≤ l. Thedefinition of tr(x) is therefore as follows:

tr(x) = 〈e, i′,m′0,m

′1, . . . ,m

′l〉

where(i) If (e)i = 〈0, j〉 where j ≤ l then i′ = i + 1, m′

j = 0, and all otherregisters remain unchanged, i.e., m′

k = mk for k 6= j.(ii) If (e)i = 〈1, j〉 where j ≤ l then i′ = i+ 1, m′

j = mj + 1, and all otherregisters remain unchanged.

(iii) If (e)i = 〈2, j0, j1, i0, i1〉 where j0, j1 ≤ l and i0, i1 ≤ lh(e) then i′ = i0or i′ = i1 according as mj0 = mj1 or not, and all registers remainunchanged, i.e., m′

j = mj for all j ≤ l.(iv) Otherwise, if x is not a sequence number, or if e is not a program

number, or if it refers to a register vk with l < k, or if lh(e) ≤ i, thentr(x) simply repeats the same state x so i′ = i, and m′

j = mj for everyj ≤ l.

Clearly tr is an elementary function, since it is defined by elementarily decid-able cases, with (a great deal of) elementary decoding and re-coding involvedin each case.

Consequently, the “state function” state(e, ~n, s) is also elementary be-cause it can be defined by iterating the transition function by limited recur-sion on s as follows:

state(e, ~n, 0) = 〈e, 0, n1, . . . , nr, 0, . . . , 0〉state(e, ~n, s+ 1) = tr(state(e, ~n, s))


state(e, ~n, s) ≤ h(e, ~n, s)where for the bounding function h we can take

h(e, ~n, s) = 〈e, e〉 ∗ 〈max(~n ) + s, . . . ,max(~n ) + s〉.This is because the maximum value of any register at step s cannot begreater than max(~n ) + s. Now this expression clearly is elementary, since〈m, . . . ,m〉 with i occurrences of m is definable by a limited recursion withbound (m+ i)2

i, as is easily seen by induction on i.

Now recall that if program P has program number e then computationterminates when instruction Ilh(e) is encountered. Thus we can define the“termination relation” T (e, ~n, s) meaning “computation terminates at steps”, by

T (e, ~n, s) := ((state(e, ~n, s))1 = lh(e)).Clearly T is elementary and

ϕ(r)e (~n ) is defined↔ ∃sT (e, ~n, s).

The output on termination is the value of register v0, so if we define the“output function” U(e, ~n, s) by

U(e, ~n, s) := (state(e, ~n, s))2then U is also elementary and

ϕ(r)e (~n ) = U(e, ~n, µsT (e, ~n, s)).

2.3.3. Σ01-definable relations and µ-recursive functions. A rela-

tion R of arity r is said to be Σ01-definable if there is an elementary relation

E, say of arity r + l, such that for all ~n = n1, . . . , nr,

R(~n )↔ ∃k1 . . .∃klE(~n, k1, . . . , kl).

A partial function ϕ is said to be Σ01-definable if its graph

(~n,m) | ϕ(~n ) is defined and = m is Σ0

1-definable.To say that a non-empty relation R is Σ0

1-definable is equivalent to sayingthat the set of all sequences 〈~n〉 satisfying R can be enumerated (possiblywith repetitions) by some elementary function f : N→ N. Such relations arecalled elementarily enumerable. For choose any fixed sequence 〈a1, . . . , ar〉satisfying R and define

f(m) =

〈(m)1, . . . , (m)r〉 if E((m)1, . . . , (m)r+l)〈a1, . . . , ar〉 otherwise.

Conversely if R is elementarily enumerated by f then

R(~n )↔ ∃m(f(m) = 〈~n〉)is a Σ0

1-definition of R.The µ-recursive functions are those (partial) functions which can be

defined from the initial functions: constant 0, successor S, projections (ontothe ith coordinate), addition +, modified subtraction −· and multiplication·, by applications of composition and unbounded minimization. Note thatit is through unbounded minimization that partial functions may arise.

2.3. THE NORMAL FORM THEOREM 57

Lemma. Every elementary function is µ-recursive.

Proof. By simply removing the bounds on µ in the lemmas in 2.2.3one obtains µ-recursive definitions of the pairing functions π, π1, π2 and ofGodel’s β-function. Then by removing all mention of bounds from Theoremin 2.2.4 one sees that the µ-recursive functions are closed under (unlimited)primitive recursive definitions: f(~m, 0) = g(~m), f(~m, n+1) = h(n, f(~m, n)).Thus one can µ-recursively define bounded sums and bounded products, andhence all elementary functions.

2.3.4. Computable functions.

Definition. The while-programs are those programs which can be builtup from assignment statements x := 0, x := y, x := y + 1, x := y −· 1,by Conditionals, Composition, For-Loops and While-Loops as in 2.1 (onprogram constructs).

Theorem. The following are equivalent:(a) ϕ is register machine computable,(b) ϕ is Σ0

1-definable,(c) ϕ is µ-recursive,(d) ϕ is computable by a while program.

Proof. The Normal Form Theorem shows immediately that every re-gister machine computable function ϕ(r)

e is Σ01-definable since

ϕ(r)e (~n ) = m↔ ∃s

(T (e, ~n, s) ∧ U(e, ~n, s) = m

)and the relation T (e, ~n, s) ∧ U(e, ~n, s) = m is clearly elementary. If ϕ isΣ0

1-definable, say

ϕ(~n ) = m↔ ∃k1 . . .∃klE(~n,m, k1, . . . , kl)

then ϕ can be defined µ-recursively by

ϕ(~n ) = (µmE(~n, (m)0, (m)1, . . . , (m)l) )0 ,

using the fact (above) that elementary functions are µ-recursive. The ex-amples of computable functionals in 2.1 show how the definition of anyµ-recursive function translates automatically into a while program. Finally,2.1 shows how to implement any while program on a register machine.

Henceforth computable means “register machine computable” or any ofits equivalents.

Corollary. The function ϕ(r)e (n1, . . . , nr) is a computable partial func-

tion of the r + 1 variables e, n1, . . . , nr.

Proof. Immediate from the Normal Form.

Lemma. A relation R is computable if and only if both R and its com-plement Nn \R are Σ0

1-definable.

Proof. We can assume that both R and Nn \R are not empty, and (forsimplicity) also n = 1.⇒. By the theorem above every computable relation is Σ0

1-definable,and with R clearly its complement is computable.


⇐. Let f, g ∈ E enumerate R and N \R, respectively. Then

h(n) := µi(f(i) = n ∨ g(i) = n)

is a total µ-recursive function, and R(n)↔ f(h(n)) = n.

2.3.5. Undecidability of the halting problem. The above corollarysays that there is a single “universal” program which, given numbers e and~n, computes ϕ(r)

e (~n ) if it is defined. However we cannot decide in advancewhether or not it will be defined. There is no program which, given e and~n, computes the total function

h(e, ~n ) =

1 if ϕ(r)

e (~n ) is defined,0 if ϕ(r)

e (~n ) is undefined.

For suppose there were such a program. Then the function

ψ(~n ) = µm(h(n1, ~n ) = 0)

would be computable, say with fixed program number e0, and therefore

ϕ(r)e0 (~n ) =

0 if h(n1, ~n ) = 0undefined if h(n1, ~n ) = 1.

But then fixing n1 = e0 gives:

ϕ(r)e0 (~n ) defined↔ h(e0, ~n ) = 0↔ ϕ

(r)e0 (~n ) undefined,

a contradiction. Hence the relation R(e, ~n ) which holds if and only if ϕ(r)e (~n )

is defined, is not recursive. It is however Σ01-definable.

There are numerous attempts to classify total computable functions ac-cording to the complexity of their termination proofs.

2.4. Recursive Definitions

2.4.1. Least fixed points of recursive definitions. By a recursivedefinition of a partial function ϕ of arity r from given partial functionsψ1, . . . , ψm of fixed but unspecified arities, we mean a defining equation ofthe form

ϕ(n1, . . . , nr) = t(ψ1, . . . , ψm, ϕ;n1, . . . , nr)where t is any compositional term built up from the numerical variables~n = n1, . . . , nr and the constant 0 by repeated applications of the successorand predecessor functions, the given functions ψ1, . . . , ψm, the function ϕitself, and the “definition by cases” function :

dc(x, y, u, v) =

u if x, y are both defined and equalv if x, y are both defined and unequalundefined otherwise.

There may be many partial functions ϕ satisfying such a recursive def-inition, but the one we wish to single out is the least defined one, i.e., theone whose defined values arise inevitably by lazy evaluation of the term t“from the outside in”, making only those function calls which are absolutelynecessary. This presupposes that each of the functions from which t is con-structed already comes equipped with an evaluation strategy. In particular

2.4. RECURSIVE DEFINITIONS 59

if a subterm dc(t1, t2, t3, t4) is called then it is to be evaluated according tothe program construct:

x := t1 ; y := t2 ; [if x := y then t3 else t4].

Some of the function calls demanded by the term t may be for further valuesof ϕ itself, and these must be evaluated by repeated unravellings of t (in otherwords by recursion).

This “least solution” ϕ will be referred to as the function defined by thatrecursive definition or its least fixed point . Its existence and its computabil-ity are guaranteed by Kleene’s Recursion Theorem below.

2.4.2. The principles of finite support and monotonicity, andthe effective index property. Suppose we are given any fixed partialfunctions ψ1, . . . , ψm and ψ, of the appropriate arities, and fixed inputs ~n.If the term t = t(ψ1, . . . , ψm, ψ;~n ) evaluates to a defined value k then thefollowing principles clearly hold:

Finite Support Principle. Only finitely many values of ψ1, . . . , ψm andψ are used in that evaluation of t.

Monotonicity Principle. The same value k will be obtained no matterhow the partial functions ψ1, . . . , ψm and ψ are extended.

Note also that any such term t satisfies theEffective Index Property. There is an elementary function f such that if

ψ1, . . . , ψm and ψ are computable partial functions with program numberse1, . . . , em and e respectively, then according to the lazy evaluation strategyjust described,

t(ψ1, . . . , ψm, ψ;~n )

defines a computable function of ~n with program number f(e1, . . . , em, e).The proof of the Effective Index Property is by induction over the build-

up of the term t. The base case is where t is just one of the constants 0, 1or a variable nj , in which case it defines either a constant function ~n 7→ 0or ~n 7→ 1, or a projection function ~n 7→ nj . Each of these is triviallycomputable with a fixed program number, and it is this program numberwe take as the value of f(e1, . . . , em, e). Since in this case f is a constantfunction, it is clearly elementary. The induction step is where t is built upby applying one of the given functions: successor, predecessor, definition bycases or ψ (with or without a subscript) to previously constructed subtermsti(ψ1, . . . , ψm, ψ;~n ), i = 1 . . . l, thus:

t = χ(t1, . . . , tl).

Inductively we can assume that for each i = 1 . . . l, ti defines a partialfunction of ~n = n1, . . . , nr which is register machine computable by someprogram Pi with program number given by an already-constructed elemen-tary function fi = fi(e1, . . . , em, e). Therefore if χ is computed by a programQ with program number e′, we can put P1, . . . , Pl and Q together to con-struct a new program obeying the evaluation strategy for t. Furthermore,by the remark on index-constructions in 2.3.1. we will be able to computeits program number f(e1, . . . , em, e) from the given numbers f1, . . . , fl ande′, by some elementary function.


2.4.3. Recursion Theorem.

Theorem (Kleene’s Recursion Theorem). For given partial functionsψ1, . . . , ψm, every recursive definition

ϕ(~n ) = t(ψ1, . . . , ψm, ϕ;~n )

has a least fixed point, i.e., a least defined solution, ϕ. Moreover if ψ1,. . . ,ψm are computable, so is the least fixed point ϕ.

Proof. Let ψ1, . . . , ψm be fixed partial functions of the appropriatearities. Let Φ be the functional from partial functions of arity r to partialfunctions of arity r defined by lazy evaluation of the term t as describedabove:

Φ(ψ)(~n ) = t(ψ1, . . . , ψm, ψ;~n ).Let ϕ0, ϕ1, ϕ2, . . . be the sequence of partial functions of arity r generatedby Φ thus: ϕ0 is the completely undefined function, and ϕi+1 = Φ(ϕi) foreach i. Then by induction on i, using the Monotonicity Principle above, wesee that each ϕi is a subfunction of ϕi+1. That is, whenever ϕi(~n ) is definedwith a value k then ϕi+1(~n ) is defined with that same value. Since theirdefined values are consistent with one another we can therefore constructthe “union” ϕ of the ϕi’s as follows:

ϕ(~n ) = k ↔ ∃i (ϕi(~n ) = k).

(i) This ϕ is then the required least fixed point of the recursive definition.To see that it is a fixed point, i.e., ϕ = Φ(ϕ), first suppose ϕ(~n ) is defined

with value k. Then by the definition of ϕ just given, there is an i > 0 suchthat ϕi(~n ) is defined with value k. But ϕi = Φ(ϕi−1) so Φ(ϕi−1)(~n ) isdefined with value k. Therefore by the Monotonicity Principle for Φ, sinceϕi−1 is a subfunction of ϕ, Φ(ϕ)(~n ) is defined with value k. Hence ϕ is asubfunction of Φ(ϕ).

It remains to show the converse, that Φ(ϕ) is a subfunction of ϕ. So sup-pose Φ(ϕ)(~n ) is defined with value k. Then by the Finite Support Principle,only finitely many defined values of ϕ are called for in this evaluation. Bythe definition of ϕ there must be some i such that ϕi already supplies all ofthese required values, and so already at stage i we have Φ(ϕi)(~n ) = ϕi+1(~n )defined with value k. Since ϕi+1 is a subfunction of ϕ it follows that ϕ(~n )is defined with value k. Hence Φ(ϕ) is a subfunction of ϕ.

To see that ϕ is the least such fixed point, suppose ϕ′ is any fixed pointof Φ. Then Φ(ϕ′) = ϕ′ so by the Monotonicity Principle, since ϕ0 is asubfunction of ϕ′ it follows that Φ(ϕ0) = ϕ1 is a subfunction of Φ(ϕ′) = ϕ′.Then again by Monotonicity, Φ(ϕ1) = ϕ2 is a subfunction of Φ(ϕ′) = ϕ′

etcetera so that for each i, ϕi is a subfunction of ϕ′. Since ϕ is the union ofthe ϕi’s it follows that ϕ itself is a subfunction of ϕ′. Hence ϕ is the leastfixed point of Φ.

(ii) Finally we have to show that ϕ is computable if the given functionsψ1, . . . , ψm are. For this we need the Effective Index Property of the termt, which supplies an elementary function f such that if ψ is computablewith program number e then Φ(ψ) is computable with program numberf(e) = f(e1, . . . , em, e). Thus if u is any fixed program number for thecompletely undefined function of arity r, f(u) is a program number for

2.4. RECURSIVE DEFINITIONS 61

ϕ1 = Φ(ϕ0), f2(u) = f(f(u)) is a program number for ϕ2 = Φ(ϕ1), and ingeneral f i(u) is a program number for ϕi. Therefore in the notation of theNormal Form Theorem,

ϕi(~n ) = ϕ(r)

f i(u)(~n )

and by the corollary (in 2.3.4) to the Normal Form Theorem, this is a com-putable function of i and ~n, since f i(u) is a computable function of i defin-able (informally) say by a for-loop of the form “for j = 1 . . . i do f od”.Therefore by the earlier equivalences, ϕi(~n ) is a Σ0

1-definable function of iand ~n, and hence so is ϕ itself because

ϕ(~n ) = m↔ ∃i (ϕi(~n ) = m ).

So ϕ is computable and this completes the proof.

Note. The above proof works equally well if ϕ is a vector-valued func-tion. In other words if, instead of defining a single partial function ϕ, therecursive definition in fact defines a finite list ~ϕ of such functions simultane-ously. For example, the individual components of the machine state of anyregister machine at step s are clearly defined by a simultaneous recursivedefinition, from zero and successor.

2.4.4. Recursive programs and partial recursive functions. Arecursive program is a finite sequence of possibly simultaneous recursivedefinitions:

~ϕ0(n1, . . . , nr0) = t0(~ϕ0;n1, . . . , nr0)

~ϕ1(n1, . . . , nr1) = t1(~ϕ0, ~ϕ1;n1, . . . , nr1)

~ϕ2(n1, . . . , nr2) = t2(~ϕ0, ~ϕ1, ~ϕ2;n1, . . . , nr2)...

~ϕk(n1, . . . , nrk) = tk(~ϕ0, . . . , ~ϕk−1, ~ϕk;n1, . . . , nrk).

A partial function is said to be partial recursive if it is one of the functionsdefined by some recursive program as above. A partial recursive functionwhich happens to be totally defined is called simply a recursive function.

Theorem. A function is partial recursive if and only if it is computable.

Proof. The Recursion Theorem tells us immediately that every partialrecursive function is computable. For the converse we use the equivalenceof computability with µ-recursiveness already established in 2.3.4. Thus weneed only show how to translate any µ-recursive definition into a recursiveprogram:

The constant 0 function is defined by the recursive program

ϕ(~n ) = 0

and similarly for the constant 1 function.The addition function ϕ(m,n) = m+ n is defined by the recursive pro-

gramϕ(m,n) = dc(n, 0,m, ϕ(m,n−· 1) + 1)

and the subtraction function ϕ(m,n) = m−· n is defined similarly but withthe successor function +1 replaced by the predecessor −· 1. Multiplication is


defined recursively from addition in much the same way. Note that in eachcase the right hand side of the recursive definition is an allowed term.

The composition schema is a recursive definition as it stands.Finally, given a recursive program defining ψ, if we add to it the recursive

definition:ϕ(~n,m) = dc(ψ(~n,m), 0,m, ϕ(~n,m+ 1) )

followed byϕ′(~n ) = ϕ(~n, 0)

then the computation of ϕ′(~n ) proceeds as follows:

ϕ′(~n ) = ϕ(~n, 0)

= ϕ(~n, 1) if ψ(~n, 0) 6= 0

= ϕ(~n, 2) if ψ(~n, 1) 6= 0...

= ϕ(~n,m) if ψ(~n,m− 1) 6= 0

= m if ψ(~n,m) = 0.

Thus the recursive program for ϕ′ defines unbounded minimization:

ϕ′(~n ) = µm (ψ(~n,m) = 0).

2.5. Primitive Recursion and For-Loops

2.5.1. Primitive recursive functions. A primitive recursive programover N is a recursive program in which each recursive definition is of one ofthe following five special kinds:

(Z) fi(n) = 0,

(S) fi(n) = n+ 1,

(Ukj ) fi(n1, . . . , nk) = nj ,

(Ckr ) fi(n1, . . . , nk) = fi0( fi1(n1, . . . , nk), . . . , fir(n1, . . . , nk) ),

(PR) fi(n1, . . . , nk, 0) = fi0(n1, . . . , nk),

fi(n1, . . . , nk,m+ 1) = fi1(n1, . . . , nk,m, fi(n1, . . . , nk,m)),

where, in (C) and (PR), i0, i1, . . . , ir < i. Recall that functions are allowedto be 0-ary, so k may be 0. Note that the two equations in the (PR) schemacan easily be combined into one recursive definition using the dc and −·function. The reason for using f rather than ϕ to denote the functions insuch a program is that they are obviously totally defined (we try to maintainthe convention that f, g, h, ... denote total functions).

Definition. The primitive recursive functions are those which are de-finable by primitive recursive programs. The class of all primitive recursivefunctions is denoted “Prim”

Lemma (Explicit Definitions). If t is a term built up from numerical con-stants, variables n1, . . . , nk and function symbols f1, . . . , fm denoting previ-ously defined primitive recursive functions, then the function f defined from

2.5. PRIMITIVE RECURSION AND FOR-LOOPS 63

them byf(n1, . . . , nk) = t(f1, . . . , fm;n1, . . . , nk)

is also primitive recursive.

Proof. By induction over the generation of term t.If t is a constant l then using the (Z), (S) and (U) schemes :

f(n1, . . . , nk) = (S S . . . S Z Uk1 ) (n1, . . . , nk).

If t is one of the variables nj then using the (Ukj ) schema:

f(n1, . . . , nk) = nj .

If t is an applicative term fi(t1, . . . , tr) then by the (Ckr ) schema:

f(n1, . . . , nk) = fi(t1(n1, . . . , nk), . . . , tr(n1, . . . , nk)).

Lemma. Every elementary function is primitive recursive, but not con-versely.

Proof. Addition f(n,m) = n + m is defined from successor by theprimitive recursion:

f(n, 0) = n, f(n,m+ 1) = f(n,m) + 1

and modified subtraction f(n,m) = n −· m is defined similarly, replacing+1 by −· 1. Note that predecessor −· 1 is definable by a trivial primitiverecursion:

f(0) = 0, f(m+ 1) = m.

Bounded sum f(~n,m) =∑

i<m g(~n, i) is definable from + by another prim-itive recursion:

f(~n, 0) = 0, f(~n,m+ 1) = f(~n,m) + g(~n,m).

Multiplication is then defined explicitly by a bounded sum, and boundedproduct by a further primitive recursion. The above lemma then givesclosure under all explicit definitions using these principles. Hence everyelementary function is primitive recursive.

We have already seen that the function n 7→ 2n(1) is not elementary.However it can be defined primitive recursively from the (elementary) expo-nential function thus:

20(1) = 1, 2n+1(1) = 22n(1).

2.5.2. Loop-Programs. The loop-programs over N are built up from• assignments x := 0, x := x+ 1, x := y, x := y −· 1 using• compositions . . . ; . . . ,• conditionals if x = y then . . . else . . . fi, and• for-loops for i = 1 . . . y do . . . od,

where i is not reset between do and od.

Lemma. Every primitive recursive function is computable by a loop-program.


Proof. Composition corresponds to “;” and primitive recursion

f(~n, 0) = g(~n ), f(~n,m+ 1) = h(~n,m, f(~n,m))

can be recast as a for-loop (with input variables ~x, y and output variable z)thus:

z := g(~x ); for i = 1 . . . y do z := h(~x, i− 1, z) od.

We now describe the operational semantics of loop programs. Each loop-program P on “free variables” ~x = x1, . . . , xk (i.e., those not “bound” byfor-loops), can be considered as a “state-transformer” function from Nk toNk and we write P (~n ) to denote the output state (n′1, . . . , n

′k) which results

after applying program P to input (n1, . . . , nk). Note that loop-programsalways terminate! The definition of P (~n ) runs as follows, according to theform of program P :

Assignments. For example if P is “xi := xj −· 1” then

P (n1, . . . , ni, . . . , nk) = (n1, . . . , nj −· 1, . . . , nk).

Composition. If P is “Q ; R” then

P (~n ) = (R Q)(~n ).

Conditionals. If P is “if xi = xj then Q else R fi” then

P (~n ) =

Q(~n ) if ni = nj

R(~x ) if ni 6= nj .

For-loops. If P is “for i = 1 . . . xj do Q(i, ~x ) od” then P is definedby P (n1, . . . , nj , . . . , nk) = Q∗(nj , n1, . . . , nj , . . . , nk) with Q∗ defined byprimitive recursion on i thus

Q∗(0, n1, . . . , nj , . . . , nk) = (n1, . . . , nj , . . . , nk)Q∗(i+ 1, n1, . . . , nj , . . . , nk) = Q(i+ 1, Q∗(i, n1, . . . , nj , . . . , nk)).

Note that the above description actually gives P as a primitive recursivefunction from Nk to Nk and not from Nk to N as the formal definition ofprimitive recursion requires. However this is immaterial when working overN because we can work with “coded” sequences 〈~n〉 ∈ N instead of vectors(~n ) ∈ Nk so as to define

P (n1, . . . , nk) = 〈n′1, . . . , n′k〉.

The coding and decoding can all be done elementarily, so for any loop-program P the output function P (~n ) will always be primitive recursive. Wetherefore have:

Theorem. The primitive recursive functions are exactly those computedby loop-programs.

2.5.3. Reduction to primitive recursion. Various somewhat moregeneral kinds of recursion can be transformed into ordinary primitive recur-sion. Two important examples are:


Course of values recursion. A trivial example is the Fibonacci functionf(0) = 1,f(1) = 2,f(n+ 2) = f(n) + f(n+ 1),

which calls for several “previous” values (in this case two) in order to com-pute the “next” value. This is not formally a primitive recursion, but itcould be transformed into one because it can be computed by the for-loop(with x, y as input and output variables):

y := 1 ; z := 1 ; for i = 1 . . . x do u := y ; y := y + z ; z := u od.

Recursion with parameter substitution. This has the form:f(n, 0) = g(n),f(n,m+ 1) = h(n,m, f(p(n,m),m)).

Again this is not formally a primitive recursion as it stands, but it can betransformed to the following primitive recursive program:

(PR)

q(n,m, 0) = n,

q(n,m, i+ 1) = p(q(n,m, i),m−· (i+ 1)),

(C) g′(n,m) = g(q(n,m,m)),

(C) h′(n,m, i, j) = h(q(n,m,m−· (i+ 1)), i, j),

(PR)

f ′(n,m, 0) = g′(n,m),f ′(n,m, i+ 1) = h′(n,m, i, f ′(n,m, i)),

(C) f(n,m) = f ′(n,m,m).

We leave it as an exercise to check that this program defines the correctfunction f .

2.5.4. A complexity hierarchy for Prim. Given a register machineprogram I0, I1, . . . , Im . . . , Ik−1 where, for example, Im is a jump instruction“if xp = xq then Ir else Is fi” and given numerical inputs in the registers~x, the ensuing computation as far as step y can be performed by a singlefor-loop as follows, where j counts the “next instruction” to be obeyed:

j := 0 ;for i = 1 . . . y do

if j = 0 then I0 ; j := 1 elseif j = 1 then I1 ; j := 2 else. . .if j = m then if xp = xq then j := r else j := s fi else. . .. . . fi . . .fi fi

od.

Definition. Lk consists of all loop-programs which contain nested for-loops with maximum depth of nesting k. Thus L0-programs are loop-freeand Lk+1-programs only contain for-loops of the form for i = 1 . . . y do Pod where P is a Lj-program for some j ≤ k.


Definition. A bounding function for a loop-program P is an increasingfunction BP : N→ N (that is, BP (n) ≥ n) such that for all n ∈ N we have

BP (n) ≥ n+ max~i≤n

#P (~i)

where #P (~i) denotes the number of steps executed by P when called withinput ~i. Note that BP (n) will also bound the size of the output for anyinput ~i ≤ n, since at most 1 can be added to any register at any step. x

With each loop-program there is a naturally associated bounding func-tion as follows :

P = assignment BP (n) = n+ 1,

P = if xi = xj then Q else R fi BP (n) = max(BQ(n), BR(n)) + 1,

P = Q ; R BP (n) = BR(BQ(n)),

P = for i = 1 . . . xk do Q od BP (n) = BnQ(n),

where BnQ denotes the n-times iterate of BQ.

It is obvious that the defined BP is a bounding function when P isan assignment or a conditional. When P is a composed program P =Q ; R then, given any input ~i ≤ n let s := #Q(~i). Then n + s ≤ BQ(n)and so the output ~j of the computation of Q on ~i is also ≤ BQ(n). Nowlet s′ := #R(~j). Then BR(BQ(n)) ≥ BQ(n) + s′ ≥ n + s + s′. HenceBR(BQ(n)) ≥ n + max~i≤n #P (~i) and therefore BR BQ is an appropriatebounding function for P . Finally if P is a for-loop as indicated, then for anyinput ~i ≤ n the computation simply composes Q a certain number of times,say k, where k ≤ n. Therefore, by what we just have done for composition,BnQ(n) ≥ Bk

Q(n) ≥ n + #P (~i). Again this justifies our choice of boundingfunctions for for-loops.

Definition. The sequence F0, F1, . . . Fk, . . . of Prim functions is givenby

F0(n) = n+ 1, Fk+1(n) = Fnk (n).

Definition. For each increasing function g : N→ N let Comp(g) denotethe class of all total functions f : Nr → N which can be computed by registermachines in such a way that on (all but finitely many) inputs ~n, the numberof steps required to compute f(~n ) is bounded by g(max(~n )).

Theorem. For each k ≥ 1 we have

Lk-computable =⋃i

Comp(F ik)

and hencePrim =

⋃k

Comp(Fk).

Proof. The second part follows immediately from the first since for alln ≥ i, F ik(n) ≤ Fnk (n) = Fk+1(n).

To prove the left-to-right containment of the first part, proceed by induc-tion on k ≥ 0 to show that for every Lk-program P there is a fixed i such thatBP ≤ F ik where BP is the bounding function associated with P as above. It


then follows that the function computed by P lies in Comp(BP ) which is con-tained in Comp(F ik). The basis of the induction is trivial since L0-programsterminate in a constant number of steps i so that BP (n) = n + i = F i0(n).For the induction step the crucial case is where P is a Lk+1-program of theform for j = 1 . . . xm do Q od with Q ∈ Lk. By the induction hypothesisthere is a i such that BQ ≤ F ik and hence, using F1(n) = 2n ≤ Fk+1(n), wehave

BP (n) = BnQ(n) ≤ F ink (n) ≤ Fk+1(in) ≤ Fk+1(2i−1n) ≤ F ik+1(n)

as required.For the right-to-left containment, suppose f ∈ Comp(F ik) for some fixed

i and k. Then there is a register machine which computes f(~n ) withinF ik(max(~n )) steps. Now Fk is defined by k successive iterations (nested for-loops) starting with F0 = succ. So Fk is Lk-computable and (by composingi times) so is F ik. Therefore if k ≥ 1 we can compute f(~n ) by a Lk-program:

x := max(~n ) ; y := F ik(x) ; compute y steps in the computation of f

since, as we have already noted, an L1 program suffices to perform any pre-determined number of steps of a register machine program. This completesthe proof.

Corollary. The “Ackermann-Peter Function” F : N2 → N defined as

F (k, n) = Fk(n)

is not primitive recursive.

Proof. Since every loop-program has one of the F ik as a bounding func-tion, it follows that every Prim function f is dominated by some F ik andtherefore for all n ≥ max(k + 1, i) we have

f(n) < F ik(n) ≤ Fnk (n) = Fk+1(n) = F (k + 1, n) ≤ F (n, n).

Thus the binary function F cannot be primitive recursive, for otherwise wecould take f(n) = F (n, n) and obtain a contradiction.

Corollary. The elementary functions are just those definable by L2-programs, since

Elem =⋃i

Comp(F i2)

where F2(n) = n · 2n.

Proof. It is very easy to see that the elementary functions (like theprimitive recursive ones) form an “honest” class in the sense that everyelementary function is computable within a number of steps bounded bysome (other) elementary function, and hence by some iterated exponential,and hence by F i2 for some i. Conversely if f ∈ Comp(F i2) then by the NormalForm Theorem there is a program number e such that for all ~n,

f(~n ) = U( e, ~n, µsT (e, ~n, s) )

and furthermore the number of computation steps µsT (e, ~n, s) is boundedelementarily by F i2(max(~n )). Thus the unbounded minimization is in thiscase replaced by an elementarily bounded minimization, and since U and Tare both elementary, so therefore is f .


2.6. The Arithmetical Hierarchy

The goal of this section is to give a classification of the relations definableby arithmetical formulas. We have already made a step in this directionwhen we discussed the Σ0

1-definable relations.As a preparatory step we prove the Substitution Lemma and as its corol-

lary the Fixed Point Lemma, also known as Kleene’s Second Recursion The-orem.

2.6.1. Kleene’s Second Recursion Theorem.

Lemma (Substitution Lemma). There is a binary elementary functionS such that

ϕ(q+1)e (m,~n ) = ϕ

(q)S(e,m)(~n ).

Proof. The details are left as an exercise; we only describe the basicidea here. To construct S(e,m) we view e as code of a register machineprogram computing an q + 1-ary function ϕ. Then S(e,m) is to be a codeof a register machine program computing the q-ary function obtained fromϕ by fixing its first argument to be m. So the program coded by S(e,m)should work as follows. Shift all inputs one register to the right, and writem in the first register. Then compute as prescribed by e.

Theorem (Fixed Point Lemma or Kleene’s Second Recursion Theorem).Fix an arity q. Then for every e we can find an e0 such that for all ~n =n1, . . . , nr

ϕ(q)e0 (~n ) = ϕ(q+1)

e (e0, ~n ).

Proof. Let ϕh(m,~n ) = ϕe(S(m,m), ~n ) and e0 := S(h, h). Then bythe Substitution Lemma

ϕe0(~n ) = ϕS(h,h)(~n ) = ϕh(h, ~n ) = ϕe(S(h, h), ~n ) = ϕe(e0, ~n ).

2.6.2. Characterization of Σ01-definable and recursive relations.

We now give a useful characterization of the Σ01-definable relations, which

will lead us to the arithmetical hierarchy. Let

W (q)e := ~n | ∃sT (e, ~n, s) .

The Σ01-definable relations are also called recursively enumerable (r.e.) rela-

tions.

Lemma. (a) The W (q)e enumerate for e = 0, 1, 2, . . . the q-ary Σ0

1-defi-nable relations.

(b) For fixed arity q, W (q)e (~n ) as a relation of e, ~n is Σ0

1-definable, but notrecursive.

Proof. (a). If R = W(q)e , then R is Σ1

0-definable by definition. Forthe converse assume that R is Σ0

1-definable, i.e., that there is an elementaryrelation E, say of arity q + r, such that for all ~n = n1, . . . , nq,

R(~n ) ≡ ∃k1 . . .∃krE(~n, k1, . . . , kr).

Then clearly R is the domain of the partial recursion function ϕ given thefollowing µ-recursive definition:

ϕ(~n ) = µm [lh(m) = r ∧ E(~n, (m)0, (m)1, . . . , (m)r−1)].

2.6. THE ARITHMETICAL HIERARCHY 69

For ϕ = ϕe we have by the Normal Form Theorem R(~n ) ≡ ∃sT (e, ~n, s).(b) It suffices to show that We(~n ) is not recursive. So assume that it

would be. Then we could pick e0 such that

We0(e, ~n ) ≡ ¬We(e, ~n );

for e = e0 we obtain a contradiction.

From the Substitution Lemma above we can immediately infer

W (q+1)e (m,~n ) ≡ W

(q)S(e,m)(~n );

this fact is sometimes called Substitution Lemma for Σ01-definable relations.

Note. We have already seen in 2.3.4 that a relation R is recursive ifand only if both R and its complement ¬R are Σ0

1-definable.

2.6.3. Arithmetical relations. A relation R of arity q is said to bearithmetical if there is an elementary relation E, say of arity q+r, such thatfor all ~n = n1, . . . , nq,

R(~n ) ≡ (Q1)k1 . . . (Qr)kr E(~n, k1, . . . , kr) with Qi ∈ ∀,∃.

Note that we may assume that the quantifiers Qi are alternating, since e.g.

∀n∀mR(n,m) ≡ ∀kR((k)0, (k)1).

A relation R of arity q is said to be Σ0r-definable if there is an elementary

relation E such that for all ~n,

R(~n ) ≡ ∃k1∀k2 . . . Qkr E(~n, k1, . . . , kr)

with Q = ∀ if r is even and Q = ∃ if r is odd. Similarly, a relation R ofarity q is said to be Π0

r-definable if there is an elementary relation E suchthat for all ~n,

R(~n ) ≡ ∀k1∃k2 . . . Qkr E(~n, k1, . . . , kr)

with Q = ∃ if r is even and Q = ∀ if r is odd. A relation R is said to be∆0r-definable if it is Σ0

r-definable as well as Π0r-definable.

A partial function ϕ is said to be arithmetical ( Σ0r-definable, Π0

r-defi-nable, ∆0

r-definable) if its graph (~n,m) | ϕ(~n ) is defined and = m is.By the note above a relationR is ∆0

1-definable if and only if it is recursive.

Example. Let Tot := e | ϕ(1)e is total . Then we have

e ∈ Tot ≡ ϕ(1)e is total

≡ ∀n∃m(ϕe(n) = m)

≡ ∀n∃m∃s(T (e, n, s) ∧ U(e, n, s) = m).

Therefore Tot is Π02-definable. We will show below that Tot is not Σ0

2-definable.


∆01 Σ0

1

Π01 ∆0

2 Σ02

Π02 ∆0

3 Σ03

Π03

...

Figure 1. The arithmetical hierarchy

2.6.4. Closure properties.

Lemma. Σ0r, Π0

r and ∆0r-definable relations are closed under conjunction,

disjunction and bounded quantifiers ∃m<n, ∀m<n. The ∆0r-definable relations

are closed against negation. Moreover, for r > 0 the Σ0r-definable relations

are closed against the existential quantifier ∃ and the Π0r-definable relations

are closed against the universal quantifier ∀.

Proof. This can be seen easily. For instance, closure under the boundeduniversal quantifier ∀m<n follows from

∀m<n∃kR(~n, n,m, k) ≡ ∃l∀m<nR(~n, n,m, (l)m).

The relative positions of the Σ0r , Π0

r and ∆0r-definable relations are shown

in Fig. 1 on page 70.

2.6.5. Universal Σ0r+1-definable relations. We now generalize the

enumeration W (q)e of the unary Σ0

1-definable relations and construct binaryuniversal Σ0

r+1-definable relations U0r+1:

U01 (e, n) :≡ ∃sT (e, n, s) (≡ n ∈W (1)

e ),

U0r+1(e, n) :≡ ∃m¬U0

r (e, n ∗ 〈m〉).For example,

U03 (e, n) :≡ ∃m1∀m2∃sT (e, n ∗ 〈m1,m2〉, s),

U02 (e, n) :≡ ∃m∀s¬T (e, n ∗ 〈m〉, s).

Clearly the relations U0r+1(e, 〈~n〉) enumerate for e = 0, 1, 2, . . . the q-ary

Σ0r+1-definable relations, and their complements the q-ary Π0

r+1-definablerelations,

Now it easily follows that all inclusions in Fig. 1 are proper. To see this,assume for example that ∃m∀s¬T (e, 〈n,m〉, s) would be Π0

2. Pick e0 suchthat

∀m∃s T (e0, 〈n,m〉, s) ≡ ∃m∀s¬T (n, 〈n,m〉, s);for n := e0 we obtain a contradiction. As another example, assume

A := 2〈e, n〉 | ∃m∀s¬T (e, 〈n,m〉, s) ∪ 2〈e, n〉+ 1 | ∀m∃sT (e, 〈n,m〉, s) ,

2.6. THE ARITHMETICAL HIERARCHY 71

which is a ∆03-set, would be Σ0

2. Then we would have a contradiction

∀m∃sT (e, 〈n,m〉, s) ≡ 2〈e, n〉+ 1 ∈ A,

and hence (e, n) | ∀m∃sT (e, 〈n,m〉, s) would be a Σ02-definable relation, a

contradiction.

2.6.6. Σ0r-complete relations. We now develop an easy method to ob-

tain precise classifications in the arithmetical hierarchy. Since by sequence-coding we can pass in an elementary way between relations R of arity q andrelations R′(n) ≡ R((n)1, . . . , (n)q) of arity 1, it is no real loss of generalityif we henceforth restrict to q = 1 and only deal with sets A,B ⊆ N (i.e.,unary relations). First we introduce the notion of (many-one) reducibility.

Let A,B ⊆ N. B is said to be reducible to A if there is a total recursivefunction f such that for all n

n ∈ B ≡ f(n) ∈ A.

A set A is said to be Σ0r-complete if

(1) A is Σ0r-definable, and

(2) every Σ0r-definable set B is reducible to A.

Lemma. If A is Σ0r-complete, then A is Σ0

r-definable but not Π0r-defina-

ble.

Proof. Let A be Σ0r-complete and assume that A is Π0

r-definable. Picka set B which is Σ0

r-definable but not Π0r-definable. By Σ0

r-completeness ofA the set B is reducible to A via a recursive function f :

n ∈ B ≡ f(n) ∈ A.

But then B would be Π0r-definable too, contradicting the choice of B.

Remark. In the definition and the lemma above we can replace Σ0r by

Π0r . This gives the notion of Π0

1-completeness, and the proposition thatevery Π0

r-complete set A is Π0r-definable but not Σ0

r-definable.

Example. We have seen above that the set Tot := e | ϕ(1)e is total

is Π02-definable. We now can show that Tot is not Σ0

2-definable. By thelemma it suffices to prove that Tot is Π0

2-complete. So let B be an arbitraryΠ0

2-definable set. Then, for some e ∈ N,

n ∈ B ≡ ∀m∃s T (e, n,m, s).

Consider the partial recursive function

ϕe(n,m) := U(e, n,m, µs T (e, n,m, s)).

By the Substitution Lemma we have

n ∈ B ≡ ∀m(ϕe(n,m) is defined)

≡ ∀m(ϕS(e,n)(m) is defined)≡ ϕS(e,n) is total

≡ S(e, n) ∈ Tot.

Therefore B is reducible to Tot.


2.7. The Analytical Hierarchy

We now generalize the arithmetical hierarchy and give a classificationof the relations definable by analytical formulas, i.e., formulas involvingnumber as well as function quantifiers.

2.7.1. Analytical relations. First note that the Substitution Lemmaas well as the Fixed Point Lemma in 2.6.1 continue to hold if functionarguments are present, with the same function S in the Substitution Lemma.We also extend the enumeration W (q)

e of the Σ01-definable relations: By 2.6.2

the setsW (p,q)e := (~g, ~n ) | ∃sT2(e,~g, ~n, s)

enumerate for e = 0, 1, 2, . . . the (p, q)-ary Σ01-definable relations. With the

same argument as in 2.6 we see that for fixed arity (p, q), W (p,q)e (~g, ~n ) as a

relation of ~g, e, ~n is Σ01-definable, but not recursive. The treatment of the

arithmetical hierarchy can now be extended without difficulties to (p, q)-aryrelations.

Examples. (a) The set R of all recursive functions is Σ03-definable,

sinceR(f) ≡ ∃e∀n∃s[T (e, n, s) ∧ U(e, n, s) = f(n)].

(b) Let LinOrd denote the set of all functions f such that

≤f := (n,m) | f〈n,m〉 = 1 is a linear ordering of its field Mf := n | ∃m(f〈n,m〉 = 1 ∨ f〈m,n〉 =1) . LinOrd is Π0

1-definable, since

LinOrd(f) ≡ ∀n(n ∈Mf → f〈n, n〉 = 1) ∧∀n,m (f〈n,m〉 = 1 ∧ f〈m,n〉 = 1→ n = m)) ∧∀n,m,k (f〈n,m〉 = 1 ∧ f〈m, k〉 = 1→ f〈n, k〉 = 1) ∧∀n,m (n,m ∈Mf → f〈n,m〉 = 1 ∨ f〈m,n〉 = 1).

Here we have written n ∈Mf for ∃m(f〈n,m〉 = 1 ∨ f〈m,n〉 = 1).

A relation R of arity (p, q) is said to be analytical if there is an arith-metical relation P , say of arity (r+p, q), such that for all ~g = g1, . . . , gp and~n = n1, . . . , nq,

R(~g, ~n ) ≡ (Q1)f1 . . . (Qr)fr P (f1, . . . , fr, ~g, ~n ) with Qi ∈ ∀,∃.Note that we may assume that the quantifiers Qi are alternating, since forinstance

∀f∀gR(f, g) ≡ ∀hR((h)0, (h)1),where (h)i(n) := (h(n))i. A relation R of arity (p, q) is said to be Σ1

r-definable if there is an (r+p, q)-ary arithmetical relation P such that for all~g, ~n,

R(~g, ~n ) ≡ ∃f1∀f2 . . . Qfr P (f1, . . . , fr, ~g, ~n )with Q = ∀ if r is even and Q = ∃ if r is odd. Similarly, a relation R ofarity (p, q) is said to be Π1

r-definable if there is an arithmetical relation Psuch that for all ~g, ~n,

R(~g, ~n ) ≡ ∀f1∃f2 . . . Qfr P (f1, . . . , fr, ~g, ~n )

2.7. THE ANALYTICAL HIERARCHY 73

with Q = ∃ if r is even and Q = ∀ if r is odd. A relation R is said to be∆1r-definable if it is Σ1

r-definable as well as Π1r-definable.

A partial functional Φ is said to be analytical (Σ1r-definable, Π1

r-definable,∆1r-definable) if its graph (~g, ~n,m) | Φ(~g, ~n ) is defined and = m is.

Lemma. A relation R is Σ1r-definable if and only if it can be written in

the form

R(~g, ~n ) ≡ ∃f1∀f2 . . . QfrQm P (f1, . . . , fr, ~g, ~n,m)

with Q ∈ ∀,∃ and Q :=

∃ if Q = ∀∀ if Q = ∃

with an elementary relation P . Similarly, a relation R is Π1r-definable if and

only if it can be written in the form

R(~g, ~n ) ≡ ∀f1∃f2 . . . QfrQm P (f1, . . . , fr, ~g, ~n,m)

with Q,Q as above and an elementary relation P .

Proof. Use

∀n∃fR(f, n) ≡ ∃g∀nR((g)n, n) with (g)n(m) := f〈n,m〉,∀nR(n) ≡ ∀fR(f(0)).

E.g., the prefix ∀f∃n∀m is transformed first into ∀f∃n∀g, then into ∀f∀h∃n,and finally into ∀g∃n.

Example. Define

WOrd(f) :≡ (≤f is a well-ordering of its field Mf ).

Then WOrd satisfies

WOrd(f) ≡ LinOrd(f)∧∀g[∀nf〈g(n+1), g(n)〉) = 1→ ∃mg(m+1) = g(m)].

Hence WOrd is Π11-definable.

2.7.2. Closure Properties.

Lemma (Closure Properties). The Σ1r, Π1

r and ∆1r-definable relations

are closed against conjunction, disjunction and numerical quantifiers ∃n,∀n. The ∆1

r-definable relations are closed against negation. Moreover, forr > 0 the Σ1

r-definable relations are closed against the existential functionquantifier ∃f and the Π1

r-definable relations are closed against the universalfunction quantifier ∀f .

Proof. This can be seen easily. For instance, closure of the Σ11-definable

relations against universal numerical quantifiers follows from the transfor-mation of ∀n∃f∀m first into ∃g∀n∀m and then into ∃g∀k.

The relative positions of the Σ1r , Π1

r and ∆1r-definable relations are shown

in Fig. 2 on page 74. Here

∆0∞ :=

⋃r≥1

Σ0r

(=

⋃r≥1

Π0r

)is the set of all arithmetical relations, and

∆1∞ :=

⋃r≥1

Σ1r

(=

⋃r≥1

Π1r

)


∆01 Σ0

1

Π01 ∆0

2 Σ02

Π02 ∆0

3 Σ03

Π03

...

∆0∞

∆11 Σ1

1

Π11 ∆1

2 Σ12

Π12

...

∆1∞

Figure 2. The analytical hierarchy

is the set of all analytical relations.

2.7.3. Universal Σ1r+1-definable relations.

Lemma (Universal relations). Among the Σ1r+1 (Π1

r+1)-definable rela-tions there is a (p, q+1)-ary relation enumerating all (p, q)-ary Σ1

r+1 (Π1r+1)

-definable relations.

Proof. As an example, we prove the lemma for Σ12 and Σ1

1. All Σ12-

definable relations are enumerated by

∃g∀h∃sT2(e, ~f, ~n, g, h, s),

and all Σ11-definable relations are enumerated by

∃g∀s¬T2(e, ~f, ~n, g, s).

Lemma. All inclusions in Fig.2 on page 74 are proper.

Proof. We postpone (to 2.9.8) the proof of ∆0∞ ( ∆1

1. The rest of theproof is obvious from the following examples. Assume ∃g∀h∃sT2(e, n, g, h, s)would be Π1

2. Pick e0 such that

∀g∃h∀s¬T2(e0, n, g, h, s) ≡ ∃g∀h∃sT2(n, n, g, h, s);

for n := e0 we obtain a contradiction. As another example, assumeA := 2〈e, n〉 | ∃g∀h∃sT2(e, n, g, h, s) ∪

2〈e, n〉+ 1 | ∀g∃h∀s¬T2(e, n, g, h, s) ,

2.8. RECURSIVE TYPE-2 FUNCTIONALS AND WELL-FOUNDEDNESS 75

which is a ∆13-set, would be Σ1

2. Then from

∀g∃h∀s¬T2(e, n, g, h, s) ≡ 2〈e, n〉+ 1 ∈ A,it would follow that (e, n) | ∀g∃h∀s¬T2(e, n, g, h, s) is a Σ1

2-definable rela-tion, a contradiction.

2.7.4. Σ1r-complete relations. A set A ⊆ N is said to be Σ1

r-completeif

(1) A is Σ1r-definable, and

(2) every Σ1r-definable set B ⊆ N is reducible to A.

Lemma. If A ⊆ N is Σ1r-complete, then A is Σ1

r-definable but not Π1r-

definable.

Proof. Let A be Σ1r-complete and assume that A is Π1

r-definable. Picka set B ⊆ N which is Σ1

r-definable but not Π1r-definable. By Σ1

r-completenessof A the set B is reducible to A via a recursive function f :

n ∈ B ≡ f(n) ∈ A.But then B would be Π1

r-definable too, contradicting the choice of B.

Remark. In the definition and the lemma above we can replace Σ1r by

Π1r . This gives the notion of Π1

r-completeness, and the proposition thatevery Π1

r-complete set A is Π1r-definable but not Σ1

r-definable.

2.8. Recursive Type-2 Functionals and Well-Foundedness

2.8.1. Computation trees. To each oracle program with index e, as-sociate its “tree of non-past-secured sequence numbers”:

Tree(e) := 〈n0, . . . , nl−1〉 | ∀k<l ¬T1(e, n0, 〈n1, . . . , nk−1〉) called the computation tree of the given program.

We imagine the computation tree as growing downwards by extension,that is if σ and τ are any two sequence numbers (or nodes) in the tree then σcomes below τ if and only if σ is a proper extension of τ , that is lh(τ) < lh(σ)and ∀i<lh(τ)((σ)i = (τ)i). We write σ ⊃ τ to denote this. Note that if σ isin the tree and σ ⊃ τ then τ is automatically in the tree, by definition. Aninfinite branch of the tree is thus determined by a number n and a functiong : N→ N such that ∀s¬T1(e, n, g(s)). Therefore by the Relativized NormalForm, an infinite branch is a witness to the fact that for some n and someg, Φe(g)(n) is not defined. To say that the tree is “well founded” is to saythat there are no infinite branches, and hence:

Theorem. Φe is total if and only if Tree(e) is well-founded.

2.8.2. Ordinal assignments; recursive ordinals. This equivalenceis the basis for a natural theory of ordinal assignments, measuring (in somesense) the “complexity” of those oracle programs which terminate “every-where” (on all oracles and all numerical inputs). We shall later investigate insome detail these ordinal assignments and the ways in which they measurecomplexity, but to begin with we shall merely describe the hierarchy whichimmediately arises. It is due to Kleene (1958), but appears there only as abrief footnote to the first page.


Definition. If Tree(e) is well-founded we can assign to each of its nodesτ an ordinal ‖τ‖ by recursion “up the tree” as follows: if τ is a terminalnode (no extension of it belongs to the tree) then ‖τ‖ = 0; otherwise ‖τ‖ =sup ‖σ‖+ 1 | σ ⊃ τ ∧ σ ∈ Tree(e) .

Then we can assign an ordinal to the whole tree by defining ‖e‖ := ‖〈〉‖.

Example. The for-loop (with input variable x and output variable y):

y := 0 ; for i = 1 . . . x do y := g(y) od

computes the iteration functional It(g)(n) = gn(0). For fixed g and n thebranch through its computation tree will terminate in a node

〈n, g(0), . . . , g2(0), . . . , gn−1(0), . . . , gn(0), . . . , g(s− 1)〉

where s is the least number such that (i) g(s) contains all the necessaryoracle information concerning g, so s > gn−1(0), and (ii) computation of theprogram terminates by step s.

Working down this g-branch (and remembering that g is any functionat all) we see that for i < n, once the value of gi(0) is chosen, it determinesthe length of the ensuing segment as far as gi+1(0). The greater the valueof gi(0), the greater is the length of this segment. Therefore as we take thesupremum over all branches issuing from a node

〈n, g(0), . . . , g2(0), . . . , g(gi−1(0)− 1)〉

the successive segments gi(0), . . . , gi+1(0) have unbounded length, depend-ing on the value of gi(0). So each such segment adds one more ω to theordinal height of the tree. Since there are n − 1 such segments, the heightof the subtree below node 〈n〉 will be ω · (n− 1). Therefore the height of thecomputation tree for this loop-program is supn ω · (n− 1) = ω2.

Definition. An ordinal is recursive if it is the order-type of some recur-sive well-ordering relation ⊆ N× N. Any predecessor of a recursive ordinalis recursive and so is its successor, so the recursive ordinals form an initialsegment of the countable ordinals. The least non-recursive ordinal is a limit,denoted ωCK

1 , the “CK” standing for Church-Kleene.Note that if Φe is total recursive then Tree(e) can be well ordered by

the so-called Kleene-Brouwer ordering : σ <KB τ if and only if either σ ⊃ τor else there is an i < min(lh(σ), lh(τ)) such that ∀j<i((σ)j = (τ)j) and(σ)i < (τ)i. This is a recursive (in fact elementary) well ordering withorder-type ≥ ‖e‖. Hence ‖e‖ is a recursive ordinal.

2.8.3. A hierarchy of total recursive functionals. Kleene’s hier-archy of total recursive functionals consists of the classes

R2(α) := Φe | Φe total ∧ ‖e‖ < α

where α ranges over all recursive ordinals. Thus R2(α) ⊆ R2(β) if α < β.

Theorem (Hierarchy Theorem). Every total recursive functional be-longs to R2(α) for some recursive ordinal α. Furthermore the hierarchycontinues to expand as α increases through ωCK

1 , that is, for every recursiveordinal α there is a total recursive functional F such that F /∈ R2(α).

2.8. RECURSIVE TYPE-2 FUNCTIONALS AND WELL-FOUNDEDNESS 77

Proof. The first part is immediate since if Φe is total it belongs toR2(α + 1) where α is the order-type of the Kleene-Brouwer ordering onTree(e).

For the second part suppose α is any fixed recursive ordinal, and let≺α be a fixed recursive well-ordering with that order type. We define atotal recursive functional Vα(f, g, e, σ) with two unary function argumentsf and g, where e ranges over indices for oracle programs and σ ranges oversequence numbers. Note first that if σ = 〈n0, n1, . . . , nk−1〉 is a non-terminalnode in Tree(e) then for any function g : N→ N the sequence number

σ ∗ g(lh(σ)−1) := 〈n0, n1, . . . , nk−1, g(k − 1)〉

is also a node in Tree(e), below σ. The definition of Vα is as follows, byrecursion down the g-branch of Tree(e) starting with node σ, but controlledby the well-ordering ≺α via the other function argument f :

Vα(f, g, e, σ) =Vα(f, g, e, σ ∗ g(lh(σ)−1)) if σ ∈ Tree(e) and

f(σ ∗ g(lh(σ)−1)) ≺α f(σ)U1(e, (σ)0, 〈(σ)1, . . . , (σ)k−1〉) otherwise.

This is a recursive definition and furthermore it always is defined since re-peated application of the first clause leads to a descending sequence

· · · ≺α f(σ′′) ≺α f(σ′) ≺α f(σ)

which must terminate after finitely many steps because ≺α is a well-ordering.Hence the second clause must eventually apply and the computation termi-nates. Therefore Vα is total recursive.

Now if Φe is any total recursive functional such that ‖e‖ < α then therewill be an order preserving map from Tree(e) into α, and hence a functionfe : N → N such that whenever τ ⊃ σ in Tree(e) then fe(τ) ≺α fe(σ). Forthis particular e and fe it is easy to see by induction up the computationtree, and using the Relativized Normal Form Theorem, that for all g and n,

Φe(g)(n) = Vα(fe, g, e, 〈n〉).

Consequently the total recursive functional F defined from Vα by

F (g)(n) = Vα(λxg(x+ 1), g, g(0), 〈n〉) + 1

cannot lie in R2(α). For if it did there would be an e and fe as above suchthat F = Φe and hence for all g and all n,

Vα(λxg(x+ 1), g, g(0), 〈n〉) + 1 = Vα(fe, g, e, 〈n〉).

A contradiction follows immediately by choosing g so that g(0) = e andg(x+ 1) = fe(x). This completes the proof.

Remark. For relatively simple but fundamental reasons based in effec-tive descriptive set theory, no such “nice” hierarchy exists for the recursivefunctions. For whereas the class of all indices e of total recursive functionalsis definable by the Π1

1 condition

∀g∀n∃s T1(e, n, g(s))


the set of all indices of total recursive functions is given merely by an arith-metical Π0

2 condition:∀n∃s T (e, n, s).

So by the so-called “boundedness property” of hyperarithmetic theory, anyinductive hierarchy classification of all the recursive functions is sure to“collapse” before ωCK

1 . In practice this usually occurs at the very first limitstage ω and the hierarchy gives no interesting information.

Nevertheless if we adopt a more constructive view and take into ac-count also the ways in which a countable ordinal may be presented as awell-ordering, rather than just accepting its set-theoretic existence, then in-teresting hierarchies of proof theoretically important sub-classes of recursivefunctions begin to emerge.

2.9. Inductive Definitions

We have already used an inductive definition in our proof of Kleene’sRecursion Theorem in 2.4.3. Now we treat inductive definitions quite gen-erally, and discuss how far they will carry us in the analytical hierarchy. Wealso discuss the rather important dual concept of “coinductive” definitions.

2.9.1. Monotone operators. Let U be a fixed non-empty set. A mapΓ: P(U)→ P(U) is called an operator on U . Γ is called monotone if X ⊆ Yimplies Γ(X) ⊆ Γ(Y ), for all X,Y ⊆ U .

IΓ :=⋂X ⊆ U | Γ(X) ⊆ X

is the set defined inductively by the monotone operator Γ; so IΓ is theintersection of all Γ-closed subsets of U . Dually,

CΓ :=⋃X ⊆ U | X ⊆ Γ(X)

is the set defined coinductively by the monotone operator Γ; so CΓ is theunion of all subsets of U that are extended by Γ. Definitions of this kindare called (generalized) monotone inductive or coinductive definitions.

Theorem (Knaster-Tarski). Let Γ be a monotone operator. Then(a) If Γ(X) ⊆ X, then IΓ ⊆ X.(b) If X ⊆ Γ(X), then X ⊆ CΓ.(c) Γ(IΓ) = IΓ and Γ(CΓ) = CΓ.In particular IΓ is the least fixed point of Γ, and CΓ is the greatest fixedpoint of Γ.

Proof. (a), (b) follow immediately from the definitions of IΓ and CΓ.(c). From Γ(X) ⊆ X we can conclude IΓ ⊆ X by (a), hence Γ(IΓ) ⊆ Γ(X) ⊆X by the monotonicity of Γ. By definition of IΓ we obtain Γ(IΓ) ⊆ IΓ. Usingmonotonicity of Γ we can infer Γ(Γ(IΓ)) ⊆ Γ(IΓ), hence IΓ ⊆ Γ(IΓ) againby definition of IΓ. The argument for Γ(CΓ) = CΓ is the same.

Example. Let 0 ∈ U and consider an arbitrary function S : U → U .For every set X ⊆ U we define

Γ(X) := 0 ∪ S(v) | v ∈ X .

2.9. INDUCTIVE DEFINITIONS 79

Clearly Γ is monotone, and both IΓ and CΓ consist of the (not necessarilydistinct) elements 0, S(0), S(S(0)), . . . .

2.9.2. Induction and coinduction principles. The premise Γ(X) ⊆X in part (a) of the Knaster-Tarski Theorem is in the special case of theexample above equivalent to

∀u(u = 0 ∨ ∃v(v ∈ X ∧ u = S(v))→ u ∈ X),

i.e., to

0 ∈ X ∧ ∀v(v ∈ X → S(v) ∈ X),

and the conclusion is ∀u(u ∈ IΓ → u ∈ X). Hence part (a) of the Knaster-Tarski Theorem expresses some kind of a general induction principle. How-ever, in the “induction step” we do not quite have the desired form: insteadof ∀v(v ∈ X → S(v) ∈ X) we would like to have ∀v∈IΓ

(v ∈ X → S(v) ∈ X)(called strengthened form of induction). But this can be achieved easily.The theorem below formulates this in the general case.

Theorem (Induction principle for monotone inductive definitions). LetΓ be a monotone operator. If Γ(X ∩ IΓ) ⊆ X, then IΓ ⊆ X.

Proof. Because of Γ(X∩IΓ) ⊆ Γ(IΓ) = IΓ we obtain from the premiseΓ(X ∩ IΓ) ⊆ X ∩ IΓ. Therefore we have IΓ ⊆ X ∩ IΓ by definition of IΓ,hence IΓ ⊆ X.

Similarly, the premise X ⊆ Γ(X) in part (b) is in the special case of theexample above equivalent to

∀u(u ∈ X → u = 0 ∨ ∃v(v ∈ X ∧ u = S(v))),

and the conclusion is ∀u(u ∈ X → u ∈ CΓ). This can be viewed as a dualform of the induction principle, called coinduction. Again we obtain a moreappropriate form of the “coinduction step”: instead of ∃v(v ∈ X∧u = S(v))we can have ∃v∈CΓ∪X(u = S(v)) (called strengthened form of coinduction).Generally:

Theorem (Coinduction principle for monotone inductive definitions).Let Γ be a monotone operator. If X ⊆ Γ(X ∪ CΓ), then X ⊆ CΓ.

Proof. Because of CΓ = Γ(CΓ) ⊆ Γ(X∪CΓ) we obtain from the premiseX∪CΓ ⊆ Γ(X∪CΓ). ThenX∪CΓ ⊆ CΓ by definition of CΓ, henceX ⊆ CΓ.

2.9.3. Approximation of the least and greatest fixed point. Theleast fixed point IΓ of the monotone operator Γ was defined “from above”,as intersection of all sets X such that Γ(X) ⊆ X. We now show that it canalso be obtained by stepwise approximation “from below”. In the generalsituation considered here we need a transfinite iteration of the approximationsteps along the ordinals. Similarly the greatest fixed point CΓ was defined“from below”, as the union of all sets X such that X ⊆ Γ(X). We showthat it can also be obtained by stepwise approximation “from above”. For anarbitrary operator Γ: P(U) → P(U) we define Γ↑α and Γ↓α by transfinite


recursion on ordinals α:Γ↑0 := ∅,Γ↑(α+ 1) := Γ(Γ↑α),

Γ↑λ :=⋃ξ<λ

Γ↑ξ,

Γ↓0 := U,

Γ↓(α+ 1) := Γ(Γ↓α),

Γ↓λ :=⋂ξ<λ

Γ↓ξ,

where λ denotes a limit ordinal. It turns out that not only monotone, butalso certain other operators Γ have fixed points that can be approximatedby these Γ↑α or Γ↓α. Call an operator Γ inclusive if X ⊆ Γ(X) and selectiveif X ⊇ Γ(X), for all X ⊆ U .

Lemma. Let Γ be a monotone or inclusive operator.(a) Γ↑α ⊆ Γ↑(α+ 1) for all ordinals α.(b) If Γ↑α = Γ↑(α+ 1), then Γ↑(α+ β) = Γ↑α for all ordinals β.(c) Γ↑α = Γ↑(α+ 1) for some α such that Card(α) ≤ Card(U).So Γ := Γ↑∞ :=

⋃β∈On Γ↑β = Γ↑α, where α is the least ordinal such that

Γ↑α = Γ↑(α+ 1), and On denotes the class of all ordinals. This α is calledthe closure ordinal of Γ and is denoted by |Γ|↑. The set Γ is called the closureof the operator Γ. Clearly Γ is a fixed point of Γ.

Proof. (a). For monotone Γ we use transfinite induction on α. Thecase α = 0 is trivial. In the successor case we have

Γ↑α = Γ(Γ↑(α− 1)) ⊆ Γ(Γ↑α) = Γ↑(α+ 1).

Here we have used the induction hypothesis and the monotonicity of Γ. Inthe limit case we obtain

Γ↑λ =⋃ξ<λ

Γ↑ξ ⊆⋃ξ<λ

Γ↑(ξ + 1) =⋃ξ<λ

Γ(Γ↑ξ) ⊆ Γ(⋃ξ<λ

Γ↑ξ)

= Γ↑(λ+ 1).

Again we have used the induction hypothesis and the monotonicity of Γ. –In case Γ is inclusive we simply have

Γ↑α ⊆ Γ(Γ↑α) = Γ↑(α+ 1).

(b). By transfinite induction on β. The case β = 0 is trivial. In thesuccessor case we have by induction hypothesis

Γ↑(α+ β + 1) = Γ(Γ↑(α+ β)) = Γ(Γ↑α) = Γ↑(α+ 1) = Γ↑α,and in the limit case again by induction hypothesis

Γ↑(α+ β) =⋃γ<β

Γ↑(α+ γ) = Γ↑α.

(c). Assume that for all α such that Card(α) ≤ Card(U) we have Γ↑α (Γ↑(α+1), and let uα ∈ Γ↑(α+1) \Γ↑α. This defines an injective map fromα | Card(α) ≤ Card(U) into U . But this set α | Card(α) ≤ Card(U) is exactly the least cardinal larger than Card(U), so this is impossible.

Similarly we obtain

Lemma. Let Γ be a monotone or selective operator.(a) Γ↓α ⊇ Γ↓(α+ 1) for all ordinals α.(b) If Γ↓α = Γ↓(α+ 1), then Γ↓(α+ β) = Γ↓α for all ordinals β.


(c) Γ↓α = Γ↓(α+ 1) for some α such that Card(α) ≤ Card(U).So Γ := Γ↓∞ :=

⋂β∈On Γ↓β = Γ↓α, where α is the least ordinal such that

Γ↓α = Γ↓(α+ 1), and On denotes the class of all ordinals. This α is calledthe coclosure ordinal of Γ and is denoted by |Γ|↓. The set Γ is called thecoclosure of the operator Γ. Clearly Γ is a fixed point of Γ.

We now show that for a monotone operator Γ its closure Γ is in fact itsleast fixed point IΓ and its coclosure Γ is its greatest fixed point CΓ.

Lemma. Let Γ be a monotone operator. Then for all ordinals α we have(a) Γ↑α ⊆ IΓ.(b) If Γ↑α = Γ↑(α+ 1), then Γ↑α = IΓ.(c) Γ↓α ⊇ CΓ.(d) If Γ↓α = Γ↓(α+ 1), then Γ↓α = CΓ.

Proof. (a). By transfinite induction on α. The case α = 0 is trivial. Inthe successor case we have by induction hypothesis Γ↑(α − 1) ⊆ IΓ. SinceΓ is monotone this implies

Γ↑α = Γ(Γ↑(α− 1)) ⊆ Γ(IΓ) = IΓ.In the limit case we obtain from the induction hypothesis Γ↑ξ ⊆ IΓ for allξ < λ. This implies

Γ↑λ =⋃ξ<λ

Γ↑ξ ⊆ IΓ.

(b). Let Γ↑α = Γ↑(α + 1), hence Γ↑α = Γ(Γ↑α). Then Γ↑α is a fixedpoint of Γ, hence IΓ ⊆ Γ↑α. The reverse inclusion follows from (a).

For (c) and (d) the proofs are similar.

2.9.4. Continuous operators. We now consider the important spe-cial case of continuous operators. A subset Z ⊆ P(U) is called directed iffor every finite Z0 ⊆ Z there is an X ∈ Z such that Y ⊆ X for all Y ∈ Z0.An operator Γ: P(U)→ P(U) is called continuous if

Γ(⋃Z) =

⋃Γ(X) | X ∈ Z

for every directed subset Z ⊆ P(U). We also need a dual notion: a subsetZ ⊆ P(U) is called codirected if for every finite Z0 ⊆ Z there is an X ∈ Zsuch that X ⊆ Y for all Y ∈ Z0. An operator Γ: P(U) → P(U) is calledcocontinuous if

Γ(⋂Z) =

⋂Γ(X) | X ∈ Z

for every codirected subset Z ⊆ P(U).

Lemma. Every continuous or cocontinuous operator Γ is monotone.

Proof. For X,Y ⊆ U such that X ⊆ Y we obtain Γ(Y ) = Γ(X ∪ Y ) =Γ(X) ∪ Γ(Y ) from the continuity of Γ, and hence Γ(X) ⊆ Γ(Y ). Similarlywe obtain Γ(X) = Γ(X ∩Y ) = Γ(X)∩Γ(Y ) from the cocontinuity of Γ, andhence Γ(X) ⊆ Γ(Y ).

For a continuous (cocontinuous) operator the transfinite approximationof its least (greatest) fixed point stops after ω steps. Hence in this case wehave an easy characterization of the least fixed point “from below”, and ofthe greatest fixed point “from above”.


Lemma. (a) Let Γ be a continuous operator. Then IΓ = Γ↑ω.(b) Let Γ be a cocontinuous operator. Then CΓ = Γ↓ω.

Proof. (a). It suffices to show Γ↑(ω + 1) = Γ↑ω.

Γ↑(ω + 1) = Γ(Γ↑ω) = Γ(⋃n<ω

Γ↑n) =⋃n<ω

Γ(Γ↑n) =⋃n<ω

Γ↑(n+ 1) = Γ↑ω,

where in the third to last equation we have used the continuity of Γ.(b). Similarly it suffices to show Γ↓(ω + 1) = Γ↓ω.

Γ↓(ω + 1) = Γ(Γ↓ω) = Γ(⋂n<ω

Γ↓n) =⋂n<ω

Γ(Γ↓n) =⋂n<ω

Γ↓(n+ 1) = Γ↓ω,

where in the third to last equation we have used the cocontinuity of Γ.

2.9.5. The accessible part of a relation. An important example of amonotone inductive definition is the following construction of the accessiblepart of a binary relation≺ on U . Note that≺ is not required to be transitive,so (U,) may be viewed as a reduction system. For X ⊆ U let Γ≺(X) bethe set of all ≺-predecessors of u:

Γ≺(X) := u | ∀v≺u(v ∈ X) .

Clearly Γ≺ is monotone; its least fixed point IΓ≺ is called the accessible partof (U,≺) and denoted by acc(≺) or acc≺. If IΓ≺ = U , then the relation ≺is called well-founded ; the inverse relation is called noetherian or termi-nating . In this special case the Knaster-Tarski Theorem and the inductionprinciple for monotone inductive definitions in 2.9.2 can be combined asfollows.

∀u(∀v≺u(v ∈ X ∩ acc≺)→ u ∈ X)→ ∀u∈acc≺(u ∈ X).(2.1)

acc≺ is Γ≺-closed, i.e., ∀v≺u(v ∈ acc≺) implies u ∈ acc≺.(2.2)

Every u ∈ acc≺ is from Γ≺(acc≺), i.e., ∀u∈acc≺∀v≺u(v ∈ acc≺).(2.3)

Note that (2.1) expresses an induction principle: to show that all elementsin u ∈ acc≺ are in a set X it suffices to prove the “induction step”: we caninfer u ∈ X from the assumption that all smaller v ≺ u are accessible andin X.

By a reduction sequence we mean a finite or infinite sequence u1, u2, . . .such that ui ui+1. As an easy application one can show that u ∈ acc≺if and only if every reduction sequence starting with u terminates afterfinitely many steps. For the direction from left to right we use inductionon u ∈ acc≺. So let u ∈ acc≺ and assume that for every u′ such thatu u′ every reduction sequence starting starting with u′ terminates afterfinitely many steps. Then clearly also every reduction sequence startingwith u must terminate, since its second member is such a u′. Conversely,suppose we would have a u /∈ acc≺. We construct an infinite reductionsequence u = u1, u2, . . . , un, . . . such that un /∈ acc≺; this yields the desiredcontradiction. So let un /∈ acc≺. By (2.2) we then have a v /∈ acc≺ suchthat un v; pick un+1 as such a v.


2.9.6. Inductive definitions over N. We now turn to inductive de-finitions over the set N and their relation to the arithmetical and analyticalhierarchies. An operator Γ: P(N) → P(N) is called Σ0

r-definable if there isa Σ0

r-definable relation QΓ such that for all A ⊆ N and all n ∈ Nn ∈ Γ(A)↔ QΓ(cA, n).

Π0r , ∆0

r , Σ1r , Π1

r and ∆1r-definable operators are defined similarly.

It is easy to show that every Σ01-definable monotone operator Γ is con-

tinuous, and hence by a lemma in 2.9.4 has closure ordinal |Γ| ≤ ω. We nowshow that this consequence still holds for inclusive operators.

Lemma. Let Γ be a monotone or inclusive Σ01-definable operator. Then

|Γ| ≤ ω.

Proof. By assumption

n ∈ Γ(A)↔ ∃sT1(e, n, cA(s))

for some e ∈ N. It suffices to show that Γ(Γ↑ω) ⊆ Γ↑ω. Suppose n ∈ Γ(Γ↑ω),so T1(e, n, cΓ↑ω(s)) for some s. Since Γ↑ω is the union of the increasingchain Γ↑0 ⊆ Γ↑1 ⊆ Γ↑2 ⊆ . . . , for some r we must have cΓ↑ω(s) = cΓ↑r(s).Therefore n ∈ Γ(Γ↑r) = Γ↑(r + 1) ⊆ Γ↑ω.

2.9.7. Definability of least fixed points for monotone operators.Next we prove that the closure of a monotone Σ0

1-definable operator is Σ01-

definable as well (this will be seen to be false for inclusive operators). Asa tool in the proof we need Konig’s Lemma. Here and later we use starredfunction variables f∗, g∗, h∗, . . . to range over 0-1-valued functions.

Lemma (Konig). Let T be a binary tree, i.e., T consists of (codes for)sequences of 0 and 1 only and is closed against the formation of initialsegments. Then

∀n∃x(lh(x) = n ∧ ∀i<n (x)i ≤ 1 ∧ x ∈ T )↔ ∃f∗∀s(f∗(s) ∈ T ).

Proof. The direction from right to left is obvious. For the converseassume the left hand side and let

M := y | ∀i<lh(y) (y)i ≤ 1 ∧∀m∃z(lh(z) = m ∧ ∀i<m (z)i ≤ 1 ∧ ∀j≤lh(y)+m Init(y ∗ z, j) ∈ T ) .

M can be seen as the set of all “fertile” nodes, possessing arbitrary longextensions within T . To construct the required infinite path f∗ we use theaxiom of dependent choice:

∃yA(0, y)→ ∀n,y(A(n, y)→ ∃zA(n+ 1, z))→ ∃f∀nA(n, f(n)),

with A(n, y) expressing that y is a fertile node of length n:

A(n, y) := (y ∈M ∧ lh(y) = n).

Now ∃yA(0, y) is obvious (take y := 〈〉). For the step case assume that y isa fertile node of length n. Then at least one of the two possible extensionsy ∗ 〈0〉 and y ∗ 〈1〉 must be fertile, i.e., in M ; pick z accordingly.

Corollary. If R is Π01-definable, then so is

Q(~g, ~n ) := ∃f∗∀sR(f∗(s), ~g, ~n ).


Proof. By Konig’s Lemma we have

Q(~g, ~n )↔ ∀n∃x≤〈1,...,1〉(lh(x) = n ∧ ∀i<n (x)i ≤ 1 ∧R(x,~g, ~n )).

We now show that the Π11 and Σ0

1-definable relations are closed againstmonotone inductive definitions.

Theorem. Let Γ: P(N)→ P(N) be a monotone operator.(a) If Γ is Π1

1-definable, then so is its least fixed point IΓ.(b) If Γ is Σ0

1-definable, then so is its least fixed point IΓ.

Proof. Let Γ: P(N)→ P(N) be a monotone operator and n ∈ Γ(A)↔QΓ(cA, n).

(a). Assume QΓ is Π11-definable. Then IΓ is the intersection of all Γ-

closed sets, so

n ∈ IΓ ↔ ∀f (∀m(QΓ(f,m)→ f(m) = 1)→ f(n) = 1).

This shows that IΓ is Π11-definable.

(b). Assume QΓ is Σ01-definable. Then IΓ can be represented in the form

n ∈ IΓ ↔ ∀f∗(∀m(QΓ(f∗,m)→ f∗(m) = 1)→ f∗(n) = 1)

↔ ∀f∗∃mR(f∗,m, n) with R recursive

↔ ∀f∗∃sT1(e, n, f∗(s)) for some e.

By the corollary to Konig’s Lemma IΓ is Σ01-definable.

2.9.8. Some counterexamples. If Γ is a non-monotone but only in-clusive Σ0

1-definable operator, then its closure Γ need not even be arithmeti-cal. Recall from 2.6.5 the definition of the universal Σ0

r+1-definable relationsU0r+1(e, n):

U01 (e, n) := ∃sT (e, n, s) (↔ n ∈W (1)

e ),

U0r+1(e, n) := ∃m¬U0

r (e, n ∗ 〈m〉).Let

U0ω := 〈r, e, ~n〉 | U0

r+1(e, 〈~n〉) .Clearly for every arithmetical relation R there are r, e such that R(~n ) ↔〈r, e, ~n〉 ∈ U0

ω. Hence U0ω can not be arithmetical, for if it would be say

Σ0r+1-definable, then every arithmetical relation R would be Σ0

r+1-definable,contradicting the fact that the arithmetical hierarchy is properly expanding.On the other hand we have

Lemma. There is an inclusive Σ01-definable operator Γ such that Γ = U0

ω,hence Γ is not arithmetical.

Proof. We define Γ such that

(2.4) Γ↑r = 〈t, e, ~n〉 | 0 < t ≤ r ∧ U0t (e, 〈~n〉) .

Let

s ∈ Γ(A) :=s ∈ A ∨∃e,~n(s = 〈1, e, ~n〉 ∧ U0

1 (e, 〈~n〉)) ∨


∃t,e,~n(s = 〈t+ 1, e, ~n〉 ∧ ∃e1, ~m 〈t, e1, ~m〉 ∈ A ∧ ∃m〈t, e, ~n,m〉 /∈ A).

We now prove (2.4) by induction on r. The base case r = 0 is obvious. Inthe step case we have

s ∈ Γ(Γ↑r)↔s ∈ Γ↑r ∨∃e,~n(s = 〈1, e, ~n〉 ∧ U0

1 (e, 〈~n〉)) ∨∃t,e,~n(s = 〈t+ 1, e, ~n〉 ∧ ∃e1, ~m 〈t, e1, ~m〉 ∈ Γ↑r ∧ ∃m〈t, e, ~n,m〉 /∈ Γ↑r)↔ s ∈ Γ↑r ∨∃e,~n((s = 〈1, e, ~n〉 ∧ U0

1 (e, 〈~n〉)) ∨∃t,e,~n(s = 〈t+ 1, e, ~n〉 ∧ 0 < t ≤ r ∧ ∃m¬U0

t (e, 〈~n,m〉))↔ ∃t,e,~n(s = 〈t, e, ~n〉 ∧ 0 < t ≤ r ∧ U0

t (e, 〈~n〉)) ∨∃e,~n((s = 〈1, e, ~n〉 ∧ U0

1 (e, 〈~n〉)) ∨∃t,e,~n(s = 〈t+ 1, e, ~n〉 ∧ 0 < t ≤ r ∧ U0

t+1(e, 〈~n〉))↔ s ∈ 〈t, e, ~n〉 | 0 < t ≤ r + 1 ∧ U0

t (e, 〈~n〉)] .

Clearly Γ is a Σ01-definable inclusive operator. By 2.9.6 its closure ordinal

|Γ| is ≤ ω, so Γ↑ω = Γ. But clearly Γ↑ω =⋃r Γ↑r = U0

ω.

On the positive side we have

Lemma. For every inclusive ∆11-definable operator Γ its closure Γ is

∆11-definable.

Proof. Let Γ be an inclusive operator such that n ∈ Γ(A)↔ QΓ(cA, n)for some ∆1

1-definable relation QΓ. Let

f∗(p) :=

1 if p = 〈r + 1, n〉 and n ∈ Γ↑r0 otherwise

and consider the following ∆11-definable relation R.

R(g) := ∀p(g(p) ≤ 1) ∧∀p(lh(p) 6= 2 ∨ (p)0 = 0→ g(p) = 0)

∀r∀n(g〈r + 1, n〉 = 1↔ QΓ(λxg〈r, x〉, n)).

Clearly R(f∗). Moreover, for every g such that R(g) we have g(p) = 0 forall p not of the form 〈r + 1, n〉, and it is easy to prove that

g〈r + 1, n〉 = 1↔ n ∈ Γ↑r.Therefore f∗ is the unique member of R, and we have

n ∈ Γ↔ ∃rn ∈ Γ↑r ↔ ∃rg〈r + 1, n〉 = 1

↔ ∃g(R(g) ∧ ∃rg〈r + 1, n〉 = 1)

↔ ∀g(R(g)→ ∃rg〈r + 1, n〉 = 1).

Hence Γ is ∆11-definable.

Corollary. U0ω ∈ ∆1

1 \∆0∞.


2.10. Notes

The history of recursive function theory goes back to the pioneering workof Turing, Kleene and others in the 1930’s. We have based our approachto the theory on the concept of an (Unlimited) Register Machine of Shep-herdson and Sturgis (1963), which allows a particularly simple development.The Normal Form Theorem and the undecidability of the halting problemare classical results from the 1930’s, due to Kleene and Church, respectively.

The subclass of the elementary functions treated in 2.2.2 has been in-troduced by Kalmar (1943).

Our notion of recursive definition in 2.4 is essentially a reformulation ofthe Herbrand-Godel-Kleene equation calculus; see (Kleene, 1952). Similarideas have also been developed by McCarthy.

Grzegorczyk (1953) was the first to classify the primitive recursive func-tions by means of a hierarchy En, which coincides with levels of Lk-compu-tability for n = k + 1 ≥ 3. In addition, E2 is the class of subelementaryfunctions.

CHAPTER 3

Godel’s Theorems

This is the point at which we bring proof and recursion together and be-gin to study connections between the computational complexity of recursivefunctions and the logical complexity of their formal termination or existenceproofs. The rest of the book will largely be motivated by this theme, andwill make repeated use of the basics laid out here and the proof theoreticmethods developed earlier. It should be stressed that by “computationalcomplexity” we mean complexity “in the large” or “in theory”; not neces-sarily feasible or practical complexity. Feasibility is always desirable if onecan achieve it, but the fact is that natural formal theories of even modestlogical strength prove the termination of functions with enormous growthrate, way beyond the realm of practical computability. Since our aim is tounravel the computational constraints implicit in the logic of a given theory,we do not wish to have any prior bounds imposed on the levels of complexityallowed.

At the base of our hierarchy of theories lie ones with polynomially- orat most exponentially-bounded complexity, and these are studied in part 3at the end of the book. The principal objects of study in this chapter arethe elementary functions, which (i) will be characterized as those provablyterminating in the theory I∆0(exp) of bounded induction, and (ii) will beshown to be adequate for the arithmetization of syntax leading to Godel’stheorems, a fact which most logicians believe but which rarely has received acomplete treatment elsewhere. We believe (i) to be a fundamental theorem ofmathematical logic, and one which – along with realizability interpretations(see part 3) – underlies the whole area of “proofs as programs” now activelydeveloped in computer science. The proof is completely straightforward, butit will require us, once and for all, to develop some routine basic arithmeticinside I∆0(exp).

Later (in part 3) we shall see how to build alternative versions of thistheory, without the explicit bounds of ∆0-formulas, but which still charac-terizing the elementary functions and natural subclasses such as polytime.Such theories are reflective of a more recent research trend towards “im-plicit complexity”. At first sight they resemble theories of full arithmetic,but they incorporate ideas of Bellantoni and Cook (1992), and of Leivant(1995a) in which composition (quantification) and recursion (induction) acton different kinds of variables. It is this variable separation which bringsapparently strong theories down to “more feasible” levels.

One cannot write a text on proof theory without bowing to Godel, andthis chapter seems the obvious place in which to give a short but we hopereasonably complete treatment of the two incompleteness theorems.

87

88 3. GODEL’S THEOREMS

All of the results in this chapter are developed as if the logic is classical.However, every result goes through in much the same way in a constructivecontext.

3.1. I∆0(exp)

I∆0(exp) is a theory in classical logic, based on the language

=, 0, S, P,+,−· , · , exp2

where S, P denote the successor and predecessor functions. We shall gen-erally use infix notations x+ 1, x−· 1, 2x rather than the more formal S(x),P (x), exp2(x) etcetera. The axioms of I∆0(exp) are the usual axioms forequality, the following defining axioms for the constants:

x+ 1 6= 0 x+ 1 = y + 1→ x = y0−· 1 = 0 (x+ 1)−· 1 = xx+ 0 = x x+ (y + 1) = (x+ y) + 1x−· 0 = x x−· (y + 1) = (x−· y)−· 1x · 0 = 0 x · (y + 1) = (x · y) + x20 = 1 (= 0 + 1) 2x+1 = 2x + 2x

and the axiom-schema of “bounded induction”:

B(0) ∧ ∀x(B(x)→ B(x+ 1))→ ∀xB(x)

for all “bounded” formulas B as defined below.

Definition. We write t1 ≤ t2 for t1−· t2 = 0 and t1 < t2 for t1 +1 ≤ t2,where t1, t2 denote arbitrary terms of the language.

A ∆0- or bounded formula is a formula in the langage of I∆0(exp), inwhich all quantifiers occur bounded, thus ∀x<tB(x) stands for ∀x(x < t→B(x)) and ∃x<tB(x) stands for ∃x(x < t ∧B(x)) (similarly with ≤ insteadof <).

A Σ1-formula is any formula of the form ∃x1∃x2 . . .∃xkB where B is a

bounded formula. The prefix of unbounded existential quantifiers is allowedto be empty, thus bounded formulas are Σ1.

3.1.1. Basic Arithmetic in I∆0(exp). The first task in any axiomatictheory is to develop, from the axioms, those basic algebraic properties whichare going to be used frequently without further reference. Thus, in the caseof I∆0(exp) we need to establish the usual associativity, commutativity anddistributivity laws for addition and multiplication, the laws of exponentia-tion, and rules governing the relations ≤ and < just defined.

Lemma. In I∆0(exp) one can prove (the universal closures of) case-distinction:

x = 0 ∨ x = (x−· 1) + 1

the associativity laws for addition and multiplication:

x+ (y + z) = (x+ y) + z and x · (y · z) = (x · y) · z

the distributivity law:

x · (y + z) = x · y + x · z

3.1. I∆0(exp) 89

the commutativity laws:

x+ y = y + x and x · y = y · xthe law:

x−· (y + z) = (x−· y)−· zand the exponentiation law:

2x+y = 2x · 2y.

Proof. Since 0 = 0 and x+ 1 = ((x+ 1)−· 1) + 1 by axioms, a trivialinduction on x gives the cases-distinction. A straightforward induction on zgives associativity for +, and distributivity follows from this by an equallystraightforward induction, again on z. Associativity of multiplication isproven similarly, but requires distributivity. The commutativity of + isdone by induction on y (or x) using sub-inductions to first prove 0 + x = xand (y + x) + 1 = (y + 1) + x. Commutativity of · is done similarly using0 · x = 0 and y · x + x = (y + 1) · x, this latter requiring both associativityand commutativity of +. That x −· (y + z) = (x −· y) −· z follows easilyby a direct induction on z. The base-case for the exponentiation law is2x+0 = 2x = 0 + 2x = 2x · 0 + 2x = 2x · (0 + 1) = 2x · 20 and the inductionstep needs distributivity to give 2x+y+1 = 2x · 2y + 2x · 2y = 2x · 2y+1.

Lemma. The following (and their universal closures) are provable inI∆0(exp):

(1) x ≤ 0↔ x = 0 and ¬x < 0(2) 0 ≤ x and x ≤ x and x < x+ 1(3) x < y + 1↔ x ≤ y(4) x ≤ y ↔ x < y ∨ x = y(5) x ≤ y ∧ y ≤ z → x ≤ z and x < y ∧ y < z → x < z(6) x ≤ y ∨ y < x(7) x < y → x+ z < y + z(8) x < y → x · (z + 1) < y · (z + 1)(9) x < 2x and x < y → 2x < 2y.

Proof. (1) This is an immediate consequence of the axioms x−· 0 = xand x + 1 6= 0. (2) A simple induction proves 0 −· x = 0, that is 0 ≤ x.Another induction on y gives (x+ 1)−· (y + 1) = x−· y, and then a furtherinduction proves x−· x = 0, which is x ≤ x. Replacing x by x+1 then givesx < x+1. (3) This follows straight from the equation (x+1)−· (y+1) = x−· y.(4) From x ≤ x we obtain x = y → x ≤ y, and from x−· y = (x+1)−· (y+1)we obtain x < y → x ≤ y, hence x < y ∨ x = y → x ≤ y. The conversex ≤ y → x < y ∨ x = y is proven by a case-distinction on y, the case y = 0being immediate from part 1. In the other case y = (y−· 1)+1 and one obtainsx ≤ y → x−· (y−· 1) = 0∨x−· (y−· 1) = 1 by a case-distinction on x−· (y−· 1).Since (x+ 1)−· y = x−· (y −· 1) this gives x ≤ y → x < y ∨ x−· (y −· 1) = 1.It therefore remains only to prove x −· (y −· 1) = 1 → x = y. But thisfollows immediately from x −· z 6= 0 → x = z + (x −· z), which is provenby induction on z using (z + 1) + (x −· (z + 1)) = z + (x −· (z + 1)) + 1 =z + ((x −· z) −· 1) + 1 = z + (x −· z). (5) Transitivity of ≤ is proven byinduction on z using parts 1 for the basis and 4 for the induction step.Then, by replacing x by x+ 1 and y by y + 1, the transitivity of < follows.


(6) can be proved by induction on x. The basis is immediate from 0 ≤ y. Theinduction step is straightforward since y < x→ y < x+1 by transitivity, andx ≤ y → x < y∨x = y → x+1 ≤ y∨y < x+1 by previous facts. (7) requiresa simple induction on z, the induction step being x+z < y+z → x+z+1 ≤y + z < y + z + 1. (8) follows from part 7 and transitivity by another easyinduction on z. (9) Using part 7 and transitivity again, one easily provesby induction, 2x < 2x+1. Then x < 2x follows straightforwardly by anotherinduction, as does x < y → 2x < 2y by induction on y, the inductionstep being x < y + 1 → x ≤ y → 2x ≤ 2y → 2x < 2y+1 by means oftransitivity.

Note. All of the inductions used in the lemmas above are inductionson “open”, i.e., quantifier-free, formulas.

3.1.2. Provable recursion in I∆0(exp). Of course in any theory manynew functions and relations can be defined out of the given constants. Whatwe are interested in are those which can not only be defined in the languageof the theory, but also can be proven to exist. This gives rise to one of themain definitions in this book.

Definition. We say that a function f : Nk → N is provably Σ1 or prov-ably recursive in an arithmetical theory T if there is a Σ1-formula F (~x, y),called a “defining formula” for f , such that

• f(~n ) = m if and only if F (~n,m) is true (in the standard model)• T ` ∃y F (~x, y)• T ` F (~x, y) ∧ F (~x, y′) → y = y′

If, in addition, F is a bounded formula and there is a bounding term t(~x )for f such that T ` F (~x, y) → y < t(~x ) then we say that f is provablybounded in T. In this case we clearly have T ` ∃y<t(~x ) F (~x, y).

The importance of this definition is brought out by the following:

Theorem. If f is provably Σ1 in T we may conservatively extend Tby adding a new function symbol for f together with the defining axiomF (~x, f(~x )).

Proof. This is simply because any model M of T can be made into amodel (M, f) of the extended theory, by interpreting f as the function onM uniquely determined by the second and third conditions above. So if Ais a closed formula not involving f , provable in the extended theory, thenit is true in (M, f) and hence true in M. Then by Completeness, A mustalready be provable in T .

Since Σ1-definable functions are recursive, we shall often use the terms“provably Σ1” and “provably recursive” synonymously. We next develop thestock of functions provably Σ1 in I∆0(exp), and prove that they are exactlythe elementary functions.

Lemma. Each term defines a provably bounded function of I∆0(exp).

Proof. Let f be the function defined explicitly by f(~n ) = t(~n ) where tis any term of I∆0(exp). Then we may take y = t(~x ) as the defining formulafor f , since ∃y (y = t(~x )) derives immediately from the axiom t(~x ) = t(~x ),

3.1. I∆0(exp) 91

and y = t(~x ) ∧ y′ = t(~x ) → y = y′ is an equality axiom. Furthermore, asy = t(~x ) is a bounded formula and y = t → y < t + 1 is provable, f isprovably bounded.

Lemma. Define 2k(x) by 20(x) = x and 2k+1(x) = 22k(x). Then for everyterm t(x1, . . . , xn) built up from the constants 0, S, P,+,−· , · , exp2, there isa k such that

I∆0(exp) ` t(x1, . . . , xn) < 2k(x1 + · · ·+ xn).

Proof. We can prove in I∆0(exp) both 0 < 2x and x < 2x. Nowsuppose t is any term constructed from subterms t0, t1 by application ofone of the function constants. Assume inductively that t0 < 2k0(s0) andt1 < 2k1(s1) are both provable, where s0, s1 are the sums of all variablesappearing in t0, t1 respectively. Let s be the sum of all variables appearingin either t0 or t1, and let k be the maximum of k0 and k1. Then, by thevarious arithmetical laws in the preceding lemmas, we can prove t0 < 2k(s)and t1 < 2k(s), and it is then a simple matter to prove t0 + 1 < 2k+1(s),t0 −· 1 < 2k(s), t0 −· t1 < 2k(s), t0 + t1 < 2k+1(s), t0 · t1 < 2k+1(s) and2t0 < 2k+1(s). Hence I∆0(exp) proves t < 2k+1(s).

Lemma. Suppose f is defined by composition

f(~n ) = g0( g1(~n ), . . . , gm(~n ) )

from functions g0, g1, . . . , gm, each of which is provably bounded in I∆0(exp).Then f is provably bounded in I∆0(exp).

Proof. By the definition of “provably bounded” there is, for each gi(i ≤ m) a bounded defining formula Gi and (by the last lemma) a numberki such that, for 1 ≤ i ≤ m, I∆0(exp) ` ∃yi<2ki

(s)Gi(~x, yi), where s is thesum of the variables ~x; and for i = 0,

I∆0(exp) ` ∃y<2k0(s0)G0(y1, . . . , ym, y),

where s0 is the sum of the variables y1, . . . , ym. Let k := max(k0, k1, . . . , km)and let F (~x, y) be the bounded formula

∃y1<2k(s) . . . ∃ym<2k(s)C(~x, y1, . . . , ym, y)

where C(~x, y1, . . . , ym, y) is the conjunction

G1(~x, y1) ∧ · · · ∧Gm(~x, ym) ∧G0(y1, . . . , ym, y).

Then clearly, F is a defining formula for f , and by prenex operations,

I∆0(exp) ` ∃y F (~x, y).

Furthermore, by the uniqueness condition on each Gi, we can also prove inI∆0(exp)

C(~x, y1, . . . , ym, y) ∧ C(~x, z1, . . . , zm, y′)→ y1 = z1 ∧ · · · ∧ ym = zm ∧G0(y1, . . . , ym, y) ∧G0(y1, . . . , ym, y

′)→ y = y′

and hence by the quantifier rules of logic,

I∆0(exp) ` F (~x, y) ∧ F (~x, y′)→ y = y′.


Thus f is provably Σ1 with F as a bounded defining formula, and it onlyremains to find a bounding term. But I∆0(exp) proves

C(~x, y1, . . . , ym, y)→ y1 < 2k(s) ∧ · · · ∧ ym < 2k(s) ∧ y < 2k(y1 + · · ·+ ym)

andy1 < 2k(s) ∧ · · · ∧ ym < 2k(s)→ y1 + · · ·+ ym < 2k(s) ·m.

Therefore by taking t(~x ) to be the term 2k(2k(s) ·m) we obtain

I∆0(exp) ` C(~x, y1, . . . , ym, y)→ y < t(~x )

and henceI∆0(exp) ` F (~x, y)→ y < t(~x ).

This completes the proof.

Lemma. Suppose f is defined by bounded minimization

f(~n,m) = µk<m( g(~n, k) = 0 )

from a function g which is provably bounded in I∆0(exp). Then f is provablybounded in I∆0(exp).

Proof. Let G be a bounded defining formula for g and let F (~x, z, y) bethe bounded formula

y ≤ z ∧ ∀i<y ¬G(~x, i, 0) ∧ (y = z ∨G(~x, y, 0)).

Obviously F (~n,m, k) is true in the standard model if and only if either k isthe least number less than m such that g(~n, k) = 0, or there is no such andk = m. But this is exactly what it means for k to be the value of f(~n,m),so F is a defining formula for f . Furthermore I∆0(exp) ` F (~x, z, y)→ y <z + 1, so t(~x, z) = z + 1 can be taken as a bounding term for f . Also it isclear that we can prove

F (~x, z, y) ∧ F (~x, z, y′) ∧ y < y′ → G(~x, y, 0) ∧ ¬G(~x, y, 0)

and similarly with y and y′ interchanged. Therefore

I∆0(exp) ` F (~x, z, y) ∧ F (~x, z, y′)→ ¬y < y′ ∧ ¬y′ < y

and hence, because y < y′ ∨ y′ < y ∨ y = y′ is provable, we have

I∆0(exp) ` F (~x, z, y) ∧ F (~x, z, y′)→ y = y′.

It remains to check that I∆0(exp) ` ∃y F (~x, z, y). This is the point wherebounded induction comes into play, since ∃y F (~x, z, y) is a bounded formula.We prove it by induction on z.

For the basis, recall that y ≤ 0 ↔ y = 0 and ¬ i < 0 are provable.Therefore F (~x, 0, 0) is provable, and hence so is ∃y F (~x, 0, y).

For the induction step from z to z+1, we can prove y ≤ z → y+1 ≤ z+1and, using i < y + 1↔ i < y ∨ i = y,

∀i<y ¬G(~x, i, 0)∧ (y = z ∧¬G(~x, y, 0))→ ∀i<y+1 ¬G(~x, i, 0)∧ y+ 1 = z+ 1

ThereforeF (~x, z, y)→ F (~x, z + 1, y + 1) ∨ F (~x, z + 1, y)

and hence∃y F (~x, z, y)→ ∃y F (~x, z + 1, y)

which completes the proof.

3.1. I∆0(exp) 93

Theorem. Every elementary function is provably bounded in the theoryI∆0(exp).

Proof. As we have seen earlier in 2.2, the elementary functions can becharacterized as those definable from the constants 0, S, P , +, −· , ·, exp2

by composition and bounded minimization. The above lemmas show thateach such function is provably bounded in I∆0(exp).

3.1.3. Proof theoretic characterization.

Definition. A closed Σ1-formula ∃~z B(~z ), with B a bounded formula,is said to be “true at m”, and we write m |= ∃~z B(~z ), if there are numbers~m = m1,m2, . . . ,ml all less than m, such that B(~m) is true (in the standardmodel). A finite set Γ of closed Σ1-formulas is “true at m”, written m |= Γ,if at least one of them is true at m.

If Γ(x1, . . . , xk) is a finite set of Σ1 formulas all of whose free variablesoccur among x1, . . . , xk, and if f : Nk → N, then we write f |= Γ to mean thatfor all numerical assignments ~n = n1, . . . , nk to the variables ~x = x1, . . . , xkwe have f(~n ) |= Γ(~n ).

Note. (Persistence) For sets Γ of closed Σ1-formulas, if m |= Γ andm < m′ then m′ |= Γ. Similarly for sets Γ(~x ) of Σ1-formulas with freevariables, if f |= Γ and f(~n ) ≤ f ′(~n ) for all ~n ∈ Nk then f ′ |= Γ.

Lemma. If Γ(~x ) is a finite set of Σ1-formulas (whose disjunction is)provable in I∆0(exp) then there is an elementary function f , strictly in-creasing in each of its variables, such that f |= Γ.

Proof. It is convenient to use a Tait-style formalisation of I∆0(exp).The axioms will be all sets of formulas Γ which contain either a complemen-tary pair of equations t1 = t2, t1 6= t2, or an identity t = t, or an equalityaxiom t1 6= t2,¬ e(t1), e(t2) where e(t) is any equation or inequation witha distinguished subterm t, or a substitution instance of one of the definingaxioms for the constants. The axiom schema of bounded induction will bereplaced by the induction rule

Γ, B(0) Γ, ¬B(y), B(y + 1)Γ, B(t)

where B is any bounded formula, y is not free in Γ and t is any term.Note that if Γ is provable in I∆0(exp) then it has a proof in the formalism

just described, in which all cut formulas are Σ1. For if Γ is classicallyderivable from non-logical axioms A1, . . . , As then there is a cut-free proofin Tait-style logic of ¬A1, ∆, Γ where ∆ = ¬A2, . . . ,¬As. We show how tocancel ¬A1 using a Σ1 cut. If A1 is an induction axiom on the formula Bwe have a cut-free proof in logic of

B(0) ∧ ∀y(¬B(y) ∨B(y + 1)) ∧ ∃x¬B(x), ∆, Γ

and hence, by inversion, cut-free proofs of B(0), ∆, Γ and ¬B(y), B(y+ 1),∆, Γ and ∃x¬B(x), ∆, Γ. From the first two of these we obtain B(x), ∆,Γ by the induction rule above, then ∀xB(x), ∆, Γ, and then from the thirdwe obtain ∆, Γ by a cut on the Σ1-formula ∃x¬B(x). If A1 is the universalclosure of any other (quantifier-free) axiom then we immediately obtain ∆, Γ


by a cut on the Σ1-formula ¬A1. Having thus cancelled ¬A1 we can similarlycancel each of ¬A2, . . . ,¬As in turn, so as to yield the desired proof of Γwhich only uses cuts on Σ1-formulas.

Now, choosing such a proof for Γ(~x ), we proceed by induction on itsheight, showing at each new proof-step how to define the required elementaryfunction f such that f |= Γ.

(i) If Γ(~x ) is an axiom then for all ~n, Γ(~n ) contains a true atom.Therefore f |= Γ for any f . To make f sufficiently increasing choosef(~n ) = n1 + · · ·+ nk.

(ii) If Γ, B0 ∨ B1 arises by an application of the ∨-rule from Γ, B0, B1

then (because of our definition of Σ1-formula) B0 and B1 must both bebounded formulas. Thus by our definition of “true at”, any function fsatisfying f |= Γ, B0, B1 must also satisfy f |= Γ, B0 ∨B1.

(iii) Only a slightly more complicated argument applies to the dual casewhere Γ, B0 ∧ B1 arises by an application of the ∧-rule from the premisesΓ, B0 and Γ, B1. For if f0(~n ) |= Γ(~n ), B0(~n ) and f1(~n ) |= Γ(~n ), B1(~n )for all ~n, then it is easy to see (by persistence) that f |= Γ, B0 ∧ B1 wheref(~n ) = f0(~n ) + f1(~n ).

(iv) If Γ, ∀yB(y) arises from Γ, B(y) by the ∀-rule (y not free in Γ) thensince all the formulas are Σ1, ∀yB(y) must be bounded and so B(y) must beof the form y 6<t ∨ B′(y) for some term t. Now assume f0 |= Γ, y 6< t,B′(y)for some increasing elementary function f0. Then for all assignments ~n tothe free variables ~x, and all assignments k to the variable y,

f0(~n, k) |= Γ(~n ), k 6< t(~n ), B′(~n, k).

Therefore by defining f(~n ) = Σk<g(~n ) f0(~n, k) where g is an increasing ele-mentary function bounding t, we easily see that either f(~n ) |= Γ(~n ) or else,by persistence, B′(~n, k) is true for every k < t(~n ). Hence f |= Γ, ∀yB(y) asrequired, and clearly f is elementary since f0 and g are.

(v) Now suppose Γ, ∃yA(y, ~x ) arises from Γ, A(t, ~x ) by the ∃-rule, whereA is Σ1. Then by the induction hypothesis there is an elementary f0 suchthat for all ~n,

f0(~n ) |= Γ(~n ), A(t(~n ), ~n ).

Then either f0(~n ) |= Γ(~n ) or else f0(~n ) bounds true witnesses for all theexistential quantifiers already in A(t(~n ), ~n ). Therefore by choosing anyelementary bounding function g for the term t, and defining f(~n ) = f0(~n )+g(~n ), we see that either f(~n ) |= Γ(~n ) or f(~n ) |= ∃yA(y, ~n ) for all ~n.

(vi) If Γ comes about by the cut rule with Σ1 cut formula C ≡ ∃~zB(~z )then the two premises are Γ, ∀~z ¬B(~z ) and Γ, ∃~z B(~z ). The universal quan-tifiers in the first premise can be inverted (without increasing proof-height)to give Γ, ¬B(~z ) and since B is bounded the induction hypothesis can beapplied to this to give an elementary f0 such that for all numerical assign-ments ~n to the (implicit) variables ~x and all assignments ~m to the new freevariables ~z,

f0(~n, ~m) |= Γ(~n ), ¬B(~n, ~m).

Applying the induction hypothesis to the second premise gives an elemen-tary f1 such that for all ~n, either f1(~n ) |= Γ(~n ) or else there are fixedwitnesses ~m < f1(~n ) such that B(~n, ~m) is true. Therefore if we define f by

3.1. I∆0(exp) 95

substitution from f0 and f1 thus:

f(~n ) = f0(~n, f1(~n ), . . . , f1(~n ))

then f will be elementary, greater than or equal to f1, and strictly increasingsince both f0 and f1 are. Furthermore f |= Γ. For otherwise there wouldbe a tuple ~n such that Γ(~n ) is not true at f(~n ) and hence, by persistence,not true at f1(~n ). So B(~n, ~m) is true for certain numbers ~m < f1(~n ). Butthen f0(~n, ~m) < f(~n ) and so, again by persistence, Γ(~n ) cannot be trueat f0(~n, ~m). This means B(~n, ~m) is false, by the above, and so we have acontradiction.

(vii) Finally suppose Γ(~x ), B(~x, t) arises by an application of the induc-tion rule on the bounded formula B. The premises are Γ(~x ), B(~x, 0) andΓ(~x ), ¬B(~x, y), B(~x, y + 1). Applying the induction hypothesis to each ofthe premises one obtains increasing elementary functions f0 and f1 such thatfor all ~n and all k,

f0(~n ) |= Γ(~n ), B(~n, 0)

f1(~n, k) |= Γ(~n ), ¬B(~n, k), B(~n, k + 1).

Now define f(~n ) = f0(~n ) + Σk<g(~n ) f1(~n, k) where g is some increasingelementary bounding function for the term t. Then f is elementary andincreasing, and by persistence from the above properties of f0 and f1, eitherf(~n ) |= Γ(~n ), or else B(~n, 0) and B(~n, k) → B(~n, k + 1) are true for allk < t(~n ). In this latter case B(~n, t(~n )) is true by induction on k up to thevalue of t(~n ). Either way, we have f |= Γ(~x ), B(~x, t(~x )) and this completesthe proof.

Theorem. A number-theoretic function is elementary if and only if itis provably Σ1 in I∆0(exp).

Proof. We have already shown that every elementary function is prov-ably bounded, and hence provably Σ1, in I∆0(exp). Conversely suppose fis provably Σ1. Then there is a Σ1-formula

F (~x, y) ≡ ∃z1 . . .∃zkB(~x, y, z1 . . . zk)

which defines f and such that

I∆0(exp) ` ∃y F (~x, y).

By the lemma immediately above, there is an elementary function g suchthat for every tuple of arguments ~n there are numbers m0,m1, . . . ,mk lessthan g(~n ) satisfying the bounded formula B(~n,m0,m1, . . . ,mk). Using theelementary sequence-coding schema developed earlier in 2.2, let

h(~n ) = 〈g(~n ), g(~n ), . . . , g(~n )〉

so that if m = 〈m0,m1, . . . ,mk〉 where m0,m1, . . . ,mk < g(~n ), then m <h(~n ). Then, because f(~n ) is the unique m0 for which there are m1, . . . ,mk

satisfying B(~n,m0,m1, . . . ,mk), we can define f as follows:

f(~n ) = ( µm<h(~n )B(~n, (m)0, (m)1, . . . , (m)k) )0.

Since B is a bounded formula of I∆0(exp) it is elementarily decidable, andsince the least number operator µ is bounded by the elementary function


h, the entire definition of f therefore involves only elementary operations.Hence f is an elementary function.

3.2. Godel Numbers

We will assign numbers – so-called Godel numbers, GN for short – to thesyntactical constructs developed in chapter 1: terms, formulas and deriva-tions. Using the elementary sequence-coding and decoding machinery de-veloped earlier we will be able to construct the code number of a composedobject from its parts, and conversely to disassemble the code number of acomposed object into the code numbers of its parts.

3.2.1. Godel numbers of terms, formulas and derivations. LetL be a countable first order language. Assume that we have injectivelyassigned to every n-ary relation symbol R a symbol number SN(R) of theform 〈1, n, i〉 and to every n-ary function symbol f a symbol number SN(f) ofthe form 〈2, n, j〉. Call L elementarily presented if the set SymbL of all thesesymbol numbers is elementary. In what follows we shall always assume thatthe languages L considered are elementarily presented. In particular thisapplies to every language with finitely many relation and function symbols.

Let SN(Var) := 〈0〉. For every L-term r we define recursively its Godelprq by

pxiq := 〈SN(Var), i〉,pfr1 . . . rnq := 〈SN(f), pr1q, . . . , prnq〉.

Assign numbers to the logical symbols by SN(→) := 〈3, 0〉 und SN(∀) :=〈3, 1〉. For simplicity we leave out the logical connectives ∧, ∨ and ∃ here;they could be treated similarly. We define for every L-formula A its Godelnumber pAq by

pRr1 . . . rnq := 〈SN(R), pr1q, . . . , prnq〉,pA→ Bq := 〈SN(→), pAq, pBq〉,p∀xiAq := 〈SN(∀), i, pAq〉.

We define symbol numbers for the names of the natural deduction rules:SN(AssVar) := 〈4, 0〉, SN(→+) := 〈4, 1〉, SN(→−) := 〈4, 2〉, SN(∀+) :=〈4, 3〉, SN(∀−) := 〈4, 4〉. For a derivation M we define its Godel numberpMq by

puAi q := 〈SN(AssVar), i, pAq〉,pλuA

iMq := 〈SN(→+), i, pAq, pMq〉,

pMNq := 〈SN(→−), pMq, pNq〉,pλxiMq := 〈SN(∀+), i, pMq〉,pMrq := 〈SN(∀−), pMq, prq〉.

It will be helpful in the sequel to have some general estimates on Godelnumbers, which we provide here. For a term r or formula A we define its

3.2. GODEL NUMBERS 97

sum of maximal sequence lengths ||r|| or ||A|| by

||xi|| := 2

||fr0 . . . rk−1|| := k + 1 + max(||ri||)

||Rr0 . . . rk−1|| := k + 1 + max(||ri||)||A→ B|| := 3 + max(||A||, ||B||)||∀xA|| := 3 + ||A||

and its symbol bound Symb(r) or Symb(A) by

Symb(xi) := max(SN(Var), i) + 1

Symb(fr0 . . . rk−1) := max(SN(f),max(Symb(ri)))

Symb(Rr0 . . . rk−1) := max(SN(R),max(Symb(ri)))

Symb(A→ B) := max(Symb(A),Symb(B))

Symb(∀xA) := Symb(A).

Lemma. ||r|| ≤ prq < Symb(r)2||r||

and ||A|| ≤ pAq < Symb(A)2||A||

.

Proof. We prove ||r|| ≤ prq by induction on r. The case of a variablexi is easy:

||xi|| = 2 ≤ 〈SN(Var), i〉 = pxiq,

and for a term fr0 . . . rk−1 we have by the sequence coding,

pfr0 . . . rk−1q = 〈SN(f), pr0q, . . . , prk−1q〉≥ k + 1 + max(priq)

≥ k + 1 + max(||ri||) by induction hypothesis

= ||fr0 . . . rk−1||.

The proof of ||A|| ≤ pAq is similar. For prq < Symb(r)2||r||

we again useinduction on r. For a variable xi we obtain by the estimate in 2.2.5

pxiq = 〈SN(Var), i〉 < Symb(xi)22

= Symb(xi)2||xi||

and for a term r := fr0 . . . rk−1 built with a function symbol f we have

pfr0 . . . rk−1q

= 〈SN(f), pr0q, . . . , prk−1q〉

≤ 〈n−· 1, n−· 1, . . . , n−· 1︸︷︷︸k+1

〉 with n := Symb(r)2max ||ri|| , by ind. hyp.

< n2k+1by the estimate in 2.2.5

= Symb(r)2k+1+max ||ri|| = Symb(r)2

||r||.

The proof of pAq < Symb(A)2||A||

is again similar, but we spell out thequantifier case A = ∀xiB:

pAq = 〈SN(∀), i, pBq〉 ≤ max(SN(∀), i,Symb(B)2||B||

)23 ≤ Symb(A)2

||A||.


3.2.2. Elementary functions on Godel numbers. We shall definean elementary predicate Deriv such that Deriv(d) if and only if d is theGodel number of a derivation. To this end we need a number of auxiliaryfunctions and relations, which will all be elementary and have the propertiesdescribed. (The convention is that relations are capitalized and functionsare lower case). First we need some basic notions:

Ter(t) t is GN of a term,

For(a) a is GN of a formula,

FV(i, y) the variable xi is free in the term or formula with GN y,

fmla(d) GN of the formula derived by the derivation with GN d.

By the context of a derivation M we mean the set uA0i0, . . . , u

An−1

in−1 of its free

assumption variables, where i0 < · · · < in−1. Its Godel number is defined tobe the least number c such that ∀ν<n((c)iν = pAνq).

ctx(d) GN of the context of the derivation with GN d,

Cons(c1, c2) the contexts with GN c1, c2 are consistent.

Then Deriv can be defined by course-of-values recursion, using the next-to-last lemma in 2.2.5.

Deriv(d) := ((d)0 = SN(AssVar) ∧ lh(d) = 3 ∧ For((d)2)) ∨((d)0 = SN(→+) ∧ lh(d) = 4 ∧ For((d)2) ∧Deriv((d)3) ∧

((ctx((d)3))(d)1 6= 0→ (ctx((d)3))(d)1 = (d)2)) ∨((d)0 = SN(→−) ∧ lh(d) = 3 ∧Deriv((d)1) ∧Deriv((d)2) ∧

Cons(ctx((d)1), ctx((d)2)) ∧(fmla((d)1))0 = SN(→) ∧ (fmla((d)1))1 = fmla((d)2)) ∨

((d)0 = SN(∀+) ∧ lh(d) = 3 ∧Deriv((d)2) ∧ ∀i<lh(ctx((d)2))(

(ctx((d)2))i 6= 0→ ¬FV((d)1, (ctx((d)2))i))) ∨((d)0 = SN(∀−) ∧ lh(d) = 3 ∧Deriv((d)1) ∧ Ter((d)2) ∧

(fmla((d)1)0 = SN(∀)).

Still further auxiliary functions are needed. A substitution is a map xi0 7→r0, . . . , xin−1 7→ rn−1 with i0 < · · · < in−1 from variables to terms; its Godelnumber is the least number s such that ∀ν<n((s)iν = prνq). Hence (s)iν = 0indicates that s leaves xiν unchanged.

union(c1, c2) GN of the union of the consistent contexts with GN c1, c2,

remove(c, i) GN of result of removing ui from the context with GN c,

sub(x, s) GN of the result of applying the substitution with GN s

to the term or formula with GN x,

update(s, i, t) GN of the result of updating the substitution with GN s

by changing its entry at i to the term with GN t.

We now give definitions of all these; from the form of the definitions it willbe clear that they have the required properties, and are elementary.


Update. This can be defined explicitly, using the bounded least numberoperator:

update(s, i, t) :=

µx<h(max(s,t),max(lh(s),i))((x)i = t ∧ ∀k<max(lh(s),i)(k 6= i→ (x)i = (s)i))

where h(n, k) := (n + 1)2k

is the elementary function defined earlier withthe property 〈n, . . . , n〉 ≤ h(n, k).

Substitution. The substitution function defined next takes a formula orterm with GN x and applies to it a substitution with GN s to produce anew formula with GN y. The substitution works by assigning specific termsto the free variables, but in order to avoid clashing it must also reassignnew variables to the universally bound ones. This occurs in the final clauseof the definition where, to be on the safe side, we (recursively) assign toa bound variable the new variable with index x + i(s), where i(s) is themaximum index of any variable occurring in a value term (s)j of s. Wedefine substitution by a limited course-of-values recursion with parametersubstitutions:

sub(x, s) :=

x if (x)0 = SN(Var) ∧ (s)(x)1 = 0(s)(x)1 if (x)0 = SN(Var) ∧ (s)(x)1 6= 0µy≤k(x,s)(lh(x) = lh(y) ∧ (x)0 = (y)0 ∧ ∀i<l(sub((x)i+1, s) = (y)i+1))

if (x)0,0 = 1 ∨ (x)0,0 = 2 ∨ (x)0 = SN(→)〈SN(∀), x+ i(s), sub((x)2,update(s, (x)1, 〈SN(Var), x+ i(s)〉))〉

if (x)0 = SN(∀)0 otherwise

sub(x, s) ≤ k(x, s)

where it is assumed that the relation and function symbols in the givenlanguage L all have arity ≤ l. The bound k(x, s) and a bound for theiterated parameter updates remain to be provided, so that the last lemmain 2.2.5 can be applied. Then sub will be elementary.

First notice that as s is continually updated by the recursion, for thesake of (the formula or term with GN) x, the first update assigns to a boundvariable in x a “new” variable with index x+i(s). The next update will thenassign to a bound variable in some subformula x′ of x a new variable withindex x′+x+ i(s) etcetera. The final update will therefore be a sequence oflength ≤ x2+i(s), whose entries are all < max(s, 〈SN(Var), x2+i(s)〉). Thusa bound for all iterated updates starting from s and x is this last expressionto the power of 2x

2+i(s), which is elementary.Using the lemma in 3.2.1 above one can see that if x is the GN of a term

or a formula X and s is the GN of a substitution S, so that we may writesub(x, s) = pX[S]q, then Symb(X[S]) ≤ max(s, x, x2 + i(s)) ≤ max(x2 + s)and, clearly, ||X[S]|| ≤ x + s. The lemma then gives an elementary boundk(x, s) := (x2 + s)2

x+sfor sub(x, s).


Remove, union, consistency, context. Removal of an assumption vari-able from a context is defined by

remove(c, i) := µx≤c((x)i = 0 ∧ ∀j<lh(c)(j 6= i→ (x)j = (c)j)).

The union of two consistent contexts can again be defined by the boundedµ-operator:

union(c1, c2) := µc≤c1∗c2∀i<max(lh(c1),lh(c2))((c)i = max((c1)i, (c2)i)).

Consistency of two contexts is defined by

Cons(c1, c2) := ∀i<max(lh(c1),lh(c2))((c1)i 6= 0→ (c2)i 6= 0→ (c1)i = (c2)i).

The context of a derivation is defined by

ctx(d) := µc≤d(((d)0 = SN(AssVar) ∧ (c)(d)1 = (d)2) ∨((d)0 = SN(→+) ∧ c = remove(ctx((d)1), i)) ∨((d)0 = SN(→−) ∧ c = union(ctx((d)1), ctx((d)2))) ∨((d)0 = SN(∀+) ∧ c = ctx((d)2)) ∨((d)0 = SN(∀−) ∧ c = ctx((d)1))).

Formulas, free variables, terms. The end formula of a derivation is de-fined by

fmla(d) := µa≤d2d (((d)0 = SN(AssVar) ∧ a = (d)2) ∨

((d)0 = SN(→+) ∧ a = 〈SN(→), (d)2, fmla((d)3)〉) ∨((d)0 = SN(→−) ∧ a = (fmla((d)1))2) ∨((d)0 = SN(∀+) ∧ a = 〈SN(∀), (d)1, fmla((d)2)〉) ∨((d)0 = SN(∀−) ∧

sub((fmla((d)1))2, µs≤d((s)(fmla((d)1))1=(d)2)) = a)).

Notice that this is the only place in our definitions of auxiliary functionsand relations where the substitution function is needed.

Freeness of a variable xi in a term or formula is defined by

FV(i, y) := ((y)0 = SN(Var) ∧ (y)1 = i) ∨((y)0,0 = 1 ∧ ∃j<lh(y)−· 1FV(i, (y)j+1)) ∨((y)0,0 = 2 ∧ ∃j<lh(y)−· 1FV(i, (y)j+1)) ∨((y)0 = SN(→) ∧ (FV(i, (y)1) ∨ FV(i, (y)2))) ∨((y)0 = SN(∀) ∧ i 6= (y)1 ∧ FV(i, (y)2)).

The sets of formulas and terms are defined by

For(a) :=

((a)0,0 = 1 ∧ SymbL((a)0) ∧ lh(a) = (a)0,1 + 1 ∧ ∀j<(a)0,1Ter((a)j+1)) ∨

((a)0 = SN(→) ∧ lh(a) = 3 ∧ For((a)1) ∧ For((a)2)) ∨((a)0 = SN(∀) ∧ lh(a) = 3 ∧ For((a)2)),

Ter(t) := ((t)0 = SN(Var) ∧ lh(t) = 2) ∨((t)0,0 = 2 ∧ SymbL((t)0) ∧ lh(t) = (t)0,1+1 ∧ ∀j<(t)0,1

Ter((t)j+1)).


Recall that for simplicity we have left out the logical connectives ∧, ∨ and∃. They could be added easily, including an extension of the notion of aderivation to also allow their axioms as listed in 1.1.7.

3.2.3. Axiomatized theories. Call a relation recursive if its (total)characteristic function is recursive. A set S of formulas is called recursive(elementary , primitive recursive, recursively enumerable), if pSq := pAq |A ∈ S is recursive (elementary, primitive recursive, recursively enumer-able). Clearly the sets StabL of stability axioms and EqL of L-equalityaxioms are elementary. Now let L be an elementarily presented languagewith = in L. A theory T with L(T ) ⊆ L is recursively (elementarily, primi-tive recursively) axiomatizable, if there is a recursive (elementary, primitiverecursive) set S of closed L-formulas such that T = A ∈ L | S∪EqL ` A .

Theorem. For theories T with L(T ) ⊆ L the following are equivalent.(a) T is recursively axiomatizable.(b) T is primitive recursively axiomatizable.(c) T is elementarily axiomatizable.(d) T is recursively enumerable.

Proof. (d) → (c). Let pTq be recursively enumerable. Then there isan elementary f such that pTq = ran(f). Let f(n) = pAnq. We define anelementary function g with the property g(n) = pA0 ∧ · · · ∧Anq by

g(0) := f(0),

g(n+ 1) := g(n) ∧ f(n+ 1),

g(n) ≤ max(SN(∧),m)24n+m

where m := maxi<n f(i)

with a ∧ b := 〈SN(∧), a, b〉. For S := A0∧ · · ·∧An | n ∈ N we have pSq =ran(g), and this set is elementary because of a ∈ ran(g)↔ ∃n<a(a = g(n)).T is elementarily axiomatizable, since T = A ∈ L | S ∪ EqL ` A .

(c) → (b) and (b) → (a) are clear.(a) → (d). Let T be axiomatized by S with pSq recursive. Then

a ∈ pTq↔ ∃d(Deriv(d) ∧ fmla(d) = a ∧ ∀i<a¬FV(i, a) ∧∀i<lh(ctx(d))((ctx(d))i ∈ pEqq ∪ pSq)).

Hence pTq is recursively enumerable.

Call a theory T in our elementarily presented language L axiomatized if itis given by a recursively enumerable axiom system AxT . By the theorem justproved we can even assume that AxT is elementary. For such axiomatizedtheories we define a binary relation PrfT by

PrfT (d, a) := Deriv(d)∧ fmla(d)=a∧∀i<lh(ctx(d))((ctx(d))i ∈ pEqq∪pAxT q).

Clearly PrfT is elementary and PrfT (d, a) if and only if d is the GN of aderivation of the formula with GN a from a context composed of equalityaxioms and formulas from AxT . A theory T is consistent if ⊥ /∈ T ; otherwiseT is inconsistent . A theory T is complete if for every closed formula A wehave A ∈ T or A /∈ T , and incomplete otherwise.

Corollary. Let T be a consistent theory. If T is axiomatized andcomplete then T is recursive.


Proof. We define the characteristic function c pTq of pTq as follows.c pTq(a) is 0 if ¬For(a) or ∃i<aFV(i, a). Otherwise it is defined by

c pTq(a) = (µx((PrfT ((x)0, a) ∧ (x)1 = 1) ∨ (PrfT ((x)0, ¬a) ∧ (x)1 = 0)))1with ¬a := 〈SN(→), a,SN(⊥)〉. Completeness of T implies that c pTq is total,and consistency that it indeed is the characteristic function of pTq.

3.2.4. Undefinability of the notion of truth. Let M be an L-structure. A relation R ⊆ |M|n is called definable in M if there is anL-formula A(x1, . . . , xn) such that

R = (a1, . . . , an) ∈ |M|n | M |= A(x1, . . . xn)[x1 := a1, . . . , xn := an] .We assume in this section that |M| = N, 0 is a constant in L and S is aunary function symbol in L with 0M = 0 and SM(a) = a + 1. Recall thatfor every a ∈ N the numeral a ∈ TerL is defined by 0 := 0 and n+ 1 := Sn.Observe that in this case the definability of R ⊆ Nn by A(x1, . . . , xn) isequivalent to

R = (a1, . . . , an) ∈ Nn | M |= A(a1, . . . , an) .Furthermore let L be an elementarily presented language. We will alwaysassume in this chapter that every elementary relation is definable in M. Aset S of formulas is called definable inM if pSq := pAq | A ∈ S is.

We shall show that already from these assumptions it follows that thenotion of truth forM, more precisely the set Th(M) of all closed formulasvalid inM, is undefinable inM. From this it will follow that the notion oftruth is in fact undecidable, for otherwise the set Th(M) would be recur-sive (Church’s Thesis), hence recursively enumerable, and hence definable,because we have assumed already that all elementary relations are definableinM and so their projections are definable also. For the proof we shall needthe following Fixed Point Lemma, which will be generalized in 3.3.2.

Lemma (Semantical Fixed Point Lemma). If every elementary relationis definable in M, then for every L-formula B(z) we can find a closed L-formula A such that

M |= A if and only if M |= B(pAq).

Proof. Let s be the elementary function satisfying for every formulaC = C(z) with z := x0,

s(pCq, k) = sub(pCq, 〈pkq〉) = pC(k)q

where sub is the substitution function already defined in 3.2.2. Hence inparticular

s(pCq, pCq) = pC(pCq)q.By assumption the graph Gs of s is definable in M, by As(x1, x2, x3) say.Let

C := ∃x(B(x) ∧As(z, z, x)), A := C(pCq),and therefore

A = ∃x(B(x) ∧As(pCq, pCq, x)).Hence M |= A if and only if ∃a∈N((M |= B(a)) ∧ a = pC(pCq)q), which isthe same as M |= B(pAq).

3.3. THE NOTION OF TRUTH IN FORMAL THEORIES 103

Theorem (Tarski’s Undefinability Theorem). Assume that every ele-mentary relation is definable in M. Then Th(M) is undefinable in M,hence in particular not recursively enumerable.

Proof. Assume that pTh(M)q is definable by BW (z). Then for allclosed formulas A

M |= A if and only if M |= BW (pAq).

Now consider the formula ¬BW (z) and choose by the Fixed Point Lemmaa closed L-formula A such that

M |= A if and only if M |= ¬BW (pAq).

This contradicts the equivalence above.We already have noticed that all recursively enumerable relations are

definable in M. Hence it follows that pTh(M)q cannot be recursively enu-merable.

3.3. The Notion of Truth in Formal Theories

We now want to generalize the arguments of the previous section. Therewe have made essential use of the notion of truth in a structure M, i.e., ofthe relationM |= A. The set of all closed formulas A such thatM |= A hasbeen called the theory ofM, denoted Th(M).

Now instead of Th(M) we shall start more generally from an arbitrarytheory T . We consider the question as to whether in T there is a notion oftruth (in the form of a truth formula B(z)), such that B(z) “means” thatz is “true”. A consequence is that we have to explain all the notions usedwithout referring to semantical concepts at all.

(i) z ranges over closed formulas (or sentences) A, or more precisely overtheir Godel numbers pAq.

(ii) A “true” is to be replaced by T ` A.(iii) C “equivalent” to D is to be replaced by T ` C ↔ D.Hence the question now is whether there is a truth formula B(z) such thatT ` A↔ B(pAq) for all sentences A. The result will be that this is impossi-ble, under rather weak assumptions on the theory T . Technically, the issuewill be to replace the notion of definability by the notion of “representabil-ity” within a formal theory. We begin with a discussion of this notion.

In this section we assume that L is an elementarily presented languagewith 0, S and = in L, and T an L-theory containing the equality axiomsEqL.

3.3.1. Representable relations and functions.

Definition. A relation R ⊆ Nn is representable in T if there is a formulaA(x1, . . . , xn) such that

T ` A(a1, . . . , an) if (a1, . . . , an) ∈ R,T ` ¬A(a1, . . . , an) if (a1, . . . , an) /∈ R.

A function f : Nn → N is called representable in T if there is a formulaA(x1, . . . , xn, y) representing the graph Gf ⊆ Nn+1 of f , i.e., such that

T ` A(a1, . . . , an, f(a1, . . . , an)),(3.1)


T ` ¬A(a1, . . . , an, c) if c 6= f(a1, . . . , an)(3.2)

and such that in addition

(3.3) T ` A(a1, . . . , an, y) ∧A(a1, . . . , an, z)→ y=z for all a1, . . . , an ∈ N.

Note that in case T ` b 6= c for b < c condition (3.2) follows from (3.1)and (3.3).

Lemma. If the characteristic function cR of a relation R ⊆ Nn is repre-sentable in T , then so is the relation R itself.

Proof. For simplicity assume n = 1. Let A(x, y) be a formula repre-senting cR. We show that A(x, 1) represents the relation R. Assume a ∈ R.Then cR(a) = 1, hence (a, 1) ∈ GcR , hence T ` A(a, 1). Conversely, assumea /∈ R. Then cR(a) = 0, hence (a, 1) /∈ GcR , hence T ` ¬A(a, 1).

3.3.2. Undefinability of the notion of truth in formal theories.

Lemma (Fixed Point Lemma). Assume that all elementary functionsare representable in T . Then for every formula B(z) we can find a closedformula A such that

T ` A↔ B(pAq).

Proof. The proof is very similar to the proof of the Semantical FixedPoint Lemma. Let s be the elementary function introduced there andAs(x1, x2, x3) a formula representing s in T . Let

C := ∃x(B(x) ∧As(z, z, x)), A := C(pCq),

and thereforeA = ∃x(B(x) ∧As(pCq, pCq, x)).

Because of s(pCq, pCq) = pC(pCq)q = pAq we can prove in T

As(pCq, pCq, x)↔ x = pAq,

hence by definition of A also

A↔ ∃x(B(x) ∧ x = pAq)

and thereforeA↔ B(pAq).

Note that for T = Th(M) we obtain the Semantical Fixed Point Lemmaabove as a special case.

Theorem. Let T be a consistent theory such that all elementary func-tions are representable in T . Then there cannot exist a formula B(z) definingthe notion of truth, i.e., such that for all closed formulas A

T ` A↔ B(pAq).

Proof. Assume we would have such a B(z). Consider the formula¬B(z) and choose by the Fixed Point Lemma a closed formula A such that

T ` A↔ ¬B(pAq).

For this A we obtain T ` A↔ ¬A, contradicting the consistency of T .

With T := Th(M) Tarski’s Undefinability Theorem is a special case.

3.4. UNDECIDABILITY AND INCOMPLETENESS 105

3.4. Undecidability and Incompleteness

Consider a consistent formal theory T with the property that all recur-sive functions are representable in T . This is a very weak assumption, aswe shall show in the next section: it is always satisfied if the theory allowsto develop a certain minimum of arithmetic. We shall show that such atheory necessarily is undecidable. First we shall prove a (weak) First In-completeness Theorem saying that every axiomatized such theory must beincomplete, and then we prove a sharpened form of this theorem due toGodel and then Rosser, which explicitly provides a closed formula A suchthat neither A nor ¬A is provable in the theory T .

In this section let L again be an elementarily presented language with0, S, = in L and T a theory containing the equality axioms EqL.

3.4.1. Undecidability.

Theorem (Undecidability). Assume that T is a consistent theory suchthat all recursive functions are representable in T . Then T is not recursive.

Proof. Assume that T is recursive. By assumption there exists a for-mula B(z) in representing pTq in T . Choose by the Fixed Point Lemma aclosed formula A such that

T ` A↔ ¬B(pAq).

We shall prove (∗) T 6` A and (∗∗) T ` A; this is the desired contradiction.Ad (∗). Assume T ` A. Then A ∈ T , hence pAq ∈ pTq, hence T `

B(pAq) (because B(z) represents in T the set pTq). By the choice of A itfollows that T ` ¬A, which contradicts the consistency of T .

Ad (∗∗). By (∗) we know T 6` A. Therefore A /∈ T , hence pAq /∈ pTqand therefore T ` ¬B(pAq). By the choice of A it follows that T ` A.

3.4.2. Incompleteness.

Theorem (First Incompleteness Theorem). Assume that T is an ax-iomatized consistent theory with the property that all recursive functions arerepresentable in T . Then T is incomplete.

Proof. This is an immediate consequence of the fact that every axiom-atized consistent theory which is complete is also recursive (a corollary in3.2.3), and the Undecidability Theorem above.

As already mentioned, we now sharpen the Incompleteness Theorem inthe sense that we actually produce a formula A such that neither A nor¬A is provable. Godel’s first incompleteness theorem produced such an Aunder the assumption that the theory satisfied a stronger condition thanmere consistency, namely “ω-consistency”. Rosser then improved Godel’sresult by showing, with a somewhat more complicated formula, that mereconsistency is all that is required.

Theorem (Godel-Rosser). Let T be axiomatized and consistent. As-sume that there is a formula L(x, y) – written x < y – such that

T ` ∀x<n(x = 0 ∨ · · · ∨ x = n− 1),(3.4)

T ` ∀x(x = 0 ∨ · · · ∨ x = n ∨ n < x).(3.5)


Assume also that every elementary function is representable in T . Then wecan find a closed formula A such that neither A nor ¬A is provable in T .

Proof. We first define RefutT ⊆ N× N by

RefutT (d, a) := PrfT (d, ¬a).Then RefutT is elementary and RefutT (d, a) if and only if d is the GNof a derivation of the negation of a formula with GN a from a contextcomposed of equality axioms and formulas from AxT . Let BPrfT

(x1, x2)and BRefutT

(x1, x2) be formulas representing PrfT and RefutT , respectively.Choose by the Fixed Point Lemma a closed formula A such that

T ` A↔ ∀x(BPrfT(x, pAq)→ ∃y<xBRefutT

(y, pAq)).

A expresses its own underivability, in the form (due to Rosser): “For everyproof of me there is a shorter proof of my negation”.

We shall show (∗) T 6` A and (∗∗) T 6` ¬A. Ad (∗). Assume T ` A.Choose n such that

PrfT (n, pAq).

Then we also have

not RefutT (m, pAq) for all m,

since T is consistent. Hence we have

T ` BPrfT(n, pAq),

T ` ¬BRefutT(m, pAq) for all m.

By (3.4) we can conclude

T ` BPrfT(n, pAq) ∧ ∀y<n¬BRefutT

(y, pAq).

Hence we have

T ` ∃x(BPrfT(x, pAq) ∧ ∀y<x¬BRefutT

(y, pAq)),T ` ¬A.

This contradicts the assumed consistency of T .Ad (∗∗). Assume T ` ¬A. Choose n such that

RefutT (n, pAq).

Then we also have

not PrfT (m, pAq) for all m,

since T is consistent. Hence we have

T ` BRefutT(n, pAq),

T ` ¬BPrfT(m, pAq) for all m.

This implies

T ` ∀x(BPrfT(x, pAq)→ ∃y<xBRefutT

(y, pAq)),

3.5. REPRESENTABILITY 107

as can be seen easily by cases on x, using (3.5). Hence T ` A. But thisagain contradicts the assumed consistency of T .

Finally we formulate a variant of this theorem which does not assumethat the theory T talks about numbers only. Call T a theory with definednatural numbers if there is a formula N(x) – written Nx – such that T ` N0and T ` ∀x∈NN(Sx) where ∀x∈NA is short for ∀x(Nx → A). Representinga function in such a theory of course means that the free variables in (3.3)are relativized to N :

T ` ∀y,z∈N (A(a1, . . . , an, y)∧A(a1, . . . , an, z)→ y=z) for all a1, . . . , an ∈ N.

Theorem (Godel-Rosser). Assume that T is an axiomatized consistenttheory with defined natural numbers, and that there is a formula L(x, y) –written x < y – such that

T ` ∀x∈N (x < n→ x = 0 ∨ · · · ∨ x = n− 1),

T ` ∀x∈N (x = 0 ∨ · · · ∨ x = n ∨ n < x).

Assume also that every elementary function is representable in T . Then onecan find a closed formula A such that neither A nor ¬A is provable in T .

Proof. As for the Godel-Rosser Theorem above; just relativize all quan-tifiers to N .

3.5. Representability

We show in this section that already very simple theories have the prop-erty that all recursive functions are representable in them.

3.5.1. Weak arithmetical theories.

Theorem (Weak arithmetical theories). Let L be an elementarily pre-sented language with 0, S, = in L and T a consistent theory with definednatural numbers containing the equality axioms EqL and the stability axiom∀x,y∈N (¬¬x = y → x = y). Assume that there is a formula L(x, y) – writtenx < y – such that

T ` Sa 6= 0 for all a ∈ N,(3.6)

T ` Sa = Sb→ a = b for all a, b ∈ N,(3.7)

the functions + and · are representable in T ,(3.8)

T ` ∀x∈N (x 6< 0),(3.9)

T ` ∀x∈N (x < Sb→ x < b ∨ x = b) for all b ∈ N,(3.10)

T ` ∀x∈N (x < b ∨ x = b ∨ b < x) for all b ∈ N.(3.11)

Then T fulfills the assumptions of the Godel-Rosser Theorem relativized toN , i.e.,

T ` ∀x∈N (x < a→ x = 0 ∨ · · · ∨ x = a− 1) for all a ∈ N,(3.12)

T ` ∀x∈N (x = 0 ∨ · · · ∨ x = a ∨ a < x) for all a ∈ N,(3.13)

and every recursive function is representable in T .


Proof. (3.12) can be proved easily by induction on a. The base casefollows from (3.9), and the step from the induction hypothesis and (3.10).(3.13) immediately follows from the trichotomy law (3.11), using (3.12).

For the representability of recursive functions, first note that the formu-las x = y and x < y actually do represent in T the equality and the less-thanrelations, respectively. From (3.6) and (3.7) we can see immediately thatT ` a 6= b when a 6= b. Assume a 6< b. We show T ` a 6< b by inductionon b. T ` a 6< 0 follows from (3.9). In the step we have a 6< b + 1, hencea 6< b and a 6= b, hence by induction hypothesis and the representability(above) of the equality relation, T ` a 6< b and T ` a 6= b, hence by (3.10)T ` a 6< Sb. Now assume a < b. Then T ` a 6= b and T ` b 6< a, hence by(3.11) T ` a < b.

We now show by induction on the definition of µ-recursive functions,that every recursive function is representable in T . Recall (from 3.3.1) thatthe second condition (3.2) in the definition of representability of a functionautomatically follows from the other two (and hence need not be checkedfurther). This is because T ` a 6= b for a 6= b.

The initial functions constant 0, successor and projection (onto the i-th coordinate) are trivially represented by the formulas 0 = y, Sx = yand xi = y respectively. Addition and multiplication are represented inT by assumption. Recall that the one remaining initial function of µ-recursiveness is −· , but this is definable from the characteristic functionof < by a−· b = µi(b+ i ≥ a) = µi(c<(b+ i, a) = 0). We now show that thecharacteristic function of < is representable in T . (It will then follow that−· is representable, once we have shown that the representable functions areclosed under µ.) So define

A := (x1 < x2 ∧ y = 1) ∨ (x1 6< x2 ∧ y = 0).

Assume a1 < a2. Then T ` a1 < a2, hence T ` A(a1, a2, 1). Now assumea1 6< a2. Then T ` a1 6< a2, hence T ` A(a1, a2, 0). Furthermore notice that∀y,z∈N (A(a1, a2, y)∧A(a1, a2, z)→ y = z) already follows logically from theequality axioms (by cases on a1 < a2, using stability of equality).

For the composition case, suppose f is defined from h, g1, . . . , gm by

f(~a ) = h(g1(~a ), . . . , gm(~a )).

By induction hypothesis we already have representing formulas Agi(~x, yi)and Ah(~y, z). As representing formula for f we take

Af := ∃~y∈N (Ag1(~x, y1) ∧ · · · ∧Agm(~x, ym) ∧Ah(~y, z)).Assume f(~a ) = c. Then there are b1, . . . , bm such that T ` Agi(~a, bi) for eachi, and T ` Ah(~b, c) so by logic T ` Af (~a, c). It remains to show uniquenessT ` ∀z1,z2∈N (Af (~a, z1)∧Af (~a, z2)→ z1 = z2). But this follows by logic fromthe induction hypothesis for gi, which gives

T ` ∀y1i,y2i∈N (Agi(~a, y1i) ∧Agi(~a, y2i)→ y1i = y2i = gi(~a ))

and the induction hypothesis for h, which gives

T ` ∀z1,z2∈N (Ah(~b, z1) ∧Ah(~b, z2)→ z1 = z2) with bi = gi(~a ).

For the µ case, suppose f is defined from g (taken here to be binary fornotational convenience) by f(a) = µi(g(i, a) = 0), assuming ∀a∃i(g(i, a) =

3.5. REPRESENTABILITY 109

0). By induction hypothesis we have a formula Ag(y, x, z) representing g.In this case we represent f by the formula

Af (x, y) := Ny ∧Ag(y, x, 0) ∧ ∀v∈N (v < y → ∃u∈N ;u 6=0Ag(v, x, u)).

We first show the representability condition (3.1), that is T ` Af (a, b) whenf(a) = b. Because of the form of Af this follows from the assumed repre-sentability of g together with T ` ∀v∈N (v < b→ v = 0 ∨ · · · ∨ v = b− 1).

We now tackle the uniqueness condition (3.3). Given a, let b := f(a)(thus g(b, a) = 0 and b is the least such). It suffices to show

T ` ∀y∈N (Af (a, y)→ y = b).

We prove T ` ∀y∈N (y < b→ ¬Af (a, y)) and T ` ∀y∈N (b < y → ¬Af (a, y)),and then appeal to the trichotomy law and stability of equality.

We first show T ` ∀y∈N (y < b → ¬Af (a, y)). Now since, for any i < b,T ` ¬Ag(i, a, 0) by the assumed representability of g, we obtain immediatelyT ` ¬Af (a, i). Hence because of T ` ∀y∈N (y < b→ y = 0 ∨ · · · ∨ y = b− 1)the claim follows.

Secondly, T ` ∀y∈N (b < y → ¬Af (a, y)) follows almost immediatelyfrom T ` ∀y∈N (b < y → Af (a, y)→ ∃u∈N ;u 6=0Ag(b, a, u)) and the uniquenessfor g, T ` ∀u∈N ;u 6=0(Ag(b, a, u)→ u = 0). This completes the proof.

3.5.2. Robinson’s theory Q. We conclude this section by consider-ing a special and particularly simple arithmetical theory due originally toRobinson (1950). Let L1 be the language given by 0, S, +, · and =, andlet Q be the theory determined by the axioms EqL1

, stability of equality¬¬x = y → x = y and

Sx 6= 0,(3.14)

Sx = Sy → x = y,(3.15)

x+ 0 = x,(3.16)

x+ Sy = S(x+ y),(3.17)

x · 0 = 0,(3.18)

x · Sy = x · y + x,(3.19)

∃z(x+ Sz = y) ∨ x = y ∨ ∃z(y + Sz = x).(3.20)

Theorem (Robinson’s Q). Every consistent theory T ⊇ Q fulfills theassumptions of the Godel-Rosser Theorem w.r.t. the definition L(x, y) :=∃z(x + Sz = y) of the <-relation. In particular, every recursive function isrepresentable in T .

Proof. We show that T satisfies the conditions of the previous theorem.For (3.6) and (3.7) this is clear. For (3.8) we can take x+y = z and x ·y = zas representing formulas. For (3.9) we have to show ¬∃z(x + Sz = 0); thisfollows from (3.17) and (3.14). For the proof of (3.10) we need the auxiliaryproposition

(3.21) x = 0 ∨ ∃y(x = 0 + Sy),

which will be attended to below. Assume x+Sz = Sb, hence also S(x+z) =Sb and therefore x + z = b. We must show ∃y′(x + Sy′ = b) ∨ x = b. Butthis follows from (3.21) for z. In case z = 0 we obtain x = b, and in case


∃y(z = 0 + Sy) we have ∃y′(x + Sy′ = b), since 0 + Sy = S(0 + y). Thus(3.10) is proved. (3.11) follows immediately from (3.20). For the proof of(3.21) we use (3.20) with y = 0. It clearly suffices to exclude the first case∃z(x+ Sz = 0). But this means S(x+ z) = 0, contradicting (3.14).

Corollary (Essential undecidability of Q). Every consistent theoryT ⊇ Q in an elementarily presented language L(T ) is non-recursive.

Proof. This follows from the theorem above and the UndecidabilityTheorem in 3.4.1.

Corollary (Undecidability of logic). The set of formulas derivable inthe classical fragment of minimal logic is non-recursive.

Proof. Otherwise Q would be recursive, because a formula A is deriv-able in Q if and only if the implication B → A is derivable, where B is theconjunction of the finitely many axioms and equality axioms of Q.

Remark. Note that it suffices that the underlying language containsone binary relation symbol (for =), one constant symbol (for 0), one unaryfunction symbol (for S) and two binary functions symbols (for + and ·). Thestudy of decidable fragments of first order logic is one of the oldest researchareas of Mathematical Logic. For more information see Borger et al. (1997).

3.5.3. Σ1-formulas. Reading the above proof of representability, onecan see that the representing formulas used are of a restricted form, havingno unbounded universal quantifiers and therefore defining Σ0

1-relations. Thiswill be of crucial importance for our proof of Godel’s Second IncompletenessTheorem to follow, but in addition we need to make a syntactically precisedefinition of the class of formulas involved, more specific and apparentlymore restrictive than the notion of Σ1-formula used earlier. However, asproved in the corollary below, we can still represent all recursive functionseven in the weak theory Q by means of Σ1-formulas in this more restrictivesense. Consequently provable Σ1-ness will be the same whichever definitionwe take.

Definition. For the remainder of this chapter, the Σ1-formulas of thelanguage L1 will be those generated inductively by the following clauses:(a) Only atomic formulas of the restricted forms x = y, x 6= y, 0 = x,

Sx = y, x+ y = z and x · y = z are allowed as Σ1-formulas.(b) If A and B are Σ1-formulas, then so are A ∧B and A ∨B.(c) If A is a Σ1-formula, then so is ∀x<yA, which is an abbreviation for∀x(∃z(x+ Sz = y)→ A).

(d) If A is a Σ1-formula, then so is ∃xA.

Corollary. Every recursive function is representable in Q by a Σ1-formula in the language L1.

Proof. This can be seen immediately by inspecting the proof of thetheorem above on weak arithmetical theories. Only notice that because ofthe equality axioms ∃z(x+Sz = y) is equivalent to ∃z∃w(Sz = w∧x+w = y)and A(0) is equivalent to ∃x(0 = x ∧A(x)).

3.6. UNPROVABILITY OF CONSISTENCY 111

3.6. Unprovability of Consistency

We have seen in the theorem of Godel-Rosser how, for every axiomatizedconsistent theory T safisfying certain weak assumptions, we can constructan undecidable sentence A meaning “For every proof of me there is a shorterproof of my negation”. Because A is unprovable, it is clearly true.

Godel’s Second Incompleteness Theorem provides a particularly inter-esting alternative to A, namely a formula ConT expressing the consistencyof T . Again it turns out to be unprovable and therefore true. We shall provethis theorem in a sharpened form due to Lob.

3.6.1. Formalized Σ1-completeness. We prove an auxiliary propo-sition, expressing the completeness of Q with respect to Σ1-formulas.

Lemma. Let A(x1, . . . , xn) be a Σ1-formula in the language L1 deter-mined by 0, S, +, · und =. Assume that N1 |= A(a1, . . . , an) where N1 isthe standard model of L1. Then Q ` A(a1, . . . , an).

Proof. By induction on the Σ1-formulas of the language L1. For atomicformulas, the cases have been dealt with either in the earlier parts of theproof of the theorem above on weak arithmetical theories, or (for x+ y = zand x · y = z) they follow from the recursion equations (3.16) - (3.19).

Cases A∧B, A∨B. The claim follows immediately from the inductionhypothesis.

Case ∀x<yA(x, y, z1, . . . , zn); for simplicity assume n = 1. SupposeN1 |= (∀x<yA)(b, c). Then also N1 |= A(i, b, c) for each i < b and hence byinduction hypothesis Q ` A(i, b, c). Now by the theorem above on Robin-son’s Q

Q ` ∀x<b(x = 0 ∨ · · · ∨ x = b− 1),

henceQ ` (∀x<yA)(b, c).

Case ∃xA(x, y1, . . . , yn); for simplicity take n = 1. Assume N1 |=(∃xA)(b). Then N1 |= A(a, b) for some a ∈ N, hence by induction hypothesisQ ` A(a, b) and therefore Q ` (∃xA)(b).

Lemma (Formalized Σ1-Completeness). In an appropriate theory T ofarithmetic with induction, we can formally prove for any Σ1-formula A

A(~x )→ ∃pPrfT (p, pA(~x )q).

Here PrfT (p, z) is a suitable Σ1-formula which represents in Robinson’s Qthe recursive relation “a is the Godel number of a proof in T of the formulawith Godel number b”. Also pA(x)q is a term which represents, in Q, thenumerical function mapping a number a to the Godel number of A(a).

Proof. We have not been precise about the theory T in which thisresult is to be formalized, but we shall content ourselves at this stage withmerely pointing out, as we proceed, the basic properties that are required.Essentially T will be an extension of Q, together with induction formalizedby the axiom schema

B(0) ∧ ∀x(B(x)→ B(Sx))→ ∀xB(x)


and it will be assumed that T has sufficiently many basic functions availableto deal with the construction of appropriate Godel numbers.

The proof proceeds by induction on the build-up of the Σ1-formula A(~x ).We consider three atomic cases, leaving the others to the reader. Suppose

A(x) is the formula 0 = x. We show T ` 0 = x → ∃pPrfT (p, p0 = xq), byinduction on x. The base case merely requires the construction of a numeralrepresenting the Godel number of the axiom 0 = 0, and the induction stepis trivial because T ` Sx 6= 0. Secondly suppose A is the formula x+ y = z.We show T ` ∀z(x+ y = z → ∃pPrfT (p, px+ y = zq)) by induction on y. Ify = 0, the assumption gives x = z and one requires only the Godel numberfor the axiom ∀x(x + 0 = x) which, when applied to the Godel numberof the x-th numeral, gives ∃pPrfT (p, px + 0 = zq). If y is a successor Su,then the assumption gives z = Sv where x + u = v, so by the inductionhypothesis we already have a p such that PrfT (p, px + u = vq). Applyingthe successor to both sides, one then easily obtains from p a p′ such thatPrfT (p′, px + y = zq). Thirdly suppose A is the formula x 6= y. We showT ` ∀y(x 6= y → ∃pPrfT (p, px 6= yq)) by induction on x. The base casex = 0 requires a subinduction on y. If y = 0, then the claim is trivial(by ex-falso). If y = Su, we have to produce a Godel number p such thatPrfT (p, p0 6= Suq), but this is just an axiom. Now consider the step casex = Sv. Again we need an auxiliary induction on y. Its base case is dealtwith exactly as before, and when y = Su it uses the induction hypothesisfor v 6= u together with the injectivity of the successor.

The cases where A is built up by conjunction or disjunction are rathertrivial. One only requires, for example in the conjunction case, a functionwhich combines the Godel numbers of the proofs of the separate conjunctsinto a single Godel number of a proof of the conjunction A itself.

Now consider the case ∃yA(y, x) (with just one parameter x for sim-plicity). By the induction hypothesis we already have T ` A(y, x) →∃pPrfT (p, pA(y, x)q). But any Godel number p such that PrfT (p, pA(y, x)q)can easily be transformed (by formally applying the ∃-rule) into a Godelnumber p′ such that PrfT (p′, p∃yA(y, x)q). Therefore we obtain as required,T ` ∃yA(y, x)→ ∃p′PrfT (p′, p∃yA(y, x)q).

Finally suppose the Σ1-formula is of the form ∀u<yA(u, x). We show

∀u<yA(u, x)→ ∃pPrfT (p, p∀u<yA(u, x)q).

By the induction hypothesis

T ` A(u, x)→ ∃pPrfT (p, pA(u, x)q)

so by logic

T ` ∀u<yA(u, x)→ ∀u<y∃pPrfT (p, pA(u, x)q).

The required result now follows immediately from the auxiliary lemma:

T ` ∀u<y∃pPrfT (p, pA(u, x)q)→ ∃qPrfT (q, p∀u<yA(u, x)q).

It remains only to prove this, which we do by induction on y (inside T ). Incase y = 0 a proof of u < 0→ A is trivial, by ex-falso, so the required Godelnumber q is easily constructed. For the step case y = Sz the assumptiongives ∀u<z∃pPrfT (p, pA(u, x)q), from which follows ∃qPrfT (q, p∀u<zA(u, x)q)by the induction. Also ∃p′PrfT (p′, pA(z, x)q). Now we only have to combine

3.6. UNPROVABILITY OF CONSISTENCY 113

p′ and q to obtain (by means of an appropriate “simple” function) a Godelnumber q′ so that PrfT (q′, p∀u<yA(u, x)q).

3.6.2. Derivability conditions. Let T be an axiomatized consistenttheory with T ⊇ Q, and possessing “enough” induction to formalize Σ1-completeness as we have just done. Define, from the associated formulaPrfT , the following L1-formulas:

ThmT (x) := ∃yPrfT (y, x),

ConT := ¬∃yPrfT (y, p⊥q).

Then ThmT (x) defines in N1 the set of formulas provable in T , and we haveN1 |= ConT if and only if T is consistent. Now consider the following twoderivability conditions for T , due to Hilbert and Bernays (1970):

T ` A→ ThmT (pAq) A closed Σ1-formula of the language L1,(3.22)

T ` ThmT (pA→ Bq)→ ThmT (pAq)→ ThmT (pBq).(3.23)

(3.22) is just a special case of formalized Σ1-completeness for closed formulas,and (3.23) requires only that the theory T has a term that constructs, fromthe Godel number of a proof of A → B and the Godel number of a proofof A, the Godel number of a proof of B, and furthermore this fact must beprovable in T .

Theorem (Godel’s Second Incompleteness Theorem). Let T be an ax-iomatized consistent extension of Q, satisfying the derivability conditions(3.22) und (3.23). Then T 6` ConT .

Proof. Let C := ⊥ in Lob’s Theorem below, which is a generalizationof Godel’s original result.

Theorem (Lob). Let T be an axiomatized consistent extension of Qsatisfying the derivability conditions (3.22) and (3.23). Then for any closedL1-formula C, if T ` ThmT (pCq)→ C, then already T ` C.

Proof. Assume T ` ThmT (pCq) → C. Choose A by the Fixed PointLemma such that

(3.24) Q ` A↔ (ThmT (pAq)→ C).

We must show T ` C. First we show T ` ThmT (pAq)→ C. For shorthand,let B be the Σ1-formula ThmT (pAq). We obtain

T ` A→ B → C by (3.24)

T ` ThmT (pA→ B → Cq) by Σ1-completeness

T ` B → ThmT (pB → Cq) by (3.23)

T ` B → ThmT (pBq)→ ThmT (pCq) again by (3.23)

T ` B → ThmT (pBq) by (3.22)

T ` B → ThmT (pCq) by the last two lines.

Therefore from the assumption T ` ThmT (pCq) → C we immediately ob-tain T ` ThmT (pAq)→ C.

Hence T ` A by (3.24), and then T ` ThmT (pAq) by Σ1-completeness.But T ` ThmT (pAq)→ C as we have just shown, therefore T ` C.


Remark. It follows that if T is any axiomatized consistent extension ofQ satisfying the derivability conditions (3.22) und (3.23), then the reflectionschema

ThmT (pCq)→ C for closed L1-formulas Cis not derivable in T . For by Lob’s Theorem, it cannot be derivable whenC is underivable.

By adding to Q the induction schema for all formulas we obtain Peano-Arithmetic PA, which is the most natural example of a theory T to whichthe results above apply. However, various weaker fragments of PA, obtainedby restricting the classes of induction formulas, would serve equally well asexamples of such T .

3.7. Notes

The fundamental paper on incompleteness is Godel (1931). This papersalready contains the β-function crucially needed for the representation the-orem; the fixed point lemma is used implicitely. Godel’s first incompletenesstheorem uses the formula “I am not provable”, a fixed point of ¬ThmT (x).To prove independence of this proposition from the underlying theory T oneneeds ω-consistency of T (which is automatically fulfilled if T is a subtheoryof the theory of the standard model). Rosser (1936) found the sharpeningpresented here, using the formula “For every proof of me there is a shorterproof of my negation”. Undefinability of the notion of truth was provedoriginally by Tarski (1939), and undecidability of predicate logic is a resultof Church (1936). The arithmetical theory Q is due to Robinson (1950).

There is also much more work on general reflection principles, whichwe only have touched in the most simple case. One must mention hereSmorynski (1991) and Feferman (1960).

Part 2

Provable Recursion in ClassicalSystems

CHAPTER 4

The Provably Recursive Functions of Arithmetic

This chapter develops the classification theory of the provably recur-sive functions of arithmetic. The topic has a long history tracing back toKreisel (1951, 1952) who, in setting out his “no-counter-example” interpre-tation, gave the first explicit characterization of the functions “computablein” arithmetic, as those definable by recursions over standard well-orderingsof the natural numbers with order-types less than ε0. Such a characteriza-tion was perhaps not so surprising in light of the earlier, groundbreakingwork of Gentzen (1936), showing that these well-orderings are just the onesover which one can prove transfinite induction in arithmetic, and therebyprove the totality of functions defined by recursions over them. Subsequentwork of the present authors (1970, 1971, 1972), extending previous resultsof Grzegorczyk (1953) and Robbin (1965), then provided other complex-ity characterizations in terms of natural, simply-defined hierarchies of so-called “fast growing” bounding functions. What was surprising was thedeep connection later discovered, first by Ketonen and Solovay (1981), be-tween these bounding functions and a variety of combinatorial results relatedto the “modified” Finite Ramsey Theorem of Paris and Harrington (1977).It is through this connection that one gains immediate access to a range ofmathematically meaningful independence results for arithmetic and strongertheories. Thus, classifying the provably recursive functions of a theory notonly gives a measure of its computational power, it also serves to delimitits mathematical power in providing natural examples of true mathematicalstatements it cannot prove. The devil lies in the detail, however, and that’swhat we present here.

The main ingredients of the chapter are: (i) Parsons’ (1966) oft-quotedbut seldom fully-exposited refinement of Kreisel’s result, characterizing thefunctions provably recursive in fragments of arithmetic with restricted in-duction-complexity; (ii) their corresponding classifications in terms of thefast growing hierarchy; and (iii) applications to two of the best-known in-dependence results: Goodstein’s Theorem and the modified Finite RamseyTheorem. Whereas Kreisel’s original proof (that the provably recursive func-tions are “ordinal-recursive” at levels below ε0) was based on Ackermann’s(1940) analysis of the epsilon-substitution method for arithmetic, our prin-cipal method will be that first developed by Schutte (1951), namely cut-elimination in infinitary logics with ordinal bounds. A wide variety of othertreatments of these, and related, topics is to be found in the literature,some along similar lines to those presented here, some using quite differ-ent model-theoretic ideas, and some applying to stronger theories than justarithmetic (as we shall do later – for once the basic classification theory is es-tablished, there is no reason to stop at ε0). See for example (chronologically)

117

118 4. THE PROVABLY RECURSIVE FUNCTIONS OF ARITHMETIC

Tait (1961, 1968), Lob and Wainer (1970), Wainer (1970, 1972) Schwichten-berg (1971, 1975, 1977), Parsons (1972), Mints (1973), Zemke (1977), Paris(1980), Kirby and Paris (1982), Rose (1984), Sieg (1985, 1991), Buchholzand Wainer (1987), Buchholz (1980), Girard (1987), Takeuti (1987), Hajekand Pudlak (1993), Feferman (1992), Rathjen (1992, 1999), Sommer (1992,1995), Tucker and Zucker (1992), Ratajczyk (1993), Buchholz et al. (1994),Buss (1994, 1998), Friedman and Sheard (1995), Weiermann (1996, 1999),Avigad and Sommer (1997), Fairtlough and Wainer (1998), Troelstra andSchwichtenberg (2000).

Recall, from the previous chapter, that a function f : Nk → N is provablyΣ1, or provably recursive, in an arithmetical theory T if there is a Σ1-formulaF (~x, y) (i.e., one obtained by prefixing finitely many unbounded existentialquantifiers to a ∆0(exp) formula) such that

• f(~n ) = m if and only if F (~n,m) is true (in the standard model)• T ` ∃y F (~x, y)• T ` F (~x, y) ∧ F (~x, y′)→ y = y′.

The theories we shall be concerned with in this chapter are PA (PeanoArithmetic) and its inductive fragments IΣn. We take, as our formalizationof PA, I∆0(exp) together with all induction axioms

A(0) ∧ ∀a(A(a)→ A(a+ 1))→ A(t)

for arbitrary formulas A and (substitutible) terms t. IΣn has the same base-theory I∆0(exp), but the induction axioms are restricted to formulas A ofthe form Σi or Πi with i ≤ n, defined for the purposes of this chapter asfollows:

Definition. Σ1-formulas have already been defined. A Π1 formula isthe dual or (classically) negation of a Σ1-formula. For n > 1, a Σn formulais one formed by prefixing just one existential quantifier to a Πn−1 formula,and a Πn formula is one formed by prefixing just one universal quantifierto a Σn−1 formula. Thus only in the cases Σ1 and Π1 do strings of likequantifiers occur. In all other cases, strings of like quantifiers are assumedto have been contracted into one such, using the pairing functions π, π1, π2

which are available in I∆0(exp). This is no real restriction, but merely amatter of convenience for later results.

It doesn’t matter whether one restricts to Σn or Πn induction formulassince, in the presence of the subtraction function, induction on a Πn formulaA is reducible to induction on its Σn dual ¬A, and vice-versa. For if onereplaces A(a) by ¬A(t −· a) in the induction axiom, and then contraposes,one obtains

A(t−· t) ∧ ∀a(A(t−· (a+ 1))→ A(t−· a)) → A(t−· 0)

from which follows the induction axiom for A(a) itself, since t −· t = 0,t−· 0 = t, and t−· a = (t−· (a+ 1)) + 1 if t−· a 6= 0.

Historically of course, Peano’s Axioms only include definitions of zero,successor, addition and multiplication, whereas the base-theory we have cho-sen includes predecessor, modified subtraction and exponentiation as well.We do this because I∆0(exp) is both a natural and convenient theory to haveavailable from the start. However these extra functions can all be provably

4.1. PRIMITIVE RECURSION AND IΣ1 119

Σ1-defined in IΣ1 from the “pure” by Peano Axioms, using the Chinese Re-mainder Theorem, so we are not actually increasing the strength of any ofthe theories here by including them. Furthermore the results in this chapterwould not at all be affected by adding to the base-theory any other primitiverecursively defined functions one wishes.

4.1. Primitive Recursion and IΣ1

One of the most fundamental results about provable recursiveness, dueoriginally to Parsons (1966), is the fact that the provably recursive functionsof IΣ1 are exactly the primitive recursive functions. The proof is very similarto the one in the last chapter, characterizing the elementary functions asthose provably recursive in I∆0(exp), but the extra power of induction onunbounded existentially quantified formulas now allows us to prove thatevery primitive recursion terminates.

Lemma. Every primitive recursive function is provably recursive in IΣ1.

Proof. We must show how to assign, to each primitive recursive defi-nition of a function f , a Σ1-formula F (~x, y)↔ ∃zC(~x, y, z) such that

(1) f(~n ) = m if and only if F (~n,m) is true (in the standard model)(2) T ` ∃y F (~x, y)(3) T ` F (~x, y) ∧ F (~x, y′)→ y = y′.

In each case, C(~x, y, z) will be a ∆0(exp) formula constructed using thesequence coding machinery already shown to be definable (by bounded for-mulas) in I∆0(exp). It expresses that z is a uniquely determined sequence-number coding the computation of f(~x ) = y, and containing the outputvalue y as its final component, so that y = π2(z). Condition 1 will holdautomatically because of the definition of C, and condition 3 will be satis-fied because of the uniqueness of z. We consider in turn, each of the fivedefinitional schemes by which the function f may be introduced:

First suppose f is the constant-zero function f(x) = 0. Then we takeC(x, y, z) to be the formula y = 0∧ z = 〈0〉. Conditions 1, 2 and 3 are thenimmediately satisfied.

Similarly, if f is the successor function f(x) = x+1 we take C(x, y, z) tobe the formula y = x+ 1∧ z = 〈x+ 1〉. Again, the conditions hold trivially.

Similarly, if f is a projection function f(~x ) = xi we take C(~x, y, z) tobe the formula y = xi ∧ z = 〈xi〉.

Now suppose f is defined by substitution from previously generatedprimitive recursive functions f0, f1, . . . , fk thus:

f(~x ) = f0(f1(~x ), . . . , fk(~x )).

For typographical ease, and without any real loss of generality, we shall fixk = 2. So assume inductively that f0, f1, f2 have already been shown to beprovably recursive, with associated ∆0(exp) formulas C0, C1, C2 coding theircomputations. For the function f itself, define C(~x, y, z) to be the conjunc-tion of the formulas lh(z) = 4, C1(~x, π2((z)1), (z)1), C2(~x, π2((z)2), (z)2),C0(π2((z)1), π2((z)2), y, (z)0), and (z)3 = y.

Then condition 1 holds because f(~n ) = m if and only if there are num-bers m1,m2 such that f1(~n ) = m1, f2(~n ) = m2 and f0(m1,m2) = m; and


these hold if and only if there are numbers k1, k2, k0 such that C1(~n,m1, k1)and C2(~n,m2, k2) and C0(m1,m2,m, k0) are all true; and these hold ifand only if C(~n,m, 〈k0, k1, k2,m〉) is true. Thus f(~n ) = m if and onlyif F (~n,m) ≡ ∃zC(~n,m, z) is true.

Condition 2 holds as well, since from C1(~x, y1, z1), C2(~x, y2, z2) andC0(y1, y2, y, z0) we can immediately derive C(~x, y, 〈z0, z1, z2, y〉) in I∆0(exp).So from ∃y∃zC1(~x, y, z), ∃y∃zC2(~x, y, z) and ∀x1∀x2∃y∃zC0(x1, x2, y, z) weobtain a proof of ∃y F (~x, y) ≡ ∃y∃zC(~x, y, z) as required.

Condition 3 holds because, from the corresponding property for each ofC0, C1 and C2, we can easily derive C(~x, y, z) ∧ C(~x, y′, z′)→ y = y′∧z = z′.

Finally suppose f is defined from f0 and f1 by primitive recursion:

f(~v, 0) = f0(~v) and f(~v, x+ 1) = f1(~v, x, f(~v, x))

where f0 and f1 are already assumed to be provably recursive with associated∆0(exp) formulas C0 and C1. Define C(~v, x, y, z) to be the conjunction ofthe formulas C0(~v, π2((z)0), (z)0), ∀i<xC1(~v, i, π2((z)i), π2((z)i+1), (z)i+1),(z)x+1 = y, π2((z)x) = y and lh(z) = x+ 2.

Then condition 1 holds because f(~l, n) = m if and only if there is asequence number k = 〈k0, . . . , kn,m〉 such that k0 codes the computation off(~l, 0) with value π2(k0), and for each i < n, ki+1 codes the computationof f(~l, i+ 1) = f1(~l, i, π2(ki)) with value π2(ki+1), and π2(kn) = m. This isequivalent to saying F (~l, n,m)↔ ∃z C(~l, n,m, z) is true.

For condition 2 note that in I∆0 we can prove

C0(~v, y, z)→ C(~v, 0, y, 〈z, y〉)

andC(~v, x, y, z) ∧ C1(~v, x, y, y′, z′) → C(~v, x+ 1, y′, t)

for a suitable term t which removes the end-component y of z, replacesit by z′, and then adds the final value-component y′. Specifically t =π(π(π1(z), z′), y′). Hence from ∃y∃z C0(~v, y, z) we obtain ∃y∃z C(~v, 0, y, z),and also from ∀y∃y′∃z′ C1(~v, x, y, y′, z′) we can derive

∃y∃zC(~v, x, y, z)→ ∃y∃zC(~v, x+ 1, y, z).

By the assumed provable recursiveness of f0 and f1, we therefore can proveoutright, ∃y F (~v, 0, y) and ∃y F (~v, x, y)→ ∃y F (~v, x+ 1, y). Then Σ1 induc-tion allows us to derive ∃y F (~v, x, y) immediately.

To show that condition 3 holds we argue informally in I∆0(exp). As-sume C(~v, x, y, z) and C(~v, x, y′, z′). Then z and z′ are sequence numbersof the same length x + 2. Furthermore we have C0(~v, π2((z)0), (z)0) andC0(~v, π2((z′)0), (z′)0) so by the assumed uniqueness condition for C0 wehave (z)0 = (z′)0. Similarly we have ∀i<xC1(~v, i, π2((z)i), π2((z)i+1), (z)i+1),and the same with z replaced by z′. So if (z)i = (z′)i we can deduce(z)i+1 = (z′)i+1 using the assumed uniqueness condition for C1. Thereforeby ∆0(exp) induction we obtain ∀i≤x ((z)i = (z′)i). The final conjuncts in Cgive (z)x+1 = π2((z)x) = y and the same with z replaced by z′ and y replacedby y′. But since (z)x = (z′)x this means y = y′ and, since all their compo-nents are equal, z = z′. Hence we have F (~v, x, y) ∧ F (~v, x, y′) → y = y′.This completes the proof.

4.1. PRIMITIVE RECURSION AND IΣ1 121

Definition. A closed Σ1-formula ∃~z B(~z ), with B ∈ ∆0(exp), is saidto be “true at m”, and we write m |= ∃~z B(~z ), if there are numbers ~m =m1, . . . ,ml all less than m, such that B(~m) is true (in the standard model).A finite set Γ of closed Σ1-formulas is “true at m”, written m |= Γ, if atleast one of them is true at m.

If Γ(x1, . . . , xk) is a finite set of Σ1 formulas all of whose free variablesoccur among x1, . . . , xk, and if f : Nk → N, then we write f |= Γ to mean thatfor all numerical assignments ~n = n1, . . . , nk to the variables ~x = x1, . . . , xkwe have f(~n ) |= Γ(~n ).

Note (Persistence). For sets Γ of closed Σ1-formulas, if m |= Γ andm < m′ then m′ |= Γ. Similarly for sets Γ(~x ) of Σ1-formulas with freevariables, if f |= Γ and f(~n ) ≤ f ′(~n ) for all ~n ∈ Nk then f ′ |= Γ.

Lemma (Σ1-induction). If Γ(~x ) is a finite set of Σ1-formulas (whosedisjunction is) provable in IΣ1 then there is a primitive recursive functionf , strictly increasing in each of its variables, such that f |= Γ.

Proof. It is convenient to use a Tait-style formalization of IΣ1, justlike the one used for I∆0(exp) in the last chapter, except that the inductionrule

Γ, A(0) Γ, ¬A(y), A(y + 1)Γ, A(t)

with y not free in Γ and t any term, now applies to any Σ1-formula A.Note that if Γ is provable in this system then it has a proof in which all

the non-atomic cut formulas are induction formulas (in this case Σ1). Forif Γ is classically derivable from non-logical axioms A1, . . . , As then there isa cut-free proof in (Tait-style) logic of ¬A1,∆,Γ where ∆ = ¬A2, . . . ,¬As.Then if A1 is an induction axiom on a formula F we have a cut-free proofin logic of

F (0) ∧ ∀y(¬F (y) ∨ F (y + 1)) ∧ ¬F (t), ∆, Γand hence, by inversion, cut-free proofs of F (0), ∆, Γ and ¬F (y), F (y +1), ∆, Γ and ¬F (t), ∆, Γ. From the first two of these we obtain F (t), ∆, Γby the induction rule above, and then from the third we obtain ∆, Γ by acut on the formula F (t). Similarly we can detach ¬A2, . . . ,¬As in turn, toyield finally a proof of Γ which only uses cuts on (Σ1) induction formulas oron atoms arising from other non-logical axioms. Such proofs are said to be“free-cut” free.

Choosing such a proof for Γ(~x ), we proceed by induction on its height,showing at each new proof-step how to define the required primitive recursivefunction f satisfying f |= Γ.

If Γ(~x ) is an axiom then for all ~n, Γ(~n ) contains a true atom. Thereforef |= Γ for any f , so choose f(~n ) = n1 + · · ·+ nk in order to make it strictlyincreasing.

If Γ, B0 ∨B1 arises by an application of the ∨-rule from Γ, B0, B1 then(because of our definition of Σ1-formula) B0 and B1 must both be ∆0(exp)formulas. Thus by our definition of “true at”, any function f satisfyingf |= Γ, B0, B1 must also satisfy f |= Γ, B0 ∨B1.

Only a slightly more complicated argument applies to the dual casewhere Γ, B0 ∧ B1 arises by an application of the ∧-rule from the premises


Γ, B0 and Γ, B1. For if f0(~n ) |= Γ(~n ), B0(~n ) and f1(~n ) |= Γ(~n ), B1(~n )for all ~n, then it is easy to see (by persistence) that f |= Γ, B0 ∧ B1 wheref(~n ) = f0(~n ) + f1(~n ).

If Γ, ∀yB(y) arises from Γ, B(y) by the ∀-rule (y not free in Γ) then sinceall formulas are Σ1, ∀yB(y) must be ∆0(exp) and so B(y) must be of theform y 6<t ∨ B′(y) for some (elementary, or even primitive recursive) termt. Now assume that f0 |= Γ, y 6< t,∨B′(y) for some increasing primitiverecursive function f0. Then for all assignments ~n to the free variables ~x,and all assignments k to the variable y,

f0(~n, k) |= Γ(~n ), k 6< t(~n ), B′(~n, k).

Therefore by defining f(~n ) = Σk<g(~n ) f0(~n, k), where g is some increasingelementary (or primitive recursive) function bounding the values of termt, we easily see that either f(~n ) |= Γ(~n ) or else B′(~n, k) is true for everyk < t(~n ). Hence f |= Γ, ∀yB(y) as required, and clearly f is primitiverecursive.

Now suppose Γ, ∃yA(y) arises from Γ, A(t) by the ∃-rule, where A is Σ1.Then by the induction hypothesis there is a primitive recursive f0 such thatfor all ~n,

f0(~n ) |= Γ(~n ), A(t(~n ), ~n ).

Then either f0(~n ) |= Γ(~n ) or else f0(~n ) bounds true witnesses for all theexistential quantifiers already in A(t(~n ), ~n ). Therefore by again choosingan elementary bounding function g for the term t, and defining f(~n ) =f0(~n ) + g(~n ), we see that either f(~n ) |= Γ(~n ) or f(~n ) |= ∃yA(y, ~n ) for all~n.

If Γ comes about by the cut rule with Σ1 cut formula C ≡ ∃~zB(~z ) thenthe two premises are Γ, ∀~z ¬B(~z ) and Γ, ∃~z B(~z ). The universal quantifiersin the first premise can be inverted (without increasing proof-height) to giveΓ, ¬B(~z ) and since B is ∆0(exp) the induction hypothesis can be applied tothis to give a primitive recursive f0 such that for all numerical assignments ~nto the (implicit) variables ~x and all assignments ~m to the new free variables~z,

f0(~n, ~m) |= Γ(~n ), ¬B(~n, ~m).

Applying the induction hypothesis to the second premise gives a primitiverecursive f1 such that for all ~n, either f1(~n ) |= Γ(~n ) or else there are fixedwitnesses ~m < f1(~n ) such that B(~n, ~m) is true. Therefore if we define f bysubstitution from f0 and f1 thus:

f(~n ) = f0(~n, f1(~n ), . . . , f1(~n ))

then f will be primitive recursive, greater than or equal to f1, and strictlyincreasing since both f0 and f1 are. Furthermore f |= Γ. For otherwisethere would be a tuple ~n such that Γ(~n ) is not true at f(~n ) and hence,by persistence, not true at f1(~n ). So B(~n, ~m) is true for certain numbers~m < f1(~n ). But then f0(~n, ~m) < f(~n ) and so, again by persistence, Γ(~n )cannot be true at f0(~n, ~m). This means B(~n, ~m) is false, by the above, andso we have a contradiction.

Finally suppose Γ(~x ), A(~x, t) arises by an application of the inductionrule on the Σ1 induction formula A(~x, y) ≡ ∃~z B(~x, y, ~z ). The premises

4.2. ε0-RECURSION IN PEANO ARITHMETIC 123

are Γ(~x ), A(~x, 0) and Γ(~x ), ¬A(~x, y), A(~x, y + 1). By inverting the uni-versal quantifiers over ~z in ¬A(~x, y), the second premise becomes Γ(~x ),¬B(~x, y, ~z ), A(~x, y + 1) which is now a set of Σ1 formulas, and the heightof its proof is not increased. Thus we can apply the induction hypothesisto each of the premises to obtain increasing primitive recursive functions f0

and f1 such that for all ~n, all k and all ~m,f0(~n ) |= Γ(~n ), A(~n, 0),

f1(~n, k, ~m) |= Γ(~n ), ¬B(~n, k, ~m), A(~n, k + 1).

Now define f by primitive recursion from f0 and f1 as follows:

f(~n, 0) = f0(~n ) and f(~n, k + 1) = f1(~n, k, f(~n, k), . . . , f(~n, k)).

Then for all ~n and all k, f(~n, k) |= Γ(~n ), A(~n, k). This is shown by in-duction on k. The base case is immediate by the definition of f(~n, 0). Theinduction step is much like the cut case above. Assume that f(~n, k) |=Γ(~n ), A(~n, k). If Γ(~n ) is not true at f(~n, k + 1) then by persistence itis not true at f(~n, k), and so f(~n, k) |= A(~n, k). Therefore there arenumbers ~m < f(~n, k) such that B(~n, k, ~m) is true. Hence f1(~n, k, ~m) |=Γ(~n ), A(~n, k + 1), and since f1(~n, k, ~m) ≤ f(~n, k + 1) we have, by persis-tence, f(~n, k + 1) |= Γ(~n ), A(~n, k + 1) as required.

It only remains to substitute, for the final argument k in f , an in-creasing elementary (or primitive recursive) function g which bounds thevalues of term t, so that with f ′(~n ) = f(~n, g(~n )) we have f(~n, t(~n )) |=Γ(~n ), A(~n, t(~n )) for all ~n, and hence f ′ |= Γ(~x ), A(~x, t) by persistence.This completes the proof.

Theorem. The provably recursive functions of IΣ1 are exactly the prim-itive recursive functions.

Proof. We have already shown that every primitive recursive functionis provably recursive in IΣ1. For the converse, suppose g : Nk → N is Σ1

defined by the formula F (~x, y) ≡ ∃zC(~x, y, z) with C ∈ ∆0(exp), and IΣ1 `∃y F (~x, y). Then by the lemma above, there is a primitive recursive functionf such that for all ~n ∈ Nk,

f(~n ) |= ∃y∃z C(~n, y, z).

This means that for every ~n there is an m < f(~n ) and a k < f(~n ) such thatC(~n,m, k) is true, and that this (unique) m must be the value of g(~n ). Wecan therefore define g primitive recursively from f as follows

g(~n ) = ( µm<h(~n )C(~n, (m)0, (m)1) )0where h(~n ) = 〈f(~n ), f(~n )〉. This completes the proof.

4.2. ε0-Recursion in Peano Arithmetic

We now set about showing that the provably recursive functions of PeanoArithmetic are exactly the ε0-recursive functions, i.e., those definable fromthe primitive recursive functions by substitutions and (arbitrarily nested)recursions over “standard” well orderings of the natural numbers with order-types less than the ordinal

ε0 = supω, ωω, ωωω, . . ..


As preliminaries, we must first develop some of the basic theory of theseordinals, and their standard codings as well-orderings on N . Then we definethe hierarchies of fast-growing bounding functions naturally associated withthem. These will provide an important complexity characterization throughwhich we can more easily obtain the main result.

4.2.1. Ordinals below ε0. Throughout the rest of this chapter, α, β,γ, δ, . . . will denote ordinals less than ε0. Every such ordinal is either 0 orcan be represented uniquely in so-called Cantor Normal Form thus:

α = ωγ1 · c1 + ωγ2 · c2 + · · ·+ ωγk · ckwhere γk < · · · < γ2 < γ1 < α and the coefficients c1, c2, . . . , ck are arbitrarypositive integers. If γk = 0 then α is a successor ordinal, written Succ(α),and its immediate predecessor α−1 has the same representation but with ckreduced to ck−1. Otherwise α is a limit ordinal, written Lim(α), and it hasinfinitely-many possible “fundamental sequences”, i.e., increasing sequencesof smaller ordinals whose supremum is α. However we shall pick out oneparticular fundamental sequence α(n) for each such limit ordinal α, asfollows: first write α as δ + ωγ where δ = ωγ1 · c1 + · · ·+ ωγk · (ck − 1) andγ = γk. Assume inductively that when γ is a limit, its fundamental sequenceγ(n) has already been specified. Then define, for each n ∈ N,

α(n) =

δ + ωγ−1 · (n+ 1) if Succ(γ)δ + ωγ(n) if Lim(γ).

Clearly α(n) is an increasing sequence of ordinals with supremum α.

Definition. With each α < ε0 and each natural number n, associate afinite set of ordinals α[n] as follows:

α[n] =

∅ if α = 0(α− 1)[n] ∪ α− 1 if Succ(α)α(n)[n] if Lim(α).

Lemma. For each α = δ + ωγ and all n,

α[n] = δ[n] ∪ δ + ωγ1 · c1 + · · ·+ ωγk · ck | ∀i(γi ∈ γ[n] ∧ ci ≤ n) .

Proof. By induction on γ. If γ = 0 then γ[n] is empty and so the righthand side is just δ[n]∪δ, which is the same as α[n] = (δ+1)[n] accordingto the definition above.

If γ is a limit then γ[n] = γ(n)[n] so the set on the right hand side is thesame as the one with γ(n)[n] instead of γ[n]. By the induction hypothesisapplied to α(n) = δ+ωγ(n), this set equals α(n)[n], which is just α[n] againby definition.

Now suppose γ is a successor. Then α is a limit and α[n] = α(n)[n] whereα(n) = δ + ωγ−1 · (n + 1). This we can write as α(n) = α(n − 1) + ωγ−1

where, in case n = 0, α(−1) = δ. By the induction hypothesis for γ− 1, theset α[n] is therefore equal to

α(n−1)[n]∪α(n−1)+ωγ1 · c1 + · · ·+ωγk · ck | ∀i(γi ∈ (γ−1)[n]∧ ci ≤ n)


and similarly for each of α(n−1)[n], α(n−2)[n], ... , α(1)[n]. Since for eachm ≤ n, α(m− 1) = δ + ωγ−1 ·m, this last set is the same as

δ[n]∪ δ+ωγ−1 ·m+ωγ1 ·c1+· · ·+ωγk ·ck | m ≤ n∧∀i(γi ∈ (γ−1)[n]∧ci≤n) and this is the set required because γ[n] = (γ−1)[n]∪γ−1. This completesthe proof.

Corollary. (a) For every limit ordinal α < ε0 and every n, α(n) ∈α[n+ 1]. (b) If β ∈ γ[n] then ωβ ∈ ωγ [n] provided n 6= 0.

Definition. The maximum coefficient of β = ωβ1 · b1 + · · ·+ ωβl · bl isdefined inductively to be the maximum of all the bi and all the maximumcoefficients of the exponents βi.

Lemma. If β < α and the maximum coefficient of β is ≤ n then β ∈α[n].

Proof. By induction on α. Let α = δ + ωγ . If β < δ, then β ∈ δ[n]by induction hypothesis and δ[n] ⊆ α[n] by the lemma. Otherwise β =δ + ωβ1 · b1 + · · · + ωβk · bk with α > γ > β1 > · · · > βk and bi ≤ n. Byinduction hypothesis βi ∈ γ[n]. Hence β ∈ α[n] by the lemma.

Definition. Let Gα(n) denote the cardinality of the finite set α[n].Then immediately from the definition of α[n] we have

Gα(n) =

0 if α = 0Gα−1(n) + 1 if Succ(α)Gα(n)(n) if Lim(α).

The hierarchy of functions Gα is called the “slow growing” hierarchy.

Lemma. If α = δ + ωγ then for all n

Gα(n) = Gδ(n) + (n+ 1)Gγ(n).

Therefore for each α < ε0, Gα(n) is the elementary function which resultsby substituting n+ 1 for every occurrence of ω in the Cantor Normal Formof α.

Proof. By induction on γ. If γ = 0 then α = δ+1, so Gα(n) = Gδ(n)+1 = Gδ(n) + (n + 1)0 as required. If γ is a successor then α is a limit andα(n) = δ+ωγ−1 ·(n+1), so by n+1 applications of the induction hypothesisfor γ − 1 we have Gα(n) = Gα(n)(n) = Gδ(n) + (n + 1)Gγ−1(n) · (n + 1) =Gδ(n)+ (n+1)Gγ(n) since Gγ−1(n)+1 = Gγ(n). Finally, if γ is a limit thenα(n) = δ + ωγ(n), so applying the induction hypothesis to γ(n), we haveGα(n) = Gα(n)(n) = Gδ(n) + (n + 1)Gγ(n)(n) which immediately gives thedesired result since Gγ(n)(n) = Gγ(n) by definition.

Definition (Coding ordinals). Encode each ordinal β = ωβ1 · b1 +ωβ2 ·b2+· · ·+ωβl ·bl by the sequence number β constructed recursively as follows:

β = 〈〈β1, b1〉, 〈β2, b2〉, . . . , 〈βl, bl〉〉.The ordinal 0 is coded by the empty sequence number, which is 0. Note thatβ is numerically greater than the maximum coefficient of β, and greater thanthe codes βi of all its exponents, and their exponents etcetera.


Lemma. (a) There is an elementary function h(m,n) such that, withm = β,

h(β, n) =

0 if β = 0β − 1 if Succ(β)β(n) if Lim(β).

(b) For each fixed α < ε0 there is an elementary well-ordering ≺α⊂ N2 suchthat for all b, c ∈ N, b ≺α c if and only if b = β and c = γ for someβ < γ < α.

Proof. (a) Thinking of m as a β, define h(m,n) as follows: First seth(0, n) = 0. Then if m is a non-zero sequence number, see if its final(rightmost) component π2(m) is a pair 〈m′, n′〉. If so, and m′ = 0 butn′ 6= 0, then β is a successor and the code of its predecessor, h(m,n), isthen defined to be the new sequence number obtained by reducing n′ byone (or removing this final component altogether if n′ = 1). Otherwise ifπ2(m) = 〈m′, n′〉 where m′ and n′ are both positive, then β is a limit of theform δ + ωγ · n′ where m′ = γ. Now let k be the code of δ + ωγ · (n′ − 1),obtained by reducing n′ by one inside m (or if n′ = 1, deleting the finalcomponent from m). Set k aside for the moment. At the “righthand end”of β we have a spare ωγ which, in order to produce β(n), must be reducedto ωγ−1 · (n + 1) if Succ(γ), or to ωγ(n) if Lim(γ). Therefore the requiredcode h(m,n) of β(n) will in this case be obtained by tagging onto the endof the sequence number k one extra pair coding this additional term. Butif we assume inductively that h(m′, n) has already been defined for m′ < mthen this additional component must be either 〈h(m′, n), n + 1〉 if Succ(γ)or 〈h(m′, n), 1〉 if Lim(γ).

This defines h(m,n), once we agree to set its value to zero in all extrane-ous cases where m is not a sequence number of the right form. However thedefinition so far given is a primitive recursion (depending on previous valuesfor smaller m’s). To make it elementary we need to check that h(m,n) isalso elementarily bounded, for then h is defined by “limited recursion” fromelementary functions, and we know that the result will then be an elemen-tary function. Now when m codes a successor then clearly, h(m,n) < m. Inthe limit case, h(m,n) is obtained from the sequence number k (numericallysmaller than m) by adding one new pair on the end. Recall that an extraitem i is tagged onto the end of a sequence number k by the function π(k, i)which is quadratic in k and i. If the item added is the pair 〈h(m′, n), n+ 1〉where Succ(γ), then h(m′, n) < m and so h(m,n) is numerically bounded bysome fixed polynomial in m and n. In the other case, however, all we can sayimmediately is that h(m,n) is numerically less than some fixed polynomialof m and h(m′, n). But since m′ codes an exponent in the Cantor NormalForm coded by m, this second polynomial cannot be iterated more than dtimes, where d is the “exponential height” of the normal form. Thereforeh(m,n) is bounded by some d-times iterated polynomial of m + n. Sinced < m it is therefore bounded by the elementary function 22c(m+n)

for someconstant c. Thus h(m,n) is defined by limited recursion, so it is elementary.

(b) Fix α < ε0 and let d be the exponential height of its Cantor NormalForm. We use the function h just defined in part (a), and note that if we only


apply it to codes for ordinals below α, they will all have exponential height≤ d, and so with this restriction we can consider h as being bounded by somefixed polynomial of its two arguments. Define g(0, n) = α and g(i+ 1, n) =h(g(i, n), n), and notice that g is therefore bounded by an i-times iteratedpolynomial, so g is defined by an elementarily limited recursion from h, andhence is itself elementary.

Now define b ≺α c if and only if c 6= 0 and there are i and j such that0 < i < j ≤ Gα(max(b, c)+1) and g(i,max(b, c)) = c and g(j,max(b, c)) = b.Since the functions g and Gα are elementary, and since the quantifiers arebounded, the relation ≺α is elementary. Furthermore by the properties ofh it is clear that if i < j then g(i,max(b, c)) codes an ordinal greater thang(j,max(b, c)) (provided the first is not zero). Hence if b ≺α c then b = βand c = γ for some β < γ < α.

We must show the converse, so suppose b = β and c = γ where β <γ < α. Then since the code of an ordinal is greater than its maximumcoefficient, we have β ∈ α[max(b, c)] and γ ∈ α[max(b, c)]. This means thatthe sequence starting with α and at each stage descending from a δ to eitherδ − 1 if Succ(δ) or δ(max(b, c)) if Lim(δ), must pass through first γ andlater, β. In terms of codes it means that there is an i and a j such that0 < i < j and g(i,max(b, c)) = c and g(j,max(b, c)) = b. Thus b ≺α c holdsif we can show that j ≤ Gα(max(b, c) + 1). In the descending sequence justdescribed, only the successor stages actually contribute an element δ − 1 toα[max(b, c)]. At the limit stages, δ(max(b, c)) does not get put in. Howeveralthough δ(n) does not belong to δ[n], it does belong to δ[n+ 1]. Thereforeall the ordinals in the descending sequence lie in α[max(b, c) + 1]. So j canbe no bigger than the cardinality of this set, which is Gα(max(b, c) + 1).This completes the proof.

Thus the principles of transfinite induction and transfinite recursion overinitial segments of the ordinals below ε0, can all be expressed in the languageof elementary recursive arithmetic.

4.2.2. The fast growing hierarchy and ε0-recursion.

Definition. The “Hardy Hierarchy” Hαα<ε0 is defined by recursionon α thus (cf. Hardy (1904)):

Hα(n) =

n if α = 0Hα−1(n+ 1) if Succ(α)Hα(n)(n) if Lim(α).

The “Fast Growing Hierarchy” Fαα<ε0 is defined by recursion on α thus:

Fα(n) =

n+ 1 if α = 0Fn+1α−1 (n) if Succ(α)Fα(n)(n) if Lim(α)

where Fn+1α−1 (n) is the n+ 1-times iterate of Fα−1 on n.

Note. The Hα and Fα functions could equally well be defined purelynumber-theoretically, by working over the well-orderings ≺α instead of di-rectly over the ordinals themselves. Thus they are ε0-recursive functions.


Lemma. For all α, β and all n,(a) Hα+β(n) = Hα(Hβ(n)),(b) Hωα(n) = Fα(n).

Proof. The first part is proven by induction on β, the unstated as-sumption being that the Cantor Normal Form of α + β is just the re-sult of concatenating their two separate Cantor Normal Forms, so that(α+ β)(n) = α+ β(n). This of course requires that the leading exponent inthe normal form of β is not greater than the final exponent in the normalform of α. We shall always make this assumption when writing α+ β.

If β = 0 the equation holds trivially because H0 is the identity function.If Succ(β) then by the definition of the Hardy functions and the inductionhypothesis for β − 1,

Hα+β(n) = Hα+(β−1)(n+ 1) = Hα(Hβ−1(n+ 1)) = Hα(Hβ(n)).

If Lim(β) then by the induction hypothesis for β(n),

Hα+β(n) = Hα+β(n)(n) = Hα(Hβ(n)(n)) = Hα(Hβ(n)).

The second part is proved by induction on α. If α = 0 then Hω0(n) =H1(n) = n + 1 = F0(n). If Succ(α) then by the limit case of the definitionof H, the induction hypothesis, and the first part above,

Hωα(n) = Hωα−1·(n+1)(n) = Hn+1ωα−1(n) = Fn+1

α−1 (n) = Fα(n).

If Lim(α) then the equation follows immediately by the induction hypothesisfor α(n). This completes the proof.

Lemma. For each α < ε0, Hα is strictly increasing and Hβ(n) < Hα(n)whenever β ∈ α[n]. The same holds for Fα, with the slight restriction thatn 6= 0, for when n = 0 we have Fα(0) = 1 for all α.

Proof. By induction on α. The case α = 0 is trivial since H0 is theidentity function and 0[n] is empty. If Succ(α) then Hα is Hα−1 composedwith the successor function, so it is strictly increasing by the inductionhypothesis. Furthermore if β ∈ α[n] then either β ∈ (α− 1)[n] or β = α− 1so, again by the induction hypothesis, Hβ(n) ≤ Hα−1(n) < Hα−1(n + 1) =Hα(n). If Lim(α) then Hα(n) = Hα(n)(n) < Hα(n)(n+ 1) by the inductionhypothesis. But as noted previously, α(n) ∈ α[n + 1] = α(n + 1)[n + 1], soby applying the induction hypothesis to α(n + 1) we have Hα(n)(n + 1) <Hα(n+1)(n + 1) = Hα(n + 1). Thus Hα(n) < Hα(n + 1). Furthermore ifβ ∈ α[n] then β ∈ α(n)[n] so Hβ(n) < Hα(n)(n) = Hα(n) straightaway bythe induction hypothesis for α(n).

The same holds for Fα = Hωα provided we restrict to n 6= 0 since ifβ ∈ α[n] we then have ωβ ∈ ωα[n]. This completes the proof.

Lemma. If β ∈ α[n] then Fβ+1(m) ≤ Fα(m) for all m ≥ n.

Proof. By induction on α, the zero case being trivial. If α is a successorthen either β ∈ (α− 1)[n] in which case the result follows straight from theinduction hypothesis, or β = α − 1 in which case it’s immediate. If α isa limit then we have β ∈ α(n)[n] and hence by the induction hypothesis,Fβ+1(m) ≤ Fα(n)(m). But Fα(n)(m) ≤ Fα(m) either by definition of F incase m = n, or by the last lemma when m > n since then α(n) ∈ α[m].


Definition (α-recursion).(a) An α-recursion is a function-definition of the following form, defining

f : Nk+1 → N from given functions g0, g1, . . . , gs by two clauses (in thesecond, n 6= 0):

f(0, ~m) = g0(~m)

f(n, ~m) = T (g1, . . . , gs, f≺n, n, ~m)

where T (g1, . . . , gs, f≺n, n, ~m) is a fixed term built up from the number-variables n, ~m by applications of the functions g1, . . . , gs and the functionf≺n given by

f≺n(n′, ~m) =

f(n′, ~m) if n′ ≺α n0 otherwise.

It is always assumed, when doing α-recursion, that α 6= 0.(b) An unnested α-recursion is one of the special form:

f(0, ~m) = g0(~m)

f(n, ~m) = g1(n, ~m, f(g2(n, ~m), . . . , gk+2(n, ~m)))

with just one recursive call on f where g2(n, ~m) ≺α n for all n and all~m.

(c) Let ε0(0) = ω and ε0(i + 1) = ωε0(i). Then for each fixed i, a functionis said to be ε0(i)-recursive if it can be defined from primitive recursivefunctions by successive substitutions and α-recursions with α < ε0(i).It is unnested ε0(i)-recursive if all the α-recursions used in its definitionare unnested. It is ε0-recursive if it is ε0(i)-recursive for some (any) i.

Note. The ε0(0)-recursive functions are just the primitive recursiveones, since if α < ω then α-recursion is just a finitely-iterated substitu-tion. So the definition of ε0(0)-recursion simply amounts to the closure ofthe primitive recursive functions under substitution, which of course doesnot enlarge the primitive recursive class.

Lemma (Bounds for α-recursion). Suppose f is defined from g1, . . . , gsby an α-recursion:

f(0, ~m) = g0(~m)

f(n, ~m) = T (g1, . . . , gs, f≺n, n, ~m)

where for each i ≤ s, gi(~a ) < Fβ(k+ max~a) for all numerical arguments ~a.(The β and k are arbitrary constants, but it is assumed that the last exponentin the Cantor Normal Form of β is ≥ the first exponent in the normal formof α, so that β + α is automatically in Cantor Normal Form). Then thereis a constant d such that for all n, ~m,

f(n, ~m) < Fβ+α(k + 2d+ max(n, ~m)).

Proof. The constant d will be the depth of nesting of the term T , wherevariables have depth of nesting 0 and each compositional term g(T1, . . . , Tl)has depth of nesting one greater than the maximum depth of nesting of thesubterms Tj .


First suppose n lies in the field of the well-ordering ≺α. Then n = γ forsome γ < α. We claim by induction on γ that

f(n, ~m) < Fβ+γ+1(k + 2d+ max(n, ~m)).

This holds immediately when n = 0, because g0(~m) < Fβ(k+max ~m) and Fβis strictly increasing and bounded by Fβ+1. So suppose n 6= 0 and assumethe claim for all n′ = δ where δ < γ.

Let T ′ be any subterm of T (g1, . . . , gs, f≺n, n, ~m) with depth of nesting d′,built up by application of one of the functions g1, . . . , gs or f≺n to subtermsT1, . . . , Tl. Now assume (for a sub-induction on d′) that each of these Tj ’s hasnumerical value vj less than F 2(d′−1)

β+γ (k+ 2d+ max(n, ~m)). If T ′ is obtainedby application of one of the functions gi then its numerical value will be

gi(v1, . . . , vl) < Fβ(k + F2(d′−1)β+γ (k + 2d+ max(n, ~m)))

< F 2d′β+γ(k + 2d+ max(n, ~m))

since if k < u then Fβ(k + u) < Fβ(2u) < F 2β (u) provided β 6= 0. On the

other hand, if T ′ is obtained by application of the function f≺n, its valuewill be f(v1, . . . , vl) if v1 ≺α n, or 0 otherwise. Suppose v1 = δ ≺α γ. Thenby the induction hypothesis,

f(v1, . . . , vl) < Fβ+δ+1(k + 2d+ max~v) ≤ Fβ+γ(k + 2d+ max~v)

because v1 is greater than the maximum coefficient of δ, so δ ∈ γ[v1], soβ + δ ∈ (β + γ)[v1] and hence Fβ+δ+1 is bounded by Fβ+γ on arguments≥ v1. Therefore, inserting the assumed bounds for the vj , we have

f(v1, . . . , vl) < Fβ+γ(k + 2d+ F2(d′−1)β+γ (k + 2d+ max(n, ~m)))

and then by the same argument as before,

f(v1, . . . , vl) < F 2d′β+γ(k + 2d+ max(n, ~m))).

We have now shown that the value of every subterm of T with depth ofnesting d′ is less than F 2d′

β+γ(k+ 2d+ max(n, ~m))). Applying this to T itselfwith depth of nesting d we thus obtain

f(n, ~m) < F 2dβ+γ(k + 2d+ max(n, ~m))) < Fβ+γ+1(k + 2d+ max(n, ~m)))

as required. This proves the claim.To derive the result of the lemma is now easy. If n = γ lies in the field

of ≺α then β + γ ∈ (β + α)[n] and so

f(n, ~m) < Fβ+γ+1(k + 2d+ max(n, ~m))) ≤ Fβ+α(k + 2d+ max(n, ~m))).

If n does not lie in the field of ≺α then the function f≺n is the constantzero function, and so in evaluating f(n, ~m) by the term T only applicationsof the gi-functions come into play. Therefore a much simpler version of theabove argument gives the desired

f(n, ~m) < F 2dβ (k + 2d+ max(n, ~m)) < Fβ+α(k + 2d+ max(n, ~m))

since α 6= 0. This completes the proof.

Theorem. For each i, a function is ε0(i)-recursive if and only if it isregister-machine computable in a number of steps bounded by Fα for someα < ε0(i).


Proof. For the “if” part, recall that for every register-machine com-putable function g there is an elementary function U such that for all argu-ments ~m, if s(~m) bounds the number of steps needed to compute g(~m) theng(~m) = U(~m, s(~m)). Thus if g is computable in a number of steps boundedby Fα, this means that g can be defined from Fα by the substitution

g(~m) = U(~m,Fα(max ~m)).

Hence g will be ε0(i)-recursive if Fα is. We therefore need to show that ifα < ε0(i) then Fα is ε0(i)-recursive. This is clearly true when i = 0 sincethen α is finite, and the finite levels of the F hierarchy are all primitiverecursive, and therefore ε0(0)-recursive. Suppose then, that i > 0, and thatα = ωγ1 · c1 + · · ·+ ωγk · ck is less than ε0(i). Adding one to each exponent,and inserting a successor term at the end, produces the ordinal β = α′ + nwhere α′ is the limit ωγ1+1 · c1 + · · · + ωγk+1 · ck. Since i > 0 it is still thecase that β < ε0(i). Obviously, from the code for α, here denoted a, we canelementarily compute the code for α′, denoted a′, and then b = π(a′, 〈0, n〉)will be the code for β. Conversely from such a b we can elementarily decodea′ and hence a, and also the n. Choosing a large enough δ < ε0(i) so thatβ < δ, we can now define a function f(b,m) by δ-recursion, with the propertythat when b is the code for β = α′ + n, then f(b,m) = Fnα (m). To explicatematters we shall expose the components from which b is constructed bywriting b = (a, n). Then the recursion defining f(b,m) = f((a, n),m) hasthe following form, using the elementary function h(a, n) defined earlier,which gives the code for α− 1 if Succ(α), or α(n) if Lim(α):

f((a, n),m) =

m+ n if a = 0 or n = 0f((h(a,m),m+ 1),m) if Succ(a) and n = 1f((h(a,m), 1),m) if Lim(a) and n = 1f((a, 1), f((a, n− 1),m)) if n > 10 otherwise.

Clearly then, f is ε0(i)-recursive, and Fα(m) = f((α, 1),m), so Fα is ε0(i)-recursive for every α < ε0(i).

For the “only if” part note first that the number of steps needed tocompute a compositional term g(T1, . . . , Tl) is the sum of the numbers ofsteps needed to compute all the subterms Tj , plus the number of stepsneeded to compute g(v1, . . . , vl) where vj is the value of Tj . Furthermore, ina register-machine computation, these values vj are bounded by the numberof computation steps plus the maximum input. This means that we cancompute a bound on the computation-steps for any such term, and we cando it elementarily from given bounds for the input data. Now supposef(n, ~m) = T (g1, . . . , gs, f≺n, n, ~m) is any recursion-step of an α-recursion.Then if we are given bounding functions on the numbers of steps to computeeach of the gi’s, and we assume inductively that we already have a bound onthe number of steps to compute f(n′,−) whenever n′ ≺α n, it follows thatwe can elementarily estimate a bound on the number of steps to computef(n, ~m). In other words, for any function defined by an α-recursion fromgiven functions ~g, a bounding function (on the number of steps needed tocompute f) is also definable by α-recursion from given bounding functions


for the g’s. Exactly the same thing holds for primitive recursions. But inthe preceding lemma we showed that as we successively define functions byα-recursions, with α < ε0(i), their values are bounded by functions Fβ+α

where also, β < ε0(i). But ε0(i) is closed under addition, so β + α < ε0(i).Hence every ε0(i)-recursive function is register-machine computable in anumber of steps bounded by some Fγ where γ < ε0(i). This completes theproof.

The following reduction of nested to unnested recursion is due to Tait(1961); see also Fairtlough and Wainer (1992) .

Corollary. For each i, a function is ε0(i)-recursive if and only if it isunnested ε0(i+ 1)-recursive.

Proof. By the Theorem, every ε0(i)-recursive function is computablein “time” bounded by Fα = Hωα where α < ε0(i). It is therefore prim-itive recursively definable from Hωα . But Hωα is defined by an unnestedωα-recursion, and clearly, ωα < ε0(i + 1). Hence arbitrarily nested ε0(i)-recursions are reducible to unnested ε0(i+ 1)-recursions.

Conversely, suppose f is defined from given functions g0, g1, . . . , gk+2 byan unnested α-recursion where α < ε0(i+ 1):

f(0, ~m) = g0(~m)

f(n, ~m) = g1(n, ~m, f(g2(n, ~m), . . . , gk+2(n, ~m)))

with g2(n, ~m) ≺α n for all n and ~m. Then the number of recursion stepsneeded to compute f(n, ~m) is f ′(n, ~m) where

f ′(0, ~m) = 0

f ′(n, ~m) = 1 + f ′(g2(n, ~m), . . . , gk+2(n, ~m))

and f is then primitive recursively definable from g2, . . . , gk+2 and any boundfor f ′. Now assume that the given functions gj are all primitive recursivelydefinable from, and bounded by Hβ where β < ε0(i+1). Then a similar, buteasier, argument to that used in proving the lemma above providing boundsfor α-recursion shows that f ′(n, ~m) is bounded by Hβ·γ where n = γ. Thisis simply because

Hβ·(γ+1)(x) = Hβ·γ+β(x) = Hβ·γ(Hβ(x)).

Therefore f is primitive recursively definable from Hβ and Hβ·α. Clearly,since β, α < ε0(i + 1) we may choose β = ωβ

′and α = ωα

′for appropriate

β′, α′ < ε0(i). Then Hβ = Fβ′ and Hβ·α = Fβ′+α′ where of course, β′ +α′ <ε0(i). Therefore f is ε0(i)-recursive.

4.2.3. Provable recursiveness of Hα and Fα. We now prove thatfor every α < ε0(i), with i > 0, the function Fα is provably recursive in thetheory IΣi+1.

Since all of the machinery we have developed for coding ordinals below ε0is elementary, we can safely assume that it is available to us in an appropriateconservative extension of I∆0(exp), and can in fact be defined (with allrelevant properties proven) in I∆0(exp) itself. In particular we shall againmake use of the function h such that, if a codes a successor ordinal α thenh(a, n) codes α− 1, and if a codes a limit ordinal α then h(a, n) codes α(n).


Note that we can decide whether a codes a succesor ordinal (Succ(a)) ora limit ordinal (Lim(a)), by asking whether h(a, 0) = h(a, 1) or not. It iseasiest to develop first the provable recursiveness of the Hardy functions Hα,since they have a simpler, unnested recursive definition. The fast growingfunctions are then easily obtained by the equation Fα = Hωα .

Definition. Let H(a, x, y, z) denote the following ∆0(exp) formula:

(z)0 = 〈0, y〉 ∧ π2(z) = 〈a, x〉 ∧∀i<lh(z)(lh((z)i) = 2 ∧ (i > 0→ (z)i,0 > 0)) ∧∀0<i<lh(z)(Succ((z)i,0)→ (z)i−1,0 = h((z)i,0, (z)i,1) ∧ (z)i−1,1 = (z)i,1+1) ∧∀0<i<lh(z)(Lim((z)i,0)→ (z)i−1,0 = h((z)i,0, (z)i,1) ∧ (z)i−1,1 = (z)i,1).

Lemma (Definability of Hα). Hα(n) = m if and only if ∃zH(α, n,m, z)is true. Furthermore, for each α < ε0 we can prove in IΣ1,

∃zH(α, x, y, z) ∧ ∃zH(α, x, y′, z)→ y = y′.

Proof. The meaning of the formula ∃zH(α, n,m, z) is that there is afinite sequence of pairs 〈αi, ni〉, beginning with 〈0,m〉 and ending with 〈α, n〉,such that at each i > 0, if Succ(αi) then αi−1 = αi−1 and ni−1 = ni+1, andif Lim(αi) then αi−1 = αi(ni) and ni−1 = ni. Thus by induction up alongthe sequence, and using the original definition of Hα, we easily see that foreach i > 0, Hαi(ni) = m, and thus at the end, Hα(n) = m. Conversely, ifHα(n) = m then there must exist such a computation-sequence, and thisproves the first part of the lemma.

For the second part notice that, by induction on the length of thecomputation-sequence s, we can prove, for each n,m,m′, s, s′ that

H(α, n,m, s)→ H(α, n,m′, s′)→ s = s′ ∧m = m′.

This proof can be formalized directly in I∆0(exp) to give

H(α, x, y, z)→ H(α, x, y′, z′)→ z = z′ ∧ y = y′

and hence∃zH(α, x, y, z)→ ∃zH(α, x, y′, z)→ y = y′.

Remark. Thus in order for Hα to be provably recursive it remains onlyto prove (in the required theory) ∃y∃zH(α, x, y, z).

Lemma. In I∆0(exp) we can prove

∃zH(ωa, x, y, z)→ ∃zH(ωa · c, y, w, z)→ ∃zH(ωa · (c+ 1), x, w, z)

where ωa is the elementary term 〈〈a, 1〉〉 which constructs, from the code aof an ordinal α, the code for the ordinal ωα, and b ·0 = 0, b ·(z+1) = b ·z⊕b,with ⊕ the elementary function which computes α+ β from α and β.

Proof. By assumption we have sequences s, s′ satisfying H(ωa, n,m, s)and H(ωa · c,m, k, s′). Add ωa · c (in the sense of ⊕) to the first componentof each pair in s. Then the last pair in s′ and the first pair in s becomeidentical. By concatenating the two – taking this double pair only once –construct an elementary term t(s, s′) satisfying H(ωa · (c + 1), n, k, t). Wecan then prove

H(ωa, x, y, z)→ H(ωa · c, y, w, z′)→ H(ωa · (c+ 1), x, w, t)


in a conservative extension of I∆0(exp), and hence in I∆0(exp) derive

∃zH(ωa, x, y, z)→ ∃zH(ωa · c, y, w, z)→ ∃zH(ωa · (c+ 1), x, w, z).

Lemma. Let H(a) be the Π2 formula ∀x∃y∃zH(a, x, y, z). Then withΠ2-induction we can prove the following:(a) H(ω0).(b) Succ(a)→ H(ωh(a,0))→ H(ωa).(c) Lim(a)→ ∀xH(ωh(a,x))→ H(ωa).

Proof. The term t0 = 〈〈0, x+1〉, 〈1, x〉〉 witnesses H(ω0, x, x+1, t0) inI∆0(exp), so H(ω0) is immediate.

With the aid of the lemma just proven we can derive

H(ωh(a,0))→ H(ωh(a,0) · c)→ H(ωh(a,0)) · (c+ 1)

Therefore by Π2 induction we obtain

H(ωh(a,0))→ H(ωh(a,0) · (x+ 1))

and thenH(ωh(a,0))→ ∃y∃zH(ωh(a,0) · (x+ 1), x, y, z).

But there is an elementary term t1 with the property

Succ(a)→ H(ωh(a,0) · (x+ 1), x, y, z)→ H(ωa, x, y, t1)

since t1 only needs to tagg onto the end of the sequence z the new pair〈ωa, x〉, thus t1 = π(z, 〈ωa, x〉). Hence by the quantifier rules,

Succ(a)→ H(ωh(a,0))→ H(ωa).

The final case is now straightforward, since the term t1 just constructedalso gives

Lim(a)→ H(ωh(a,x), x, y, z)→ H(ωa, x, y, t1)and so by quantifier rules again,

Lim(a)→ ∀xH(ωh(a,x))→ H(ωa).

Definition (Structural Transfinite Induction). The structural progres-siveness of a formula A(a) is expressed by SProgaA, which is the con-junction of the formulas A(0), ∀a(Succ(a) → A(h(a, 0)) → A(a)), and∀a(Lim(a) → ∀xA(h(a, x)) → A(a)). The principle of structural transfi-nite induction up to an ordinal α is then the following axiom-schema, forall formulas A:

SProgaA→ ∀a≺αA(a)where a ≺ α means a lies in the field of the well-ordering ≺α, in other wordsa = 0 ∨ 0 ≺α a.

Note. The last lemma shows that the Π2 formula H(ωa) is structuralprogressive, and that this is provable with Π2-induction.

We now make use of a famous result of Gentzen (1936), which says thattransfinite induction is provable in arithmetic up to any α < ε0. For lateruse we prove this fact in a slightly more general form, where one can recur toall points strictly below the present one, and need not refer to distinguishedfundamental sequences.


Definition (Transfinite Induction). The (general) progressiveness of aformula A(a) is

ProgaA := ∀a(∀b≺aA(b)→ A(a)

).

The principle of transfinite induction up to an ordinal α is the schema

ProgaA→ ∀a≺αA(a)

where again a ≺ α means a lies in the field of the well-ordering ≺α.

Lemma. Structural transfinite induction up to α is derivable from trans-finite induction up to α.

Proof. Let A be an arbitrary formula and assume SProgaA; we mustshow ∀a≺αA(a). Using transfinite induction for the formula a ≺ α → A(a)it suffices to prove

∀a(∀b≺a;b≺αA(b)→ a ≺ α→ A(a)

)which is equivalent to

∀a≺α(∀b≺aA(b)→ A(a)

).

This is easily proved from SProgaA, using the properties of the h function,and distinguishing the cases a = 0, Succ(a) and Lim(a).

Remark. Induction over an arbitrary well-founded set is an easy con-sequence. Comparisons are made by means of a “measure function” µ, intoan initial segment of the ordinals. The principle of “general induction” upto an ordinal α is

ProgµxA(x)→ ∀x;µx≺αA(x)where ProgµxA(x) expresses “µ-progressiveness” w.r.t. the measure functionµ and the ordering ≺:=≺α

ProgµxA(x) := ∀a(∀y;µy≺aA(y)→ ∀x;µx=aA(x)

).

We claim that general induction up to an ordinal α is provable fromtransfinite induction up to α.

Proof. Assume ProgµxA(x); we must show ∀x;µx≺αA(x). Consider

B(a) := ∀x;µx=aA(x).

It suffices to prove ∀a≺αB(a), which is ∀a≺α∀x;µx=aA(x). By transfiniteinduction it suffices to prove ProgaB, which is

∀a(∀b≺a∀y;µy=bA(y)→ ∀x;µx=aA(x)

).

But this follows from the assumption ProgµxA(x), since ∀b≺a∀y;µy=bA(y) im-plies ∀y;µy≺aA(y).

We now come to Gentzen’s theorem. In the proof we will need someproperties of ≺ which can all be proved in I∆0(exp): irreflexivity and tran-sitivity for ≺, and also — following Schutte —

a ≺ 0→ A,(4.1)

c ≺ b⊕ ω0 → (c ≺ b→ A)→ (c = b→ A)→ A,(4.2)

a⊕ 0 = a,(4.3)

a⊕ (b⊕ c) = (a⊕ b)⊕ c,(4.4)


0⊕ a = a,(4.5)

ωa0 = 0,(4.6)

ωa(x+ 1) = ωax⊕ ωa,(4.7)

a 6= 0→ c ≺ b⊕ ωa → c ≺ b⊕ ωe(a,b,c)m(a, b, c),(4.8)

a 6= 0→ c ≺ b⊕ ωa → e(a, b, c) ≺ a.(4.9)

where e and m denote the appropriate function constants and A is anyformula. (The reader should check that e, m can be taken to be elementary.)

Theorem (Gentzen (1936)). For every Π2 formula F and each i > 0we can prove in IΣi+1 the principle of transfinite induction up to α for allα < ε0(i).

Proof. Starting with any Πj formula A(a), we construct the formula

A+(a) := ∀b(∀c≺bA(c)→ ∀c≺b⊕ωaA(c))

where, as mentioned above, ⊕ is the elementary addition function on ordinal-codes thus: α ⊕ γ = α+ γ. Note that since A is Πj then A+ is a Πj+1

formula. The crucial point is that

IΣj ` ProgaA(a)→ ProgaA+(a).

So assume ProgaA(a), that is, ∀a(∀b≺aA(b)→ A(a)) and

(4.10) ∀b≺aA+(b)

We have to show A+(a). So assume further

(4.11) ∀c≺bA(c)

and c ≺ b⊕ ωa. We have to show A(c).If a = 0, then c ≺ b⊕ ω0. By (4.2) it suffices to derive A(c) from c ≺ b

as well as from c = b. If c ≺ b, then A(c) follows from (4.11), and if c = b,then A(c) follows from (4.11) and ProgaA.

If a 6= 0, from c ≺ b ⊕ ωa we obtain c ≺ b ⊕ ωe(a,b,c)m(a, b, c) by (4.8)and e(a, b, c) ≺ a by (4.9). From (4.10) we obtain A+(e(a, b, c)). By thedefinition of A+(x) we get

∀u≺b⊕ωe(a,b,c)xA(u)→ ∀u≺(b⊕ωe(a,b,c)x)⊕ωe(a,b,c)A(u)

and hence, using (4.4) and (4.7)

∀u≺b⊕ωe(a,b,c)xA(u)→ ∀u≺b⊕ωe(a,b,c)(x+1)A(u).

Also from (4.11) and (4.6), (4.3) we obtain

∀u≺b⊕ωe(a,b,c)0A(u).

Using an appropriate instance of the induction schema we can conclude

∀u≺b⊕ωe(a,b,c)m(a,b,c)A(u)

and hence A(c).Now fix i > 0 and (throughout the rest of this proof) let ≺ denote the

well-ordering ≺ε0(i). Given any formula F (v) of level 2 define A(a) to bethe formula ∀v≺aF (v). Then A is also of level 2 and furthermore it is easyto see that ProgvF (v) → ProgaA(a) is derivable in I∆0(exp). Therefore


by iterating the above procedure i times starting with j = 2, we obtainsuccessively the formulas A+, A++, ... A(i) where A(i) has level i+ 2 and

Zi+1 ` ProgvF (v)→ ProguA(i)(u).

Now fix any α < ε0(i) and choose k so that α ≤ ε0(i)(k). By applying k+ 1times the progressiveness of A(i)(u), one obtains A(i)(k + 1) without need ofany further induction, since k is fixed. Therefore

Zi+1 ` ProgvF (v)→ A(i)(k + 1).

But by instantiating the outermost universally quantified variable of A(i)

to zero we have A(i)(k + 1) → A(i−1)(ωk+1). Again instantiating to zerothe outermost universally quantified variable in A(i−1) we similarly obtainA(i−1)(ωk+1) → A(i−2)(ωω

k+1). Continuing in this way, and noting that

ε0(i)(k) consists of an exponential stack of i ω’s with k + 1 on the top, wefinally get down (after i steps) to

Zi+1 ` ProgvF (v)→ A(ε0(i)(k)).

Since A(ε0(i)(k)) is just ∀v≺ε0(i)(k)

F (v) we have therefore proved, in Zi+1,transfinite induction for F up to ε0(i)(k), and hence up to the given α.

Theorem. For each i and every α < ε0(i), the fast growing function Fαis provably recursive in IΣi+1.

Proof. If i = 0 then α is finite and Fα is therefore primitive recursive,so it is provably recursive in IΣ1.

Now suppose i > 0. Since Fα = Hωα we need only show, for everyα < ε0(i), that Hωα is provably recursive in IΣi+1. But a lemma aboveshows that its defining Π2 formula H(ωa) is provably progressive in IΣ2,and therefore by Gentzen’s result,

IΣi+1 ` ∀a≺αH(ωa).

One further application of progressiveness then gives

IΣi+1 ` H(ωα)

which, together with the definability of Hα proved above, completes theprovable Σ1-definability of Hωα in IΣi+1.

Corollary. Any ε0(i)-recursive function is provably recursive in IΣi+1.

Proof. We have seen already that each ε0(i)-recursive function is re-gister-machine computable in a number of steps bounded by some Fα withα < ε0(i). Consequently, each such function is primitive recursively, andin fact elementarily, definable from an Fα which itself is provably recursivein IΣi+1. But primitive recursions only need Σ1-inductions to prove themdefined (see 4.1). Thus in IΣi+1 we can prove the Σ1-definability of allε0(i)-recursive functions.


4.3. Ordinal Bounds for Provable Recursion in PA

For the converse of the above result we perform an ordinal analysis ofPA-proofs in a system which allows higher levels of induction to be reduced,via cut elimination, to Σ1-inductions. The cost of such reductions is a suc-cessive exponential increase in the ordinals involved, but in the end, by ageneralization of Parsons’ theorem on primitive recursion, this enables us toread off fast growing bounding functions for provable recursion.

It would be naive to try to carry through cut elimination directly on PA-proofs since the inductions would get in the way. Instead, following Schutte(1951), the trick is to unravel the inductions by means of the ω-rule: fromthe infinite sequence of premises A(n) | n ∈ N derive ∀xA(x). Thedisadvantage is that this embeds PA into a “semi-formal” system with aninfinite rule, so proofs will now be well-founded trees with ordinals measuringtheir heights. The advantage is that this system admits cut elimination, andfurthermore it bears a close relationship with the fast growing hierarchy, aswe shall see.

4.3.1. The infinitary system n : N `α Γ. We shall inductively gener-ate, according to the rules below, an infinitary system of (classical) one-sidedsequents

n : N `α Γ

in Tait-style (i.e., with negation of compound formulas defined by de Mor-gan’s laws) where:

(i) n : N is a new kind of atomic formula, declaring a bound on numerical“inputs” from which terms appearing in Γ are computed according to theN -rules and axioms.

(ii) Γ is any finite set of closed formulas, either of the form m : N , or elseformulas in the language of arithmetic based on =, 0, S, P,+,−· , ·, exp2,possibly with the addition of any number of further primitive-recursively-defined function symbols. Recall that Γ, A denotes the set Γ ∪ A etc.

(iii) Ordinals α, β, γ < ε0 denote bounds on the heights of derivations,assigned in a carefully controlled way due originally to Buchholz. Essentially,the condition is that if a sequent with bound α is derived from a premisewith bound β then β ∈ α[n] where n is the declared input bound.

(iv) Any occurrence of a number n in a formula should of course be readas its corresponding numeral, but we need not introduce explicit notationfor this since the intention will be clear in context.

The first axiom and rule are “computation rules” for N , and the restare just formalised versions of the truth definition, with Cut added.

(N1): For arbitrary α,

n : N `α Γ, m : N provided m ≤ n+ 1

(N2): For β, β′ ∈ α[n],

n : N `β n′ : N n′ : N `β′ Γn : N `α Γ

4.3. ORDINAL BOUNDS FOR PROVABLE RECURSION IN PA 139

(Ax): If Γ contains a true atom (i.e., an equation or inequation be-tween closed terms) then for arbitrary α,

n : N `α Γ

(∨): For β ∈ α[n],

n : N `β Γ, A,Bn : N `α Γ, A ∨B

(∧): For β, β′ ∈ α[n]

n : N `β Γ, A n : N `β′ Γ, Bn : N `α Γ, A ∧B

(∃): For β, β′ ∈ α[n],

n : N `β m : N n : N `β′ Γ, A(m)n : N `α Γ, ∃xA(x)

(∀): Provided βi ∈ α[max(n, i)] for every i,

max(n, i) : N `βi Γ, A(i) for every i ∈ Nn : N `α Γ, ∀xA(x)

(Cut): For β, β′ ∈ α[n],

n : N `β Γ, C n : N `β′ Γ, ¬Cn : N `α Γ

(C is called the “cut formula”).

Definition. The functions Bα are defined by the recursion:

B0(n) = n+ 1, Bα+1(n) = Bα(Bα(n)), Bλ(n) = Bλ(n)(n)

where λ denotes any limit ordinal with assigned fundamental sequence λ(n).

Note. Since, at successor stages, Bα is just composed with itself once,an easy comparison with the fast growing Fα shows that Bα(n) ≤ Fα(n) forall n > 0. It is also easy to see that for each positive integer k, Bω·k(n) isthe 2n+1-times iterate of Bω·(k−1) on n. Thus another comparison with thedefinition of Fk shows that Fk(n) ≤ Bω·k(n) for all n. Thus every primitiverecursive function is bounded by a Bω·k for some k. Furthermore, just as forHα and Fα, Bα is strictly increasing and Bβ(n) < Bα(n) whenever β ∈ α[n].The next two lemmas show that these functions Bα are intimately relatedwith the infinitary system we have just set up.

Lemma. m ≤ Bα(n) if and only if n : N `α m : N is derivable by theN1 and N2 rules only.

Proof. For the “if” part, note that the proviso on the axiom N1 is thatm ≤ n + 1 and therefore m ≤ Bα(n) is automatic. Secondly if n : N `αm : N arises by the N2 rule from premises n : N `β n′ : N and n′ : N `β′

m : N where β, β′ ∈ α[n] then, assuming inductively that m ≤ Bβ′(n′) andn′ ≤ Bβ(n), we have m ≤ Bβ′(Bβ(n)) and hence m ≤ Bα(n).

For the “only if” proceed by induction on α, assuming m ≤ Bα(n). Ifα = 0 then m ≤ n + 1 and so n : N `α m : N by N1. If α = β + 1 thenm ≤ Bβ(n′) where n′ = Bβ(n), so by the induction hypothesis, n : N `β


n′ : N and n′ : N `β m : N . Hence n : N `α m : N by N2 since β ∈ α[n].Finally, if α is a limit then m ≤ Bα(n)(n) and so n : N `α(n) m : N bythe induction hypothesis. But since α[n] = α(n)[n] the ordinal bounds βon the premises of this last derivation also lie in α[n], which means thatn : N `α m : N as required.

Definition. A sequent n : N `α Γ is said to be term controlled ifevery closed term occuring in Γ has numerical value bounded by Bα(n). Aninfinitary derivation is then term controlled if every one of its sequents isterm controlled.

Note. For a derivation to be term controlled it is sufficient that eachaxiom is term controlled, since in any rule, the closed terms occuring in theconclusion must already occur in a premise (in the case of the ∀ rule, thepremise i = 0). Thus if α is the ordinal bound on the conclusion, every suchclosed term is bounded by a Bβ(n) for some β ∈ α[n] and hence is boundedby Bα(n) as required.

Lemma (Bounding Lemma). Let Γ be a set of Σ1-formulas or atoms ofthe form m : N . If n : N `α Γ has a term controlled derivation in whichall cut formulas are Σ1, then Γ is true at Bα+1(n). Here, the definition of“true at” is extended to include atoms m : N by saying that m : N is true atk if m < k.

Proof. By induction over α according to the generation of the sequentn : N `α Γ, which we shall denote by S.

(Axioms) If S is either a logical axiom or of the form N1 then Γ containseither a true atomic equation or inequation, or else an atom m : N wherem < n+ 2, so Γ is automatically true at Bα+1(n).

(N2) If S arises by the N2 rule from premises n : N `β n′ : N andn′ : N `β′ Γ where β, β′ ∈ α[n] then, by the induction hypothesis, Γ istrue at Bβ′+1(n′) where n′ < Bβ+1(n). Therefore by persistence, Γ is trueat Bβ′+1(Bβ+1(n)) which is less than or equal to Bα(Bα(n)) = Bα+1(n). Soby persistence again, Γ is true at Bα+1(n).

(∨, ∧) Because of our definition of Σ1-formulas, the ∨ and ∧ rules onlyapply to bounded (∆0(exp)) formulas, so the result is immediate in thesecases (by persistence and the fact that the rules preserve truth).

(∀) Similarly, the only way in which the ∀ rule can be applied is in abounded context, where Γ = Γ′,∀x(x 6< t∨A(x)), t is a closed term, and A(x)a bounded formula. Suppose then, that S arises by the ∀ rule from premisesmax(n, i) : N `βi Γ′, i 6< t∨A(i) where βi ∈ α[max(n, i)] for every i. Sincethe derivation is term controlled we know that (the numerical value of) tis less than or equal to Bα(n). Therefore by the induction hypothesis andpersistence again: for every i < t, the set Γ′, A(i) is true at Bβi+1(Bα(n)).But βi ∈ α[Bα(n)] and so Bβi+1(Bα(n)) ≤ Bα(Bα(n)) = Bα+1(n). Hence Γis true at Bα+1(n) using persistence once more.

(∃) If Γ contains a Σ1-formula ∃xA(x) and S arises by the ∃ rule frompremises n : N `β m : N and n : N `β′ Γ, A(m) then by the inductionhypothesisis, Γ, A(m) is true at Bβ′+1(n) where m < Bβ+1(n). Therefore,by the definition of “true at”, Γ is true at whichever is the greater of Bβ+1(n)


and Bβ′+1(n). But since β, β′ ∈ α[n] both of these are less than Bα+1(n),so Γ is again true at Bα+1(n).

(Cut) Finally suppose S comes about by a cut on the Σ1 formula C ≡∃~xD(~x ) with D bounded. Then the premises are n : N ` Γ, C andn : N ` Γ, ¬C with ordinal bounds β, β′ ∈ α[n] respectively. By theinduction hypothesis applied to the first premise, we have numbers ~m <Bβ+1(n) such that Γ, D(~m) is true at Bβ+1(n). By inverting the universalquantifiers in ¬C ≡ ∀~x¬D(~x ), the second premise gives max(n, ~m) : N `β′

Γ, ¬D(~m). Then by the induction hypothesis (since Γ,¬D(~m) is now a setof Σ1-formulas) we have Γ,¬D(~m) true at Bβ′+1(max(n, ~m)), which is lessthan Bβ′+1(Bβ+1(n)), which is less than or equal to Bα+1(n). Therefore(by persistence) Γ must be true at Bα+1(n), for otherwise both D(~m) and¬D(~m) would be true, and this cannot be.

4.3.2. Embedding of PA. The Bounding Lemma above becomes ap-plicable to PA if we can embed it into the infinitary system and then (asdone in the next sub-section) reduce all the cuts to Σ1 form. This is standardproof-theoretic procedure. First, comes a simple technical lemma which willbe needed frequently.

Lemma (Weakening). If n : N `α Γ and n ≤ n′ and Γ ⊆ Γ′ and α[m] ⊆α′[m] for every m ≥ n′ then n′ : N `α′ Γ′. Furthermore, if the givenderivation of n : N `α Γ is term controlled then so will be the derivation ofn′ : N `α′ Γ′ provided of course, that all the closed terms occurring in Γ′

are bounded by Bα′(n′).

Proof. Proceed by induction on α. Note first that if n : N `α Γ is anaxiom then Γ, and hence also Γ′, contains either a true atom or a declarationm : N where m ≤ n+ 1. Thus n′ : N `α′ Γ′ is an axiom also.

(N2) If n : N `α Γ arises by the N2 rule from premises n : N `β m : Nand m : N `β′ Γ where β, β′ ∈ α[n] then, by applying the induction hy-pothesis to each of these, n can be increased to n′ in the first, and Γ can beincreased to Γ′ in the second. But then since α[n] ⊆ α[n′] ⊆ α′[n′] the ruleN2 can be re-applied to yield the desired n′ : N `α′ Γ′.

(∃) If n : N `α Γ arises by the ∃ rule from premises n : N `β m : N andn : N `β′ Γ, A(m) where ∃xA(x) ∈ Γ and β, β′ ∈ α[n] then, by applyingthe induction hypothesis to each premise, n can be increased to n′ and Γincreased to Γ′. The ∃ rule can then be re-applied to yield the desiredn′ : N `α′ Γ′, since as above, β, β′ ∈ α′[n′].

(∀) Suppose n : N `α Γ arises by the ∀ rule from premises

max(n, i) : N `βi Γ, A(i)

where ∀xA(x) ∈ Γ and βi ∈ α[max(n, i)] for every i. Then, by applying theinduction hypothesis to each of these premises, n can be increased to n′ andΓ increased to Γ′. The ∀ rule can then be re-applied to yield the desiredn′ : N `α′ Γ′, since for each i, βi ∈ α[max(n′, i)] ⊆ α′[max(n′, i)].

The remaining rules ∨, ∧ and Cut, are handled easily by increasing n ton′ and Γ to Γ′ in the premises, and then re-applying the rule.

Theorem (Embedding). Suppose PA ` Γ(x1, . . . , xk) where x1, . . . , xkare all the free variables occurring in Γ. Then there is a fixed number d such


that, for all numerical instantiations n1, n2, . . . , nk of the free variables, wehave a term controlled derivation of

max(n1, n2, . . . , nk) : N `ω·d Γ(n1, n2, . . . , nk).

Furthermore, the (non-atomic) cut formulas occurring in this derivation arejust the induction formulas which occur in the original PA proof.

Proof. We work with a Tait-style formalisation of PA in which theinduction axioms are replaced by corresponding rules:

Γ, A(0) Γ,¬A(z), A(z + 1)Γ, A(t)

with z not free in Γ and t any term. As in the proof of Σ1-induction lemmain 4.1, we may suppose that the given PA-proof of Γ(~x ) has been reducedto “free-cut” free form, wherein the only non-atomic cut formulas are theinduction formulas. We simply have to transform each step of this PA-proofinto an appropriate, term controlled infinitary derivation.

(Axioms) If Γ(~x ) is an axiom of PA then with ~n = n1, n2, . . . , nk substi-tuted for the variables ~x = x1, x2, . . . , xk, there must occur a true atom inΓ(~n ). Thus we automatically have a derivation of max~n : N `α Γ(~n ) forarbitrary α. However we must choose α appropriately so that, for all ~n, thissequent is term controlled. To do this, simply note that, since PA only hasprimitive-recursively-defined function constants, every one of the (finitelymany) terms t(~x ) appearing in Γ(~x ) is primitive recursive, and thereforethere is a number d such that for all ~n, Bω·d(max~n ) bounds the value ofevery such t(~n ). So choose α = ω · d.

(∨, ∧, Cut) If Γ(~x ) arises by a ∨, ∧ or cut rule from premises Γ0(~x )and Γ1(~x ) then, inductively, we can assume that we already have infinitaryderivations of max~n : N `ω·d0 Γ0(~n ) and max~n : N `ω·d1 Γ1(~n ) whered0 and d1 are independent of ~n. So choose d = max(d0, d1) + 1 and notethat ω · d0 and ω · d1 both belong to ω · d[max~n]. Then by re-applying thecorresponding infinitary rule, we obtain max~n : N `ω·d Γ(~n ) as required,and this derivation will again be term controlled provided the premises were.

(∀) Suppose Γ(~x ) arises by an application of the ∀ rule from the premiseΓ0(~x ), A(~x, z) where Γ = Γ0,∀zA(~x, z). Assume that we already have ad0 such that for all ~n and all m, there is a term controlled derivation ofmax(~n,m) : N `ω·d0 Γ0(~n ), A(~n,m). Then with d = d0 +1 we have ω · d0 ∈ω · d[max(~n,m)], and so an application of the infinitary ∀ rule immediatelygives max~n : N `ω·d Γ(~n ). This is also term controlled because any closedterm appearing in Γ(~n ) must appear in Γ0(~n ), A(~n, 0) and so is alreadybounded by Bω·d0(max~n ).

(∃) Suppose Γ(~x ) arises by an application of the ∃ rule from the premiseΓ0(~x ), A(~x, t(~x )) where Γ = Γ0,∃zA(~x, z). If the witnessing term t con-tains any other variables besides x1, . . . , xk we can assume they have beensubstituted by zero. Thus by the induction we have, for every ~n, a termcontrolled derivation of max~n : N `ω·d0 Γ0(~n ), A(~n, t(~n )) for some fixedd0 independent of ~n. Now it is easy to see, by checking through the rules,that any occurrences of the term t(~n ) may be replaced by (the numeral for)its value, say m. Furthermore, because the derivation is term controlled,m ≤ Bω·d0(maxn) and hence max~n : N `ω·d0 m : N . Therefore by the


infinitary ∃ rule we immediately obtain max~n : N `ω·d Γ0(~n ),∃zA(~n, z)where d = d0 + 1, and this derivation is again term controlled.

(Induction) Finally, suppose Γ(~x ) = Γ0(~x ), A(~x, t(~x )) arises by the in-duction rule from premises Γ0(~x ), A(~x, 0) and Γ0(~x ),¬A(~x, z), A(~x, z + 1).Assume inductively, that we have d0 and d1 and, for all ~n and all i, termcontrolled derivations of

max~n : N `ω·d0 Γ0(~n ), A(~n, 0)

max(~n, i) : N `ω·d1 Γ0(~n ),¬A(~n, i), A(~n, i+ 1).

Now let d2 be any number ≥ max(d0, d1) and such that Bω·d2 bounds everysubterm of t(~x ) (again there is such a d2 because every subterm of t definesa primitive recursive function of its variables). Then for all ~n, if m is thenumerical value of the term t(~n ) we have a term controlled derivation of

max(~n,m) : N `ω·(d2+1) Γ0(~n), A(~n,m).

For, in the case m = 0 this follows immediately from the first premise aboveby weakening the ordinal bound; and if m > 0 then by successive cuts onA(~n, i) for i = 0, 1, . . . ,m − 1, with weakenings where necessary, we obtainfirst a term controlled derivation of

max(~n,m) : N `ω·d2+m Γ0(~n ), A(~n,m)

and then, since m ∈ ω[max(~n,m)], another weakening provides the desiredordinal bound ω · (d2 + 1).

Since, by our choice of d2, max(~n,m) ≤ Bω·d2(max~n ) we also have

max~n : N `ω·d2 max(~n,m) : N

and so, combining this with the sequent just derived, the N2 rule gives

max~n : N `ω·(d2+2) Γ0(~n), A(~n,m).

It therefore only remains to replace the numeral m by the term t(~n ), whosevalue it is. But it is easy to check, by induction over the logical structureof formula A, that provided d2 is in addition chosen to be at least twicethe height of the “formation tree” of A, then for all ~n there is a cut-freederivation of

max~n : N `ω·d2 Γ0(~n),¬A(~n,m), A(~n, t(~n)).

Therefore, fixing d2 accordingly and setting d = d2 + 3, a final cut on theformula A(~n,m) yields the desired term controlled derivation, for all ~n, of

max~n : N `ω·d Γ0(~n), A(~n, t(~n)).

This completes the induction case, and hence the proof, noting that the onlynon-atomic cuts introduced are on induction formulas.

4.3.3. Cut elimination. Once a PA proof is embedded in the infini-tary system, we need to reduce the cut complexity before the BoundingLemma becomes applicable. As we shall see, this entails an iterated exponen-tial increase in the original ordinal bound. Thus ε0, the first exponentially-closed ordinal after ω, is a measure of the proof-theoretic complexity of PA.


Lemma (∀-Inversion). If n : N `α Γ,∀aA(a) then for every m we havemax(n,m) : N `α Γ, A(m). Furthermore if the given derivation is termcontrolled, so is the resulting one.

Proof. We proceed by induction on α. Note first that if the sequentn : N `α Γ,∀aA(a) is an axiom then so is n : N `α Γ and then the desiredresult follows immediately by weakening.

Suppose n : N `α Γ,∀aA(a) is the consequence of a ∀ rule with ∀aA(a)the “main formula” proven. Then the premises are, for each i,

max(n, i) : N `βi Γ, A(i),∀aA(a)

where βi ∈ α[max(n, i)]. So by applying the induction hypothesis to the casei = m one immediately obtains max(n,m) : N `βm Γ, A(m). Weakeningthen allows the ordinal bound βm to be increased to α.

In all other cases the formula ∀aA(a) is a “side formula” occurring in thepremise(s) of the final rule applied. So by the induction hypothesis, ∀aA(a)can be replaced by A(m) and n by max(n,m). The result then follows byre-applying that final rule.

Note that each transformation preserves term control.

Definition. We insert a subscript “Σr” on the proof-gate thus:

n : N `αΣrΓ

to signify that, in the infinitary derivation, all cut formulas are of the formΣi or Πi where i ≤ r.

Lemma (Cut Reduction). let n : N `αΣrΓ, C and n : N `γΣr

Γ′,¬Cwhere r ≥ 1 and C is a Σr+1 formula. Suppose also that α[n′] ⊆ γ[n′] forall n′ ≥ n. Then

n : N `γ+αΣrΓ,Γ′.

Furthermore, if the given derivations are term controlled, so is the resultingone.

Proof. We proceed by induction on α according to the derivation ofn : N `αΣr

Γ, C. If this is an axiom then C, being non-atomic, can be deletedand it’s still an axiom, and so is n : N `γ+αΣr

Γ,Γ′. Furthermore this sequentis term controlled if the given ones are, since Bγ+α(n) is greater than orequal to Bγ(n) and Bα(n).

Now suppose C is the “main formula” proven in the final rule of thederivation. Since C ≡ ∃xD(x) with D a Πr formula, this final rule is an∃ rule with premises n : N `β0

Σrm : N and n : N `β1

ΣrΓ, D(m), C where

β0, β1 ∈ α[n] ⊆ γ[n]. By the induction hypothesis we then have

n : N `γ+β1

ΣrΓ, D(m),Γ′ (∗)

Since ¬C ≡ ∀x¬D(x) we can apply ∀-inversion to the given derivation ofn : N `γΣr

Γ′,¬C to obtain max(n,m) : N `γΣrΓ′,¬D(m) as inversion does

not affect the cut formulas. Hence by the N2 rule, using n : N `β0

Σrm : N

and a weakening,

n : N `γ+β1

ΣrΓ,¬D(m),Γ′ (∗∗)


Then from (∗) and (∗∗) a cut on D(m) gives the desired result:

n : N `γ+αΣrΓ,Γ′.

Notice, however, that (∗∗) requires β1 to be nonzero so that γ ∈ γ+β1[n]. If,on the other hand, β1 = 0 then either n : N `β1

ΣrΓ is an axiom or else D(m)

is a true atom, in which case ¬D(m) nay be deleted from max(n,m) : N `γΣr

Γ′,¬D(m) and then, by N2, n : N `γ+αΣrΓ′. Whichever is the case, the

desired result follows immediately by weakening.Finally suppose otherwise, i.e., C is a “side formula” in the final rule of

the derivation of n : N `αΣrΓ, C. Then by applying the induction hypothesis

to the premise(s), C gets replaced by Γ′ and the ordinal bounds β are re-placed by γ+β. Re-application of that final rule then yields n : N `γ+αΣr

Γ,Γ′

as required.It is clear, at each step, that the new derivations introduced are term

controlled provided that the assumed ones are.

Theorem (Cut Elimination). If n ≥ 1 then

n : N `ωα

ΣrΓ.

Furthermore, if the given derivation is term controlled so is the resultingone.

Proof. Proceeding by induction on α, first suppose n : N `αΣr+1Γ

comes about by a cut on a Σr+1 or Πr+1 formula C. Then the premisesare n : N `β0

Σr+1Γ, C and n : N `β1

Σr+1Γ,¬C where β0, β1 ∈ α[n]. By an

appropriate weakening we may increase whichever is the smaller of β0, β1 sothat both ordinal bounds become β = max(β0, β1). Applying the inductionhypothesis we obtain

n : N `ωβ

ΣrΓ, C and n : N `ωβ

ΣrΓ,¬C.

Then since one of C,¬C is Σr+1, the above Cut Reduction Lemma withα = γ = ωβ yields

n : N `ωβ ·2Σr

Γ.

But β ∈ α[n] and so ωβ · 2[m] ⊆ ωα[m] for every m ≥ n. Therefore byweakening, n : N `ωα

ΣrΓ.

Now suppose n : N `αΣr+1Γ arises by any rule (or axiom) other than a

cut on a Σr+1 or Πr+1 formula. First, apply the induction hypothesis to thepremises (if any), thus reducing r + 1 to r and increasing ordinal bounds βto ωβ, and then re-apply that final rule to obtain n : N `ωα

ΣrΓ, noting that

if β ∈ α[n] then ωβ ∈ ωα[n] provided n ≥ 1. Note again, that the resultingderivation is term controlled if the original one is.

Theorem (Preliminary Cut Elimination). If n : N `ω·d+cΣr+1Γ with r ≥ 1

and n ≥ 1, then

n : N `ωd·2c+1

ΣrΓ

and this derivation is term controlled if the first one is.


Proof. This is just a special case of the main Cut Elimination Theoremabove, where α < ω2. Essentially the same steps are applied, but with a fewextra technicalities.

Suppose n : N `ω·d+cΣr+1Γ arises by a cut on a Σr+1 formula C. By

weakening we may assume that the premises are n : N `βΣr+1Γ, C and

n : N `βΣr+1Γ,¬C both with the same ordinal bound β ∈ ω · d+ c[n]. Thus

β = ω · k + l where either k = d and l < c or k < d and l ≤ n. Theinduction hypothesis then gives n : N `γΣr

Γ, C and n : N `γΣrΓ,¬C where

γ = ωk · 2l+1. The Cut Reduction Lemma then gives n : N `γ·2ΣrΓ. If k = d

and l < c then γ · 2[m] ⊆ ωd · 2c+1[m] for all m ≥ n and so the desired resultfollows immediately by weakening. On the other hand, if k < d and l ≤ nthen, setting n′ = 2n+2, we have γ · 2[m] ⊆ ωd[m] for all m ≥ n′. Thusagain by weakening, n′ : N `ωd

ΣrΓ. But Bω+1(n) ≥ n′ so n : N `ω+1

Σrn′ : N .

Therefore by the N2 rule we again have

n : N `ωd·2c+1

ΣrΓ

as required, since ω + 1 and ωd both belong to ωd · 2c+1[n] when n ≥ 1.If n : N `ω·d+cΣr+1

Γ comes about by the ∀-rule then the premises are, for

each i, max(n, i) : N `βi

Σr+1Γ, A(i) where βi = ω ·k+ l with either k = d and

l < c or k < d and l ≤ max(n, i). Applying the induction hypothesis to eachone gives max(n, i) : N `γi

ΣrΓ, A(i) where γi = ωk · 2l+1. If k = d and l < c

then γi ∈ ωd ·2c+1[max(n, i)]. If k < d and l ≤ max(n, i) then with n′ = 2n+1

and i′ = 2i+1 we obtain first, by weakening, max(n′, i′) : N `ωd

ΣrΓ, A(i),

and second, max(n, i) : N `ωΣrmax(n′, i′) : N because Bω(max(n, i)) ≥

max(n′, i′). Therefore by N2, max(n, i) : N `ωd+1Σr

Γ, A(i). Thus for each i

we have, in either case, an ordinal δi ∈ ωd · 2c+1[max(n, i)] such that

max(n, i) : N `δiΣrΓ, A(i).

The desired result then follows by re-applying the ∀-rule.Finally suppose n : N `ω·d+cΣr+1

Γ arises by any other rule or axiom. Then

the premises (if any) are of the form n : N `βΣr+1Γ′ or, in the case of N2

(with a weakening to make the ordinal bounds the same) m : N `βΣr+1Γ

and n : N `βΣr+1m : N . In each case β ∈ ω · d + c[n] and so β = ω · k + l

where either k = d and l < c or k < d and l ≤ n. The induction hypothesisthen transforms each such premise, reducing r + 1 to r and increasing β toωk · 2l+1. If k = d and l < c then ωk · 2l+1 belongs to ωd · 2c+1[n]. If k < dand l ≤ n then, just as before, we can use N2 and weakening to increasethe bound ωk · 2l+1 to ωd + 1 which again belongs to ωd · 2c+1[n] since n isassumed to be ≥ 1. Thus whichever is the case, each premise of the ruleapplied now has r instead of r + 1 and an ordinal bound γ belonging toωd · 2c+1[n]. Re-application of that final rule (or axiom) then immediatelygives

n : N `ωd·2c+1

ΣrΓ

as required, and each step preserves term control.


4.3.4. The classification theorem.

Theorem. For each i the following are equivalent:(a) f is provably recursive in IΣi+1;(b) f is elementarily definable from Fα = Hωα for some α < ε0(i);(c) f is computable in Fα-bounded time, for some α < ε0(i);(d) f is ε0(i)-recursive.

Proof. The Theorem in 4.2.2 characterizing ε0(i)-recursive functionsgives the equivalence of (c) and (d), and its proof also shows their equivalencewith (b). The implication from (d) to (a) was a corollary in 4.2.3. It thereforeonly remains to prove that (a) implies (b).

Suppose that f : Nk → N is provably recursive in IΣi+1. Then there isa Σ1-formula F (~x, y) such that for all ~n and m, f(~n ) = m if and only ifF (~n,m) is true, and such that

IΣi+1 ` ∃yF (~x, y).

In the case i = 0 we have already proved that f is primitive recursive andhence ε0(0)-recursive, so henceforth assume i > 0. By the Embedding The-orem there is a fixed number d and, for all instantiations ~n of the variables~x, a term controlled derivation of

max~n : N `ω·dΣi+1∃yF (~n, y).

Let n = max~n if max~n > 0 and n = 1 if max~n = 0. Then by the Prelimi-nary Cut Elimination Theorem with c = 0,

n : N `ωd·2Σi

∃yF (~n, y)

and by weakening, since ωd · 2[m] ⊆ ωd+1[m] for all m ≥ n,

n : N `ωd+1

Σi∃yF (~n, y).

Now, if i > 1, apply the ordinary Cut Elimination Theorem i − 1 times,bringing the cuts down to the Σ1 level and simultaneously increasing theordinal bound ωd+1 by i − 1 iterated exponentiations to the base ω. Thisproduces

n : N `αΣ1∃yF (~n, y)

with ordinal bound α < ε0(i) (recalling that, as defined earlier, ε0(i) con-sists of an exponential stack of i + 1 ω’s). Since this last derivation is stillterm controlled, we can next apply the Bounding Lemma to conclude that∃yF (~n, y) is true at Bα+1(n), which is less than or equal to Fα+1(n). Thismeans that for all ~n, Fα+1(n) bounds the value m of f(~n ) and boundswitnesses for all the existential quantifiers in the prefix of the Σ1 defining-formula F (~n,m). Thus, relative to Fα+1, the defining formula is boundedand therefore elementarily decidable, and f can be defined from it by abounded least-number operator. That is, f is elementarily definable fromFα+1.

Corollary. Every function provably recursive in IΣi+1 is bounded byan Fα = Hωα for some α < ε0(i). Hence Hε0(i+1) is not provably recursivein IΣi+1, for otherwise it would dominate itself.


4.4. Independence Results for PA

If the Hardy hierarchy is extended to ε0 itself by the definition

Hε0(n) = Hε0(n)(n)

then clearly (by what we have already done) the provable recursiveness ofHε0 is a consequence of transfinite induction up to ε0. However this func-tion is obviously not provably recursive in PA, for if it were we would havean α < ε0 such that Hε0(n) ≤ Hα(n) for all n, contradicting the fact thatα ∈ ε0[m] for some m and hence Hα(m) < Hε0(m). Thus, although transfi-nite induction up to any fixed ordinal below ε0 is provable in PA, transfiniteinduction all the way up to ε0 itself is not. This is Gentzen’s result, that ε0is the least upper bound of the “provable ordinals” of PA. Together with theGodel incompleteness phenomena, it forms the basis of all logical indepen-dence results for PA and related theories. The question that remained untilthe later 1970’s, was whether there might be other independence results of amore natural and clear mathematical character, i.e., genuine mathematicalstatements formalizable in the language of arithmetic which, though true,are not provable in PA. A variety of such results have emerged since thefirst, and most famous, one of Paris and Harrington (1977) which is treatedbelow. But we shall begin with a much simpler one due also to by Paris,and his student Kirby. The methods we employ (due respectively to Ci-chon (1983) and Ketonen and Solovay (1981)) are, however, quite differentto their original non-standard model theoretic ones, and in each case thereemerges a deep connection with the Hardy hierarchy.

4.4.1. Goodstein sequences. Choose any two positive numbers a andx, and write a in base-(x+ 1) normal form thus:

a = (x+ 1)a1 ·m1 + (x+ 1)a2 ·m2 + · · ·+ (x+ 1)ak ·mk

where 1 ≤ m1,m2, . . . ,mk ≤ x and a1 > a2 > · · · > ak. Then write eachexponent ai in base-(x+1) normal form, and each of their exponents, etceterauntil all exponents are ≤ x. The expression finally obtained is called thecomplete base-(x+ 1) form of a.

Definition. Let g(a, x) be the number which results by first writinga − 1 in complete base-(x + 1) form, and then increasing the base from(x+ 1) to (x+ 2), leaving all the coefficients mi ≤ x fixed.

Definition. The Goodstein sequence on (a, x) is then the sequence ofnumbers aii≥x generated by iteration of the operation g thus: ax = a andax+j+1 = g(ax+j , x+ j).

For example, the Goodstein sequence on (16, 1) begins a1 = 16, a2 = 112,a3 = 1, 284, a4 = 18, 753, a5 = 326, 594, etc.

Definition. Given a number a written in complete base-(x+ 1) form,let ord(a, x) be the ordinal in Cantor normal form obtained by replacing thebase (x+ 1) throughout by ω.

Definition. For α > 0 define the x-predecessor of α to be Px(α) = themaximum element of α[x].

4.4. INDEPENDENCE RESULTS FOR PA 149

Lemma. ord(a− 1, x) = Px(ord(a, x)).

Proof. The proof is by induction on a. If a = 1 then ord(a − 1, x) =0 = Px(1). Suppose then, that a > 1, and let the complete base-(x+1) formof a be

a = (x+ 1)a1 ·m1 + (x+ 1)a2 ·m2 + · · ·+ (x+ 1)ak ·mk.

If ak = 0 then ord(a, x) is a successor and ord(a − 1, x) = ord(a, x) − 1 =Px(ord(a, x)). If ak > 0 let

b = (x+ 1)a1 ·m1 + (x+ 1)a2 ·m2 + · · ·+ (x+ 1)ak · (mk − 1).

Then in complete base-(x+ 1) we have

a− 1 = b+ (x+ 1)ak−1 · x+ (x+ 1)ak−2 · x+ · · ·+ (x+ 1)0 · x.Let α = ord(a, x), β = ord(b, x) and αk = ord(ak, x). Then α = β + ωαk

and by the induction hypothesis we have

ord(a− 1, x) = β + ωPx(αk) · x+ ωP2x (αk) · x+ ωP

3x (αk) · x+ · · ·+ x

where Px(αk), P 2x (αk), P 3

x (αk), . . . , 0 are all the elements of αk[x] in descend-ing order. Therefore ord(a − 1, x) is the maximum element of β + ωαk [x].But this set is just α[x], so the proof is complete.

Lemma. Let aii≥x be the Goodstein sequence on (a, x). Then for eachj > 0,

ord(ax+j , x+ j) = Px+j−1Px+j−2 · · ·Px+1Px(ord(a, x)).

Proof. By induction on j. The basis j = 1 follows immediately fromthe last lemma since, by the definitions, ord(ax+1, x+ 1) = ord(ax− 1, x) =ord(a− 1, x). Similarly for the step from j to j + 1:

ord(ax+j+1, x+ j + 1) = ord(ax+j − 1, x+ j) = Px+j(ord(ax+j , x+ j))

and the result then follows immediately by the induction hypothesis.

Since the ordinals associated with the stages of a Goodstein sequencedecrease, it follows that every Goodstein sequence must eventually terminateat 0. This was established by Goodstein himself many years ago. Howeverthe following result, due to Cichon (1983), brings out a surprisingly closeconnection with the Hardy hierarchy.

Theorem. Every Goodstein sequence terminates. Let aii≥x be theGoodstein sequence on (a, x). Then there is an m such that am = 0. Fur-thermore the least such m is given by m = Hord(a,x)(x).

Proof. Since ord(ax+j+1, x+ j+ 1) = Px+j(ord(ax+j , x+ j)) it followsstraightaway by well-foundedness that there must be a first stage k at whichord(ax+k, x + k) = 0 and hence ax+k = 0. Letting m = x + k we thereforehave, by the last lemma,

m = µy>x.Py−1Py−2 · · ·Px+1Px(ord(a, x)) = 0.

But it is very easy to check by induction on α > 0 that for all x,

Hα(x) = µy>x.Py−1Py−2 · · ·Px+1Px(α) = 0

since Px(1) = 0, Px(α + 1) = α and Px(α) = Px(α(x)). Hence m =Hord(a,x)(x).


The theorem of Kirby and Paris (1982) now follows immediately:

Corollary. The statement “every Goodstein sequence terminates” isformalizable in the language of PA and, though true, is not provable in PA.

Proof. The Goodstein sequence aii≥x on (a, x) is generated by iter-ation of the function g which is clearly primitive recursive. Therefore ai is aprimitive recursive function of a, x, i, and hence there is a Σ1-formula which(provably) defines it in PA. Thus the fact that every Goodstein sequenceterminates, i.e., ∀a>0∀x>0∃y>x(ay = 0) is expressible in PA. It cannot beproved in PA however, for otherwise the function Hord(a,x)(x) = the leasty > x such that ay = 0 would be provably recursive. But this is impossiblebecause, by substituting for a the primitive recursive function e(x) consist-ing of an iterated exponential stack of (x + 1)’s with stack-height (x + 1),one obtains ord(e(x), x) = ε0(x). Hence Hε0(x) = Hord(e(x),x)(x) would beprovably recursive also; a contradiction.

4.4.2. The Modified Finite Ramsey theorem. Ramsey’s Theoremfor infinite sets (1930) says that for every positive integer n, each finite par-titioning (or “colouring”) of the n-element subsets of an infinite set X has aninfinite homogeneous (or “monochromatic”) subset Y ⊂ X, meaning all n-element subsets of Y have the same colour (lie in the same partition). Ram-sey also proved a version for finite sets: the Finite Ramsey Theorem statesthat given any positive integers n, k, l with n < k, there is an m so largethat every partitioning of the n-element subsets of m = 0, 1, . . . ,m − 1,into l (disjoint) classes, has a homogeneous subset Y ⊂ m with cardinalityat least k. This is usually written:

∀n,k,l ∃m (m→ (k)nl )

where, letting m[n] denote the collection of all n-element subsets of m, m→(k)nl means that for every function (colouring) c : m[n] → l there is a subsetY ⊂ m of cardinality at least k, which is homogeneous for c, i.e., c is constanton the n-element subsets of Y .

Whereas, by Jockusch (1972), the Infinite Ramsey Theorem (with nvarying) is not arithmetically expressible (even by restricting to recursivepartitions) the Finite Ramsey Theorem clearly is. For by standard cod-ing, the relation m → (k)nl is easily seen to be elementary recursive and soexpressible as a ∆0(exp)-formula. The statement therefore asserts the exis-tence of a recursive function which computes the least such m from n, k, l.This function is known to have superexponential growth-rate, so it is prim-itive recursive but not elementary. Thus the Finite Ramsey Theorem isindependent of I∆0(exp) but provable in IΣ1.

The Modified Finite Ramsey Theorem of Paris and Harrington (1977)is also expressible as a Π0

2-formula, but it is now independent of PeanoArithmetic. Their modification is to replace the requirement that the finitehomogeneous set Y has cardinality at least k, by the requirement that Y is“large” in the sense that its cardinality is at least as big as its smallest ele-ment, i.e., |Y | ≥ minY . (Thus 5, 7, 8, 9, 10 is large but 6, 7, 80, 900, 1010is not.) We can now (if we wish, and it’s simpler to do so) dispense with


the parameter k and state the modified version as:

∀n,l ∃m (m→ (large)nl )

where m → (large)nl means that every colouring c : m[n] → l has a largehomogeneous set Y ⊂ m, it being assumed always that Y must have at leastn+ 1 elements in order to avoid the trivial case Y = m = n.

That the Modified Finite Ramsey Theorem is indeed true follows easilyfrom the Infinite Ramsey Theorem. For assume, toward a contradiction,that it is false. Then there are fixed n and l such that for every m there is acolouring cm : m[n] → l with no large homogeneous set. Define a “diagonal”colouring on all n+ 1-element subsets of N by:

d(x0, x1, . . . , xn−1, xn) = cxn(x0, x1, . . . , xn−1)

where x0, x1, . . . , xn−1, xn are written in increasing order. Then by the In-finite Ramsey Theorem, d has an infinite homogeneous set Y ⊂ N. Wecan therefore select from Y an increasing sequence y0, y1, . . . , yy0 withy0 ≥ n + 1. Now let m = yy0 and choose Y0 = y0, y1, . . . , yy0−1. Then Y0

is a large subset of m and is homogeneous for cm since cm(x0, . . . , xn−1) =d(x0, . . . , xn−1,m) is constant on all x0, . . . , xn−1 ∈ Y [n]

0 . This is the de-sired contradiction.

Paris and Harrington’s original proof that

PA 6 `∀n∀l ∃m (m→ (large)nl )

is essentially model-theoretic. Later, Ketonen and Solovay (1981) gave arefined, purely combinatorial analysis of the rate of growth of the Paris-Harrington function

PH(n, l) = µm (m→ (large)nl )

showing that for sufficiently large n,

Fε0(n− 3) ≤ PH(n, 8) ≤ Fε0(n− 2).

The lower bound immediately gives the independence result, since it saysthat PH(n, 8) eventually dominates every provably recursive function of PA.The basic ingredients of the Ketonon-Solovay method for the lower boundare set out concisely in the book “Ramsey Theory” by Graham, Rothschildand Spencer (1990) where a somewhat weaker result is presented. Howeverit is not difficult to adapt their treatment so as to obtain a fairly short proofthat, for a suitable elementary function l(n),

Hε0(n) ≤ PH(n+ 1, l(n)).

Though it does not give the refined bounds of Ketonen-Solovay, this isenough for the independence result.

The proof has two parts. First, define certain colourings on finite setsof ordinals below ε0, for which we can prove that all of their homogeneoussets must be “relatively small”. Then, as in the foregoing result on Good-stein sequences, use the Hardy functions to associate numbers x betweenn and Hε0(n) with ordinals PxPx−1 . . . Pn(ε0). By this correspondence oneobtains colourings on n + 1-element subsets of Hε0(n) which have no largehomogeneous sets. Hence PH must grow at least as fast as Hε0 .


Definition. Given Cantor Normal Forms α = ωα1 · a1 + · · ·+ ωαr · arand β = ωβ1 · b1 + · · ·+ ωβs · bs with α > β, let D(α, β) denote the first (i.e.greatest) exponent αi at which they differ. Thus ωα1 ·a1+ · · ·+ωαi−1 ·ai−1 =ωβ1 · b1 + · · ·+ ωβi−1 · bi−1 and ωαi · ai > ωβi · bi + · · ·+ ωβs · bs.

Definition. For each n ≥ 2 the function Cn from the n + 1-elementsubsets of ε0(n−1) into 2n−1 is given by the following induction. The defini-tion of Cn(α0, α1, . . . , αn) requires that the ordinals are listed in descend-ing order; whenever we need to emphasise this we write Cn(α0, α1, . . . , αn)>instead. Note that if α, β < ε0(n− 1) then D(α, β) < ε0(n− 2).

C2(α0, α1, α2)> =

0 if D(α0, α1) > D(α1, α2)1 if D(α0, α1) < D(α1, α2)2 if D(α0, α1) = D(α1, α2)

and for each n > 2,

Cn(α0, . . . , αn)> =

2 · Cn−1(δ0, . . . , δn−1) if D(α0, α1) > D(α1, α2)2 · Cn−1(δn−1, . . . , δ0)+1 if D(α0, α1) < D(α1, α2)2n − 2 if D(α0, α1) = D(α1, α2)

where δi = D(αi, αi+1) for each i < n.

Lemma. If S = γ0, γ1. . . . , γr> is homogeneous for Cn then, lettingmax(γ0) denote the maximum coefficient of γ0 and k(n) = 1+2+ · · ·+(n−1) + 2, we have |S| < max(γ0) + k(n).

Proof. Proceed by induction on n ≥ 2.For the base-case we have ε0(1) = ωω and C2 : (ωω)[3] → 3. Since S is

a subset of ωω the values of D(γi, γi+1), for i < r, are integers. Let γ0, thegreatest member of S, have Cantor Normal Form:

γ0 = ωm · cm + ωm−1 · cm−1 + · · ·+ ω2 · c2 + ω · c1 + c0

where some of cm−1, . . . , c1, c0 may be zero, but cm > 0. Then for eachi < r, D(γi, γi+1) ≤ cm ≤ max(γ0). Now if C2 has constant value 0 or1 on S[3] then all D(γi, γi+1), for i < r, are distinct, and since we have rdistinct numbers ≤ max(γ0) it follows that |S| = r + 1 < max(γ0) + 3 asrequired. If, on the other hand, C2 has constant value 2 on S[3] then all theD(γi, γi+1) are equal, say to j. But then the Cantor Normal Form of each γicontains a term ωj · ci,j where 0 ≤ cr,j < cr−1,j < · · · < c0,j = cj ≤ max(γ0).In this case we have r + 1 distinct numbers ≤ max(γ0) and hence, again,|S| = r + 1 < max(γ0) + 3.

For the induction step assume n > 2. Assume also that r ≥ k(n), forotherwise the desired result |S| < max(γ0) + k(n) is automatic.

First, suppose Cn is constant on S[n+1] with even value < 2n − 2. Notethat the final n+ 1-tuple of S is (γr−n, γr−n+1, γr−n+2, . . . , γr)>. Therefore,by the first case in the definition of Cn,

D(γ0, γ1) > D(γ1, γ2) > · · · > D(γr−n+1, γr−n+2)

and this set is homogeneous for Cn−1 (the condition r ≥ k(n) ensures thatit has more than n elements). Consequently, by the induction hypothesis,


r−n+2 < max(D(γ0, γ1))+ k(n− 1) and therefore, since D(γ0, γ1)) occursas an exponent in the Cantor Normal Form of γ0,

|S| = r + 1 < max(D(γ0, γ1)) + k(n− 1) + (n− 1) ≤ max(γ0) + k(n)

as required.Second, suppose Cn is constant on S[n+1] with odd value. Then by the

definition of Cn we have

D(γr−n+1, γr−n+2) > D(γr−n, γr−n+1) > · · · > D(γ0, γ1)

and this set is homogeneous for Cn−1. So by applying the induction hypoth-esis, r − n+ 2 < max(D(γr−n+1, γr−n+2)) + k(n− 1) and hence

|S| = r + 1 < max(D(γr−n+1, γr−n+2)) + k(n).

Now in this case, since D(γ1, γ2) > D(γ0, γ1) it follows that the initial seg-ments of the Cantor Normal Forms of γ0 and γ1 are identical down to and in-cluding the term with exponent D(γ1, γ2). Therefore D(γ1, γ2) = D(γ0, γ2).Similarly D(γ2, γ3) = D(γ1, γ3) = D(γ0, γ3) and by repeating this argu-ment one obtains eventually, D(γr−n+1, γr−n+2) = D(γ0, γr−n+2). ThusD(γr−n+1, γr−n+2) is one of the exponents in the Cantor Normal Form ofγ0, so its maximum coefficient is bounded by max(γ0) and, again, |S| <max(γ0) + k(n).

Finally suppose Cn is constant on S[n+1] with value 2n−2. In this case allthe D(γi, γi+1) are equal, say to δ, for i < r−n+2. Let di be the coefficientof ωδ in the Cantor Normal Form of γi. Then d0 > d1 > · · · > dr−n+1 > 0and so r − n+ 1 < d0 ≤ max(γ0). Therefore |S| = r + 1 < max(γ0) + k(n)and this completes the proof.

Lemma. For each n ≥ 2 let l(n) = 2k(n) + 2n − 1. Then there is acolouring cn : Hε0(n−1)(k(n))[n+1] → l(n) which has no large homogeneoussets.

Proof. Fix n ≥ 2 and let k = k(n). Recall that

Hε0(n−1)(k) = µy>k(Py−1Py−2 · · ·Pk(ε0(n− 1)) = 0).

As i increases from k up to Hε0(n−1)(k) − 1, the associated sequence ofordinals αi = PiPi−1 · · ·Pk(ε0(n−1)) strictly decreases to 0. Therefore, fromthe above colouring Cn on sets of ordinals below ε0(n− 1), we can define acolouring dn on the (n+1)-subsets of 2k, 2k+1, . . . ,Hε0(n−1)(k)− 1 thus:

dn(x0, x1, . . . , xn)< = Cn(αx0−k, αx1−k, . . . αxn−k)>.

Clearly, every homogeneous set y0, y1, . . . , yr< for dn corresponds to ahomogeneous set αy0−k, αy1−k, . . . αyr−k> for Cn, and by the previouslemma it has fewer than max(αy0−k) + k elements. Now the maximumcoefficient of any Pi(β) is no greater than the maximum of i and max(β),so max(αy0−k) ≤ y0 − k. Therefore every homogeneous set y0, y1, . . . , yr<for dn has fewer than y0 elements.

From dn construct cn : Hε0(n−1)(k)[n+1] → l(n) as follows:

cn(x0, x1, . . . , xn)< =dn(x0, x1, . . . , xn) if x0 ≥ 2kx0 + 2n − 1 if x0 < 2k


Suppose y0, y1, . . . , yr< is homogeneous for cn with colour ≥ 2n−1. Thenby the second clause, y0+2n−1 = cn(y0, y1, . . . , yn) = cn(y1, y2, . . . , yn+1) =y1 + 2n − 1 and hence y0 = y1 which is impossible. Therefore any homoge-neous set for cn has least element y0 ≥ 2k and, by the first clause, it mustbe homogeneous for dn also. Thus it has fewer than y0 elements, and hencethis colouring cn has no large homogeneous sets.

Theorem. (Paris-Harrington 1977) The Modified Finite Ramsey The-orem ∀n∀l ∃m (m→ (large)nl ) is true but not provable in PA.

Proof. Suppose, toward a contradiction, that ∀n∀l ∃m (m→ (large)nl )were provable in PA. Then the function

PH(n, l) = µm(m→ (large)nl )

would be provably recursive in PA, and so also would be the function f(n) =PH(n + 2, l(n + 1)). For each n, f(n) is so big that every colouring onf(n)[n+2] with l(n+1) colours, has a large homogeneous set. The last lemma,with n replaced by n + 1, gives a colouring cn+1 : Hε0(n)(k(n + 1))[n+2] →l(n+1) with no large homogeneous sets. Therefore f(n) > Hε0(n)(k(n+1))for otherwise cn+1, restricted to f(n)[n+2], would have a large homogeneousset. SinceHε0(n) is increasing, Hε0(n)(k(n+1)) > Hε0(n)(n) = Hε0(n). Hencef(n) > Hε0(n) for all n, and since Hε0 eventually dominates all provablyrecursive functions of PA it follows that f cannot be provably recursive.This is the contradiction.

4.5. Notes

To be written.

CHAPTER 5

Accessible Recursive Functions

As we shall see in section one below, the class of all recursive func-tions fails to possess a natural hierarchical structure, generated inductivelyfrom “within”. On the other hand, many proof-theoretically significant sub-recursive classes do. This chapter attempts to measure the limits of pred-icative generation in this context, by classifying and characterizing those(predictably terminating) recursive functions which can be successively de-fined according to an autonomy principle of the form: allow recursions onlyover well-orderings which have already been “coded” at previous levels. Thequestion is: how can a recursion code a well-ordering? The answer lies inGirard’s theory of dilators, but it is reworked here in an entirely differentand much simplified framework specific to our subrecursive purposes. The“accessible” recursive functions thus generated turn out to be those prov-ably recursive in the theory ID<ω of finitely iterated inductive definitions,or equivalently in the second order theory of Π1

1-Comprehension.

5.1. The Subrecursive Stumblingblock

An obvious goal would be to find, once and for all, a natural transfinitehierarchy classification of all the recursive functions which clearly reflectstheir computational- and termination- complexity. There is one for the totaltype-two recursive functionals, as we saw in 2.8. So why isn’t there one forthe type-one recursive functions as well? The reason is that the terminationstatement for a type-two recursive functional is a well-foundedness condition– i.e., a statement that a certain recursive ordinal exists – whereas thetermination statements for recursive functions are merely arithmetical andhave nothing apparently to do with ordinals. This is all somewhat vague andmeaningless, but there are some basic negative results of general recursiontheory which help explain it more precisely.

Firstly, it is simply not possible to classify recursive functions in terms ofthe order-types of termination orderings, since every recursive function hasa simply definable (e.g. ∆0 or elementary) termination ordering of lengthω. This result goes back to Myhill (1953), Routledge (1953) and Liu (1960).

Theorem. For every recursive function ϕe there is an elementary recur-sive well-ordering <e of order-type ω in which the rank of any point (n, 0)is a bound on the number of steps needed to compute ϕe(n). Thus ϕe isdefinable by an easy recursion over <e.

Proof. Define the well-ordering <e⊆ N×N by: (n, s) <e (n′, s′) if andonly if either (i) n < n′ or (ii) n = n′, s > s′ and ϕe(n) is undefined at steps′. Then the well-foundedness of <e is just a restatement of the assumptionthat the computation of ϕe(n) terminates for every n. Furthermore the rank

155

156 5. ACCESSIBLE RECURSIVE FUNCTIONS

or height of the point (n, 0) is just the rank of (n− 1, 0) (if n > 0) plus thenumber of steps needed to compute ϕe(n). Using the notation of Kleene’sNormal Form Theorem in chapter 2, we can define the rank r of any pointin the well-ordering quite simply by:

r(n, s) =

0 if n = 0 ∧ T (e, 0, s)r(n, s+ 1) + 1 if ¬T (e, n, s)r(n− 1, 0) + 1 if n > 0 ∧ T (e, n, s)

and then for each n we have ϕe(n) = U(e, n, r(n, 0)).

This result tells us that subrecursive hierarchies must inevitably be“notation-dependent”. They must depend upon given well-orderings, notjust on their order-types. So what is a subrecursive hierarchy?

Definition. By a subrecursive hierarchy we mean a triple (C,P,≺)where ≺ is a recursively enumerable relation, P is a linearly and hence well-ordered initial segment of the accessible part of ≺ and, uniformly to eacha ∈ P , C assigns an effectively generated class C(a) of recursive functionsso that C(a′) ⊆ C(a) whenever a′ ≺ a. Furthermore we require that thereare elementary relations Lim, Succ and Zero which decide for each a ∈ P ,whether a represents a limit ordinal, a successor or zero, and an elementaryfunction pred which computes the immediate predecessor of a if it happensto represent a successor. We also assume that pred(a) < a whenever Succ(a)holds.

For example, the classes C(a) could be the functions elementary in Fawhere F is some version of the fast-growing hierarchy, but what gives thehierarchy its power is the size and structure of its well-ordering (P,≺). Thereis a universal system of notations for such well-orderings, called Kleene’s O,and it will be convenient, now and for later, to develop its basic properties.Our somewhat modified version will however be denoted W.

It is important to note that the set of numbersW is just a constructivizedversion of the set of countable tree ordinals Ω developed earlier.

Definition. (i) The set W of “constructive ordinal notations” is thesmallest set closed under the following inductive rule:

a ∈ W ⇐= a = 0 ∨ ∃b∈Wa = 2b+ 1 ∨ ∃e(∀n([e](n) ∈ W) ∧ a = 2e)

where [e] denotes the e-th elementary function in some standard primitiverecursive enumeration.

(ii) For each a ∈ W its “rank” is the ordinal |a| given by:

|0| = 0 ; |2b+ 1| = |b|+ 1 ; |2e| = supn|[e](n)|+ 1.

These ordinals are called the “constructive” or “recursive” ordinals. Theirleast upper bound is denoted ωCK1 .

(iii) The recursively enumerable relation ≺W defined inductively by:

a′ ≺W a ⇐= ∃b(a = 2b+ 1 ∧ a′ W b) ∨ ∃e∃n(a = 2e ∧ a′ W [e](n))

partially orders and is well-founded on W. In fact W is the accessible partof ≺W .

5.1. THE SUBRECURSIVE STUMBLINGBLOCK 157

(iv) A path in W is any subset P ⊆ W which is linearly (and hencewell-) ordered by ≺W and contains with each a ∈ P all its ≺W -predecessors.If it contains a notation for every recursive ordinal then it is called a paththrough W.

Theorem. If P is a path in W then the well-ordering (P,≺W) satisfiesthe conditions of the definition of a subrecursive hierarchy. Conversely everywell-ordering (P,≺) satisfying those conditions is isomorphic to a path inW.

Proof. It is clear from the last set of definitions that if P is a path inW then (P,≺W) is a well-ordering satisfying the conditions of the definitionof a subrecursive hierarchy. For the converse, let (P,≺) be any well-orderingsatisfying those conditions. As ≺ is a recursively enumerable relation, thereis an elementary recursive function pr of two variables such that for everynumber a the function n 7→ pr(a, n) enumerates a′ | a′ ≺ a provided it isnon-empty.

We now define an elementary recursive function w such that for everya ∈ P we have w(a) ∈ W and |w(a)| is the ordinal represented by a in thewell-ordering (P,≺).

w(a) =

0 if Zero(a)2.w(pred(a)) + 1 if Succ(a)2.e(a) if Lim(a)

where e(a) is an elementary index such that for every n,

[e(a)](n) = w(pr(a, n)).

Clearly e(a) is computed by a standard index-construction using as param-eters a given index for the function pr and and an assumed one for w itself.Since pred(a) < a the definition of w is thus a course-of-values primitiverecursion, and it is bounded by some fixed elementary function dependingon the chosen method of indexing. Thus w is definable elementarily from itsown index as a parameter, and the second recursion theorem justifies thisprinciple of definition.

It is obvious by induction that if a ∈ P then w(a) ∈ W and |w(a)| is theordinal represented by a in the well-ordering (P,≺). Thus w(a) | a ∈ P is a path in W isomorphic with the given (P,≺). Note that if w(a) ∈ Wthen although a may not be in P it certainly will lie in the accessible partof ≺.

Theorem. W is a complete Π11 set.

Proof. SinceW is the intersection of all sets satisfying a positive arith-metical closure condition, it is Π1

1. Furthermore if

S = n | ∀g∃sT (e, n, g(s)) is any Π1

1 subset of N then as in 2.8, the Kleene-Brouwer ordering of non-past-secured sequence numbers gives, uniformly to each n, an (elementary)recursive linear ordering (Pn,≺n) having 〈n〉 as its top element, which iswell-founded if and only if n ∈ S, and on which it is possible (elementarily)to distinguish limits from successors, and compute the predecessor of any


successor point. Since in this case, membership in each Pn is decidable,the function w of the above proof is easily modified (adding n as a newparameter) so that w(n, a) ∈ W if and only if a belongs to the accessiblepart of ≺n. Therefore with a = 〈n〉, the top element of Pn, we get thereduction:

n ∈ S ⇐⇒ w(n, 〈n〉) ∈ W.

Therefore S is “many-one reducible” toW and henceW is Π11 complete.

Thus every subrecursive hierarchy can be represented in the form (C,P )where P is some path in W, the underlying relation ≺W now being thesame in each case. The level of definability of P then serves as a roughmeasure of the logical complexity of the given hierarchy. To say that thehierarchy (C,P ) is “inductively generated” is therefore to say that the pathP is Π1

1. Since one can easily manufacture such hierarchies of arbitraryrecursive ordinal-lengths, the question of the existence of inductively gen-erated hierarchies which are “complete” in the sense that they capture allrecursive functions and extend through all recursive ordinal-levels, becomesthe question: is there a subrecursive hierarchy (C,P ) where P is a Π1

1 paththrough W? The “stumblingblock” is that the answer is “No”.

Theorem. There is no subrecursive hierarchy (C,P ) such that P is aΠ1

1 path through W and⋃C(a) | a ∈ P contains all recursive functions.

Proof. Suppose there were such a hierarchy. Then since at each levela ∈ P , the class

⋃C(a′) | a′ W a is a recursively enumerable set of

recursive functions, there must always be, at any level a ∈ P , a new recursivefunction which has not yet appeared. This enables us to defineW as follows:

a ∈ W ↔ ∃e(ϕeis total ∧ ∀c(c ∈ P ∧ ϕe ∈ C(c)→ a ∈ W ∧ |a| < |c|))

Now for c ∈ P there is a uniform Σ11 definition of the condition a ∈ W∧|a| <

|c| since it is equivalent to saying there is an order-preserving function fromd : d W a into d | d ≺W c . Also, notice that the Π1

1 condition c ∈ Poccurs negatively. Since all other components of the right hand side arearithmetical the above yields a Σ1

1 definition of W. This is impossible sinceW is a complete Π1

1 set; if it were also Σ11 then every Π1

1 set would be Σ11

and conversely.

The classic Feferman (1962) was the first to provide a detailed technicalinvestigation of the general theory of subrecursive hierarchies. Many fun-damental results are proved there, of which the above is just one relativelysimple but important example. It is also shown that there are subrecur-sive hierarchies (C,P ) which contain all recursive functions, but where thepath P is arithmetically definable and very short (e.g. of length ω3). Thesepathological hierarchies are not generated “from below” either, since theyare constructed out of an assumed enumeration of indices for all total re-cursive functions. The classification problem for all recursive functions thusseems intractable. On the other hand, as we have already seen and shall seefurther, there are hierarchies for “naturally ocurring” r.e. subclasses such asthe ones provably recursive in arithmetical theories.

5.2. ACCESSIBLE RECURSIVE FUNCTIONS 159

5.2. Accessible Recursive Functions

Before one accepts a computable function as being recursive, a proof oftotality is required. This will generally be an induction over the tree of sub-computations unravelled from the defining algorithm, since termination cor-responds to well-foundedness along the computation-branches. If the tree iswell-founded, the strength of the induction principle over its Kleene-Brouwerwell-ordering thus serves as a measure of the proof-theoretical complexity ofthe given function.

The aim of this chapter is to isolate and characterize those recursivefunctions which may be termed “predicatively accessible” or “predictablyterminating” according to the following hierarchy principle: one is allowedto generate a function at a new level only if it is provably recursive over awell-ordering already coded in a previous level, i.e., only if one has alreadyconstructed a method to prove its termination.

This begs the question: what should it mean for a well-ordering to be“coded in a previous level”? Certainly it is not enough merely to requirethat the characteristic function of its ordering relation should have been gen-erated at an earlier stage, since by the Myhill-Routledge observation in thelast section, the resulting hierarchy would then collapse in the sense that allrecursive functions would appear immediately once the elementary relationshad been produced. In order to avoid this circularity, a more delicate notionof “code” for well-orderings is needed, but one which is still finitary in that itshould be determined by number theoretic functions only. The crucial ideais the one underpinning Girard’s Π1

2-Logic (1981), and this section can beviewed as a reconstruction of some of the main results there. However ourapproach is quite different and, since the concern is with only those partsof the general framework specific to subrecursive hierarchies, it can be de-veloped in (we hope) a simpler and more basic context. The slogan is: codewell-orderings by number theoretic functors whose direct limits are (isomor-phic copies of) the well-orderings themselves. This functorial connection iseasily explained.

A well-ordering is an “intensional ordinal”. If the ordinal is countablethen the additional intensional component should amount to a particularchoice of enumeration of its elements. Thus, by a presentation of a countableordinal α we shall mean a chosen sequence of finite subsets of it, denotedα[n], n ∈ N, such that

∀n(α[n] ⊆ α[n+ 1]) and ∀β<α∃n(β ∈ α[n]).

It will later be convenient to require also that for all β

β+1 ∈ α[n] =⇒ β ∈ α[n] and β ∈ α[n] =⇒ β+1 ∈ α[n+1] if β+1 < α.

Note that a presentation of α immediately induces a sub-presentation foreach β < α by β[n] := α[n] ∩ β, and consequently a system of “rank func-tions” given by

G(β, n) := card β[n]

for β ≤ α, so that if β belongs to α[n] then it is the G(β, n)-th element inascending order. Thus G(γ, n) < G(β, n) whenever γ ∈ β[n]. This systemG, called the “slow-growing hierarchy” on the given presentation, determines


a functorG(α) : N0 → N

where N0 is the category 0 → 1 → 2 → 3 → . . . . in which there is onlyone arrow (the identity function) imn : m → n if m ≤ n, and where N isthe category of natural numbers in which the morphisms between m and nare all strictly increasing maps from 0, 1, . . . ,m − 1 to 0, 1, . . . , n − 1.The definition of G(α) is straightforward. On numbers we take G(α)(n) =cardα[n] and on arrows we take G(α)(imn) to be the map p : G(α,m) →G(α, n) such that if k < G(α,m) and the k-th element of α[m] in ascendingorder is β, then p(k) = G(β, n).

It is easy to check that if instead we view G(α) as a functor from N0 intothe larger category of all linear orderings, with the order-preserving mapsas morphisms, then G(α) has a direct limit which will be a well-orderedstructure isomorphic to the presentation of α we started with. We shalltherefore write

(α, [ ]) = Lim→ G(α)or more loosely, when the presentation is understood

α = Lim→ G(α).

G(α) will be taken as the canonical functorial code of the given presentation.Note that, given two presentations (α, [ ]) and (α′, [ ]′), the existence of a

natural transformation from G(α) to G(α′) is equivalent to the existence ofan order-preserving map ν from α into α′ such that for every n, ν takes α[n]into α′[n]′. Thus although the notion of “natural well-ordering” or “naturalpresentation” remains unclear (see Feferman (1996) for a discussion of thisbothersome problem) there is nevertheless a “natural” partial ordering ofthem.

We can now begin to describe what is meant by an accessible recursivefunction. Firstly we need to develop recursion within a robust hierarchicalframework, one which closely reflects provable termination on the one hand,and complexity on the other. That is, if a function is provably recursive overa well-ordering of order-type α then the chosen hierarchy should provide acomplexity bound for it, at or near level α. The “fast growing hierarchy”has this property as we have already seen, and the version B turns out tobe a particularly convenient form to work with.

Definition. Given a presentation of α, define for each β ≤ α the func-tion Bβ : N→ N as follows:

B0(n) = n+ 1 and Bβ(n) = Bγ Bγ(n) if β 6= 0

where γ is the maximum element of β[n].

Theorem. For a suitably large class of ordinal presentations α, thefunction Bα naturally extends to a functor on N. This functor is, in thesense described earlier, a canonical code for a (larger) ordinal presentationα+. Thus Bα = G(α+) and hence α+ = Lim→ Bα.

Definition. The accessible part of the fast-growing hierarchy is definedto be (Bα)α<τ where τ = sup τi and the presentations τi are generated asfollows:

τ0 = ω and τi+1 = Lim Bτi = τ+i .


The accessible recursive functions are those computable within Bα-boundedtime or space, for any α < τ (or those Kalmar-elementary in Bα’s, α < τ).

Theorem. τ is a presentation of the proof-theoretic ordinal of the the-ory Π1

1-CA0. The accessible recursive functions are therefore the provablyrecursive functions of this theory.

The main effort of this section will lie in computing the operation α 7→α+ and establishing the functorial identity Bα = G(α+). The followingsection will characterize the ordinals τi and their limit τ proof theoreti-cally. In fact τi+2 will turn out to be the ordinal of the theory IDi of ani-times iterated inductive definition. In order to compute these and othermoderately large recursive ordinals we shall need to make uniform recursivedefinitions of systems of “fast-growing” operations on ordinal presentations.It will therefore be convenient to develop a more explicitly computationaltheory of ordinal presentations within a uniform inductive framework. Thisis where “structured tree ordinals” come into play, but now we shall needto generalize them to all finite number classes, the idea being that largeordinals in one number class can be presented in terms of a fast growinghierarchy indexed by ordinal presentations in the next number class.

Note. It will be intuitively clear that the ordinals computed are indeedrecursive ordinals, having notations in the set W. However throughoutthis section we shall suppress all the recursion-theoretic machinery of W todo with coding limit ordinals by recursive indices etcetera, and concentratepurely on their abstract structure as unrestricted tree ordinals in Ω, whereinarbitrary sequences are allowed and not just recursive (or elementary) ones.Later we shall be forced to code them up as ordinal notations, and it will befairly obvious how this should be done. However it all adds a further levelof technical and syntactical complexity that we don’t need to be botheredwith at present. Things are complicated enough without at the same timehaving to worry about recursion indices. So for the time being let us agreeto work over the classical Ω instead of the constructive W, and appeal toChurch’s Thesis whenever we want to claim that a tree ordinal is recursive.

5.2.1. Structured tree ordinals. The sets Ω0 ⊆ Ω1 ⊆ Ω2 ⊆ . . . offinite, countable and higher-level tree ordinals (hereafter denoted by lower-case greek letters) are generated by the following iterated inductive defini-tion:

α ∈ Ωk ⇐= α = 0 ∨ ∃β ∈ Ωk(α = β + 1) ∨ ∃i<k(α : Ωi → Ωk)

where β + 1 denotes β ∪ β, and if α : Ωi → Ωk we call it a limit and oftenwrite, more suggestively, α = supΩi

αξ, the subscript ξ denoting evaluationof the function at ξ. We often use λ to denote such limits. The subtree partialordering ≺ on Ωk is the transitive closure of β ≺ β + 1 and αξ ≺ α for eachξ. The identity function on Ωi will be denoted ωi, so that ωi = supΩi

ξ ∈ Ωk

whenever i < k.

Definition. For each α ∈ Ωk and ~γ ∈ Ωk−1 × Ωk−2 × · · · × Ω0 definethe finite linearly-ordered set α[~γ] of ≺-predecessors of α by induction as


follows:

0[~γ] = φ ; (α+ 1)[~γ] = α[~γ] ∪ α ; (supΩi

αξ)[~γ] = αγi [~γ].

Definition. The subset ΩSk of structured tree ordinals at level k is de-

fined by induction on k. If each ΩSi has already been defined for i < k, let

≺S⊆ Ωk ×Ωk be the transitive closure of β ≺S β + 1 and αξ ≺S α for everyξ ∈ ΩS

i , in the case where α : Ωi → Ωk. Then ΩSk consists of those α ∈ Ωk

such that for every λ S α with λ = supΩiλξ, the following condition holds:

∀~γ∈ΩSk−1×ΩS

k−2×···×ΩS0∀ξ∈ωi[~γ]λξ ∈ λ[~γ].

Remark. The structuredness condition above ensures that “fundamen-tal sequences” mesh together appropriately. In particular, since ΩS

0 = Ω0 =N and since ω0[x] = 0, 1, 2, . . . , x − 1 for each x ∈ Ω0, the condition forcountable tree ordinals λ = supΩ0

λz simply amounts to

∀x∀z<xλz ∈ λx[x].Note also that if α ∈ ΩS

k and β ≺S α then β ∈ ΩSk , that ω0, ω1, . . . , ωk−1 are

structured at level k, and that ΩSi ⊆ ΩS

k whenever i < k.

Theorem. If α ∈ ΩS1 then

∀x(α[x] ⊆ α[x+ 1]) and ∀β ≺ α ∃x(β ∈ α[x]).

Proof. By induction over the generation of countable tree ordinals α.The α = 0 and α = β + 1 cases are trivial. Suppose α = sup αz where zranges over Ω0 = N, and assume inductively that the result holds for eachαz individually. Since α is structured, we have for each x ∈ N, αx ∈ α[x+1]and hence αx[x+ 1] ⊆ α[x+ 1]. Thus α[x] = αx[x] ⊆ αx[x+ 1] ⊆ α[x+ 1].For the second part, if β ≺ α, then β αz for some z. So by the inductionhypothesis β ∈ αz[x] ∪ αz for all sufficiently large x. Therefore choosingx > z we have αz ∈ α[x] by structuredness and hence αz[x] ∪ αz ⊆ α[x]and hence β ∈ α[x].

Theorem (Structure). If α is a countable structured tree ordinal thenβ | β ≺ α is well-ordered by ≺, and if β + 1 ≺ α then we have for all n,β + 1 ∈ α[n] =⇒ β ∈ α[n] and β ∈ α[n] =⇒ β + 1 ∈ α[n+ 1]. Therefore byassociating to each β α its set-theoretic ordinal |β| = sup |γ|+1 | γ ≺ β ,it is clear that α determines a presentation of the countable ordinal |α| givenby |α|[n] = |β| | β ∈ α[n] .

Proof. By the lemma above, if β ≺ α, and γ ≺ α then for some largeenough x, β and γ both lie in α[x], and so β ≺ γ or γ ≺ β or β = γ. Hence≺ well-orders β | β ≺ α . The rest is quite straightforward.

Thus ΩS1 provides a convenient structure over which ordinal presenta-

tions can be computed. The reason for introducing higher level tree ordinalsΩSk is that they will enable us to name large elements of ΩS

1 in a uniformway, by higher level versions of the fast-growing hierarchy.

Definition. The ϕ-hierarchy at level k is the function

ϕ(k) : Ωk+1 × Ωk → Ωk


defined by the following recursion over α ∈ Ωk+1 :

ϕ(k)(0, β) := β + 1,

ϕ(k)(α+ 1, β) := ϕ(k)(α, ϕ(k)(α, β)),

ϕ(k)(supΩi

αξ, β) := supΩi

ϕ(k)(αξ, β) if i < k,

ϕ(k)(supΩk

αξ, β) := ϕ(k)(αβ, β).

When the context is clear, the superscript (k) will be supressed. Also ϕ(α, β)will sometimes be written ϕα(β).

Note. At level k = 0 we have ϕ(0)α = Bα where Bα is the fast-growing

function defined in the introduction according to the presentation deter-mined by α ∈ ΩS

1 as in the Structure Theorem. This is because α[n] = αn[n]if α is a limit and the maximum element of α+ 1[n] is α.

Lemma (Properties of ϕ). For any level k, all α, α′ ∈ Ωk+1, all β ∈ Ωk,and all ~γ ∈ Ωk−1 × · · · × Ω0,

α′ ∈ α[β,~γ] =⇒ ϕ(α′, β) ∈ ϕ(α, β)[~γ]

Proof. By induction on α ∈ Ωk+1. The implication holds vacuouslyif α = 0 since 0[β,~γ] is empty. For the successor step α to α + 1, ifα′ ∈ α + 1[β,~γ] then α′ ∈ α[β,~γ] ∪ α, so by the induction hypthe-sis, ϕ(α′, β) ∈ ϕ(α, β)[~γ] ∪ ϕ(α, β). But δ ∈ ϕ(α, δ)[~γ] for any δ, soputting δ = ϕ(α, β) gives ϕ(α′, β) ∈ ϕ(α, ϕ(α, β))[~γ] = ϕ(α + 1, β)[~γ] asrequired. Now suppose α = supΩi

αξ. If i < k then from the definitions,α[β,~γ] = αγi [β,~γ] and ϕ(α, β)[~γ] = ϕ(αγi , β)[~γ] so the result follows imme-diately from the induction hypothesis for αγi ≺ α. The final case i = kfollows in exactly the same way, but with γi replaced by β.

Theorem. ϕ preserves structuredness. For any level k, if α ∈ ΩSk+1 and

β ∈ ΩSk then ϕ(α, β) ∈ ΩS

k .

Proof. By induction on α ∈ ΩSk+1. The zero and successor cases are

immediate, and so is the limit case α = supΩkαξ since if β ∈ ΩS

k then bydefinition, αβ ∈ ΩS

k+1 and hence ϕ(α, β) = ϕ(αβ, β) ∈ ΩSk . Suppose then,

that α = supΩiαξ where i < k. Then for every ξ ∈ ΩS

i we have αξ ∈ ΩSk+1

and so ϕ(αξ, β) ∈ ΩSk by the induction hypothesis. It remains only to check

the structuredness condition for λ = ϕ(α, β) = supΩiϕ(αξ, β). Assume

~γ ∈ ΩSk−1×· · ·×ΩS

0 and ξ ∈ ωi[~γ]. Then by the structuredness of α we haveαξ ∈ α[β,~γ] because ξ ∈ ωi[~γ] implies ξ ∈ ωi[β,~γ] when i < k. Therefore bythe last lemma, ϕ(αξ, β) ∈ ϕ(α, β)[~γ]. So ϕ(α, β) ∈ ΩS

k .

Corollary. Define, for each positive integer k,

τk = ϕ(1)(ϕ(2)(. . . ϕ(k)(ωk, ωk−1) . . . , ω1), ω0)

and set τ0 = ω0. Thus τ1 = ϕ(1)(ω1, ω0), τ2 = ϕ(1)(ϕ(2)(ω2, ω1), ωo)etcetera. Then each τk ∈ ΩS

1 , and since

ωk−1 ∈ ϕ(k)(ωk, ωk−1)[ωk−2, . . . , ω0, k]


we have τk−1 ∈ τk[k] by repeated application of the last lemma. Thereforeτ = sup τk is also structured.

Our notion of structuredness is closely related to the work of Schmidt(1976) on “step-down” relations and “built-up” systems of fundamental se-quences. See also Kadota (1993) for an earlier alternative treatment of τ .

5.2.2. Collapsing properties of G. For the time being we set struc-turedness to one side and review some of the “arithmetical” properties ofthe slow-growing G-function developed by Wainer (1989). These will befundamental to what follows later. Recall that G(α, n) measures the size ofα[n]. Since we are now working with tree ordinals α and the presentationsdetermined by them according to the Structure Theorem it is clear that Gcan be defined by recursion over α ∈ Ω1 as follows:

G(0, n) = 0, G(α+ 1, n) = G(α, n) + 1, G(supαz, n) = G(αn, n).

Note that the parameter n does not change in this recursion, so what weare actually defining is a function Gn : Ω1 → Ω0 for each fixed n whereGn(α) := G(α, n). We need to lift this to higher levels.

Definition. Fix n ∈ N and, by induction on k, define the functionsGn : Ωk+1 → Ωk and Ln : Ωk → Ωk+1 as follows:

Gn(0) = 0 Ln(0) = 0Gn(α+ 1) = Gn(α) + 1 Ln(β + 1) = Ln(β) + 1Gn(sup

Ω0

αz) = Gn(αn)

Gn(supΩi+1

αξ) = supΩi

Gn(αLnζ) Ln(supΩi

βζ) = supΩi+1

Ln(βGnξ)

Lemma. For all β ∈ Ωk we have Gn Ln(β) = β. Hence for everypositive k, Gn(ωk) = ωk−1.

Proof. By induction on β ∈ Ωk. The zero and successor cases areimmediate, and if β = supΩi

βζ then, assuming we have already provedGn Ln(ζ) = ζ for every ζ ∈ Ωi (i < k), we can simply unravel the abovedefinitions to obtain

Gn Ln(β) = Gn(supΩi+1

Ln(βGnξ)) = supΩi

Gn Ln(βGnLnζ) = supΩi

βζ = β.

Hence Gn Ln is the identity and since ωk = supΩkξ we have

Gn(ωk) = supΩk−1

Gn(Lnζ) = supΩk−1

ζ = ωk−1.

Note that this Lemma holds for every fixed n.

Definition. For each fixed n define the subset ΩGk (n) of Ωk as follows

by induction on k. Set ΩG0 (n) = Ω0 and assume ΩG

i (n) defined for eachi < k. Take ≺Gn⊆ Ωk × Ωk to be the transitive closure of β ≺Gn β + 1 andαξ ≺Gn α for every ξ ∈ ΩG

i (n), if α : Ωi → Ωk. Then ΩGk (n) consists of those

α ∈ Ωk such that for every λ = supΩiλξ Gn α the following condition holds:

∀ξ∈ΩGi (n) (Gn(λξ) = Gn(λLnGnξ)) .

Call this the “G-condition”. Note that ΩG0 (n) = Ω0 and ΩG

1 (n) = Ω1 forevery n since ξ = LnGnξ if ξ ∈ Ω0.


Lemma. For each fixed n ∈ N,(a) If λ = supΩi+1

λξ ∈ ΩGk (n) then Gn(λξ) = Gn(λ)Gn(ξ) for every ξ ∈

ΩGi+1(n)

(b) ω0, ω1, . . . , ωk−1 ∈ ΩGk (n).

(c) Ln : Ωk → ΩGk+1(n).

Proof. First, if λ = supΩi+1λξ ∈ ΩG

k (n) then Gn(λ) = supΩiGn(λLnζ)

so if ξ ∈ ΩGi+1(n) we can put ζ = Gn(ξ) to obtain

Gn(λ)Gn(ξ) = Gn(λLnGnξ) = Gn(λξ)

by the G-condition.Second, note that ω0 ∈ Ω1 = ΩG

1 (n) ⊆ ΩGk if k > 0. If k > i + 1 and

λ ≺Gn ωi+1 then λ ∈ ΩGi+1 so it satisfies the G-condition. If λ = ωi+1 then

it is just the identity function on Ωi+1 and so the G-condition amounts toGn(ξ) = Gn(LnGnξ), which holds because Gn Ln is the identity. Henceωi+1 ∈ ΩG

k (n).Third, we show Ln(β) ∈ ΩG

k+1(n) for every β ∈ Ωk, by induction on β.The zero and successor cases are immediate. If β = supΩi

βζ then Ln(β) =supΩi+1

Ln(βGnξ) and Ln(βGnξ) ∈ ΩGk+1(n) by the induction hypothesis. For

Ln(β) ∈ ΩGk+1(n), it remains to check the G-condition:

Gn(Ln(βGnξ)) = Gn(Ln(βGnLnGnξ)).

Again, this holds because Gn Ln is the identity.

Theorem. For each fixed n ∈ N and every k, if α ∈ ΩGk+2(n) and

β ∈ ΩGk+1(n) then

(a) ϕ(k+1)(α, β) ∈ ΩGk+1(n)

(b) Gn(ϕ(k+1)(α, β)) = ϕ(k)(Gn(α), Gn(β)).

Proof. By induction on α ∈ ΩGk+2(n). The zero and successor cases are

straightforward, and so is the case where α = supΩk+1αξ because then we

have: (1) ϕ(k+1)(α, β) = ϕ(k+1)(αβ, β) ∈ ΩGk+1(n) by the induction hypoth-

esis, and (2) Gn(αβ) = Gn(α)Gn(β) by the last lemma, so by the inductionhypothesis and the definition of the ϕ-functions,

Gn(ϕ(k+1)(αβ, β)) = ϕ(k)(Gn(α)Gn(β), Gn(β)) = ϕ(k)(Gn(α), Gn(β)).

Now suppose α = supΩiαξ with i ≤ k. Then for (1) we have

ϕ(k+1)(α, β) = supΩi

ϕ(k+1)(αξ, β)

and for each ξ ∈ ΩGi (n), ϕ(k+1)(αξ, β) ∈ ΩG

k+1(n) by the induction hypothe-sis. Furthermore, αLnGnξ ∈ ΩG

k+2(n) because Ln takes Ωi into ΩGi+1(n), and

Gn(αξ) = Gn(αLnGnξ). So by the induction hypothesis Gn(ϕ(k+1)(αξ, β))and Gn(ϕ(k+1)(αLnGnξ, β)) are identical. Thus ϕ(k+1)(α, β) ∈ ΩG

k+1(n). Forpart (2) if i = 0 then Gn(α) = Gn(αn) and by the induction hypothesis,

Gn(ϕ(k+1)(α, β)) = Gn(ϕ(k+1)(αn, β))

= ϕ(k)(Gn(αn), Gn(β))


= ϕ(k)(Gn(α), Gn(β)).

If i > 0 then for every ζ ∈ Ωi−1 we have Lnζ ∈ ΩGi (n) so αLnζ ∈ ΩG

k+2(n)and Gn(αLnζ) = Gn(α)ζ . Therefore, using the induction hypothesis oncemore,

Gn(ϕ(k+1)(α, β)) = supΩi−1

Gn(ϕ(k+1)(αLnζ , β))

= supΩi−1

ϕ(k)(Gn(αLnζ), Gn(β))

= supΩi−1

ϕ(k)(Gn(α)ζ , Gn(β))

= ϕ(k)(Gn(α), Gn(β)).


Corollary. Recalling the definition:

τk = ϕ(1)(ϕ(2)(. . . ϕ(k)(ωk, ωk−1) . . . , ω1), ω0)

and the fact that ϕ(0) = B, we have for each k > 0

G(τk, n) = Bτk−1(n) for all n ∈ N.

Our next task is to extend this to a functorial identity. The followingsimple lemma plays a crucial role.

Lemma (Bijectivity). Fix n and k. If α ∈ ΩGk (n) and γi ∈ ΩG

i (n) fori < k then Gn bijectively collapses α[γk−1, . . . , γ1, n] onto

Gn(α)[Gn(γk−1), . . . , Gn(γ1)].

Proof. By an easy induction on α, noting that Gn(α+ 1) = Gn(α) + 1and Gn(αγ) = Gn(α)Gn(γ) for γ ∈ ΩG

i (n).

The functors G, B and ϕ. Henceforth we shall restrict attention tothose tree ordinals which simultaneously are structured and possess the G-collapsing properties above.

Definition. Ok := ΩSk ∩

⋂n∈N

ΩGk (n).

Thus O0 = N, O1 = ΩS1 and if α = supΩi

αξ ∈ Ok then αξ ∈ Ok

whenever ξ ∈ Oi. The preceding subsections give ωi ∈ Ok for every i < kand ϕ(α, β) ∈ Ok whenever α ∈ Ok+1 and β ∈ Ok, so we can build lots ofelements in Ok using the ϕ functions. The importance of the Ok’s is thatwhen restricted to them, the ϕ functions can be made into functors.

Definition. Set O<k = Ok−1 ×Ok−2,× · · · ×O0 for each k > 0. MakeO<k into a category by choosing as morphisms

σ : (γk−1, . . . , γ0)→ (γ′k−1, . . . , γ′0)

all ≺-preserving maps from ωk−1[γk−1, . . . , γ0] into ωk−1[γ′k−1, . . . , γ′0]. Note

that ωk−1[γk−1, . . . , γ0] is the same as γk−1[γk−2, . . . , γ0].

Definition. The functor G : O<k+1 → O<k is given by:• G(γk, . . . , γ1, n) = (Gn(γk), . . . , (Gn(γ1)))


• If σ : (γk, . . . , γ1, n) → (γ′k, . . . , γ′1,m) then G(σ) = Gm σ G−1

n

where G−1n is the inverse of the bijection Gn : ωk[γk, . . . , γ1, n] →

ωk−1[Gn(γk), . . . Gn(γ1)] given by the Bijectivity Lemma.

Note. Each α ∈ O1 can be made into a functor ααα from N0 = 0 →1 → 2 → 3 → . . . into O<2 by defining ααα(n) = (α, n) and ααα(inm) to bethe identity embedding of α[n] into α[m]. Then G ααα is exactly the functorG(α) : N0 → N defined in the introduction.

Before defining ϕα as a functor we need a “normal form lemma”.

Lemma (Normal Form). For all α ∈ Ωk+1, β ∈ Ωk and ~γ ∈ Ωk−1 ×· · ·×Ω0: if δ ∈ ϕ(k)(α, β)[~γ] then either δ ∈ β[~γ]∪β or else δ is expressibleuniquely in the “normal form”

δ = ϕ(αr, ϕ(αr−1 . . . ϕ(α1, ϕ(α0, β)) . . . )).

where α0 ∈ α[β,~γ] and αi+1 ∈ αi[ϕ(αi, . . . ϕ(α0, β) . . . ), ~γ].Furthermore, if α, β and ~γ are structured then for each i < r,

αi+1 ∈ α[ϕ(αi, . . . ϕ(α0, β) . . . ), ~γ].

Proof. By induction on α. If α = 0 then ϕ(α, β)[~γ] = β[~γ] ∪ β. Ifα is a limit then ϕ(α, β)[~γ] = ϕ(αξ, β)[~γ] where ξ = β or γk−1 or . . . orγ0, so the result follows immediately. For the successor step α to α + 1note that ϕ(α + 1, β)[~γ] = ϕ(α, β1)[~γ] where β1 = ϕ(α, β). Therefore ifδ ∈ ϕ(α + 1, β)[~γ] then by the induction hypothesis for α, either δ ∈ β1[~γ],in which case the unique normal form is as stated, or β1 δ in which casethe normal form is as stated but with β replaced by β1 = ϕ(α, β).

If, furthermore, α, β and ~γ are structured, then by induction on i =0, 1, 2, . . . , r − 1 we show

αi+1 ∈ α[ϕ(αi, . . . ϕ(α0, β) . . . ), ~γ].

Firstly, we have α[β,~γ] ⊆ α[ϕ(α0, β), ~γ] by a simple induction on α. Theonly non-trivial case is where α : Ωk → Ωk+1, so α[β,~γ] = αβ[β,~γ] ⊆αβ[ϕ(α0, β), ~γ]. But β ∈ ϕ(α0, β)[~γ] = ωk[ϕ(α0, β), ~γ] so by structured-ness αβ ∈ α[ϕ(α0, β), ~γ] and hence αβ[ϕ(α0, β), ~γ] ⊆ α[ϕ(α0, β), ~γ]. Thusfor the base-case i = 0, α0 ∈ α[β,~γ] ⊆ α[ϕ(α0, β), ~γ] and hence α1 ∈α0[ϕ(α0, β), ~γ] ⊆ α[ϕ(α0, β), ~γ]. And for the induction step i to i + 1, re-placing β by ϕ(αi, . . . , ϕ(α0, β) . . . ) and α0 by αi+1 gives

αi+2 ∈ αi+1[ϕ(αi+1, . . . ϕ(α0, β) . . . ), ~γ] ⊆ α[ϕ(αi+1, . . . ϕ(α0, β) . . . ), ~γ]

as required.

Note. Conversely, if δ has the normal form above then δ ∈ ϕ(k)(α, β)[~γ]by repeated application of lemma on properties of ϕ in 5.2.1.

Definition (Functorial Definition of ϕ). The functor ϕ(k)α : O<k+1 →

O<k+1 is defined as follows: First, assume α ∈ Ok+1 has already been madeinto a functor ~α : O<k+1 → O<k+2 such that ααα(γk, . . . , γ0) = (α, γk, . . . , γ0)and ααα(iγγ′) = iααα(γ)ααα(γ′) where iγγ′ denotes the identity embedding of

ωk[γk, . . . γ0] = γk[γk−1, . . . , γ0]


as a subset of γ′k[γ′k−1, . . . , γ

′0] when it exists. Note that this amounts to

a monotonicity condition: if σ is a subfunction of σ′ then ααα(σ) will be asubfunction of ααα(σ′). Girard called such functors “flowers”. We can nowdefine ϕ(k)

α as a functor on O<k+1; the superscript (k) will be omitted.(i) ϕα(γk, γk−1, . . . , γ0) = (ϕα(γk), γk−1, . . . , γ0)(ii) If σ : (γk, . . . , γ0)→ (γ′k, . . . , γ

′0) then ϕα(σ) : ϕα(γk)[γk−1, . . . , γ0]→

ϕα(γ′k)[γ′k−1, . . . , γ

′0] is the map δ 7→ δ′ built up inductively on δ ∈ ϕα(γk)[~γ]

according to its normal form as in the Normal Form lemma:

• if δ ∈ γk[γk−1, . . . , γ0] then δ′ = σ(δ)• if δ = γk then δ′ = γ′k• if δ = ϕαr. . . ϕα1ϕα0(γk) then set δ′ = ϕα′r. . . ϕα′1ϕα′0(γ

′k) where

for each i = 0, 1, 2, . . . , r, α′i = ααα(σi)(αi) with σi the previouslydetermined subfunction taking ξ ∈ ϕαi−1 · · · γα0(γk)[~γ] to ξ′ ∈ϕα′i−1

· · · ϕα′0(γ′k)[~γ′].

Note that σi is a subfunction of σi+1 and so ααα(σi) is a subfunction of ααα(σi+1).This means that αi+1 occurs below αi in the domain of ααα(σi+1), and henceα′i+1 occurs below α′i in α[ϕα′i · · · ϕα′0(γ

′k), ~γ′]. Thus α′i+1 ∈ α′i[ϕα′i · · ·

ϕα′0(γ′k),→→ γ′] for each i. So δ′ ∈ ϕα(γ′k)[~γ

′] as required, by the abovenote. This completes the definition.

Note. A careful reading of the preceding definition should convincethe reader that the maps ϕ(k)

α (σ) do in fact constitute a functor, that is:ϕα(idγ) = idϕα(γ) and ϕα(σ σ′) = ϕα(σ) ϕα(σ′). This depends, ofcourse, on the assumed functoriality of ααα. Furthermore, ϕα also satisfies the“flower” property, in the sense that if γk = γ′k and σ is the identity functionfrom γk[γk−1, . . . , γ0] into γk[γ′k−1, . . . , γ

′0] then ϕα(σ) is the identity function

from ϕα(γk)[~γ] into ϕα(γk)[~γ′]. Again, this depends on the assumption thatααα is a “flower”.

Theorem (Commutation). Fix k > 0 and suppose α ∈ Ok+1 satisfiesthe assumptions of the functorial definition of ϕ. Suppose also that thereis a β ∈ Ok such that Gn(α) = β for every n, and β determines a functorβββ : O<k → O<k+1 satisfying G ααα = βββ G. Then

G ϕ(k)α = ϕ

(k−1)β G.

Proof. Firstly, if (γk, . . . , γ1, n) ∈ O<k+1 then by Theorem 5.2.2 andthe definition of the functor G,

G ϕ(k)α (γk, . . . , γ1, n) = G(ϕ(k)

α (γk), γk−1, . . . , γ1, n)= (Gnϕ

(k)α (γk), Gn(γk−1), . . . , Gn(γ1))

= (ϕ(k−1)β (Gnγk), Gn(γk−1), . . . , Gn(γ1))

= ϕ(k−1)β (Gn(γk), Gn(γk−1), . . . , Gn(γ1))

= ϕ(k−1)β G(γk, . . . , γ1, n).

Secondly, if σ : (γk, . . . , γ1, n) → (γ′k, . . . , γ′1,m) in O<k+1 then using the

notation of the functorial definition of ϕ and again the definition of the


functor G,

G ϕ(k)α (σ) : ϕ(k−1)

β G(γk, . . . , γ1, n)→ ϕ(k−1)β G(γ′k, . . . , γ

′1m)

is the map sending Gn(δ) to Gm(δ′) whenever δ 7→ δ′ under ϕ(k)α (σ). There-

fore in order to prove

G ϕ(k)α (σ) = ϕ

(k−1)β G(σ)

we have to check that for every δ ∈ ϕα(γk)[γk−1, . . . , γ1, n],

Gm(δ′) = ϕ(k−1)β (G(σ))(Gn(δ)).

Recall that by the Bijectivity Lemma, Gn always collapses γk[γk−1, . . . , γ1, n]bijectively onto Gn(γ)[Gn(γk−1), . . . , Gn(γ1)]. Now according to the defini-tion of ϕ(k)

α (σ) in the functorial definition of ϕ, there are three cases toconsider:

(i) If δ ∈ γk[γk−1, . . . , γ1, n] then δ′ = σ(δ) and so

Gm(δ′) = Gm σ(δ) = G(σ)(Gn(δ)) = ϕ(k−1)β (G(σ))(Gn(δ))

because in this case we have Gn(δ) ∈ Gn(γk)[Gn(γk−1), . . . , Gn(γ1)].(ii) If δ = γk then δ′ = γ′k and so Gn(δ) = Gn(γk) and in this case

Gm(δ′) = Gm(γ′k) = ϕ(k−1)β (G(σ))(Gn(δ)).

(iii) if δ = ϕ(k)αr · · · ϕ

(k)α0 (γk) is in ϕ

(k)α (γk)[γk−1, . . . , γ1, n] where each

αi+1 occurs below αi in α[ϕ(k)αi · · · ϕ

(k)α0 (γk), ~γ], then Gn(δ) = ϕ

(k−1)βr

· · · ϕ(k−11)

β0(Gn(γk)) with βi = Gn(αi) for each i ≤ r. The collapsing

property in the Bijectivity Lemma of Gn ensures that βi+1 occurs below βiin β[ϕ(k−1)

βi · · · ϕ(k−1)

β0(Gnγk), Gn~γ] and that

Gn(δ) ∈ ϕ(k−1)β (Gnγk)[Gn(γk−1), . . . , Gn(γ1)].

Furthermore, every element of ϕ(k−1)β (Gnγk)[Gn(γk−1), . . . , Gn(γ1)] occurs

as such a Gn(δ). In this case, we have δ′ = ϕ(k)α′r · · · ϕ(k)

α′0(γ′k) where

α′i = ααα(σi)(αi), and σi is the previously generated subfunction taking ξ ∈ϕ

(k)αi−1 · · · ϕ

(k)α0 (γk)[~γ] to ξ′ ∈ ϕ(k)

α′i−1 · · · ϕ(k)

α′0(γ′k)[~γ′]. Therefore Gm(δ′) =

ϕ(k−1)β′r

· · · ϕ(k−1)β′0

(Gmγ′k) where β′i = Gm(α′i) for each i ≤ r. But sinceG ααα = βββ G, it follows that β′i = G ααα(σi)(βi) = βββ G(σi)(βi) andG(σi) is the subfunction taking Gn(ξ) ∈ ϕ(k−10

βi−1 · · · ϕ(k−1)

β0(Gnγk)[Gn~γ] to

Gm(ξ′) ∈ ϕ(k−1)β′i−1

· · ·ϕ(k−1)β′0

(Gmγ′k)[Gm~γ′]. All of this means that Gm(δ′) =

ϕ(k−1)β (G(σ))(Gn(δ)) according to the definition of ϕ(k−1)

β (G(σ)).

Theorem. Again recall the definition

τk = ϕ(1)(ϕ(2)(. . . ϕ(k)(ωk, ωk−1) . . . , ω1), ω0).

As before, write B = ϕ(0) and, assuming α ∈ O1 has been made into afunctor ααα : O<1 → O<2, write G(α) = G ααα. Then we have the functorial


identity:G(τk) = Bτk−1

.

Hence for each k > 0,Lim→ Bτk−1

= τk.

Proof. Fix k > 0 and for each i = 1, . . . , k set

αi = ϕ(i)(ϕ(i+1)(. . . ϕ(k)(ωk, ωk−1) . . . ), ωi), ωi−1).

Then from the previous subsections we have αi ∈ Oi. Make ωi into a functorωωωi : O<i+1 → O<i+2 by setting

• ωi(γi, . . . , γ0) = (ωi, γi, . . . , γ0)• ωi(σ) = σ whenever σ : (γi, . . . , γ0)→ (γ′i, . . . , γ

′0).

This makes good sense because ωi[γi, . . . , γ0] is the same set of tree ordinalsas γi[γi−1, . . . , γ0]. Since ωωωi is just the identity functor it automatically hasthe “flower” property. Therefore starting with ωωωk and repeatedly applyingthe functorial definition of ϕ and its accompanying note, we can make eachαi into a functor αiαiαi : O<i → O<i+1, again with the flower property, bysetting

αkαkαk = ϕ(k)ωkωk−1ωk−1ωk−1

and for each i = k − 1, . . . , 1 in turn,

αiαiαi = ϕ(i)αi+1ωi−1ωi−1ωi−1.

Exactly the same thing can be done with

βi = ϕ(i−1)(ϕ(i)(. . . ϕ(k−1)(ωk−1, ωk−1) . . . , ωi−1), ωi−2)

for 1 < i ≤ k, and we claim that for each such i,

G αiαiαi = βiβiβi G.The proof is by downward induction on i = k, k − 1, . . . , 2. If i = k thensince Gn(ωk) = ωk−1 for every n, and G ωkωkωk = ωk−1ωk−1ωk−1 G, we can apply theCommutation Theorem to get

G αkαkαk = ϕ(k−1)ωk−1

G ωk−1ωk−1ωk−1 = ϕ(k−1)ωk−1

ωk−2ωk−2ωk−2 G = βkβkβk G.The induction step from i+1 to i is similar. First note that by Theorem 5.2.2,Gn(αi+1) = βi+1 for every n, and G αi+1αi+1αi+1 = βi+1βi+1βi+1 G by the inductionhypothesis. So the Commutation Theorem applies again, giving

G αiαiαi = ϕ(i−1)βi+1

G ωi−1ωi−1ωi−1 = ϕ(i−1)βi+1

ωi−2ωi−2ωi−2 G = βiβiβi G.

This proves the claim, and the theorem follows from G α2α2α2 = β2β2β2 G byone more application of the Commutation Theorem:

G(τk) = G α1α1α1 = ϕ(0)β2G ω0ω0ω0 = Bτk−1

since α1 = τk, β2 = τk−1, ϕ(0) = B and G ω0ω0ω0 = idN.

Remark. Girard’s dilators are functors on the category of (set-theore-tic) ordinals which commute with direct limits and with pull-backs. Com-mutation with direct limits provides number-theoretic representation sys-tems for countable ordinals named by the dilator, and commutation withpull-backs ensures uniqueness of representation. Although our context isquite different, the ϕ-functors above are nevertheless dilator-like, since The


Commutation Theorem essentially expresses preservation of “limits” underG, and the Normal Form lemma gives uniqueness of representation withrespect to ϕ.

Example. Bω0 is the following functor on O<1:(i) Bω0(n) = n+ 2n

(ii) Bω0(σ : n→ m) is the map taking

n+ 2n0 + 2n1 + · · ·+ 2nr 7→ m+ 2σ(n0) + 2σ(n1) + · · ·+ 2σ(nr).

ThusLim→ Bω0 = τ1 = ϕ(1)(ω1, ω0) = ϕ(1)(ω0, ω0) = ω0 + 2ω0

and Bω0 constructs a presentation of the (disappointingly small) ordinalω.2. Do not be deceived however. After this point the B-functors really getmoving. At the next step in the τ -sequence we have

Lim→ Bτ1 = τ2

= ϕ(1)(ϕ(2)(ω2, ω1), ω0)

= ϕ(1)(ϕ(2)(ω1, ω1), ω0)

= ϕ(1)(ϕ(2)(ω0, ω1), ω0)

= ϕ(1)(ω1 + 2ω0 , ω0)

= supn

ϕ(1)(ω1 + 2n, ω0)

= supn

ϕ(1)ω1 · · · ϕ(1)

ω1(ω0)

where, in the last line, there are 22niterations of ϕ(1)

ω1 for each n. But for eachβ ∈ Ω1 we have ϕ(1)

ω1 (β) = β+2β. So |τ2| is the limit of iterated exponentials,starting with ω0. In other words, τ2 is a presentation of epsilon-zero. Thenτ3 is a presentation of the Howard ordinal, as we shall see in the next section.

5.2.3. The accessible recursive functions. The accessible part ofthe fast-growing hierarchy is generated from ω0 by iteration of the principle:given α, first form Bα and then take its direct limit to obtain the nextordinal α+. Note that the equation

α+ = Lim→ Bα

is really only an isomorphism of presentations. However, this is enough toensure that the B-functions are uniquely determined. So by the theoremabove, we may take ω+

0 = τ1, ω++0 = τ2, ω+++ = τ3, etcetera. Thus to sum

up where we are so far:

Theorem. The accessible recursive functions are exactly the functionsKalmar elementary in the functions Bα, α ≺ τ , where τ = sup τi. Similarlythey are exactly those which are elementary in the functions Gα, α ≺ τ . τ isthe first point in this scale at which the elementary closures of Bα | α ≺ τ and Gα | α ≺ τ are equal.

Our next task is to characterize the accessible recursive functions:

Theorem. The accessible recursive functions are exactly the functionsprovably recursive in the theory Π1

1-CA0.


5.3. Proof Theoretic Characterizations of Accessibility

We first characterize the accessible recursive functions as those provablyrecursive in the (first order) theory ID<ω of finitely iterated inductive defi-nitions. Later we will see that these, in turn, are the same as the functionsprovably recursive in Π1

1-CA0.Since the systems of ϕ functions, and the number theoretic functions Bα

indexed by them, are all defined by (admittedly somewhat abstract) recur-sion equations, they are, at least in an intuitive sense, recursive. It shouldtherefore be possible (and it is, as we now show) to develop them, instead,in a more formal recursive setting, where the sets of tree ordinals Ωk arereplaced by sets Wk of Kleene-style ordinal notations, and the uncountable“regular cardinals” ωk are replaced by their “recursively regular” analogues,the “admissibles” ωCK

k . After all, the ωk’s are only used to index certainstrong kinds of diagonalization, so if we know that it is only necessary todiagonalize over recursive sequences, the ωCK

k ’s should do just as well. Theend result will be that we can formalize the definition of each Wk in a firstorder arithmetical theory IDk of k-times iterated inductive definitions, thendevelop recursive analogues of the functions ϕ(i), i ≤ k, within it, and henceprove the recursiveness of Bα for at least α < τk (and in fact, α < τk+2).Thus every accessible recursive function, being elementary recursive in Bαfor some α < τ , will be provably recursive in ID<ω =

⋃k IDk. The converse

will be proven in subsequent subsections, using ordinal analysis methodsdue to Buchholz (1987), in particular his Ω-rules.

5.3.1. Finitely iterated inductive definitions. We can generate arecursive analogue Wk of the set of tree ordinals Ωk by starting with 0 as anotation for the ordinal zero, choosing 〈0, b〉 as a notation for the successorof b, and choosing 〈i+1, e〉 as a notation for the limit of the sequence e(x)taken over x ∈Wi. Thus Wk is obtained by k successive (iterated) inductivedefinitions thus: a ∈Wk if a = 0 or a = 〈0, b〉 for some b ∈Wk or a = 〈i+1, e〉for some i < k where e(x) ∈ Wk for all x ∈ Wi. We can formalize theseconstructions of W1, . . . ,Wi, . . . ,Wk in a sequence of first order arithmeticaltheories IDi(W ), i ≤ k, as follows: first, for each k and any formula A witha distinguished free variable, let Fk(A, a) be the positive-in-A formula

a = 0 ∨ ∃b(a = 〈0, b〉 ∧A(b)

)∨ ∃e

(a = 〈1, e〉 ∧ ∀x∃y(e(x) = y ∧A(y))

)∨∨∨

1≤i<k∃e

(a = 〈i+ 1, e〉 ∧ ∀x(Wi(x)→ ∃y(e(x) = y ∧A(y)))

)where e(x) = y abbreviates ∃z(T (e, x, z) ∧ U(e, x, z) = y).

Definition. ID0(W ) is just Peano Arithmetic, and for each k > 0,IDk(W ) is the theory in the language of PA expanded by new predicatesW1, . . . ,Wk, having for each i = 1, . . . , k the inductive closure axioms

∀a(Fi(Wi, a)→Wi(a))

and the least-fixed-point axiom schemes

∀a(Fi(A, a)→ A(a))→ ∀a(Wi(a)→ A(a))

where A is any formula in the language of IDk(W ). ID<ω(W ) is then theunion of the IDk(W )’s.

5.3. PROOF THEORETIC CHARACTERIZATIONS OF ACCESSIBILITY 173

Note that the i-th least-fixed-point axiom applied to the formula A :=Wk(a) gives

∀a(Fi(Wk, a)→Wk(a))→ ∀a(Wi(a)→Wk(a)),

from which follows ∀i(Wi(a)→ Wk(a)), since ∀a(Fi(Wk, a)→ Fk(Wk, a)) isimmediate by the definition of Fk, and then ∀a(Fi(Wk, a)→ Wk(a)) by theinductive closure axiom for Wk.

These theories were first studied by Kreisel (1963), Feferman (1970) andFriedman (1970). A comprehensive treatment of this fundamental area, asit stood at the time, is given in Buchholz et al. (1981).

As an illustration of what can be done in these theories, let f (k) : Wk+1×Wk → Wk be the partial recursive function which mimics ϕ(k) on ordinalnotations. Thus f (k) is defined by the recursion theorem to satisfy

f(k)0 (b) := 〈0, b〉,

f(k)〈0,a〉(b) := f (k)

a (f (k)a (b)),

f(k)〈i+1,e〉(b) := 〈i+ 1, d〉 where d(x) = f

(k)e(x)(b) if i < k,

f(k)〈k+1,e〉(b) := f

(k)e(b)(b)

where, as is done here, we shall often write the first argument of the binaryf (k) as a subscript. It is easy to check that if ωk is replaced by ωCK

k , ifα ∈ Ωk+1 is then replaced by a notation a ∈Wk+1, and if β ∈ Ωk is replacedby a notation b ∈ Wk, then ϕ

(k)α (β) ∈ Ωk gets replaced by f (k)

a (b) ∈ Wk. Inparticular then, the countable ϕ(1)

α (β) is a recursive ordinal.Furthermore we can actually prove f (k) : Wk+1×Wk →Wk in IDk+1(W ).

For let A(a) be the formula

∀b(Wk(b)→ ∃c(f (k)

a (b) = c ∧Wk(c))).

Then the recursive definition of f (k) together with the inductive closureaxiom for Wk enable one easily to prove ∀a(Fk+1(A, a)→ A(a)). The least-fixed-point axiom for Wk+1 then gives ∀a(Wk+1(a)→ A(a)). Thus, provablyin IDk+1(W ) we have f (k) : Wk+1 ×Wk →Wk.

Now, starting from ωCKi with notation 〈i+ 1, e0〉 ∈Wi+1 where e0 is an

index for the identity function, we can immediately deduce the existence inW1 of a notation

tk = f (1)(f (2)(. . . f (k)(〈k + 1, e0〉, 〈k, e0〉) . . . , 〈2, e0〉), 〈1, e0〉)

for the (tree) ordinal τk.Now let CB be a computation formula for the function B, so that for

any notation a ∈ W1 for the recursive ordinal α, ∃y,zCB(a, x, y, z) is a Σ1-definition of Bα. By the same argument that we have just applied above, andwriting Bα(x)↓ for the formula ∃y,zCB(a, x, y, z), we can prove ∀a(W1(a)→∀xBα(x)↓). Thus Bα is provably recursive in IDk(W ) for any ordinal αwhich, provably in IDk(W ), has a notation in W1.

Suppose α ≺ τk. We check that α itself has a notation in W1, provably inIDk(W ). Firstly, the relation a ≺ b is the restriction of a Σ1 relation to W1.This is because a ≺ b if and only if there is a sequence of pairs 〈bi, xi〉 such


that b0 = b and the last bl = a and for each i < l, bi+1 = e(xi) if bi = 〈1, e〉and bi+1 = c if bi = 〈0, c〉. Now let W≺

1 (b) be the formula ∀a≺bW1(a), andnotice that ∀b(F1(W≺

1 , b)→W≺1 (b)) is easily checked in IDk(W ). Therefore

by the least-fixed-point axiom for W1 we have ∀b(W1(b) → W≺1 (b)). Hence

if α ≺ τk has notation a ∈ W1 then since we can prove W1(tk) it followsthat we can prove W1(a) (using the fact that the true Σ1 statement a ≺ tkis provable in arithmetic). We have shown that every α ≺ τk has a recursiveordinal notation, provably in IDk(W ). Therefore by the last paragraph, Bαis provably recursive in IDk(W ). This proves

Theorem. Every accessible recursive function is provably recursive inID<ω(W ).

We can refine this further:

Theorem. Each α ≺ τk+2 has a notation a for which IDk(W ) `W1(a).Therefore every function in the elementary (or primitive recursive) closureof Bα | α ≺ τk+2 is provably recursive in IDk(W ).

Proof. The second part follows immediately from the first since, asabove, Bα is provably recursive whenever α is provably a recursive ordinal,and since provably recursive functions will always be closed under primitiverecursion in the presence of Σ1 induction.

For the first part, suppose α ≺ τk+2. Then α has a recursive ordinalnotation a ≺ tk+2 = 〈1, ek+2〉 and so, for some fixed n, a ≺ ek+2(n). Ifwe can show that W1(ek+2(n)) is provable in IDk(W ) then by the earlierremarks, IDk(W ) `W1(a).

Now by unravelling the definition of tk+2 according to the recursionequations for f (1), . . . , f (k+2), it is not difficult to check that

ek+2(n) =

f (1)(f (2)(. . . f (k)(f (k+1) m〈k+2,e0〉 (〈k + 1, e0〉), 〈k, e0〉) . . . , 〈2, e0〉), 〈1, e0〉)

with m = 22niterates of f (k+1)

〈k+2,e0〉. (Recall that 〈k, e0〉 is the chosen notationfor ωCKk−1 in Wk.) It therefore will be enough to prove in IDk(W ) that

f (k)(f (k+1) m〈k+2,e0〉 (〈k + 1, e0〉), 〈k, e0〉) ∈Wk

for then W1(ek+2(n)) follows immediately. (Note that we could prove iteasily in IDk+1(W ) but the point is that one only needs IDk(W ).)

The following is a lifting to IDk(W ) of Gentzen’s original argument show-ing that transfinite induction up to any ordinal below ε0 is provable in PA.First let Ai be the formula generated by:

A0(d) ≡ ∀b(Wk(b)→ ∃a(f(k)d (b) = a ∧Wk(a)))

Ai+1(d) ≡ ∀c(Ai(c)→ ∃a(f (k+1)d (c) = a ∧Ai(a))) .

Then in IDk(W ) it is easy to check, from the definitions of f (k) and f (k+1),that for every i, ` Fk(Ai, d) → Ai(d) and hence ` ∀d(Wk(d) → Ai(d)).Furthermore if d is a limit notation of the form 〈k+1, e〉 then, again for each i,` ∀b(Wk(b)→ Ai(e(b)))→ Ai(d). In particular therefore, ` Ai(〈k+1, e0〉)for every i.


Now by a downward meta–induction on j = m+1,m, . . . , 1 we show, stillin IDk(W ), that ` Ai(cj) for every i ≤ j where cj denotes the (m+1−j)–thiterate of f (k+1)

〈k+2,e0〉 starting on 〈k + 1, e0〉.The j = m+ 1 case simply states Ai(〈k + 1, e0〉) shown above. For the

induction step assume the result holds for j > 1 and let i ≤ j − 1. Then wehave Ai+1(cj) and Ai(cj) and hence

` ∃a(f (k+1)cj (cj) = a ∧Ai(a)) .

But f (k+1)cj (cj) = f

(k+1)〈k+2,e0〉(cj) = cj−1 and so Ai(cj−1).

This completes the induction and putting j = 1, i = 0 we immediatelyobtain

IDk(W ) ` ∀b(Wk(b)→ ∃a(f (k)c1 (b) = a ∧Wk(a))) .

Therefore with b = 〈k, e0〉,

IDk(W ) `Wk(f (k)(f (k+1) m〈k+2,e0〉 (〈k + 1, e0〉), 〈k, e0〉))

as required.

5.3.2. The infinitary system IDk(W )∞. We now set up an infinitarysystem suitable for the analysis of IDk(W ), in particular cut elimination and“collapsing” results from which bounds can be computed. The crucial newcomponent is a version of Buchholz’s Ω–rule Buchholz (1987); Buchholzet al. (1981), a major technical innovation in the analysis of larger systemsof finitely and transfinitely iterated inductive definitions. We have the basicsalready from the chapter on Peano Arithmetic, but now sequents will be ofthe form

γk : ΩSk , γk−1 : ΩS

k−1, . . . , γ1 : ΩS1 , n : N `α Γ

which we shall immediately abbreviate to ~γ, n `α Γ. The particular sequence~ω := ωk−1, ωk−2, . . . , ω1, ω0, where ωi = supξ∈Ωi

ξ, will be of special signifi-cance. The ordinal bound α will be in ΩS

k+1. We stress that throughout thisand the following subsections all tree ordinals will be structured as in 5.2.1.Note that we could equally well replace each Ωi by its recursive analogueWi and each ωi by ωCK

i , as in the previous section. The results below wouldwork in just the same way.

As before, the system is in Tait style, Γ being a set of closed formulasin the language of IDk(W ), written in negation normal form. The rules areas follows.

(N1): For arbitrary α and ~γ,

~γ, n `α Γ, m : N provided m ≤ n+ 1.

(N2): For β0, β1 ∈ α[~γ, n],

~γ, n `β0 n′ : N ~γ, n′ `β1 Γ~γ, n `α Γ

.

(Ax): If Γ contains a true atom (i.e., an equation or inequation be-tween closed terms) then for arbitrary α,

~γ, n `α Γ.


(∨): For β ∈ α[~γ, n] and i = 0, 1

~γ, n `β Γ, Ai~γ, n `α Γ, A0 ∨A1

.

(∧): For β0, β1 ∈ α[~γ, n],

~γ, n `β0 Γ, A0 ~γ, n `β1 Γ, A1

~γ, n `α Γ, A0 ∧A1.

(∃): For β1 ∈ α[~γ, n] and β0 ∈ β1[~γ, n],

~γ, n `β0 m : N ~γ, n `β1 Γ, A(m)~γ, n `α Γ, ∃xA(x)

.

(∀): Provided βi ∈ α[~γ,max(n, i)] for every i,

~γ,max(n, i) `βi Γ, A(i) for every i ∈ N~γ, n `α Γ, ∀xA(x)

.

(Cut): For β0, β1 ∈ α[~γ, n], with C the “cut formula”,

~γ, n `β0 Γ, C ~γ, n `β1 Γ, ¬C~γ, n `α Γ

(Wi-Ax): For arbitrary α and ~γ and 1 ≤ i ≤ k,

~γ, n `α Γ, Wi(m),Wi(m) provided m ≤ n.

(Wi): For β1 ∈ α[~γ, n], β0 ∈ β1[~γ, n] and 1 ≤ i ≤ k,

~γ, n `β0 m : N ~γ, n `β1 Γ, Fi(Wi,m)~γ, n `α Γ, Wi(m)

.

(Ωi): For β0, β1 ∈ α[~γ, n] and 1 ≤ i ≤ k,

~γ, n `β0 Γ0,Wi(m) ~γ, n;Wi(m) 7→β1 Γ1

~γ, n `α Γ0,Γ1

where ~γ, n;Wi(m) 7→β1 Γ1 means that whenever ~ω, l `δ0 ∆,Wi(m)where ∆ is a set of positive-in-Wi formulas, δ ∈ ΩS

i and γi ∈ δ[~ω, l],then ~γ[i 7→ δ],max(n, l) `β1 ∆,Γ1. Here, `δ0 denotes a derivationwithout cuts and ~γ[i 7→ δ] denotes the sequence ~γ with γi replacedby δ.

As before, we indicate that all cut formulas in a derivation are of “size” ≤ rby writing ~γ, n `αr Γ. It is not of special importance how size is defined, ex-cept that a subformula is of smaller size than a formula, and atomic formulasm : N have size 0 and all other atomic formulas have size 1. We first needto extend slightly our notation to do with sets of structured tree ordinals.

Definition. For ~γ, n and ~δ, n′ in ΩSk ×· · ·×ΩS

1 ×N we write ~γ, n E ~δ, n′

to mean n ≤ n′ and for all i ≤ k either γi = δi or γi ∈ δi[~δ, n′].

Lemma. If α ∈ ΩSk+1 and ~γ, n E ~δ, n′ then α[~γ, n] ⊆ α[~δ, n′].


Proof. By induction on α. If α = 0 or α is a successor the result fol-lows trivially from the definition of α[~γ, n]. In the case where α = supΩi

αη

where i > 0, we have α[~γ, n] = αγi [~γ, n] ⊆ αγi [~δ, n′] by induction hypothesis.

Either γi = δi or γi ∈ δi[~δ, n′] = ωi[~δ, n′] since ~γ, n E ~δ, n′. By the defini-tion of structuredness we then have αγi ∈ α[~δ, n′] and hence immediately,αγi [~δ, n

′] ⊆ α[~δ, n′]. Therefore α[~γ, n] ⊆ α[~δ, n′] as required. The case i = 0is similar.

Lemma. If γi ∈ γ′i[~ω, n] and ~ω, n E ~γ, n then ~γ, n E ~γ[i 7→ γ′i], n. Henceby the lemma above α[~γ, n] ⊆ α[~γ[i 7→ γ′i], n] for any α ∈ ΩS

k+1.

Proof. To show ~γ, n E ~γ[i 7→ γ′i], n we merely check the definition. Theonly point at which the sequence ~γ, n differs from ~γ[i 7→ γ′i], n is at indexi. But then we are given that γi belongs to γ′i[~ω, n] and γ′i[~ω, n] ⊆ γ′i[~γ, n]by the last lemma. But γ′i[~γ, n] is the same as γ′i[~γ[i 7→ γ′i], n] and thereforeγi ∈ γ′i[~γ[i 7→ γ′i], n] as required.

Lemma (Weakening). Suppose ~γ, n `α Γ in IDk(W )∞.(a) If Γ ⊆ Γ′ and α[~γ, n′] ⊆ α′[~γ, n′] for all n′ ≥ n then ~γ, n `α0+α′ Γ′ for

any α0.(b) If n ≤ n′ then ~γ, n′ `α Γ.(c) If ~ω, n E ~γ, n and γi ∈ γ′i[~ω, n] then ~γ[i 7→ γ′i], n `α Γ.

Proof. (a). By induction on α with cases according to the last ruleapplied in deriving ~γ, n `α Γ. In all cases except the (∀) and (Ωi) rules theresult follows by first applying the induction hypothesis to the premises andthen re-applying the final rule. This final rule becomes applicable becauseif β ∈ α[~γ, n] then β ∈ α′[~γ, n] by assumption, and consequently α0 +β ∈ α0 + α′[~γ, n]. Case (∀). Suppose ∀xA(x) ∈ Γ and for each i we have~γ,max(n, i) `βi Γ, A(i) for some βi ∈ α[~γ,max(n, i)]. By the inductionhypothesis applied to each of these premises we obtain ~γ,max(n, i) `α0+βi

Γ′, A(i). Also α0+βi ∈ (α0+α′)[~γ,max(n, i)] because βi ∈ α[~γ,max(n, i)] ⊆α′[~γ,max(n, i)] by assumption. Now we can re-apply the (∀) rule to obtainthe required ~γ, n `α0+α′ Γ′.

Case (Ωi). Let Γ = Γ0,Γ1 where ~γ, n `β0 Γ0,Wi(m) and ~γ, n;Wi(m) 7→β1

Γ1. It is easy to see that, in this case, we can apply the induction hypothesisstraightaway to each of the premises, to obtain ~γ, n `α0+β0 Γ′0,Wi(m) and~γ, n;Wi(m) 7→α0+β1 Γ′1. Then by re-applying the (Ωi) rule one obtains, asbefore, the desired ~γ, n `α0+α′ Γ′0,Γ

′1.

(b). Again by induction on α with cases according to the last rule appliedin deriving ~γ, n `α Γ. In all cases one applies the induction hypothesis tothe premises, increasing n to n′ in the declaration, and then one re-appliesthe rule noticing that if β ∈ α[~γ, n] then β ∈ α[~γ, n′] since always we haveα[~γ, n] ⊆ α[~γ, n′] by structuredness.

(c). Again by induction on α with cases according to the last rule appliedin deriving ~γ, n `α Γ. We treat the (Ωj) rules separately. In all other casesone applies the induction hypothesis to the premises, increasing γi up to γ′iin the declaration, and then one re-applies the same rule, noticing that ifβ ∈ α[~γ, n] then β ∈ α[~γ[i 7→ γ′i], n] by the second lemma above. Here wemake use of the two assumptions.


For the (Ωj) rules where j 6= i the argument is similar. In these casesthe premises will be ~γ, n `β0 Γ0,Wj(m) and ~γ, n;Wj(m) 7→β1 Γ1, andthe conclusion is ~γ, n `α Γ0,Γ1. We can straightforwardly apply theinduction hypothesis to the first premise. For the second premise assume~ω, l `δ0 ∆,Wj(m) where γj ∈ δ[~ω, l]. Then ~γ[j 7→ δ],max(n, l) `β1 ∆,Γ1 andthe induction hypothesis can be applied to this also, so as to change γi toγ′i. We then have shown ~γ[i 7→ γ′i], n;Wj(m) 7→β1 Γ1. The (Ωj) rule can nowbe re-applied as before to yield the required ~γ[i 7→ γ′i], n `α Γ0,Γ1.

Finally, the (Ωi) rule has ~γ, n `β0 Γ0,Wi(m) and ~γ, n;Wi(m) 7→β1 Γ1

as its premises, and the conclusion is ~γ, n `α Γ0,Γ1. Again the inductionhypothesis can be applied immediately to the first premise, so as to increaseγi to γ′i in the declaration. For the second premise assume ~ω, l `δ0 ∆,Wi(m)where γ′i ∈ δ[~ω, l]. By assumption we know that γi ∈ γ′i[~ω, n]. Henceγi ∈ γ′i[~ω,max(n, l)] ⊆ δ[~ω,max(n, l)]. Therefore, with l′ := max(n, l) wehave ~ω, l′ `δ0 ∆,Wi(m) (by part (b)) and γi ∈ δ[~ω, l′]. We can now apply~γ, n;Wi(m) 7→β1 Γ1 to conclude ~γ[i 7→ δ],max(n, l) `β1 ∆,Γ1 which is thesame as ~γ[i 7→ γ′i][i 7→ δ],max(n, l) `β1 ∆,Γ1. We have now shown that~γ[i 7→ γ′i], n `β0 Γ0,Wi(m) and ~γ[i 7→ γ′i], n;Wj(m) 7→β1 Γ1. Re-applyingthe (Ωi) rule then gives ~γ[i 7→ γ′i], n `α Γ0,Γ1.

Lemma. Suppose ~γ, n `δ0 Γ where δ ∈ ΩSi and ~ω, n E ~γ, n, and all

occurrences of Wi in Γ are positive. Let A(a) be an arbitrary formula ofIDk(W ). Then there are fixed d and r such that in IDk(W )∞ we can prove

~γ,max(n, d) `ωk+δr ¬∀a(Fi(A, a)→ A(a)),Γ∗

where Γ∗ results from Γ by replacing some, but not necessarily all, occur-rences of Wi by A.

Proof. By induction according to the last rule applied in deriving~γ, n `δ0 Γ. Notice that if this is an axiom then it cannot be a (Wi)-axiombecause all occurrences of Wi are positive. Hence Γ∗ is (essentially) the sameaxiom. The result in this case then follows because any ordinal bound canbe assigned to an axiom.

For any rule other than the (Wi) and (Ωi) rules the result follows straight-forwardly by applying the induction hypothesis to the premises and thenre-applying that rule.

In the case of a (Wi) rule suppose ~γ, n `δ0 Γ,Wi(m) comes from thepremises ~γ, n `β0

0 m : N and ~γ, n `β10 Γ, Fi(Wi,m) where β1 ∈ δ[~γ, n] and

β0 ∈ β1[~γ, n]. By the induction hypothesis

~γ,max(n, d) `ωk+β1r Γ∗,¬∀a(Fi(A, a)→ A(a)), Fi(A,m).

By logic we can prove Fi(A,m) ∧ ¬A(m),¬Fi(A,m), A(m) for any formulaA. The proof can be translated directly into a proof in IDk(W )∞ of finiteheight. We choose d to be this height, and r to be the size of the formulaFi(A,m). Since d[~γ, n′] ⊆ ωk[~γ, n′] for all n′ ≥ max(n, d) we can apply part(a) of the Weakening Lemma to give

~γ,max(n, d) `ωk0 Γ∗, Fi(A,m) ∧ ¬A(m),¬Fi(A,m), A(m).


By weakening the other premise ~γ, n `β00 m : N to ~γ,max(n, d) `ωk+β0

0 m : Nand then applying the (∃) rule,

~γ,max(n, d) `ωk+β10 Γ∗,¬∀a(Fi(A, a)→ A(a)),¬Fi(A,m), A(m).

A cut on the formula Fi(A,m), of size r, immediately gives

~γ,max(n, d) `ωk+δr Γ∗,¬∀a(Fi(A, a)→ A(a)), A(m)

which is the required sequent in this case.In the case of an (Ωi) rule, suppose Γ = Γ0,Γ1 and the premises are

~γ, n `β00 Γ0,Wi(m) and ~γ, n;Wi(m) 7→β1

0 Γ1 where β0, β1 ∈ δ[~γ, n]. Byapplying the induction hypothesis to each of these premises we easily obtain

~γ,max(n, d) `ωk+β00 ¬∀a(Fi(A, a)→ A(a)),Γ∗0,Wi(m),

~γ,max(n, d);Wi(m) 7→ωk+β1r ¬∀a(Fi(A, a)→ A(a)),Γ∗1.

We can then re-apply the (Ωi) rule to obtain

~γ,max(n, d) `ωk+δr ¬∀a(Fi(A, a)→ A(a)),Γ∗0,Γ

∗1

since ωk + β0, ωk + β1 ∈ ωk + δ[~γ, n]. This concludes the proof.

Theorem (Embedding). Suppose IDk(W ) ` Γ(~x ). Then there arefixed numbers d and r, determined by this derivation, such that for all~n = n1, n2, . . . , if max(~n, d) < n then ~ω, n `ωk·2+ω0

r Γ(~n ) in IDk(W )∞.

Proof. By induction on the height h of the given derivation of Γ(~x ) inIDk(W ) we show ~ω, n `ωk·2+h

r Γ(~n ) for sufficiently large n. The number r isan upper bound on the cut rank and the induction complexity of the IDk(W )derivation. This proof will now simply be an extension of the correspondingembedding of Peano Arithmetic into its infinitary system. Since the logicand the rules are essentially the same we need only consider the new axiomsbuilt into IDk(W ). Firstly for any axiom Γ,Wi(x),Wi(x) we immediatelyhave by (Wi-Ax) ~ω, n `ωk·2

r Γ(~n ),Wi(m),Wi(m) for any m ≤ n.For the inductive closure axioms note thet

~ω, n `d′0 Γ(~n ),¬Fi(Wi,m), Fi(Wi,m)

where d′ depends only on the size of the formula Fi(Wi,m). Furthermorefor m ≤ n we have ~ω, n `0

0 m : N by (N1). Thus by the (Wi) rule, ~ω, n `d′+10

Γ(~n ),¬Fi(Wi,m),Wi(m). Hence by two applications of the (∨) rule fol-lowed by the (∀) rule we obtain ~ω, n `d′+4

0 Γ(~n ),∀a(Fi(Wi, a) → Wi(a)).Therefore, provided we choose d ≥ d′ + 4, we have d′ + 4 ∈ ωk[~ω, n], so byweakening ~ω, n `ωk·2

0 Γ(~n ),∀a(Fi(Wi, a)→Wi(a)).For the least-fixed-point axioms for Wi we apply the last lemma. From

~ω, l `δ0 ∆,Wi(m), assuming ωi−1 ∈ δ[~ω, l] we obtain, for appropriate d andr depending on the size of A,

~ω,max(l, d) `ωk+δr ∆,¬∀a(Fi(A, a)→ A(a)), A(m).

By part (c) of the Weakening Lemma we obtain immediately

~ω[i 7→ δ],max(l, d) `ωk+δr ∆,¬∀a(Fi(A, a)→ A(a)), A(m).


But ωk · 2[~ω[i 7→ δ], n′] = ωk + δ[~ω[i 7→ δ], n′] and so by part (a) of theWeakening Lemma,

~ω[i 7→ δ],max(l, d) `ωk·2r ∆,¬∀a(Fi(A, a)→ A(a)), A(m)

and we can increase the d to max(n,m) by part (b) of Weakening. We havetherefore shown

~ω,max(n,m);Wi(m) 7→ωk·2r ¬∀a(Fi(A, a)→ A(a)), A(m).

Now we use the (Ωi) rule to combine this with the axiom ~ω,max(n,m) `0

Γ(~n ),Wi(m),Wi(m) so as to derive

~ω,max(n,m) `ωk·2+1r Γ(~n ),Wi(m),¬∀a(Fi(A, a)→ A(a)), A(m).

Hence by two (∨) rules followed by the (∀) rule,

~ω, n `ωk·2+4r Γ(~n ),¬∀a(Fi(A, a)→ A(a)),∀a(Wi(a)→ A(a)).

Hence the least-fixed-point axiom for Wi.The ordinary induction axioms can be treated as for PA and would yield

a bound ω0 +4, which could be weakened to ωk · 2. Since the logical rules ofIDk(W ) are easily transferred to “finite step” rules in the infinitary calculus,it now follows that any derivation in IDk(W ) transforms into an infinitaryone ~ω, n `ωk·2+h

r Γ(~n ).

5.3.3. Ordinal analysis of IDk. In this subsection we compute ordinalbounds for Σ1 theorems, and hence provably recursive functions, of IDk(W ).The methods are cut elimination and collapsing a la Buchholz (1987). TheΩ rules used here are a variation on his original invention, but tailored to astep by step collapsing process. It should be remarked that the developmenthere is similar to (though somewhat more complex than) the PhD thesis ofWilliams (2004) which analyses finitely iterated inductive definitions basedon a weak arithmetic with a pointwise induction scheme.

Lemma (Inversion). In IDk(W )∞:(a) If ~γ, n `αr Γ, A0 ∧A1, then ~γ, n `αr Γ, Ai, for each i = 0, 1.(b) If ~γ, n `αr Γ,∀xA(x), then ~γ,max(n,m) `αr Γ, A(m).

Proof. The parts are very similar, so we shall only do part (b). Fur-thermore the fundamental ideas for this are already dealt with in the ordinalanalysis for Peano Arithmetic.

We proceed by induction on α. Note first that if the sequent ~γ, n `αrΓ,∀xA(x) is an axiom of IDk(W )∞ then so is ~γ, n `αr Γ and then the desiredresult follows immediately by weakening.

Suppose ~γ, n `αr Γ,∀xA(x) is the consequence of a (∀) rule with ∀xA(x)the “main formula” proven. Then the premises are, for each m,

~γ,max(n,m) `βmr Γ, A(m),∀xA(x)

where βm ∈ α[~γ,max(n,m)]. So by applying the induction hypothesis oneimmediately obtains ~γ,max(n,m) `βm

r Γ, A(m). Weakening then allows theordinal bound βm to be increased to α.

In all other cases the formula ∀xA(x) is a “side formula” occurring in thepremise(s) of the final rule applied. So by the induction hypothesis, ∀xA(x)


can be replaced by A(m) and n by max(n,m). The result then follows byre-applying that final rule.

Lemma (Cut Reduction). Suppose ~γ, n `αr Γ, C and ~γ, n `γr Γ′,¬C inIDk(W )∞ where C is a formula of size r+1 and of shape C0∨C1 or ∃xC0(x)or Wi(m) or a false atom. Then

~γ, n `γ+αr Γ,Γ′.

Proof. By induction on α with cases according the last rule applied inderiving ~γ, n `αr Γ, C.

If it is an axiom then either Γ is already an axiom or else C is Wi(m)and Γ contains ¬C. In this case we can weaken ~γ, n `γr Γ′,¬C to obtain~γ, n `γ+αr Γ′,Γ as required.

If it arises by any rule in which C is a side formula then the inductionhypothesis applied to the premises replaces C by Γ′ and adds γ to the leftof the ordinal bound. But if β ∈ α[~γ, n] then γ + β ∈ γ + α[~γ, n], so byre-applying the final rule one again obtains ~γ, n `γ+αr Γ,Γ′. This applies toall of the rules including the (Wi) rule and the (Ωi) rule.

Finally suppose C is the “main formula” proven in the final rule of thederivation. There are two cases:

If C is C0 ∨C1 then the premise is ~γ, n `βr Γ, Ci, C with β ∈ α[~γ, n]. Bythe induction hypothesis we therefore have ~γ, n `γ+βr Γ, Ci,Γ′. By inverting~γ, n `γr Γ′,¬C where ¬C is ¬C0 ∧ ¬C1 we obtain ~γ, n `γr Γ′,¬Ci,Γ byweakening. Now we can apply a cut on Ci (which has size ≤ r) to produce~γ, n `γ+αr Γ,Γ′.

If C is ∃xC0(x) the premises are ~γ, n `β0r m : N and ~γ, n `β1

r Γ, C0(m), Cwhere β1 ∈ α[~γ, n] and β0 ∈ β1[~γ, n]. By the induction hypothesis wetherefore have ~γ, n `γ+β1

r Γ, C0(m),Γ′. Now by inverting ~γ, n `γr Γ′,¬Cwhere ¬C is ∀x¬C0(x) we get ~γ,max(n,m) `γr Γ′,¬C0(m),Γ by weak-ening. Observe that the first premise ~γ, n `β0

r m : N can be weakenedto the ordinal bound γ + β0 and from this, by the (N2) rule, we obtain~γ, n `γ+β1

r Γ′,¬C0(m),Γ. Now we can apply a cut on C0(m) (which has size≤ r) to produce ~γ, n `γ+αr Γ,Γ′.

Theorem (Cut Elimination). In IDk(W )∞, if ~γ, n `αr+1 Γ then we have~γ, n `2α

r Γ, and by repeating this, ~γ, n `α∗0 Γ where α∗ = 2r+1(α).

Proof. By induction on α.If ~γ, n `αr+1 Γ arises by any rule other than a cut of rank r + 1, simply

apply the induction hypothesis to the premises and then re-apply this finalrule, using the fact that β ∈ α[~γ, n] implies 2β ∈ 2α[~γ, n].

If on the other hand, it arises by a cut of rank r + 1, then the premiseswill be ~γ, n `β0

r+1 Γ, C and ~γ, n `β1r+1 Γ,¬C where β0, β1 ∈ α[~γ, n] and C has

size r+1. By weakening if necessary we may assume β0 = β1. Applying theinduction hypothesis to these one then obtains ~γ, n `2β0

r Γ, C and ~γ, n `2β0

r

Γ,¬C. Cut Reduction then gives ~γ, n `2β0+1

r Γ and since β0 ∈ α[~γ, n],2β0 ∈ 2α[~γ, n] and therefore 2β0+1[~γ, n] ⊆ 2α[~γ, n]. Weakening then gives~γ, n `2α

r Γ as required.


Theorem (Collapsing). If ~ω, n `α0 Γ in IDk(W )∞ where Γ is a set ofpositive-in-Wk formulas, then

~ω, n `ϕ(α,ωk−1)0 Γ

by a derivation in which there are no (Ωk) rules. Here ϕ denotes the functionϕ(k) : ΩS

k+1 × ΩSk → ΩS

k defined at the beginning of this chapter.

Proof. By induction on α, as usual, with cases according to the lastrule applied in deriving ~γ, n `α0 Γ. In the case of axioms there is nothingto do, because the ordinal bound can be chosen arbitrarily. In all rulesexcept (Ω) the process is the same: apply the induction hypothesis to thepremises, and then re-apply the final rule. For instance, if the final ruleis a (∀) where ∀xA(x) ∈ Γ, then the premises are ~γ,max(n, i) `βi Γ, A(i)with βi ∈ α[~γ,max(n, i)]. By the induction hypothesis we therefore havea derivation of ~γ,max(n, i) `ϕ(βi,ωk−1) Γ, A(i) in which there are no (Ωk)rules. Since βi ∈ α[~γ,max(n, i)], an earlier calculation gives ϕ(βi, ωk−1) ∈ϕ(α, ωk−1)[~γ,max(n, i)]. Re-applying the (∀) rule gives ~γ, n `ϕ(α,ωk−1)

0 Γ.Now suppose ~γ, n `α0 Γ0,Γ1 comes about by an application of an (Ωi) rule

where i < k. Applying the induction hypothesis to the first premise ~γ, n `β00

Γ0,Wi(m) gives immediately the derivation ~γ, n `ϕ(β0,γk)0 Γ0,Wi(m) in which

there are no (Ωk) rules. In the case of the second premise ~γ, n;Wi(m) 7→β10

Γ1 assume ~ω, l `δ0 ∆,Wi(m) where ωi−1 ∈ δ[~ω, l]. Then we have ~γ[i 7→δ],max(n, l) `β1

0 ∆,Γ1 and we can apply the induction hypothesis to this toobtain

~γ[i 7→ δ],max(n, l) `ϕ(β1,γk)0 ∆,Γ1

with no (Ωk) rules. This proves ~γ, n;Wi(m) 7→ϕ(β1,γk)0 Γ1. We can therefore

re-apply this (Ωi) rule, using ϕ(β0, γk), ϕ(β1, γk) ∈ ϕ(α, γk)[~γ, n] to obtain~γ, n `ϕ(α,γk)

0 Γ0,Γ1, again with no (Ωk) rules.Finally suppose ~γ, n `α0 Γ0,Γ1 comes about by an application of an (Ωk)

rule. By a simple weakening of one of the premises we may safely assume thatthey both have the same ordinal bound β ∈ α[~γ, n]. Applying the inductionhypothesis to the first premise ~γ, n `β0 Γ0,Wk(m) we obtain a derivation~γ, n `ϕ(β,γk)

0 Γ0,Wk(m) as before, without (Ωk) rules. Since Γ0 is a set ofpositive-in-Wk formulas we can apply the second premise ~γ, n;Wk(m) 7→β

0 Γ1

with δ = ϕ(β, γk), ∆ = Γ0 and l = n to obtain δ, ωk−2, . . . , ω0, n `β0 Γ0,Γ1.Now we can apply the induction hypothesis to get δ, ωk−2, . . . , ω0, n `ϕ(β,δ)

0

Γ0,Γ1 without (Ωk) rules, and since the ordinal bound is now in ΩSk the δ at

the front is redundant and may be replaced by anything, in particular γk.Thus ~γ, n `ϕ(β,δ)

0 Γ0,Γ1 where ϕ(β, δ) = ϕ(β + 1, γk). But since β ∈ α[~γ, n],we have ϕ(β + 1, γk)[~γ, n] ⊆ ϕ(α, γk)[~γ, n], and so a final weakening gives~γ, n `ϕ(α,γk)

0 Γ0,Γ1. Notice that we have eliminated this application of the(Ωk) rule.

Theorem. If IDk(W ) ` Γ(~x ) where Γ is a set of purely arithmetical(or even positive-in-W1) formulas, then there are fixed numbers d, h and r,and a fixed (countable) α ≺ τk+2, such that if n > max(~n, d, h, r) then in


ID0(W )∞ + (W1) we can can derive

n `α0 Γ(~n ).

Recall τk+2 = ϕ(1)(ϕ(2)(. . . ϕ(k+1)(ϕ(k+2)(ωk+2, ωk+1), ωk), . . . , ω1), ω0).

Proof. The Embedding Theorem shows that if Γ(~x ) is provable inIDk(W ) then there are fixed numbers d, h and r, determined by this proof,such that if n > max(~n, d) then

~ω, n `ωk+ωk+hr Γ(~n ).

We can weaken this result slightly by noting that whenever we have aninfinitary derivation ~ω, n `ωk+β

r Γ, the ordinal bound β can be replaced by2β thus: ~ω, n `ωk+2β

r Γ. This is easily shown by induction on β, for one onlyneeds to check the ordinal assignment conditions, that if ωk+γ ∈ ωk+β[~ω, n]then γ ∈ β[~ω, n], hence 2γ ∈ 2β[~ω, n] and finally ωk + 2γ ∈ ωk + 2β[~ω, n].Therefore

~ω, n `ωk+2ωk+h

r Γ(~n ).

Now ωk+2ωk ·(1+22ωk )[~ω, n] = ωk+2ωk ·(1+22n)[~ω, n]. Such calculations are

easily checked: the rightmost ωk diagonalizes to ωk−1, then to ωk−2, and soon down to ω0 and then to n. Thus if n > h we see that ωk + 2ωk+h belongsto ωk + 2ωk · (1 + 22ωk )[~ω, n]. We can now use the part (a) of the WeakeningLemma to increase the ordinal bound ωk+2ωk+h to ωk+2ωk ·(1+22ωk ), whichis the same as ϕ(k+1)(ωk+1, ϕ

(k+1)(ωk+1, ωk)). This is because for β ∈ ΩSk+1

the definition of ϕ(k+1) gives ϕ(k+1)(ωk+1, β) = ϕ(k+1)(β, β) = β+ 2β. Thus

~ω, n `ϕ(k+1)(ωk+1,ϕ

(k+1)(ωk+1,ωk))r Γ(~n ).

By the Cut Elimination Theorem, any derivation with cut rank r and ordinalbound β ∈ ΩS

k+1 can be transformed into a derivation with cut rank r−1 andordinal bound 2β. But this could be weakened to β + 2β = ϕ(k+1)(ωk+1, β).Therefore we can successively reduce cut rank by iterating the functionψ(·) := ϕ(k+1)(ωk+1, ·) to obtain

~ω, n `ψr+2(ωk)

0 Γ(~n ).

By repeated Collapsing~ω, n `α0 Γ(~n )

where α = ϕ(1)(ϕ(2)(. . . ϕ(k)(ψr+2(ωk), ωk−1) . . . , ω1), ω0). Since this is nowa cut-free derivation with a countable ordinal bound, it has neither (Wi)rules for i > 1 nor any (Ωi) rules and the ~ω prefix is redundant. Hence inID0(W )∞ + (W1) we have

n `α0 Γ(~n ).

It remains to check that α ≺ τk+2. Firstly ϕ(k+1)(ϕ(k+2)(ωk+2, ωk+1), ωk) =ϕ(k+1)(ωk+1 + 2ωk+1 , ωk) = ϕ(k+1)(ωk+1 + 2ωk , ωk). Furthermore we haveϕ(k+1)(ωk+1 + 2ωk , ωk)[~ω, n] = ϕ(k+1)(ωk+1 + 2n, ωk)[~ω, n] = ψ22n

(ωk)[~ω, n]and this set contains ψr+2(ωk) as long as 22n

> r + 2. This is becauseβ ∈ ψ(β)[~ω, n] for all β ∈ ΩS

k+1. Hence

ψr+2(ωk) ∈ ϕ(k+1)(ϕ(k+2)(ωk+2, ωk+1), ωk)[~ω, n].


Now recall that if γ ∈ β[~ω, n] then ϕ(k)(γ, ωk−1) ∈ ϕ(k)(β, ωk−1)[~ω, n]. So

ϕ(k)(ψr+2(ωk), ωk−1) ∈

ϕ(k)(ϕ(k+1)(ϕ(k+2)(ωk+2, ωk+1), ωk), ωk−1)[ωk−2, . . . , ω0, n].

Repeating this process at levels k − 1, k − 2, . . . , 1 we thus obtain

α = ϕ(1)(ϕ(2)(. . . ϕ(k)(ψr+2(ωk), ωk−1) . . . , ω1), ω0)

∈ ϕ(1)(ϕ(2)(. . . ϕ(k)(ϕ(k+1)(ϕ(k+2)(ωk+2, ωk+1), ωk), ωk−1) . . . , ω1), ω0)[n]

= τk+2[n].

We have checked, in the course of the above, that this holds provided n islarge enough, for example n > max(d, h, r). Thus α ∈ τk+2[max(d, h, r)] andhence α ≺ τk+2.

Lemma. If Γ is a finite set of Σ1-formulas such that n `α0 Γ(~n ) then atleast one of them, say ∃~yA(~n, ~y ) is true with witnesses ~m < Bα(n+ 1).

Proof. Proceed by induction on α with cases according to the last ruleapplied. If Γ is an axiom, then it contains a true atom, which is a trivial Σ1-formula requiring no witnesses, and so we are done. If the last rule appliedis anything other than an (∃) rule then its principal formula is bounded.Either this is true, in which case it is again a true Σ1-formula requiring nowitnesses and we are done, or else one of the premises is of the form n `β0Γ′(~n ), C where C is a false subformula of that principal formula. Applyingthe induction hypothesis to this premise we see that Γ′(~n ) ⊆ Γ(~n ) containsa true Σ1-formula with witnesses less than Bβ(n+ 1) < Bα(n+ 1). Finallysuppose the last rule applied is an (∃) rule with conclusion Γ(~n ),∃yD(y, ~n )and premises n `β0 m : N and n `β0 Γ(~n ), D(m,~n ) where D is again Σ1.By the induction hypothesis either Γ(~n ) already contains a true Σ1-formulawith witnesses less than Bβ(n+1) or else D(m,~n ) is a true Σ1-formula withwitnesses less than Bβ(n+1) and the new witness m for ∃y is also less thanBβ(n+ 1). Since Bβ(n+ 1) is less than Bα(n+ 1) we are done.

Corollary. If IDk(W ) ` ∃~yA(~x, ~y ) with A bounded, then there is anα ≺ τk+2 and a number d such that for any ~n there are ~m < Bα(max(~n, d)+1) such that A(~n, ~m) holds.

Proof. By the theorem we have n `α0 ∃~yA(~n, ~y ) if n > max(~n, d).Applying the lemma we have immediately true witness ~m < Bα(max(~n, d)+1) such that A(~n, ~m) holds.

5.3.4. Accessible = provable recursive in ID<ω. From the forego-ing and 5.3.1 we immediately have

Theorem. The provably recursive functions of IDk(W ) are exactly thoseelementary, or primitive, recursive in Bα | α < τk+2 . Hence the accessiblerecursive functions are exactly those provably recursive in ID<ω(W ).

5.4. ID<ω AND Π11-CA0 185

5.3.5. Provable ordinals of IDk(W ). By a “provable ordinal” of thetheory IDk(W ) we mean one for which there is a recursive ordinal notationa such that IDk(W ) ` W1(a). This is equivalent to proving transfiniteinduction up to a in the form

∀b(F1(A, b)→ A(b)→ ∀baA(b)).

Thus in case k = 0, although W1 is not a predicate symbol of PA, it never-theless makes perfectly good sense to refer to the ordinals below ε0 = |τ2| asthe provable ordinals of ID0(W ), since we know already that they are theones for which PA proves transfinite induction.

Theorem. For each k, the provable ordinals of IDk(W ) are exactly thoseless than |τk+2|.

Proof. Any ordinal less that |τk+2| is represented by a tree ordinalα ≺ τk+2 and, by 5.3.1, this has a notation a for which IDk(W ) ` W1(a).Thus every ordinal below |τk+2| is a provable one of IDk(W ).

Conversely, if IDk(W ) `W1(a), then by 5.3.3 one can derive a `α0 W1(a)in ID0(W )∞ + (W1) for some fixed α ≺ τk+2. Therefore a `β0 F1(W1, a)for some β ≺ α. By inverting this, and deleting existentisal side formulaswhich contain false atomic conjuncts, one easily sees that either a = 0 ora = 〈0, b〉 and a `γ0 W1(b) for some γ ≺ β, or else a = 〈1, e〉 and for everyn, max(a, n) `γ0 W1(e(n)), again where γ ≺ β. Thus one can prove byinduction on α that if n `α0 W1(a) than |a| ≤ |α|.

5.4. ID<ω and Π11-CA0

Π11-CA0 is the second order (classical) theory whose language extends

that of PA by the addition of new variables X,Y, . . . and f, g, . . . denotingsets of numbers and (respectively) number–theoretic functions. Of course,the set variables may be eliminated in favour of function variables only, forexample replacing ∃XA(X) by ∃fA(x | f(x) = 0 ) and t ∈ x | f(x) = 0 by f(t) = 0. For later convenience we shall consider this done, but for thetime being we continue to use the set notation as an abbreviation.

The axioms of Π11-CA0 are those of PA together with the single induction

axiom (not the schema – this is what the “-CA0” stands for)

∀X(0 ∈ X ∧ ∀x(x ∈ X → x+ 1 ∈ X)→ ∀x(x ∈ X))

and the comprehension schema,

∃X∀x(x ∈ X ↔ C(x))

restricted to Π11 formulas C which do not contain X free but may have first

and second order parameters. Recall that Π11 formulas are those of the form

∀fA(f) with A arithmetical (i.e., containing no second order quantifiers).The comprehension principal gives sets, but we do not yet have a principleguaranteeing the existence of functions whose graphs are definable. To thisend we need to add the so-called graph principle:

∀~x∃!z(~x, z) ∈ X → ∃h∀~x(~x, h(~x )) ∈ X.


As an example of how this is used we show, briefly, that the following versionof the axiom of choice:

∀x∃fA(x, f)→ ∃h∀xA(x, hx)

is provable in Π11-CA0, for arithmetical A and where hx(y) := h(x, y). First,

by standard quantifier manipulation, one reduces ∀x∃fA(x, f) to the form∀x∃f∀yR(x, f(y)). Now let Q(x, s) be defined as follows

Q(x, s) := ∀i≤lh(s)((s)i = µm∀z≥i∃t(R(x, t) ∧ lh(t) ≥ z ∧ t w (si) ∗ 〈m〉))

where s, t denote sequence numbers and t w s means that t extends s. Thenthe formula ∃s(lh(s) = y∧Q(x, s)∧ (s)y = z) arithmetically defines a single-valued relation X(x, y, z) which is the graph of a binary function h(x, y).The graph principle ensures the existence of h and for each x, hx picks outthe “leftmost” infinite branch through R. That is, ∀x,yR(x, hx(y)) and hence∃h∀xA(x, hx).

The foundational importance of Π11-CA0 is that it is strong enough to

formalise and develop large parts of core mathematics – up to, for exam-ple, the Cantor-Bendixson Theorem, Ulm’s Theorem and many other fun-damental results. The reader is recommended to consult the major workof Simpson (1999) where Π1

1-CA0 features at the “top” of a hierarchy offive particularly significant subsystems of second order arithmetic (RCA0,WKL0, ACA0, ATR0 and Π1

1-CA0) each of which captures and (in reverse)characterises deep mathematical principles in terms of the levels of compre-hension allowed.

Feferman (1970) proved that various Π11–systems, in particular Π1

1-CA0,can be reduced to theories of iterated inductive definitions, and we followhere the treatment reviewed in chapter 1 of Buchholz et al. (1981), to showthat Π1

1-CA0 and ID<ω(W ) have the same first order theorems (and thus thesame provably recursive functions). However a certain amount of additionalcare must be taken in the reduction of Π1

1-CA0 to the finitely iterated IDsystem, because our Wi’s are defined in terms of unrelativised partial re-cursive sequencing at limits, and this means that the usual Π1

1 normal formcannot be reduced directly to a Wi set without further manipulation into aspecial normal form due to Richter (1965).

First, as a straightforward illustration of the power of Π11-CA0, we show

that ID<ω(W ) is easily embedded into it. One only needs to prove theexistence of sets X1, . . . Xk, . . . satisfying the inductive closure and least–fixed–point axioms for the operator forms F1, . . . Fk, . . . respectively. Foreach k let Fk(X1, . . . Xk−1, Y, a) be the formula obtained from the operatorform Fk(A, a) by replacing each occurence of Wi by the set variable Xi andthe formula A by Y . Let Ck(X1, . . . , Xk−1, Z) be the formula

∀z(z ∈ Z ↔ ∀Y (∀a(Fk(X1, . . . , Xk−1, Y, a)→ a ∈ Y )→ z ∈ Y ))

expressing that Z is the intersection of all sets Y which are inductively closedunder Fk with respect to the set parameters X1, . . . , Xk−1. Since this is aΠ1

1 condition we have by Π11–comprehension,

∃ZCk(X1, . . . , Xk−1, Z) .

5.4. ID<ω AND Π11-CA0 187

By its very definition, such a Z automatically satisfies the least-fixed-pointschema for Fk(X1, . . . , Xk−1, Y ) as an operator on (arithmetically definedsets) Y (which exist because of Π1

1–comprehension). Furthermore the in-ductive closure axiom holds because the formula Fk(X1, . . . , Xk−1, Y, a) ispositive (and thus, as an operator, monotone) in Y , for we have

z 6∈ Z → ∃Y (∀a(Fk(X1, . . . , Xk−1, Y, a)→ a ∈ Y ) ∧ z 6∈ Y )

→ ∃Y (¬Fk(X1, . . . , Xk−1, Y, z) ∧ Z ⊆ Y )

→ ¬Fk(X1, . . . , Xk−1, Z, z) .

Hence if X1, . . . , Xk−1 are the sets W1, . . . ,Wk−1 then Z is the set Wk. Theprovable formula

∃X1∃X2 . . .∃Xk(C1(X1) ∧ C2(X1, X2) ∧ · · · ∧ Ck(X1, . . . , Xk−1, Xk))

therefore establishes the existence of W1, . . . ,Wk in Π11-CA0.

Conversely we need to show that, for first order arithmetical sentences,Π1

1-CA0 is conservative over ID<ω(W ). This will require a (many–one recur-sive) reduction of any Π1

1 form to one of the Wi sets, and a suitable interpre-tation of the second order theory Π1

1-CA0 inside ID<ω(W ). We leave asidethe interpretation for the time being, and concentrate first on Richter’s Π1

1

reduction without bothering explicitly about its formalisation.By standard quantifier manipulations, using Kleene’s Normal Form for

partial recursion and the usual overbar to denote the course-of-values func-tion f(x) = 〈f(0), . . . , f(x)〉, any Π1

1 formula C with set parameter X andnumber variable a can be brought to the form

C(a,X) ≡ ∀f∃xR(a, f(x), g1(x), g2(x))

where R is some “primitive recursive” formula (having only bounded quan-tifiers) and g1, g2 are the strictly increasing functions enumerating X and itscomplement (denoted here) X ′. Let h encode the three functions f, g1, g2 byh(x) = 〈f(x), g1(x), g2(x)〉 so that h0(x) = (h(x))0 = f(x), h1(x) = g1(x)and h2(x) = g2(x). Then the negation of the above form is equivalent to

∃h∀x(h(x) ∈ N ×X ×X ′ ∧ ¬R1(a, h(x)))

where ¬R1(a, h(x)) expresses the conjunction of h1(x− 1) < h1(x)∧ h2(x−1) < h2(x) and ∃y≤x(h1(y) = x ∨ h2(y) = x) and ¬R(a, h0(x), h1(x), h2(x)).Negating once again, one sees that the original Π1

1 form is equivalent to

C(a,X) ≡ ∀h∃x(h(x) 6∈ N ×X ×X ′ ∨R1(a, h(x))) .

We refer to this as the “Richter Normal Form” on N,X,X ′.Now, to take this a stage further, suppose that X were expressible in

Richter Normal Form from parameter Y thus:

x ∈ X ↔ ∀f∃y(f(y) 6∈ Y ∨R2(x, f(y))) .

Then, putting the two forms together, we have:

C(a,X)↔ ∀h∃x∀f∃y(h1(x) 6∈ X∨f(y) 6∈ Y ∨R1(a, h(x))∨R2(h2(x), f(y))) .

Replacing ∃x∀fP (x, f) by the (classically) equivalent ∀g∃xP (x, g(〈x, .〉)), theright hand side now becomes:

∀h∀g∃x∃y(h1(x) 6∈ X ∨ g(〈x, y〉) 6∈ Y ∨R1(a, h(x)) ∨R3(h2(x), g(〈x, y〉)))


where R3 is a suitably modified version of R2. The negation of this saysthat there are functions h, g such that for all x, y,

h1(x) ∈ X ∧ g(〈x, y〉) ∈ Y ∧ ¬R1(a, h(x)) ∧ ¬R3(h2(x), g(〈x, y〉)) .

By combining the functions h = 〈h0, h1, h2〉 and g into a new functionf(〈x, y〉) = 〈h0(x), h1(x), h2(x), g(〈x, y〉)〉, and adding a new primitive re-cursive clause ¬R4(f(〈x, y〉)) expressing, for each i = 0, 1, 2, that fi =hi is independent of y, i.e., fi(〈x, y〉) = fi(〈x, 0〉), one sees that, for allx, y, this last line is equivalent to the conjunction of f1(〈x, y〉) ∈ X ∧f3(〈x, y〉) ∈ Y and ¬R1(a, 〈f0, f1, f2〉(〈x, y〉)) and ¬R3(f2(〈x, y〉), f3(〈x, y〉))and ¬R4(f(〈x, y〉)). Negating back again, and contracting the pair x, y intoa single z (= 〈x, y〉), one obtains

C(a,X)↔ ∀f∃z(f(z) 6∈ N ×X ×N × Y ∨R5(a, f(z)))

where R5 is a disjunction of suitably modified versions of R1, R3, R4. It isthen a simple matter to combine the two occurrences of N into one, so thatC is now expressed in Richter Normal Form on the parameters N,X, Y .

Thus, if C is Π11 in X and X is expressible in Richter Normal Form on

parameter(s) Y , then C is expressible in Richter Normal Form on parametersN,X, Y . Now one sees that since W1, being Π1

1 in no set parameters, istherefore expressible in Richter Normal Form on parameter N , any set Π1

1

in W1 is expressible in Richter Normal Form on parameters N,W1, and sincethis includes W2, any set Π1

1 in W2 is expressible in Richter Normal Formon parameters N,W1,W2 (one can always combine multiple occurrences ofN in the parameter list). Iterating this, it follows that any set Π1

1 in Wk isexpressible in Richter Normal Form on the parameters N,W1, . . . ,Wk.

Lemma. If S is Π11 in Wk then it is many–one reducible to Wk+1. That

is, there is a recursive function g such that a ∈ S ↔ g(a) ∈Wk+1.

Proof. If S is Π11 inWk then, by the above, there is a primitive recursive

relation R such that for all a,

a ∈ S ↔ ∀f∃z(f(z) ∈ N ×W1 × · · · ×Wk → R(a, f(z)))

and we may assume that R(a, s) implies R(a, s′) for any extension s′ of s bysimply replacing R(a, s) by ∃t⊆sR(a, t) if necessary. For notational simplicityonly, we carry the proof through for k = 1, the general case being entirelysimilar. The function g is given by g(a) = g(a, 〈〉) where, for arbitrarysequence numbers s, the binary g(a, s) is defined (from its own index) bythe recursion theorem thus:

g(a, s) =

0 if R(a, s)〈1,Λi〈2,Λjg(a, s ∗ 〈〈i, j〉〉)〉〉 otherwise.

Here, the Kleene notation Λit denotes any index of t regarded as a recursivefunction of i only, and s ∗ s′ denotes the new sequence number obtained byconcatenating s with s′. We must show

g(a, s) ∈W2 ↔ ∀f∃z(f(z) ∈ N ×W1 → R(a, s ∗ f(z)))

so that the required result follows immediately by putting s = the emptysequence 〈〉.

5.4. ID<ω AND Π11-CA0 189

For the left–to–right implication we use (informally) the least-fixed-pointproperty of W2, by applying it to

A(b) ≡ ∀cb∀s(c = g(a, s)→ ∀f∃z(f(z) ∈ N ×W1 → R(a, s ∗ f(z))))

with the “sub–tree” partial ordering on W2. We show that ∀b(F2(A, b)→A(b)) from which one gets the required left–to–right implication by puttingb = g(a, s). Note that this is an abuse of the language of ID2(W ) becauseA is not even a first–order formula. However it will be when we come toformalize this argument subsequently. So assume F2(A, b). This means thatA(c) holds for every c ≺ b. If b = 〈0, c〉 or 〈2, e〉 then A(b) is automaticbecause b is not a value of g. If b = 0 and b = g(a, s) then R(a, s) holds andwe again have A(b). Finally suppose b = 〈1, e〉 and b = g(a, s). Then for eachi, e(i) = 〈2, ei〉 ≺ b where ei(j) = g(a, s∗〈i, j〉) for every j ∈W1. Hencefor every n, considered as a pair n = 〈i, j〉, if j ∈W1 we have g(a, s ∗ n) ≺ band therefore A(g(a, s∗n)), thus ∀f∃z(f(z) ∈ N×W1 → R(a, s∗〈n〉∗ f(z))).Since n is arbitrary, ∀f∃z(f(z) ∈ N ×W1 → R(a, s ∗ f(z))) and again wehave A(b).

For the right–to–left implication use the inductive closure property ofWk+1. Suppose g(a, s) 6∈ W2. Then g(a, s) is defined by its second clauseand so there is an i and a j ∈ W1 such that g(a, s ∗ 〈〈i, j〉〉) 6∈ W2. Let n0

be the least such pair 〈i, j〉 so that g(a, s ∗ 〈n0〉) 6∈ W2. Then let n1 be theleast such pair so that g(a, s ∗ 〈n0, n1〉) 6∈ W2. Clearly this process can berepeated ad infinitum to obtain a function f(z) = nz such that for all z,¬R(a, s ∗ f(z)). This completes the proof.

Note. The function f just defined is recursive in W2. More generally,the proof that any set Π1

1 in Wk is many–one reducible to Wk+1 needs onlyrefer to functions recursive in Wk+1.

Theorem. Any Π11-CA0 proof of a first order arithmetical sentence can

be interpreted in some IDk+1(W ) by restricting the function variables torange over those recursive in Wk+1. Thus Π1

1-CA0 and ID<ω(W ) prove thesame arithmetical formulas.

Proof. Suppose the given Π11-CA0 proof uses k + 1 instances of Π1

1-Comprehension, defining sets X0, . . . , Xk. Imagine them ordered in sucha way that the definition of each Xi uses only parameters from the listX0, . . . , Xi−1. Then by induction on i one sees, by the lemma, that eachXi is many–one reducible to Wi+1 and thence to Wk+1 and the proof ofthis refers only to functions recursive in Wk+1. Therefore by interpretingall function variables in the original proof as ranging over functions recur-sive in Wk+1, replacing ∃fC(f) by ∃e(∀x∃y(eWk+1(x) = y) ∧ C(eWk+1))etc., every second order formula is translated into a first order formula ofIDk+1(W ), first order formulas remaining unchanged. Thus the second or-der quantifier rules become first order ones provable in IDk+1(W ), the graphprinciple becomes provable too, and the induction axiom of Π1

1-CA0 becomesprovable by the usual first order schema. Furthermore, under this interpre-tation, all of the second order quantifier manipulations used in the abovereduction of Π1

1 forms to Richter Normal Forms are provably correct (be-cause of standard recursion–theoretic uniformities). The proof of the lemma


then translates into a proof in IDk+1(W ), and consequently each applicationof Π1

1-Comprehension becomes a theorem. If the end–formula of the givenΠ1

1-CA0 proof is first order arithmetical, it therefore remains provable inIDk+1(W ).

5.5. An Independence Result – Kruskal’s Theorem

Kruskal’s Theorem [] states that every infinite sequence Ti of finitetrees has an i < j such that Ti is embeddable in Tj . By “finite tree” ismeant a rooted (finite) partial ordering in which the nodes below any givenone are totally ordered. An embedding of Ti into Tj is then just a one-to-onefunction from the nodes of Ti to nodes of Tj preserving infs (greatest lowerbounds).

Friedman [][] showed this theorem to be independent of the theory ATR0

(see Simpson (1999)) and went on to develop a significant extension of itwhich is independent of Π1

1-CA0. The Extended Kruskal Theorem con-cerns finite trees in which the nodes carry labels from a fixed finite list0, 1, 2, . . . , k. By a more delicate argument, he proved that for any k,every infinite sequence Ti of finite ≤ k-labelled trees has an embeddingTi → Tj where i < j. However the notion of embedding is now more com-plex. Ti → Tj means that there is an embedding f in the former sense,but which also preserves labels (i.e., the label of a node is the same as thatof its image under f) and satisfies the gap condition which states: if nodex comes immediately below node y in Ti, and if z is an intermediate nodestrictly between f(x) and f(y) in Tj , then the label of z must be ≥ the labelof f(y).

Both of these statements are Π11, expressed by a universal set/function

quantifier, but Friedman showed that they can be miniaturised to a Π02 form

which (i) now falls within the realm of “finitary combinatorics”, expressiblein the language of first-order arithmetic, but (ii) still reflects the proof-theoretic strength of the original results.

Friedman’s Miniaturized Kruskal Theorem for labelled trees runs as fol-lows: For any number c and fixed k there is a number Kk(c) so large that forevery sequence Ti of finite ≤ k-labelled trees of length Kk(c), and whereeach Ti is bounded in size by ‖Ti‖ ≤ c · 2i, there is an embedding Ti → Tjwith i < j. (In fact Friedman showed that 2i may be replaced by i + 1without affecting the result’s strength).

That the miniaturized version is a consequence of the full theorem followsfrom Konig’s Lemma, for suppose the miniaturized version fails. Then thereis a c such that for every l there is a sequence of size-bounded, ≤ k-labelledfinite trees, of length l, which is “bad” (i.e., contains no embedding Ti → Tjwith i < j). Arrange these bad sequences into a big tree, each node ofwhich is itself a finite labelled tree. Because of the size-bound, each level ofthis big tree is finite. However it has infinitely many levels, so by Konig’sLemma there is an infinite branch. This infinite branch is then an infinitebad sequence, contradicting the full theorem.

5.5. AN INDEPENDENCE RESULT – KRUSKAL’S THEOREM 191

In this section we prove that the Miniaturized Kruskal Theorem forlabelled trees, and hence the full Kruskal Theorem, are independent of Π1

1-CA0. The proof consists in showing that the computation sequence for eachGτk(n) is bad.

5.5.1. ϕ-terms, trees and i-sequences. Henceforth we shall regardthe ϕ-functions as function symbols and use them, together with the con-stants 0, ωj , to build terms. Each such term will of course denote a (struc-tured) tree ordinal, but it is important to lay stress, in this section, uponthese terms rather than the tree ordinals which they denote.

Definition. An i-term, for i > 0, is either ωi−1 or else of the form

ϕ(i)α (β) (alternatively written ϕ(i)(α, β))

where β is an i-term and α is a j-term with j ≤ i + 1. (0-terms are justnumerals n built from 0 by repeated applications of the successor ϕ(0) whichhas no subscript.) Note that each i-term may be viewed as a finite labelledtree whose root has label i, whose left hand subtree is the tree α and whoseright hand subtree is the tree β. We often indicate the level i of a term γby writing γi.

Notation. For each i-term γi and i− 1-term ξi−1 (assuming i > 1) wedenote the term ϕ

(i−1)γ (ξ) by simply γ(ξ). If i = 1 then ξ is not present.

Thus, with association to the left,

τk = ϕ(1)(ϕ(2)(. . . ϕ(k)(ωk, ωk−1) . . . , ω1), ω0)

is denoted ωk(ωk−1)(ωk−2) . . . (ω0), and a typical i-term then would be

ν(ξik)(ξik−1) . . . (ξi1)(ξi)

where ν (the “indicator”) is either 0 or an ωj .

Definition (The Computation Sequence). The computation sequencestarting with τk and fixed input n is the sequence of 1-terms and numer-als generated sequentially according to the computation rules for the ϕ-functions, as follows:

γ = ν(ξik)(ξik−1) . . . (ξi1)(ξ1)

reduces in one step to

δ =

ξik(ϕ(ik−1)

ξik(ξik−1)) . . . (ξi1)(ξ1) if ν = 0,

ξij (ξik)(ξik−1) . . . (ξi1)(ξ1) if ν = ωij and ij < ij+1, . . . , ik,

n(ξik)(ξik−1) . . . (ξi1)(ξ1) if ν = ω0.

If γ = ω0 it reduces to n, then to n− 1 etc. until it reaches 0 and stops. Wehenceforth omit the overbar from numerals.

Lemma. The computation sequence starting with τk and n is finite.

Proof. The proof goes by induction on k noting, as basis, that whenk = 0 the computation sequence for τ0 = ω0 runs simply: ω0, n, n− 1, . . . ,0. For the induction step, suppose k > 0.

Let level i of the computation sequence be what remains after strip-ping away, from each term γ(ξi−1)(ξi−2) . . . (ξ1), the outermost arguments


(ξi−1)(ξi−2) . . . (ξ1), thus leaving γ alone. Notice that a gap will occur whenγ = 0 and ξi−1 = ωi−2 because the term 0(ωi−2)(ξi−2) reduces in one step toωi−2(ϕωi−2(ξ

i−2)), which has no level i component, but this then reduces toωi−2(ξi−2)(ϕωi−2(ξ

i−2) and the ωi−2 now reappears at level i (and the ξi−2

at level i − 1). Thus level i will contain j-terms for j ≤ i and furthermorethere will be many repetitions. However we only need show that each leveli sequence is finite. This we do by a sub-induction downward from i = k+1to i = 1. Then with i = 1 we have the entire computation sequence fromτk, and hence the result.

At level k + 1 the reduction steps produce the following sequence: ωk,ωk−1, . . . , ω0, n, n−1, . . . , 0, at which point the k-term ϕ0ϕ1 . . . ϕn−1(ωk−1)will have been generated at level k. So the level k+1 sequence will continue:0, 1, 0, 0, 2, 1, 0, 0, 1, 0, 0, 3, . . .n− 1, n− 2, n− 3, . . . 1, 0, 0, 1, 0, 0, . . . ,n − 2, . . . , ending with 0 after 2n+1 − 1 steps (the number of nodes in thebinary tree of height n+ 1). After this there is a gap, with ωk−1 remainingat level k, which gets replaced successively by ωk−2, ωk−3, . . . , ω0, and thenn = ϕ0ϕ0 . . . ϕ0(0). No further terms have subscripts at level k + 1, so levelk + 1 is finite.

Now assume inductively that level i+1 is finite. After it has finished, nofurther term at level i can have a level i+1 subscript, so the first such termmust be an ωj for some j ≤ i−1 and in fact j 6= i−1 because as level i devel-ops, ωi−1 will occur in a context ωi−1(ϕ

(i−1)α (β)) and then the next reduction

would produce α in level i+1. Suppose the first such term is ωi−2. This willoccur in the computation sequence in a context ωi−2(ξi−1)(ξi−2) . . . (ξ1).But this reduces to ξi−2(ξi−1)(ξi−2) . . . (ξ1) so clearly ξi−2 cannot be anyterm except ωi−3 for otherwise its subscript would keep level i + 1 alive.Similarly ξi−3 = ωi−4, . . . , ξ1 = ω0. We claim also that ξi−1 = ωi−2, forassume otherwise. Then ξi−1 = ϕ

(i−1)α (β) where, because of the reduction

rules, α must reduce to ωi−2. Then ωi−2(ξ)(ωi−3) . . . (ω1) reduces eventu-ally to 0(ξ)(ωi−3) . . . (ω1) and then to α(β)(ϕξ(ωi−3)) . . . (ω1) and thence toωi−2(β′)(ϕξ(ωi−3)) . . . (ω1) and finally to ξ(ωi−3)(β′)(ϕξ(ωi−3)) . . . (ω1). Thepoint is that ξ now appears in level i+1 contrary to the assumption that leveli+1 has finished already. What we have shown here, is that after level i+1 isfinished, all remaining terms at level i come below ωi−2(ωi−2)(ωi−3) . . . (ω1),which is the first term in the computation sequence for τi−1. Since i−1 < kthe overall induction hypothesis allows us to assume that this remainingpart of level i (after level i+ 1 is finished) is finite. This ends the proof.

Lemma. The length of the computation sequence is greater than the num-ber of successor ordinals encountered in the reduction process, i.e., greaterthan the cardinality of the set of tree-ordinals τk[n], which by definition isexactly Gτk(n).

Definition (The i-sequences). Let γ occur in level i of the computationsequence from τk and n. Then the i-sequence from that occurrence of γconsists of all succeeding level i terms as far as the next zero. Write γ →i δto indicate that γ precedes (or is) δ in the same i-sequence. Note that thereis just one 1-sequence – the computation sequence itself.

5.5. AN INDEPENDENCE RESULT – KRUSKAL’S THEOREM 193

Lemma. Fix τk and n. Then each i-sequence is non-repeating and non-increasing with respect to the tree-ordinals denoted.

Proof. The result trivially holds for τ0, so assume inductively thatk > 0 and the result holds for all τk′ with k′ < k. Within this, proceed bya downward induction from i = k + 1 to i = 1, noting from the above proofthat all k + 1-sequences strictly descend with respect to their tree-ordinals.Suppose then, that i < k + 1 and all i + 1-sequences are non-repeatingand non-increasing. Choose an i-sequence starting with γ. If it lies withinthe computation sequence from some smaller τk′ there is nothing to do.Otherwise suppose γ is a j-term and use a sub-induction on the ordinal of γ.If γ = 0 it is its own i-sequence and there is nothing to do. If γ = ωj−1 thenγ reduces immediately to a j−1-term and the sub-induction hypothesis givesthe required result straight away. On the other hand, γ might be ϕ(j)

α (β).Then γ →i ϕ0(δ) →i δ where δ = ϕ0 . . . ϕα2ϕα1(β) and the ϕ subscriptsα →i+1 . . . α1 →i+1 . . . α2 →i+1 . . . 0 form an i + 1-sequence which, byinduction hypothesis, neither repeats nor increases in ordinal. Thereforethe first part of the i-sequence from γ down to ϕ0(δ) neither repeats norincreases, and the rest, from δ down to 0 is non-repeating and ordinallynon-increasing because of the sub-induction hypothesis, since the ordinal ofδ is smaller than that of γ. This completes the proof.

Definition. γ →+ δ means that, as labelled trees, γ → δ (i.e., γis embeddable in δ, preserving labels, infs and satisfying the gap condition)and furthermore, if γ is a j′-term, the embedding does not completely embedγ inside any j-subterm of δ where j < j′.

Lemma. Fix τk and n. Then for each i with 1 ≤ i ≤ k + 1 and everyterm δ, if γ →i δ and γ →+ δ then γ and δ are identical.

Proof. By induction on i from k + 1 down to 1, and within that aninduction over the term or tree δ, and within that a subinduction over γ.

For the basis i = k + 1, level k + 1 is described above, and it is clearthat, in each of its k + 1-sequences, no term can be →+ embedded in anysuccessor.

Now suppose 1 ≤ i < k and assume the result for i+ 1. We proceed byinduction on the i-term δ. If δ = ωj or 0 and γ →+ δ the only possibilityis γ is δ. Suppose then, that δ is of the form ϕ

(j)α (β). Then γ cannot be ωj′

for any j′ > j because γ →+ δ, and it cannot be ωj′ with j′ ≤ j becausenone of its successors in the i-sequence could then be j-terms. Thus γ isalso of the form ϕ

(j′)α′ (β′). By γ →+ δ we have j′ ≤ j and by γ →i δ we have

j′ ≥ j, so j′ = j. Also, we cannot have β′ →i δ for otherwise, by the gapcondition, γ →+ δ implies β′ →+ δ, so by the sub-induction hypothesis β′

and δ would be identical, and then γ would contain δ as a proper sub-term,contradicting γ → δ.

The situation then, is this: γ = ϕ(j)α′ (β

′), δ = ϕ(j)α (β), γ →i δ and

γ →+ δ. Furthermore β must be of the form ϕ(j)αr . . . ϕ

(j)α2ϕ

(j)α1 (β′) where

α′ → . . . α1 → . . . α2 → . . . αr → . . . α is (the initial part of) an i + 1-sequence from α′. This is because any occurrence of zero immediately getsstripped away, leaving what remains before it.


Now there are four possible ways in which γ can embed in δ, only twoof which actually happen.

Case 1. γ →+ β. Then γ →i δ →i β belong to the same i-sequence, soby the induction hypothesis γ is then identical to β. Therefore the ordinaldenoted by γ is strictly less than the ordinal of δ. But this is impossiblebecause i-sequences are non-increasing.

Case 2. γ →+ α. Then let η denote the smallest j-subterm of α suchthat γ →+ η. This occurrence of eta in the subscript α of δ must be createdanew as the i-sequence proceeds from γ to δ. The only way this can happenis that at some intervening stage a ϕ(j)

α′′(β′′) occurs, where the indicator ν of

α′′ is ωj . The next stage replaces ν by β′′ and then β′′ reduces to a j-subtermof α which contains η. Call this j-subterm η′. But this reduction from β′′ toη′, although it occurs inside level i+1, must also occur at level i, and withinthe same i-sequence. (Note that in the course of the i + 1-sequence fromα′, no zero arises until the end, and so levels above i remain unchanged.)Hence γ →i ϕ

(j)α′′(β

′′) →i β′′ and β′′ →i η′. Also γ →+ η′. Thus by theinduction hypothesis, η′ being a proper subterm of δ, we have γ identicalto η′, and since the i-sequence is ordinally non-increasing this means thatthe ordinal of γ is not greater than the ordinal of β′′. This is impossiblehowever, because γ →i ϕ

(i)α′′(β

′′) and so γ is ordinally greater than β′′.Case 3. γ →+ δ where the embedding takes the root of γ to the root

of δ and α′ → β and β′ → α. By the gap condition, since β′ and β arej-terms, α must be either an j-term or a j + 1-term and α′ a j′-term withj′ ≤ j. But since α′ comes before α in the same i + 1-sequence, α′ cannotbe a j′-term and α a j-term where j′ < j. Therefore both α′ and α arej-terms. Now the only way in which α′ could arise in level i+1 is by meansof an earlier diagonalization at level i: ϕ(j)

ωj (ξ)→i ϕ(j)ξ (ξ) followed by further

reductions to ϕ(j)α′ (β

′) where β′ = ϕ(j)αr . . . ϕ

(j)α1 (ξ) and ξ →i+1 α′. However,

just as in case 2, this level i+1 reduction between j-terms takes place also atlevel i and consequently β′ →i ξ →i α′ →i α. Because of the gap condition,β′ → α implies β′ →+ α, and so by the induction hypothesis, β′ and α areidentical. But this means that α′ and α are identical at level i + 1, and soγ and δ are identical at level i.

Case 4. γ →+ δ where the embedding takes the root of γ to the rootof δ and α′ → α and β′ → β. Since α′ →i+1 α and, again by the gapcondition, α′ →+ α, it follows from the induction hypothesis that α′ andα are identical. Therefore γ and δ are identical too and this completes theproof.

Lemma. The r-th member of the computation sequence from τk and nis bounded in size by ck(n) · 2r where ck(n) is max(2k + 1, n).

Proof. At each step of the computation sequence, the worst that hap-pens as a result of one step reduction is that part of the tree or term getscopied, or n gets inserted for ω0. Thus the reduct is at most twice the sizeof the previous term or tree, or else greater by n. It remains only to notethat the size of the starting tree τk is 2k + 1.

5.6. NOTES 195

Theorem. The computation sequence from τk and n is a bad sequence,and therefore its length is bounded by the Kruskal function Kk(ck(n)). HenceGτk(n) < Kk(ck(n)).

Proof. By the three preceeding lemmas. The second one is appliedwith i = 1, noting that if γ and δ are 1-terms then γ → δ automaticallyimplies γ →+ δ since 1-terms never get inserted inside numerals. Thus ifγ precedes δ in the computation sequence and γ → δ they must be identi-cal, contradicting the first of the lemmas which says that there can be norepetitions.

Corollary. Neither Kruskal’s theorem for labelled trees, nor its minia-turized version, is provable in Π1

1-CA0.

Proof. If the miniaturized version were provable then K and henceKn(cn(n)) would be provably recursive in Π1

1-CA0 and therefore majorizedby Gτ (n) = Gτn(n), contradicting the Theorem.

5.6. Notes

This chapter is almost entirely based on Wainer (1999).

Part 3

Constructive Logic andComplexity

CHAPTER 6

Computability in Higher Types

In this chapter we will develop a somewhat more general view of com-putability theory, where not only numbers and functions appear as argu-ments, but also functionals of any finite type.

6.1. Abstract Computability via Information Systems

There are two principles on which our notion of computability will bebased: finite support and monotonicity, both of which have already beenused (at the lowest type level) in section 2.4.

It is a fundamental property of computation that evaluation must befinite. So in any evaluation of Φ(ϕ) the argument ϕ can be called upon onlyfinitely many times, and hence the value – if defined – must be determinedby some finite subfunction of ϕ. This is the principle of finite support (cf.section 2.4).

Let us carry this discussion somewhat further and look at the situationone type higher up. Let H be a partial functional of type three, mappingtype two functionals Φ to natural numbers. Suppose Φ is given and H(Φ)evaluates to a defined value. Again, evaluation must be finite. Hence theargument Φ can only be called on finitely many functions ϕ. Furthermoreeach such ϕ must be presented to Φ in a finite form (explicitly say, as aset of ordered pairs). In other words, H and also any type two argumentΦ supplied to it, must satisfy the finite support principle, and this mustcontinue to apply as we move up through the types.

To describe this principle more precisely, we need to introduce the notionof a “finite approximation” Φ0 of a functional Φ. By this we mean a finiteset X of pairs (ϕ0, n) such that (i) ϕ0 is a finite function, (ii) Φ(ϕ0) is definedwith value n, and (iii) if (ϕ0, n) and (ϕ′0, n

′) belong to X where ϕ0 and ϕ′0are “consistent”, then n = n′. The essential idea here is that Φ should beviewed as the union of all its finite approximations. Using this notion of afinite approximation we can now formulate the

Principle of Finite Support . If H(Φ) is defined with valuen, then there is a finite approximation Φ0 of Φ such thatH(Φ0) is defined with value n.

The monotonicity principle formalizes the simple idea that once H(Φ) isevaluated, then the same value will be obtained no matter how the argumentΦ is extended. This requires the notion of “extension”. Φ′ extends Φ if forany piece of data (ϕ0, n) in Φ there exists another (ϕ′0, n) in Φ′ such that ϕ0

extends ϕ′0 (note the contravariance!). The second basic principle is then

199

200 6. COMPUTABILITY IN HIGHER TYPES

Monotonicity Principle. If H(Φ) is defined with value nand Φ′ extends Φ, then also H(Φ′) is defined with valuen.

An immediate consequence of finite support and monotonicity is thatthe behaviour of any functional is indeed determined by its set of finiteapproximations. For if Φ, Φ′ have the same finite approximations and H(Φ)is defined with value n, then by finite support, H(Φ0) is defined with value nfor some finite approximation Φ0, and then by monotonicityH(Φ′) is definedwith value n. Thus H(Φ) = H(Φ′), for all H.

This observation now allows us to formulate a notion of abstract com-putability:

Effectivity Principle. An object is computable just in caseits set of finite approximations is (primitive) recursivelyenumerable (or equivalently, Σ0

1-definable).

This is an “externally induced” notion of computability, and it is of definiteinterest to ask whether one can find an “internal” notion of computabilitycoinciding with it. This will be done by means of a fixed point operatorintroduced into this framework by Platek, and the result we shall eventuallyprove is due to Plotkin (1978).

The general theory of computability concerns partial functions and par-tial operations on them. However we are primarily interested in total ob-jects, so once the theory of partial objects is developed, we can look for waysto extract the total ones. In the last section of this chapter Kreisel’s Den-sity Theorem (that the total functionals are dense in the space of all partialcontinuous functionals) and the associated Effective Choice Principle arepresented.

The organization of the remaining sections is as follows. First we givean abstract, axiomatic formulation of the above principles, in terms of theso-called information systems of Scott (1982). From these we define thenotion of a continuous functional of arbitrary finite type over N. Plotkin’stheorem will then characterize the computable ones as those generated bycertain natural schemata, just as µ-recursion or least fixed points generatethe partial recursive functions.

6.1.1. Information systems. The basic idea of information systemsis to provide an axiomatic setting to describe approximations of abstractobjects (like functions or functionals) by concrete, finite ones. We do notattempt to analyze the notion of “concreteness” or finiteness here, but rathertake an arbitrary countable set A of “bits of data” or “tokens” as a basicnotion to be explained axiomatically. In order to use such data to buildapproximations of abstract objects, we need a notion of “consistency”, whichdetermines when the elements of a finite set of tokens are consistent witheach other. We also need an “entailment relation” between consistent setsX of data and single tokens a, which intuitively expresses the fact that theinformation contained in X is sufficient to compute the bit of information a.The axioms below are a minor modification of Scott’s (1982), due to Larsenand Winskel (1991).

6.1. ABSTRACT COMPUTABILITY VIA INFORMATION SYSTEMS 201

Definition. An information system is a structure (A,Con,`) where Ais a countable set (the tokens), Con is a nonempty set of finite subsets of A(the consistent sets) and ` is a subset of Con×A (the entailment relation),which satisfy

U ⊆ V ∈ Con→ U ∈ Con,

a ∈ Con,

U ` a→ U ∪ a ∈ Con,a ∈ U ∈ Con→ U ` a,U, V ∈ Con→ ∀a∈V (U ` a)→ V ` b→ U ` b.

The elements U of Con are called formal neighborhoods. We use U, V,Wto denote finite sets, and write U ` V to mean ∀a∈V (U ` a), and a ` b (a, bare consistent) to mean a, b ∈ Con. Clearly

U ` V → U ∪ V ∈ Con,U1 ⊇ U ` V ⊇ V1 → U1 ` V1,

U ` V `W → U `W.

Definition. The ideals (also called objects) of an information systemA = (A,Con,`) are defined to be those subsets x of A which satisfy

U ⊆ x→ U ∈ Con (x is consistent),

x ⊇ U ` a→ a ∈ x (x is deductively closed).

The set of all ideals of A is denoted by |A|.

Example. Any countable set A can be turned into a flat informationsystem by letting the set of tokens be A and

Con := ∅ ∪ a | a ∈ A and U ` a := a ∈ U,For A = N we have the following picture of the Con-sets.

∅•

•0

•1

•2

. . .

In this case the ideals are just the elements of Con.A rather important example is the following, which concerns approxi-

mations of functions from a countable set A into a countable set B. Thetokens are the pairs (a, b) with a ∈ A and b ∈ B, and

Con := (ai, bi) | i < k | ∀i,j<k(ai = aj → bi = bj) ,U ` (a, b) := (a, b) ∈ U.

It is not difficult to verify that this defines an information system whoseideals are (the graphs of) all partial functions from A to B.

Yet another example is provided by any fixed partial functional Φ. Atoken should now be a pair (ϕ0, n) where ϕ0 is a finite function and Φ(ϕ0)is defined with value n. Thus if we take Con to be the set of all finitesets of tokens and for U := (ϕi, ni) | i = 1 . . . k define U ` (ϕ0, n) ifand only if ϕ0 extends some ϕi, then this structure becomes an information


system. The ideals in this case are all sets x of tokens with the propertythat whenever (ϕ0, n) belongs to x, then also all (ϕ′0, n) with ϕ′0 extendingϕ0 belong to x.

6.1.2. Domains with countable basis.

Definition. (D,v,⊥) is a complete partial ordering (cpo for short),if v is a partial ordering (i.e., reflexive, transitive and antisymmetric) onD with least element ⊥, and moreover every directed subset S ⊆ D has asupremum

⊔S in D. Here S ⊆ D is called directed if for any x, y ∈ S there

is a z ∈ S such that x v z and y v z.

Lemma. Let A = (A,Con,`) be an information system. Then (|A|,⊆, ∅) is a complete partial ordering with supremum operator

⋃.

Proof. Exercise.

Definition. Let (D,v,⊥) be a complete partial ordering. An elementx ∈ D is called compact , if for every directed subset S ⊆ D with x v

⊔S

there is a z ∈ S such that x v z. The set of all compact elements of D iscalled the basis of D; it is denoted by Dc.

Lemma. Let A = (A,Con,`) be an information system. The compactelements of the complete partial ordering (|A|,⊆, ∅) can be represented inthe form

|A|c = U | U ∈ Con ,where U := a ∈ A | U ` a is the deductive closure of U .

Proof. Let z ∈ |A| be compact. We must show z = U for someU ∈ Con. The family U | U ⊆ z is directed (because for U, V ⊆ zan upper bound of U und V is given by U ∪ V ), and we have z ⊆

⋃U⊆z U .

Since z is compact, we have z ⊆ U for some U ⊆ z. Now z is deductivelyclosed as well, hence U ⊆ z.

Conversely, let U ∈ Con. We must show U ∈ |A|c. Clearly U ∈ |A|.It remains to show that U is compact. So let S ⊆ |A| be a directed subsetsatisfying U ⊆

⋃S. With U = a1, . . . , an we have ai ∈ zi ∈ S. Since S

is directed, there is a z ∈ S with z1, . . . , zn ⊆ z, hence U ⊆ z and thereforeU ⊆ z.

Definition. A complete partial ordering (D,v,⊥) is algebraic, if everyx ∈ D is the supremum of its compact approximations:

x =⊔u ∈ Dc | u v x .

Lemma. Let A = (A,Con,`) be an information system. Then (|A|,⊆, ∅) is algebraic.

Proof. Assume x ∈ |A|. Clearly x =⋃U | U ⊆ x .

Definition. A complete partial ordering (D,v,⊥) is bounded complete(or consistently complete), if every bounded subset S of D has a least upperbound

⊔S in D. It is a domain (or Scott-Ersov-domain), if it is algebraic

and bounded complete.


Now we can prove that the ideals of an information system form a domainwith a countable basis.

Theorem. For every information system A = (A,Con,`) the structure(|A|,⊆, ∅) is a domain, whose set of compact elements can be represented as|A|c = U | U ∈ Con .

Proof. We already noticed that (|A|,⊆, ∅) is a complete partial order-ing and algebraic. If S ⊆ |A| is bounded, then

⋃S is its least upper bound.

Hence (|A|,⊆, ∅) is bounded complete. The characterization of the compactelements has been proved above.

Remark. The converse is true as well: one can show easily that everydomain with countable basis can be represented in the way just described,as the set of all ideals of an appropriate information system.

6.1.3. Function spaces. We now define the “function space” A→ Bbetween two information systems A and B.

Definition. Let A = (A,ConA,À) and B = (B,ConB,`B) be infor-mation systems. Define A→ B = (C,Con,`) by

C := ConA ×B,

(Ui, bi) | i ∈ I ∈ Con := ∀J⊆I(⋃j∈J

Uj ∈ ConA → bj | j ∈ J ∈ ConB),

For the definition of the entailment relation ` it is helpful to first define thenotion of an application of W := (Ui, bi) | i ∈ I ∈ Con to U ∈ ConA:

(Ui, bi) | i ∈ I U := bi | U À Ui .From the definition of Con we know that this set is in ConB. Now defineW ` (U, b) by WU `B b.

Clearly application is monotone in the second argument, in the sensethat U À U ′ implies (WU ′ ⊆ WU , hence also) WU `B WU ′. In fact,application is also monotone in the first argument, i.e.,

W `W ′ implies WU `B W ′U.

To see this let W = (Ui, bi) | i ∈ I and W ′ = (U ′j , b′j) | j ∈ J . By

definition W ′U = b′j | U À U ′j . Now fix j such that U À U ′j ; we mustshow WU `B b′j . By assumption W ` (U ′j , b

′j), hence WU ′j `B b′j , hence

bi | U ′j À Ui `B b′j . It follows that

WU = bi | U À Ui by definition

⊇ bi | U ′j À Ui because of U À U ′j`B b′j as shown above.

Lemma. If A and B are information systems, then so is A→ B definedas above.

Proof. Let A = (A,ConA,À) and B = (B,ConB,`B). The first,second and fourth property of the definition are clearly satisfied. For thethird, suppose

(U1, b1), . . . , (Un, bn) ` (U, b), i.e., bj | U À Uj `B b.


We have to show that (U1, b1), . . . , (Un, bn), (U, b) ∈ Con. So let I ⊆1, . . . , n and suppose

U ∪⋃i∈I

Ui ∈ ConA.

We must show that b ∪ bi | i ∈ I ∈ ConB. Let J ⊆ 1, . . . , n consist ofthose j with U À Uj . Then also

U ∪⋃i∈I

Ui ∪⋃j∈J

Uj ∈ ConA.

Since ⋃i∈I

Ui ∪⋃j∈J

Uj ∈ ConA,

from the consistency of (U1, b1), . . . , (Un, bn) we can conclude that

bi | i ∈ I ∪ bj | j ∈ J ∈ ConB.

But bj | j ∈ J `B b by assumption. Hence

bi | i ∈ I ∪ bj | j ∈ J ∪ b ∈ ConB.

For the final property, suppose

W ` (U1, b1), . . . , (Un, bn) and (U1, b1), . . . , (Un, bn) ` (U, b).

We have to show W ` (U, b), i.e., WU `B b. We obtain

WU `B⋃WUi | U À Ui (monotonicity in the second argument)

`B bi | U À Ui because of W ` (Ui, bi), i.e., WUi `B bi

`B b by (U1, b1), . . . , (Un, bn) ` (U, b).


We shall now give two alternative characterizations of the function space:firstly as “approximable maps”, and secondly as continuous maps w.r.t. theso-called Scott topology.

The basic idea for approximable maps is the desire to study “informationrespecting” maps from A into B. Such a map is given by a relation r betweenConA and B, where r(U, b) intuitively means that whenever we are giventhe information U ∈ ConA, then we know that at least the token b appearsin the value.

Definition. Let A = (A,ConA,À) and B = (B,ConB,`B) be infor-mation systems. A relation r ⊆ ConA × B is an approximable map if itsatisfies(a) If r(U, b1), . . . , r(U, bn), then b1, . . . , bn ∈ ConB;(b) If r(U, b1), . . . , r(U, bn) and b1, . . . , bn `B b, then r(U, b);(c) If r(U ′, b) and U À U ′, then r(U, b).We write r : A→ B to mean that r is an approximable map from A to B.

Theorem. Let A and B be information systems. Then the ideals ofA→ B are exactly the approximable maps from A to B.


Proof. Let A = (A,ConA,À) and B = (B,ConB,`B). If r ∈ |A →B| then r ⊆ ConA × B is consistent and deductively closed. We have toshow that r satisfies the axioms for approximable maps.

(a). Let r(U, b1), . . . , r(U, bn). We must show that b1, . . . , bn ∈ ConB.But this clearly follows from the consistency of r.

(b). Let r(U, b1), . . . , r(U, bn) and b1, . . . , bn `B b. We must show thatr(U, b). But

(U, b1), . . . , (U, bn) ` (U, b)by the definition of the entailment relation ` in A → B, hence r(U, b) bythe deductive closure of r.

(c). Let U À U ′ and r(U ′, b). We must show that r(U, b). But

(U ′, b) ` (U, b)

since (U ′, b)U = b (which follows from U À U ′), hence again r(U, b) bythe deductive closure of r.

For the other direction suppose that r : A→ B is an approximable map.We must show that r ∈ |A→ B|.

Consistency of r. Suppose r(U1, b1), . . . , r(Un, bn) and U =⋃Ui | i ∈

I ∈ ConA for some I ⊆ 1, . . . , n. We must show that bi | i ∈ I ∈ConB. Now from r(Ui, bi) and U À Ui we obtain r(U, bi) by axiom (c) forall i ∈ I, and hence bi | i ∈ I ∈ ConB by axiom (a).

Deductive closure of r. Suppose r(U1, b1), . . . , r(Un, bn) and

W := (U1, b1), . . . , (Un, bn) ` (U, b).

We must show r(U, b). By definition of ` for A → B we have WU `B b,which is bi | U À Ui `B b. Further by our assumption r(Ui, bi) we knowr(U, bi) by axiom (c) for all i with U À Ui. Hence r(U, b) by axiom (b).

Definition. Suppose A = (A,Con,`) is an information system andU ∈ Con. Define OU ⊆ |A| by

OU := x ∈ |A| | U ⊆ x .

Note that, since the ideals x ∈ |A| are deductively closed, x ∈ OUimplies U ⊆ x.

Lemma. The system of all OU with U ∈ Con forms the basis of a topo-logy on |A|, called the Scott topology.

Proof. Suppose U, V ∈ Con and x ∈ OU ∩ OV . We have to findW ∈ Con such that x ∈ OW ⊆ OU ∩ OV . Choose W = U ∪ V .

Lemma. Let A be an information system and O ⊆ |A|. Then the fol-lowing are equivalent.(a) O is open in the Scott topology.(b) O satisfies

(i) If x ∈ O and x ⊆ y, then y ∈ O (Alexandrov condition).(ii) If x ∈ O then U ∈ O for some U ⊆ x (Scott condition).

(c) O =⋃U∈OOU .

Hence open sets O may be seen as those determined by a (possiblyinfinite) system of finitely observable properties, namely all U such thatU ∈ O.


Proof. (a)→ (b). If O is open, then O is the union of some OU ’s, U ∈Con. Since each OU is upwards closed, also O is; this proves the Alexandrovcondition. For the Scott condition assume x ∈ O. Then x ∈ OU ⊆ O forsome U ∈ Con. Note that U ∈ OU , hence U ∈ O, and U ⊆ x since x ∈ OU .

(b) → (c). Assume that O ⊆ |A| satisfies the Alexandrov and Scottconditions. Let x ∈ O. By the Scott condition, U ∈ O for some U ⊆ x, sox ∈ OU for this U . Conversely, let x ∈ OU for some U ∈ O. Then U ⊆ x.Now x ∈ O follows from U ∈ O by the Alexandrov condition.

(c) → (a). The OU ’s are the basic open sets of the Scott topology.

We now give some simple characterizations of the continuous functionsf : |A| → |B|. Call f monotone if x ⊆ y implies f(x) ⊆ f(y).

Lemma. Let A and B be information systems and f : |A| → |B|. Thenthe following are equivalent.(a) f is continuous w.r.t. the Scott topology.(b) f is monotone and satisfies the “Principle of Finite Support” PFS: If

b ∈ f(x), then b ∈ f(U) for some U ⊆ x.(c) f is monotone and commutes with directed unions: for every directed

D ⊆ |A|f(

⋃x∈D

x) =⋃x∈D

f(x).

Note that in (c) the set f(x) | x ∈ D is directed by monotonicity off , hence its union is indeed an ideal in |A|. Note also that from PFS itfollows immediately that if V ⊆ f(x), then V ⊆ f(U) for some U ⊆ x.

Hence continuous maps f : |A| → |B| are those that can be completelydescribed from the point of view of finite approximations of the abstractobjects x ∈ |A| and f(x) ∈ |B|: Whenever we are given a finite approxi-mation V to the value f(x), then there is a finite approximation U to theargument x such that already f(U) contains the information in V ; note thatby monotonicity f(U) ⊆ f(x).

Proof. (a) → (b). Let f be continuous. Then for any basic open setOV ∈ |B| (so V ∈ ConB) the set f−1[OV ] = x | V ⊆ f(x) is open in|A|. To prove monotonicity assume x ⊆ y; we must show f(x) ⊆ f(y). Solet b ∈ f(x), i.e., b ⊆ f(x). The open set f−1[Ob] = z | b ⊆ f(z) satisfies the Alexandrov condition, so from x ⊆ y we can infer b ⊆ f(y),i.e., b ∈ f(y). To prove PFS assume b ∈ f(x). The open set z | b ⊆ f(z) satisfies the Scott condition, so for some U ⊆ x we have b ⊆ f(U).

(b)→ (a). Assume that f satisfies monotonicity and PFS. We must showthat f is continuous, i.e., that for any fixed V ∈ ConB the set f−1[OV ] =x | V ⊆ f(x) is open. We prove

x | V ⊆ f(x) =⋃OU | U ∈ ConA and V ⊆ f(U) .

Let V ⊆ f(x). Then by PFS V ⊆ f(U) for some U ∈ ConA such that U ⊆ x,and U ⊆ x implies x ∈ OU . Conversely, let x ∈ OU for some U ∈ ConA suchthat V ⊆ f(U). Then U ⊆ x, hence V ⊆ f(x) by monotonicity.

For (b) ↔ (c) assume that f is monotone. Let f satisfy PFS, andD ⊆ |A| be directed. f(

⋃x∈D x) ⊇

⋃x∈D f(x) follows from monotonicity.


For the reverse inclusion let b ∈ f(⋃x∈D x). Then by PFS b ∈ f(U) for some

U ⊆⋃x∈D x. From the directedness and the fact that U is finite we obtain

U ⊆ z for some z ∈ D. From b ∈ f(U) and monotonicity infer b ∈ f(z).Conversely, let f commute with directed unions, and assume b ∈ f(x). Then

b ∈ f(x) = f(⋃U⊆x

U) =⋃U⊆x

f(U),

hence b ∈ f(U) for some U ⊆ x.

Clearly the identity and constant functions are continuous, and also thecomposition g f of continuous functions f : |A| → |B| and g : |B| → |C|.

Theorem. Let A and B = (B,ConB,`B) be information systems.Then the ideals of A → B are in a natural bijective correspondence withthe continuous functions from |A| to |B|, as follows.(a) With any approximable map r : A → B we can associate a continuous

function |r| : |A| → |B| by

|r|(z) := b ∈ B | r(U, b) for some U ⊆ z .

We call |r|(z) the application of r to z.(b) Conversely, with any continuous function f : |A| → |B| we can associate

an approximable map f : A→ B by

f(U, b) := (b ∈ f(U)).

These assignments are inverse to each other, i.e., f = |f | and r = |r|.

Proof. Let r be an ideal of A → B; then by the theorem just provedr is an approximable map. We first show that |r| is well-defined. So letz ∈ |A|.|r|(z) is consistent: let b1, . . . , bn ∈ |r|(z). Then there are U1, . . . , Un ⊆ z

such that r(Ui, bi). Hence U := U1 ∪ · · · ∪ Un ⊆ z and r(U, bi) by ax-iom (c) of approximable maps. Now from axiom (a) we can conclude thatb1, . . . , bn ∈ ConB.|r|(z) is deductively closed: let b1, . . . , bn ∈ |r|(z) and b1, . . . , bn `B b.

We must show b ∈ |r|(z). As before we find U ⊆ z such that r(U, bi). Nowfrom axiom (b) we can conclude r(U, b) and hence b ∈ |r|(z).

To prove continuity of |r| let V ∈ ConB; we must show that |r|−1[OV ]is open. Now for every z ∈ |A|

z ∈ |r|−1[OV ]↔ |r|(z) ∈ OV↔ b ∈ B | r(U, b) for some U ⊆ z ∈ OV↔ V ⊆ b ∈ B | r(U, b) for some U ⊆ z ↔ ∀b∈V ∃U [U ⊆ z ∧ r(U, b)]↔ ∀b∈V ∃U [z ∈ OU ∧ r(U, b)]

↔ z ∈⋂b∈V

⋃OU | r(U, b)

Since V is finite this implies that |r|−1[OV ] is open.


Now let f : |A| → |B| be continuous. It is easy to verify that f is indeedan approximable map. Furthermore

b ∈ |f |(z)↔ f(U, b) for some U ⊆ z↔ b ∈ f(U) for some U ⊆ z↔ b ∈ f(z) by monotonicity and PFS.

Finally, for any approximable map r : A→ B we have

r(U, b)↔ ∃V⊆Ur(V, b) by axiom (c) for approximable maps

↔ b ∈ |r|(U)

↔ |r|(U, b),

so r = |r|.

From now on we will usually write r(z) for |r|(z), and similarly f(U, b)for f(U, b). It should always be clear from the context where the mods andhats should be inserted.

6.1.4. Algebras and types. We now consider concrete informationsystems, our basis for continuous functionals.

Types will be built from base types by the formation of function types,ρ → σ. As domains for the base types we choose non-flat and possiblyinfinitary free algebras, given by their constructors. The main reason fortaking non-flat base domains is that we want the constructors to be injectiveand with disjoint ranges. This generally is not the case for flat domains.

Definition (Algebras and types). Let ξ, ~α be distinct type variables;the αl are called type parameters. We inductively define type forms ρ, σ, τ ∈Ty(~α ), constructor type forms κ ∈ KTξ(~α ) and algebra forms ι ∈ Alg(~α );all these are called strictly positive in ~α. In case ~α is empty we abbreviateTy(~α ) by Ty and call its elements types rather than type forms; similarlyfor the other notions.

αl ∈ Ty(~α ),ι ∈ Alg(~α )ι ∈ Ty(~α )

,ρ ∈ Ty σ ∈ Ty(~α )ρ→ σ ∈ Ty(~α )

,

κ0, . . . , κk−1 ∈ KTξ(~α )µξ(κ0, . . . , κk−1) ∈ Alg(~α )

(k ≥ 1),

~ρ ∈ Ty(~α ) ~σ0, . . . , ~σn−1 ∈ Ty~ρ→ (~σν → ξ)ν<n → ξ ∈ KTξ(~α )

(n ≥ 0).

We use ι for algebra forms and ρ, σ, τ for type forms. ~ρ → σ means ρ0 →. . . ρn−1 → σ, associated to the right. For ~ρ→ (~σν → ξ)ν<n → ξ ∈ KTξ(~α )call ~ρ the parameter argument types and the ~σν → ξ recursive argumenttypes. To avoid empty types, we require that there is a nullary constructortype, i.e., one without recursive argument types.

Here are some examples of algebras.

U := µξξ (unit),

B := µξ(ξ, ξ) (booleans),

N := µξ(ξ, ξ → ξ) (natural numbers, unary),


P := µξ(ξ, ξ → ξ, ξ → ξ) (positive numbers, binary),

D := µξ(ξ, ξ → ξ → ξ) (binary trees, or derivations),

O := µξ(ξ, ξ → ξ, (N→ ξ)→ ξ) (ordinals),

T0 := N, Tn+1 := µξ(ξ, (Tn → ξ)→ ξ) (trees).

Important examples of algebra forms are

L(α) := µξ(ξ, α→ ξ → ξ) (lists),

α× β := µξ(α→ β → ξ) (product),

α+ β := µξ(α→ ξ, β → ξ) (sum).

Remark (Substitution for type parameters). Let ρ ∈ Ty(~α ); we writeρ(~α ) for ρ to indicate its dependence on the type parametes ~α. We cansubstitute types ~σ for ~α, to obtain ρ(~σ). Examples are L(B), the type oflists of booleans, and N×N, the type of pairs of natural numbers.

Note that often there are many equivalent ways to define a particulartype. For instance, we could take U+U to be the type of booleans, L(U) tobe the type of natural numbers, and L(B) to be the type of positive binarynumbers.

For every constructor type κi we provide a (typed) constructor symbolCi. In some cases they have standard names, for instance

ttB, ffB for the two constructors of the type B of booleans,

0N,SN→N for the type N of (unary) natural numbers,

1P, SP→P0 , SP→P

1 for the type P of (binary) positive numbers,

nilL(ρ), consρ→L(ρ)→L(ρ) for the type L(ρ) of lists,

(inlρσ)ρ→ρ+σ, (inrρσ)σ→ρ+σ for the sum type ρ+ σ.

We denote the constructors of the type D of derivations by 0D (axiom) andCD→D→D (rule).

One can extend the definition of algebras and types to simultaneouslydefined algebras: just replace ξ by a list ~ξ = ξ0, . . . , ξN−1 of type variablesand change the algebra introduction rule to

κ0, . . . , κk−1 ∈ KT~ξ(~α )

(µ~ξ (κ0, . . . , κk−1))j ∈ Alg(~α )(k ≥ 1).

The definition of a “nullary” constructor type is a little more delicate here.We require that for every ξj (j < N) there is a κij with final value typeξj , each of whose recursive argument types has a final value type ξjν withjν < j. — Examples of simultaneously defined algebras are

(Ev,Od) := µξ,ζ(ξ, ζ → ξ, ξ → ζ) (even and odd numbers),

(Ts(ρ),T(ρ)) := µξ,ζ(ξ, ζ → ξ → ξ, ρ→ ζ, ξ → ζ) (tree lists and trees).

T(ρ) defines finitely branching trees, and Ts(ρ) finite lists of such trees; thetrees carry objects of a type ρ at their leaves. The constructor symbols and


their types are

EmptyTs(ρ), TconsT(ρ)→Ts(ρ)→Ts(ρ),

Leafρ→T(ρ), BranchTs(ρ)→T(ρ).

However, for simplicity we often consider non-simultaneous algebras only.An algebra is finitary if all its constructor types (i) only have finitary

algebras as parameter argument types, and (ii) have recursive argumenttypes of the form ξ only (so the ~σν in the general definition are all empty).Structure-finitary algebras are defined similarly, but without conditions onparameter argument types. In the examples above U, B, N, P and D areall finitary, but O and Tn+1 are not. L(ρ), ρ × σ and ρ + σ are structure-finitary, and finitary if their parameter types are. An argument position ina type is called finitary if it is occupied by a finitary algebra.

An algebra is explicit if all its constructor types have parameter argu-ment types only (i.e., no recursive argument types). In the examples aboveU, B, ρ×σ and ρ+σ are explicit, but N, P, L(ρ), D, O and Tn+1 are not.

We will also need the notion of the level of a type, which is defined by

lev(ι) := 0, lev(ρ→ σ) := maxlev(σ), 1 + lev(ρ).Base types are types of level 0, and a higher type has level at least 1.

6.1.5. Partial continuous functionals. For every type ρ we definethe information system Cρ = (Cρ,Conρ,`ρ). The ideals x ∈ |Cρ| are thepartial continuous functionals of type ρ. Since we will have Cρ→σ = Cρ →Cσ, the partial continuous functionals of type ρ→ σ will be the continuousfunctions from |Cρ| to |Cσ| w.r.t. the Scott topology. It will not be possi-ble to define Cρ by recursion on the type ρ, since we allow algebras withconstructors having function arguments (like O and Sup). Instead, we shalluse recursion on the “depth” of the notions involved, defined below.

Definition (Information system of a type ρ). We simultaneously defineCι, Cρ→σ, Conι and Conρ→σ.(a) The tokens a ∈ Cι are the type correct constructor expressions Ca∗1 . . . a

∗n

where a∗i is an extended token, i.e., a token or the special symbol ∗ whichcarries no information.

(b) The tokens in Cρ→σ are the pairs (U, b) with U ∈ Conρ and b ∈ Cσ.(c) A finite set U of tokens in Cι is consistent (i.e., ∈ Conι) if all its elements

start with the same constructor C, say of arity τ1 → . . . → τn → ι,and all Ui ∈ Conτi for i = 1, . . . , n, where Ui := a∗1i . . . a∗mi for U =C ~a∗1, . . . ,C ~a∗m (with U ∪ ∗ := U).

(d) (Ui, bi) | i ∈ I ∈ Conρ→σ is defined to mean ∀J⊆I(⋃j∈J Uj ∈ Conρ →

bj | j ∈ J ∈ Conσ).Building on this definition, we define U `ρ a (for U ∈ Conρ, a ∈ Cρ) andapplication WU of W = (Ui, bi) | i ∈ I ∈ Conρ→σ to U ∈ Conρ.

(e) C ~a∗1, . . . ,C ~a∗m `ι C′ ~a∗ is defined to mean C = C′, m ≥ 1 and Ui ` a∗i ,with Ui as in (c) above (and U ` ∗ taken to be true).

(f) (Ui, bi) | i ∈ I `ρ→σ (U, b) is defined to mean WU `σ b.(g) Application WU of W = (Ui, bi) | i ∈ I ∈ Conρ→σ to U ∈ Conσ is

defined to be bi | U `ρ Ui ; recall that U ` V abbreviates ∀a∈V (U ` a).


•0 • S∗@@@•S0

• S(S∗)@

@@•S(S0)

• S(S(S∗))@

@@•S(S(S0))

...

Figure 1. Tokens and entailment for N

If we define the depth of the syntactic expressions involved by

dp(Ca∗1 . . . a∗n) := 1 + maxdp(a∗i ) | i = 1, . . . , n , dp(∗) := 0,

dp((U, b)) := max1 + dp(U), 1 + dp(b),dp( ai | i ∈ I ) := max 1 + dp(ai) | i ∈ I ,dp(U ` a) := max1 + dp(U), 1 + dp(a),dp(WU) := max1 + dp(W ), 1 + dp(U),

these are definitions by recursion on the depth.It is easy to see that (Cρ,Conρ,`ρ) is an information system. Observe

that all the notions involved are computable: a ∈ Cρ, U ∈ Conρ, U ` a andWU .

Definition (Partial continuous functionals). For every type ρ let Cρ bethe information system (Cρ,Conρ,`ρ). The set |Cρ| of ideals in Cρ is the setof partial continuous functionals of type ρ. A partial continuous functionalx ∈ |Cρ| is computable if it is recursively enumerable when viewed as a setof tokens.

Notice that Cρ→σ = Cρ → Cσ as defined generally for informationsystems.

For example, the tokens for the algebra N are shown in Figure 1. Atoken a entails another one b if and only if there is a path from a (up) tob (down). As another (more typical) example, consider the algebra D ofderivations with a unary constructor 0 and a binary C. Then C0∗, C∗0is consistent, and C0∗, C∗0 ` C00.

6.1.6. Constructors as continuous functions. Let ι be an algebra.Every constructor C generates the following ideal in the function space:

rC := (~U,C ~a∗ ) | ~U ` ~a∗ .

Here (~U, a) abbreviates (U1, (U2, . . . (Un, a) . . . )).According to the general definition of a continuous function associated

to an ideal in a function space the continuous map |rC| satisfies

|rC|(~x ) = C ~a∗ | ∃~U⊆~x(~U ` ~a∗) .

An immediate consequence is that the (continuous maps corresponding to)constructors are injective and their ranges are disjoint, which is what wewanted to achieve by associating non-flat rather than flat information sys-tems with algebras.


Lemma (Constructors are injective and have disjoint ranges). Let ι bean algebra and C be a constructor of ι. Then

|rC|(~x ) ⊆ |rC|(~y )↔ ~x ⊆ ~y.If C1,C2 are distinct constructors of ι then

|rC1 |(~x ) ∩ |rC2 |(~y ) = ∅.

Proof. Immediate from the definitions.

Remark. Notice that neither property holds for flat information sys-tems, since for them, by monotonicity, constructors need to be strict (i.e.,if one argument is the empty ideal, then the value is as well). But then wehave

|rC|(∅, y) = ∅ = |rC|(x, ∅),|rC1 |(∅) = ∅ = |rC2 |(∅)

where in the first case we have one binary and, in the second, two unaryconstructors.

Lemma (Ideals of base type). Every non-empty ideal in the informationsystem associated to an algebra has the from |rC|(~x ) with a constructor Cand ideals ~x.

Proof. Let z be a non-empty ideal and Ca∗0b∗0 ∈ z, where for simplicity

we assume that C is a binary constructor. Let x := a | Ca∗ ∈ z andy := b | C∗b ∈ z ; clearly x, y are ideals. We claim that z = |rC|(x, y).For ⊇ consider Ca∗b∗ with a∗ ∈ x ∪ ∗ and b∗ ∈ y ∪ ∗. Then by defini-tion Ca∗∗,C∗b∗ ⊆ z, hence Ca∗b∗ ∈ z by deductive closure. Conversely,notice that an arbitrary element of z must have the form Ca∗b∗, becauseof consistency. Then Ca∗∗,C∗b∗ ⊆ z again by deductive closure. Hencea∗ ∈ x ∪ ∗ and b∗ ∈ y ∪ ∗, and therefore Ca∗b∗ ∈ |rC|(x, y).

It is in this proof that we need entailment to be a relation between finitesets of tokens and single tokens, not just a binary relation between tokens.Information systems with the latter property are called atomic.

The information systems Cρ enjoy the pleasant property of coherence,which amounts to the possibility to locate inconsistencies in two-elementsets of data objects. Generally, an information system A = (A,Con,`) iscoherent if it satisfies: U ⊆ A is consistent if and only if all of its two-elementsubsets are. Clearly all Aι are coherent, and moreover we have

Lemma. Let A and B be information systems. If B is coherent, thenso is A→ B.

Proof. Let A = (A,ConA,À) and B = (B,ConB,`B) be informationsystems, and consider (U1, b1), . . . , (Un, bn) ⊆ ConA ×B. Assume

∀1≤i<j≤n((Ui, bi), (Uj , bj) ∈ Con).

We have to show (U1, b1), . . . , (Un, bn) ∈ Con. Let I ⊆ 1, . . . , n and⋃i∈I Ui ∈ ConA. We must show bi | i ∈ I ∈ ConB. Now since B

is coherent by assumption, it suffices to show that bi, bj ∈ ConB for alli, j ∈ I. So let i, j ∈ I. By assumption we have Ui ∪ Uj ∈ ConA, and hencebi, bj ∈ ConB.


Corollary. The information systems Cρ are all coherent.

6.1.7. Total and cototal ideals in a finitary algebra. In the infor-mation system Cι associated with an algebra ι, the “total” and “cototal”ideals are of special interest. Here we give an explicit definition for fini-tary algebras. For general algebras totality can be defined inductively andcototality coinductively (cf. 7.1.6).

Recall that a token in ι is a constructor tree P possibly containing thespecial symbol ∗. Because of the possibility of parameter arguments we needto distinguish between “structure-” and “fully” total and cototal ideals.For the definition it is easiest to refer to a constructor tree P (∗) with adistinguished occurrence of ∗. This occurrence is called non-parametric ifthe path from it to the root does not pass through a parameter argumentof a constructor. For a constructor tree P (∗), an arbitrary P (C~∗ ) is calledone-step extension of P (∗), written P (C~∗ ) 1 P (∗).

Definition. Let ι be an algebra, and Cι its associated informationsystem. An ideal x ∈ |Cι| is cototal if every constructor tree P (∗) ∈ x has a1-predecessor P (C~∗ ) ∈ x; it is called total if it is cototal and the relation1 on x is well-founded. It is called structure-cototal (structure-total) if thesame holds with 1 defined w.r.t. P (∗) with a non-parametric distinguishedoccurrence of ∗.

If there are no parameter arguments, we shall simply speak of totaland cototal ideals. For example, for the algebra N every total ideal isthe deductive closure of a token S(S . . . (S0) . . . ), and the set of all tokensS(S . . . (S∗) . . . ) is a cototal ideal. For the algebra D of derivations the totalideals can be viewed as the finite derivations, and the cototal ones as thefinite or infinite “locally correct” derivations of Mints (1978); arbitrary idealscan be viewed as “partial” or “incomplete” derivations, with “holes”.

We consider two examples of algebras whose cototal ideals are of interest.The first one concerns the algebra I := µξ(ξ, ξ → ξ, ξ → ξ, ξ → ξ) of standardrational intervals, whose constructors we name I (for the initial interval[−1, 1]) and C−1,C0,C1 (for the left, middle, right part of the argumentinterval, of half its length). For example, C−1I, C0I and C1I should beviewed as the intervals [−1, 0], [−1

2 ,12 ] and [0, 1]. Every total ideal then can

be seen as a standard interval

Ii·2−k,k := [i

2k− 1

2k,i

2k+

12k

] for −2k < i < 2k.

However, the cototal ideals include Cn−1I | n ≥ 0 , which can be seen as

a “stream” representation of the real −1, and also I ∪ Cn1C−1I | n ≥ 0

and I ∪ Cn−1C1I | n ≥ 0 , which both represent the real 0. Generally,

the cototal ideals give us all reals in [−1, 1], in the well-known (non-unique)stream representation using “signed digits” from −1, 0, 1.

The second example concerns the simultaneously defined algebras

(W,R) := µξ,ζ(ξ, ζ → ξ, ξ → ζ, ξ → ζ, ξ → ζ, ζ → ζ → ζ → ζ).

The constructors with their type and intended meaning are

W0 : W stop,W : R→W quit writing and go into read mode,


Rd : W→ R quit reading and write d (d ∈ −1, 0, 1),R : R→ R→ R→ R read the next digit and stay in read mode.

Consider a well-founded “read tree”, i.e., a constructor tree built from R(ternary) with Rd at its leaves. The digit d at a leaf means that, afterreading all input digits on the path leading to the leaf, the output d iswritten. Notice that the tree may consist of a single leaf Rd, which meansthat, without any input, d is written as output. LetRd1 , . . . , Rdn be all leavesof such a well-founded tree. At a leaf Rdi

we continue with W (indicatingthat we now write di), and continue with another such well-founded readtree, and carry on. The result is a “W-cototal R-total” ideal, which can beviewed as a representation of a uniformly continuous real function f : I→ I.For example, let P := R(R1WP,R0WP,R−1WP ). Then P represents thefunction f(x) := −x, and R0WP represents the function f(x) := −x

2 .

6.2. Denotational and Operational Semantics

For every type ρ, we have defined what a partial continuous functional oftype ρ is: an ideal consisting of tokens at this type. These tokens or ratherthe formal neighborhoods formed from them are syntactic in nature; they arereminiscent to Kreisel’s “formal neighborhoods” (Kreisel, 1959; Martin-Lof,1983; Coquand and Spiwack, 2006). However – in contrast to Martin-Lof(1983) – we do not have to deal separately with a notion of consistency forformal neighborhoods: this concept is built into information systems.

Let us now turn our attention to a formal (functional programming)language, in the style of Plotkin’s PCF (1977), and see how we can provide adenotational semantics (that is, a “meaning”) for the terms of this language.A closed term M of type ρ will denote a partial continuous functional of thistype, that is, a consistent and deductively closed set of tokens of type ρ. Wewill define this set inductively.

It will turn out that these sets are recursively enumerable. In this senseevery closed term M of type ρ denotes a computable partial continuousfunctional of type ρ. However, it is not a good idea to define a computablefunctional in this way, by providing a recursive enumeration of its tokens.We rather want to be able to use recursion equations for such definitions.Therefore we extend the term language by constants D defined by certain“computation rules”, as in (Berger et al., 2003; Berger, 2005a). Our seman-tics will cover these as well. The resulting term system can be seen as acommon extension of Godel’s T (1958) and Plotkin’s PCF; we call it T+.There are some natural questions one can ask for such a term language:(a) Preservation of values under conversion (as in (Martin-Lof, 1983, First

Theorem)). Here we need to include applications of computation rules.(b) An adequacy theorem (cf. (Plotkin, 1977, Theorem 3.1) or (Martin-Lof,

1983, Second Theorem)), which in our setting says that whenever aclosed term has a proper token in the ideal it denotes, then it evaluatesto a constructor term entailing this token.

Property (a) will be proved in 6.2.7, and (b) in 6.2.8.

6.2.1. Structural recursion operators and Godel’s T. We beginwith a discussion of particularly important examples of such constants D,

6.2. DENOTATIONAL AND OPERATIONAL SEMANTICS 215

the (structural) higher type recursion operators R~ι,~τj introduced by Godel(1958). They are used to construct mappings from ιj to τj by recursion onthe structure of~ι. In order to define the type of the recursion operators w.r.t.~ι = µ~ξ (κ0, . . . , κk−1) and result types ~τ , we first define for each constructortype

κ = ~ρ→ (~σν → ξjν )ν<n → ξj ∈ KT~ξ

the step type

δ := ~ρ→ (~σν → ιjν )ν<n → (~σν → τjν )ν<n → τj .

The jth simultaneous recursion operator R~ι,~τj then has type

ιj → δ0 → . . .→ δk−1 → τj

where k is the total number of constructors. The recursion argument is oftype ιj . In the step type δ above, the ~ρ are parameter types, (~σν → ιjν )ν<nare the types of the predecessor components in the recursion argument, and(~σν → τjν )ν<n are the types of the previously defined values. We will oftenomit the upper indices ~ι, ~τ when they are clear from the context. In case ofa non-simultaneous free algebra we write Rτι for Rι,τ1 . — An example of asimultaneous recursion on tree lists and trees will be given below.

For some common algebras listed in 6.1.4 we spell out the type of theirrecursion operators:

RτB : B→ τ → τ → τ,

RτN : N→ τ → (N→ τ → τ)→ τ,

RτP : P→ τ → (P→ τ → τ)→ (P→ τ → τ)→ τ,

RτL(ρ) : L(ρ)→ τ → (ρ→ L(ρ)→ τ → τ)→ τ,

Rτρ+σ : ρ+ σ → (ρ→ τ)→ (σ → τ)→ τ,

Rτρ×σ : ρ× σ → (ρ→ σ → τ)→ τ.

Definition. Terms of Godel’s T are inductively defined from typedvariables xρ and constants for constructors C~ιi and recursion operators R~ι,~τjby abstraction λxρMσ and application Mρ→σNρ.

6.2.2. Conversion. To define the conversion relation for the structuralrecursion operators, it will be helpful to use the following notation. Let~ι = µ~ξ ~κ,

κi = ρ0 → . . .→ ρm−1 → (~σ0 → ξj0)→ . . .→ (~σn−1 → ξjn−1)→ ξj ∈ KT~ξ,

and consider C~ιi ~N . We write ~NP = NP0 , . . . , N

Pm−1 for the parameter argu-

ments Nρ00 , . . . , N

ρm−1

m−1 and ~NR = NR0 , . . . , N

Rn−1 for the recursive arguments

N~σ0→ιj0m , . . . , N

~σn−1→ιjn−1

m+n−1 , and nR for the number n of recursive arguments.We define a conversion relation 7→ρ between terms of type ρ by

(λxM(x))N 7→M(N),(6.1)

λx(Mx) 7→M if x /∈ FV(M) (M not an abstraction),(6.2)

Rj(C~ιi ~N) ~M 7→Mi~N

((Rj1 · ~M) NR

1

). . .

((Rjn · ~M) NR

n

).(6.3)


Here we have written Rj · ~M for λxιjR~ι,~τj xιj ~M . The rule (6.1) is called β-conversion, and (6.2) η-conversion; their left hand sides are called β-redexesor η-redexes, respectively. The left hand side of (6.3) is calledR-redex ; it is aspecial case of a redex associated with a constantD defined by “computationrules” (cf. 6.2.4), and hence also called a D-redex .

Let us look at some examples of what can be defined in Godel’s T. Wedefine the canonical inhabitant ερ of a type ρ ∈ Ty:

ειj := C~ιjε~ρ(λ~x1

ειj1 ) . . . (λ~xnειjn ), ερ→σ := λxε

σ.

The projections of a pair to its components can be defined easily:

M0 := Rρρ×σMρ×σ(λxρ,yσxρ), M1 := Rρρ×σMρ×σ(λxρ,yσyσ).

The append -function ∗ for lists is defined recursively as follows. We writex :: l as shorthand for cons(x, l).

nil ∗ l2 := l2, (x :: l1) ∗ l2 := x :: (l1 ∗ l2).It can be defined as the term

l1 ∗ l2 := RL(α)→L(α)L(α) l1(λl2 l2)λx, ,p,l2(x :: (pl2))l2.

Using the append function ∗ we can define list reversal Rev by

Rev(nil) := nil, Rev(x :: l) := Rev(l) ∗ (x :: nil).

The corresponding term is

Rev(l) := RL(α)L(α)l nilλx, ,p(p ∗ (x :: nil)).

Assume we want to define by simultaneous recursion two functions onN, say even, odd: N→ B. We want

even(0) := tt, odd(0) := ff,

even(Sn) := odd(n), odd(Sn) := even(n).

This can be achieved by using pair types: we recursively define the singlefunction evenodd: N→ B×B. The step types are

δ0 = B×B, δ1 = N→ B×B→ B×B,

and we can define evenoddm := RB×BN m〈tt, ff〉λn,p〈p1, p0〉.

Another example concerns the algebras (Ts(N),T(N)) simultaneouslydefined in 6.1.4 (we write them without the argument N here), whose con-structors C(Ts,T)

i for i ∈ 0, . . . , 3 are

EmptyTs, TconsT→Ts→Ts, LeafN→T, BranchTs→T.

Recall that the elements of the algebra T (i.e., T(N)) are just the finitelybranching trees, which carry natural numbers on their leaves.

Let us compute the types of the recursion operators w.r.t. the resulttypes τ0, τ1, i.e., of R(Ts,T),(τ0,τ1)

Ts and R(Ts,T),(τ0,τ1)T , or shortly RTs and

RT. The step types areδ0 := τ0,

δ1 := Ts→ T→ τ0 → τ1 → τ0,

δ2 := N→ τ1,

δ3 := Ts→ τ0 → τ1.

Hence the types of the recursion operators are

RTs : Ts→ δ0 → δ1 → δ2 → δ3 → τ0,


RT : T→ δ0 → δ1 → δ2 → δ3 → τ1.

As a concrete example we recursively define addition ⊕ : Ts→ T→ Ts and+: T→ T→ T. The recursion equations to be satisfied are

⊕Empty = λaEmpty,

⊕(Tcons b bs) = λa(Tcons(+ b a)(⊕ bs a)),

+(Leaf n) = λaa,

+(Branch bs) = λa(Branch(⊕ bs a)).

We define ⊕ and + by means of the recursion operators RTs and RT withresult types τ0 := T→ Ts and τ1 := T→ T. The step terms are

M0 := λaEmpty,

M1 := λbs,b,f,g,a(Tcons(gτ1a)(f τ0a)),

M2 := λn,aa,

M3 := λbs,g,a(Branch(f τ1a)).

Thenbs ⊕ a := RTsbs ~Ma, b+ a := RTb ~Ma.

We finally introduce some special cases of structural recursion and alsoa generalization; both will be important later on.

Simplified simultaneous recursion. In a recursion on simultaneously de-fined algebras one may need to recur on some of those algebras only. Thenwe can simplify the type of the recursion operator accordingly, by

(i) omitting all step types δ~ι,~τi with irrelevant value type τj , and(ii) simplifying the remaining step types by omitting from the recursive

argument types (~σν → τjν )ν<n and also from their algebra-duplicates(~σν → ιjν )ν<n all those with irrelevant τjν .

In the (Ts,T)-example, if we only want to recur on Ts, then the step typesare

δ0 := τ0, δ1 := Ts→ τ0 → τ0.

Hence the type of the simplified recursion operator is

RTs : Ts→ δ0 → δ1 → τ0.

An example is the recursive definition of the length of a Ts. The recursionequations are

lh(Empty) = 0, lh(Tcons b bs) = lh(bs) + 1.

The step terms are

M0 := 0, M1 := λbs,p(p+ 1).

Cases. There is an important variant of recursion, where no recursivecalls occur. This variant is called the cases operator ; it distinguishes casesaccording to the outer constructor form. Here all step types have the form

δ~ι,~τi := ~ρ→ (~σν → ιjν )ν<n → τj .

The intended meaning of the cases operator is given by the conversion rule

(6.4) Cj(C~ιi ~N) ~M 7→Mi~N.


Notice that only those step terms are used whose value type is the presentτj ; this is due to the fact that there are no recursive calls. Therefore thetype of the cases operator is

C~ιιj→τj : ιj → δi0 → . . .→ δiq−1 → τj ,

where δi0 , . . . , δiq−1 consists of all δi with value type τj . We write Cτjιj or evenCj for C~ιιj→τj .

The simplest example (for type B) is if-then-else. Another example is

CτN : N→ τ → (N→ τ)→ τ.

It can be used to define the predecessor function on N, i.e., P0 := 0 andP(Sn) := n, by the term

Pm := CNNm0(λnn).

In the (Ts,T)-example we have

Cτ0Ts : Ts→ τ0 → (T→ Ts→ τ0)→ τ0.

When computing the value of a cases term, we do not want to (eagerly)evaluate all arguments, but rather compute the test argument first and de-pending on the result (lazily) evaluate at most one of the other arguments.This phenomenon is well known in functional languages; for instance, inScheme the if-construct is called a special form (as opposed to an oper-ator). Therefore instead of taking the cases operator applied to a full listof arguments, one rather uses a case-construct to build this term; it differsfrom the former only in that it employs lazy evaluation. Hence the predeces-sor function is written in the form [case m of 0 | λnn]. If there are exactlytwo cases, we also write λm[if m then 0 else λnn] instead.

General recursion. In practice it often happens that one needs to re-cur to an argument which is not an immediate component of the presentconstructor object; this is not allowed in structural recursion. Of course,in order to ensure that the recursion terminates we have to assume thatthe recurrence is w.r.t. a given well-founded set; for simplicity we restrictourselves to the algebra N. However, we do allow that the recurrence iswith respect to a measure function µ, with values in N. The operator F ofgeneral recursion then is defined by

(6.5) FµxG = Gx(λy[if µy < µx then FµyG else ε]

),

where ε denotes a canonical inhabitant of the range. We leave it as anexercise to prove that F is definable from an appropriate structural recursionoperator.

6.2.3. Corecursion. We will show in 6.3 that an arbitrary “reductionsequence” beginning with a term in Godel’s T terminates. For this to hold itis essential that the constants allowed in T are restricted to constructors Cand recursion operators R. A consequence will be that every closed term ofa base type denotes a total ideal. The conversion rules for R (cf. 6.2.2) workfrom the leaves towards the root, and terminate because total ideals arewell-founded. If however we deal with cototal ideals (infinitary derivations,for example), then a similar operator is available to define functions withcototal ideals as values, namely “corecursion”. For simplicity we restrict


ourselves to finitary algebras, and only consider the non-simultaneous case.The corecursion operator coRτι is used to construct a mapping from τ to ιby corecursion on the structure of ι. We define the (single) step type by

δ := τ →∑

(∏

~ρ× τ × . . .× τ),

with summation over all constructors of ι. ~ρ are the types of the parameterarguments of the ith constructor, followed by as many τ ’s as there are recur-sive arguments. The corecursion operator coRτι then has type τ → δ → ι.

We list the types of the corecursion operators for some algebras:coRτB : τ → (τ → U + U)→ B,coRτN : τ → (τ → U + τ)→ N,coRτP : τ → (τ → U + τ + τ)→ P,coRτD : τ → (τ → U + τ × τ)→ D,coRτL(ρ) : τ → (τ → U + ρ× τ)→ L(ρ),coRτI : τ → (τ → U + τ + τ + τ)→ I.

The conversion relation for each of these is defined bycoRτBNM 7→ [case MN of tt | ff],coRτNNM 7→ [case MN of 0 | λn(S(coRτNnM))],coRτPNM 7→ [case MN of 1 | λn(S0(coRτPnM)) | λn(S1(coRτPnM))],coRτDNM 7→ [case MN of 0D | λx,y(CD→D→D(coRτDxM, coRτDyM))],coRτL(ρ)NM 7→ [case MN of nil | λz,x(z :: coRτL(ρ)xM)],coRτINM 7→[case MN of I | λx(C−1(coRτIxM)) | λx(C0(coRτIxM)) | λx(C1(coRτIxM))].

As an example of a function defined by corecursion consider the trans-formation of an “abstract” real in the interval [−1, 1] into a stream repre-sentation using signed digits from −1, 0, 1. Assume that we work in anabstract (axiomatic) theory of reals, having an unspecified type ρ, and thatwe have a type σ for rationals as well. Assume that the abstract theoryprovides us with a function g : ρ → σ → σ → B comparing a real x with aproper rational interval p < q:

g(x, p, q) = tt→ x ≤ q,g(x, p, q) = ff → p ≤ x.

From g we define a function h : ρ→ U + ρ+ ρ+ ρ by

h(x) :=

inl(inl(inr(2x+ 1))) if g(x,−1

2 , 0) = tt

inl(inr(2x)) if g(x,−12 , 0) = ff and g(x, 0, 1

2) = tt

inr(2x− 1) if g(x, 0, 12) = ff

(type summation + is taken to be left associative). h is definable by a closedterm M in Godel’s T. Then the desired function f : ρ→ I transforming anabstract real x into a cotoal ideal (i.e., a stream) in I can be defined by

f(x) := coRρIxM.


6.2.4. A common extension T+ of Godel’s T and Plotkin’s PCF.Terms of T+ are built from (typed) variables and (typed) constants (con-structors C or defined constants D, see below) by (type-correct) applicationand abstraction:

M,N ::= xρ | Cρ | Dρ | (λxρMσ)ρ→σ | (Mρ→σNρ)σ.

Definition (Computation rule). Every defined constant D comes witha system of computation rules, consisting of finitely many equations

(6.6) D~Pi(~yi) = Mi (i = 1, . . . , n)

with free variables of ~Pi(~yi) and Mi among ~yi, where the arguments onthe left hand side must be “constructor patterns”, i.e., lists of applicativeterms built from constructors and distinct variables. To ensure consistencyof the defining equations, we require that for i 6= j either ~Pi and ~Pj arenon-unifiable (i.e., there is no substitution which identifies them), or elsefor the most general unifier ξ of ~Pi and ~Pj we have Miξ = Mjξ. Noticethat the substitution ξ assigns to the variables ~yi in Mi constructor patterns~Rk(~z ) (k = i, j). A further requirement on a system of computation rulesD~Pi(~yi) = Mi is that the lengths of all ~Pi(~yi) are the same; this number iscalled the arity of D, denoted by ar(D). The left hand side of (6.6) is calleda D-redex .

More formally, constructor patterns are defined inductively by (we write~P (~x ) to indicate all variables in ~P )(a) x is a constructor pattern.(b) The empty list 〈〉 is a constructor pattern.(c) If ~P (~x ) and Q(~y ) are constructor patterns whose variables ~x and ~y are

disjoint, then (~P ,Q)(~x, ~y ) is a constructor pattern.(d) If C is a constructor and ~P a constructor pattern, then so is C~P .

Examples of constants D defined by computation rules are abundant.The defining equations in 6.2.1 can all be seen as computation rules, for

(i) the append-function ∗,(ii) list reversal Rev,(iii) the simultaneously defined functions even, odd: N→ B and(iv) the two simultaneously defined functions ⊕ : Ts → T → Ts and

+: T→ T→ T.Moreover, the structural recursion operators themselves can be viewed asdefined by computation rules, which in this case are called conversion rules;cf. 6.2.2.

The boolean connectives andb, impb and orb are defined by

tt andb y = y,

x andb tt = x,

ff andb y = ff,

x andb ff = ff,

ff impb y = tt,

tt impb y = y,

x impb tt = tt,

tt orb y = tt,

x orb tt = tt,

ff orb y = y,

x orb ff = x.

Notice that when two such rules overlap, their right hand sides are equalunder any unifier of the left hand sides.


Generally, for finitary algebras ι we define the boolean-valued functionEι : ι→ B (existence, corresponding to total ideals in finitary algebras) andfor structure-finitary algebras SEι : ι→ B (structural existence, correspond-ing to structure-total ideals) by

Eιj (Ci~x ) = E~ρ~x andb∧∧ν<n

Eιν~xRm+ν , SEιj (Ci~x ) =

∧∧ν<n

SEιν (~xRm+ν)

(recall the notation from 6.2.2 for parameter and recusive arguments of aconstructor). Examples are

EN0 = tt,

EN(Sn) = ENn,

SEL(α)(nil) = tt,

SEL(α)(x :: l) = SEL(α)l.

Decidable equality =ι : ι→ ι→ B for a finitary algebra ι is defined by(Ci~x =ι Cj~y ) = ff if i 6= j,

(Ci~x =ι Ci~y ) =(~xP =~ρ ~y

P andb∧∧ν<n

(~xRm+ν =ιjν~yRm+ν)

).

For example,(0 =N 0) = tt,

(0 =N Sn) = ff,

(Sm =N 0) = ff,

(Sm =N Sn) = (m =N n).

The predecessor functions introduced in 6.2.1 by means of the cases-operator C can also be viewed as defined constants:

P0 = 0, P(Sn) = n.

Another example is the destructor function, disassembling a constructor-built argument into its parts. For the type T1 := µξ(ξ, (N → ξ) → ξ) itis

DT1 : T1 → U + (N→ T1)defined by the computation rules

DT10 = inl(u), DT1(Sup(f)) = inr(f).

Generally, the type of the destructor function for ι := µξ(κ0, . . . , κk−1) withκi = ~ρi → (~σiν → ξ)ν<ni → ξ is

Dι : ι→∑i<k

(∏~ρi ×

∏ν<ni

(~σiν → ι)).

6.2.5. Confluence. The conversion rules of 6.2.2 together with thecomputation rules of the defined constantsD generate a “reduction” relation→ between terms of T+. We show that the reflexive and transitive closure→∗ of → is “confluent”, i.e., any two reduction sequences starting from thesame term can be continued to lead to the same term. The proof uses amethod due to W.W. Tait and P. Martin-Lof (cf. Barendregt (1984, 3.2)).The idea is to use a “parallel” reduction relation →p, which intuitively hasthe following meaning. Mark some β- or D-redexes in a given term. Thenconvert all of them, working in parallel from the leaves to the root. Noticethat redexes newly generated by this process are not converted. Confluenceof the relation →p can be easily proved using the notion of a “completedevelopment” M∗ of a term M due to Takahashi (1995), and confluence of→p immediately implies the confluence of →∗.


Recall the definition of the conversion relation 7→ in 6.2.2. We extend itwith the computation rules of T+: for every such rule D~Pi(~yi) = Mi(~yi) wehave the conversion D~Pi( ~Ni) 7→Mi( ~Ni). The one step reduction relation →between terms in T+ is defined as follows. M → N if N is obtained fromM by replacing a subterm M ′ in M by N ′, where M ′ 7→ N ′. The reductionrelations →+ and →∗ are the transitive and the reflexive transitive closureof →, respectively.

Definition. A binary relation R has the diamond property if from xRy1

and xRy2 we can infer the existence of z such that y1Rz and y2Rz. We callR confluent if its reflexive and transitive closure has the diamond property.

Lemma. Every binary relation R with the diamond property is confluent.

Proof. We write xRny if there is a sequence x = x0Rx1Rx2R . . . Rxn =y. By induction on n +m one proves that from xRny1 and xRmy2 we caninfer the existence of a z such that y1Rmz and y2Rnz.

Definition. Parallel reduction →p is defined inductively by the rules

x→p x, C→p C, D →p D(6.7)

M →p M′

λxM →p λxM ′(6.8)

M →p M′ N →p N

′

MN →p M ′N ′(6.9)

M(x)→p M′(x) N →p N

′

(λxM(x))N →p M ′(N ′)(6.10)

~N →p~N ′

D~P ( ~N)→p M( ~N ′)for D~P (~y ) = M(~y ) a computation rule.(6.11)

Lemma (Substitutivity of→p). If M(~x )→p M′(~x ) and ~K →p

~K ′, thenM( ~K )→p M

′( ~K ′ ).

Proof. By induction on M →p M′. All cases are easy with the possible

exception of (6.10) and (6.11). Case (6.10). Consider

M(y, ~x )→p M′(y, ~x ) N(~x )→p N

′(~x )(λyM(y, ~x ))N(~x )→p M ′(N ′(~x ), ~x )

.

Assume ~K →p~K ′. By induction hypothesis M(y, ~K) →p M

′(y, ~K ′) andN( ~K)→p N

′( ~K ′). Then an application of (6.10) gives

(λyM(y, ~K))N( ~K)→p M′(N ′( ~K), ~K),

since we can assume y /∈ FV( ~K).Case (6.11). Consider

~N(~x )→p~N ′(~x )

D~P ( ~N(~x ))→p M( ~N ′(~x ))

with D~P (~y ) = M(~y ) a computation rule. Assume ~K →p~K ′. By induction

hypothesis N( ~K)→p N′( ~K ′). Then an application of (6.11) gives

D~P ( ~N( ~K))→p M( ~N ′( ~K)),


Here we have made use of our assumption that all free variables in D~P (~y ) =M(~y ) are among ~y.

Definition (Complete expansion M∗ of M).

x∗ := x, C∗ := C for constructors C,

D∗ := D if ar(D) > 0, or ar(D) = 0 and D has no rules,

(λxM)∗ := λxM∗,

(MN)∗ := M∗N∗ if MN is neither a β- nor a D-redex,

((λxM(x))N)∗ := M∗(N∗),

(D~P ( ~N))∗ := M( ~N∗) for D~P (~y ) = M(~y ) a computation rule.

To see that M∗ is well-defined assume D ~P1( ~N1) = D ~P2( ~N2), whereD~Pi(~yi) = M(~yi) (i = 1, 2) are computation rules. We must show M1( ~N∗

1 ) =M2( ~N∗

2 ). By our conditions on computation rules there is a most generalunifier ξ of ~P1(~y1) and ~P2(~y) such that M1(~y1ξ) = M2(~y2ξ). Notice that ~yiξis a constructor pattern; without loss of generality we can assume that both~y1ξ and ~y2ξ are parts of the same constructor pattern ~K(~x ). Then we canwrite ~Ni = (~yiξ)( ~K), where the substitution of ~K is for ~x. Hence

Mi( ~N∗i ) = Mi((~yiξ)( ~K∗)) = Mi(~yiξ)( ~K∗)

and therefore M1( ~N∗1 ) = M2( ~N∗

2 ).The crucial property of the complete expansion M∗ of M is that the

result of an arbitrary parallel reduction ofM can be further parallely reducedto M∗.

Lemma. M →p M′ implies M ′ →p M

∗.

Proof. By induction on M and distinguishing cases on M →p M′. The

initial cases (6.7) are easy. Case (6.8). Then

M →p M′

λxM →p λxM ′ .

By induction hypothesis M ′ →p M∗. Then another application of (6.8)

yields λxM ′ →p λxM∗ = (λxM)∗.

Case (6.9). We distinguish cases on M . Subcase MN , which is neithera β- nor a D-redex. Then (MN)∗ = M∗N∗, and the last rule was

M →p M′ N →p N

′

MN →p M ′N ′ .

By induction hypothesis M ′ →p M∗ and N ′ →p N

∗. Another applicationof (6.9) yields M ′N ′ →p M

∗N∗.Subcase (λxM(x))N . Then ((λxM(x))N)∗ = M∗(N∗), and the last rule

wasλxM(x)→p λxM

′(x) N →p N′

(λxM(x))N →p (λxM ′(x))N ′ .

Then we also have M(x) →p M′(x). By induction hypothesis M ′(x) →p

M∗(x) and N →p N∗. Therefore (λxM(x))N →p M

∗(N∗), which was to beshown.


Subcase D~P ( ~M) which D~P (~y ) = K(~y ) a computation rule. Then(D~P ( ~M))∗ = K( ~M∗). The last rule derived D~P ( ~M) →p N for some N .Since this rule was (6.9) we have N = D ~N and P ( ~M) →p

~N . But ~P (~y )is a constructor pattern, hence ~N = ~P ( ~M ′) with ~M →p

~M ′. By inductionhypothesis ~M ′ →p

~M∗. Therefore N = D~P ( ~M ′)→p K( ~M∗) = (D~P ( ~M))∗.Case (6.10). Then

M(x)→p M′(x) N →p N

′

(λxM(x))N →p M ′(N ′).

We must show M ′(N ′) →p ((λxM(x))N)∗ (= M∗(N∗)). But this followswith the induction hypothesis M ′(x) →p M

∗(x) and N ′ →p N∗ from the

substitutivity of →p.Case (6.11). Then

~N →p~N ′

D~P ( ~N)→p M( ~N ′)for D~P (~y ) = M(~y ) a computation rule.

We must show M( ~N ′) →p (D~P ( ~N))∗ (= M( ~N∗)). Again this follows withthe induction hypothesis ~N ′ →p

~N∗ from the substitutivity of →p.

Corollary. →∗ is confluent.

Proof. The reflexive closure of → is contained in →p, hence →∗ is thereflexive and transitive closure of →p. Since →p has the diamond propertyby the previous lemma, an earlier lemma implies that →∗ is confluent.

6.2.6. Ideals as denotation of terms. How can we use computationrules to define an ideal z in a function space? The general idea is to in-ductively define the set of tokens (U, b) that make up z. It is convenientto define the value [[λ~xM ]], where M is a term with free variables among~x. Since this value is a token set, we can define inductively the relation(~U, b) ∈ [[λ~xM ]].

For a constructor pattern ~P (~x ) and a list ~V of the same length andtypes as ~x we define a list ~P (~V ) of formal neighborhoods of the same lengthand types as ~P (~x ), by induction on ~P (~x ). x(V ) is the singleton list V ,and for 〈〉 we take the empty list. (~P ,Q)(~V , ~W ) is covered by the inductionhypothesis. Finally

(C~P )(~V ) := C~b∗ | b∗i ∈ Pi(~Vi) if Pi(~Vi) 6= ∅, and b∗i = ∗ otherwise .

We use the following notation. (~U, b) means (U1, . . . (Un, b) . . . ), and(~U, V ) ⊆ [[λ~xM ]] means (~U, b) ∈ [[λ~xM ]] for all (finitely many) b ∈ V .

Definition (Inductive, of (~U, b) ∈ [[λ~xM ]]).

Ui ` b(~U, b) ∈ [[λ~xxi]]

(V ),(~U, V, c) ∈ [[λ~xM ]] (~U, V ) ⊆ [[λ~xN ]]

(~U, c) ∈ [[λ~x(MN)]](A).

For every constructor C and defined constant D we have

~V ` ~b∗

(~U, ~V ,C~b∗) ∈ [[λ~xC]](C),

(~U, ~V , b) ∈ [[λ~x,~yM ]] ~W ` ~P (~V )

(~U, ~W, b) ∈ [[λ~xD]](D)


with one such rule (D) for every computation rule D~P (~y ) = M .

The height of a derivation of (~U, b) ∈ [[λ~xM ]] is defined as usual, byadding 1 at each rule. We define its D-height similarly, where only rules(D) count.

We begin with some simple consequences of this definition. The followingtransformations preserve D-height:

~V ` ~U → (~U, b) ∈ [[λ~xM ]]→ (~V , b) ∈ [[λ~xM ]],(6.12)

(~U, V, b) ∈ [[λ~x,yM ]]↔ (~U, b) ∈ [[λ~xM ]] if y /∈ FV(M),(6.13)

(~U, V, b) ∈ [[λ~x,y(My)]]↔ (~U, V, b) ∈ [[λ~xM ]] if y /∈ FV(M),(6.14)

(~U, ~V , b) ∈ [[λ~x,~y (M(~P (~y )))]]↔ (~U, ~P (~V ), b) ∈ [[λ~x,~z (M(~z ))]].(6.15)

Proof. (6.12) and (6.13) are both proved by easy inductions on therespective derivations.

(6.14). Assume (~U, V, b) ∈ [[λ~x,y(My)]]. By (A) we then have a W suchthat (~U, V,W ) ⊆ [[λ~x,yy]] (i.e., V `W ) and (~U, V,W, b) ∈ [[λ~x,yM ]]. By (6.12)from the latter we obtain (~U, V, V, b) ∈ [[λ~x,yM ]]. Now since y /∈ FV(M),(6.13) yields (~U, V, b) ∈ [[λ~xM ]], as required. Conversely, assume (~U, V, b) ∈[[λ~xM ]]. Since y /∈ FV(M), (6.13) yields (~U, V, V, b) ∈ [[λ~xM ]]. Clearly wehave (~U, V, V ) ⊆ [[λ~x,yy]]. Hence by (A) (~U, V, b) ∈ [[λ~x,y(My)]], as required.Notice that the D-height did not change in these transformations.

(6.15). By induction on ~P , with a side induction on M . We distinguishcases on M . The cases xi, C and D are follow immediately from (6.13). Incase MN the following are equivalent by induction hypothesis:

(~U, ~V , b) ∈ [[λ~x,~y ((MN)(~P (~y )))]]

∃W ((~U, ~V ,W ) ⊆ [[λ~x,~y (N(~P (~y )))]] ∧ (~U, ~V ,W, b) ∈ [[λ~x,~y (M(~P (~y )))]])

∃W ((~U, ~P (~V ),W ) ⊆ [[λ~x,~y (N(~z ))]] ∧ (~U, ~P (~V ),W, b) ∈ [[λ~x,~y (M(~z ))]])

(~U, ~P (~V ), b) ∈ [[λ~x,~y ((MN)(~z ))]].

The final case is where M is zi. Then we have to show

(~U, ~V , b) ∈ [[λ~x,~y(P (~y ))]]↔ P (~V ) ` b.We now distinguish cases on P (~y ). If P (~y ) is yj , then both sides are equiv-alent to Vj ` b. In case P (~y ) is (C ~Q)(~y ) the following are equivalent, usingthe induction hypothesis for ~Q(~y )

(~U, ~V , b) ∈ [[λ~x,~y((C ~Q)(~y ))]]

(~U, ~V , b) ∈ [[λ~x,~y(C ~Q(~y ))]]

(~U, ~Q(~V ), b) ∈ [[λ~x,~u(C~u )]]

(~U, ~Q(~V ), b) ∈ [[λ~xC]] by (6.14)

∃~b∗(b = C~b∗ ∧ ~Q(~V ) ` ~b∗)

C ~Q(~V ) ` b.This concludes the proof.


Let∼ denote the equivalence relation on formal neighborhoods generatedby entailment, i.e., U ∼ V means (U ` V ) ∧ (V ` U).

(6.16) If ~U ` ~P (~V ), then there are ~W such that ~U ∼ ~P ( ~W ) and ~W ` ~V .

Proof. By induction on ~P . The cases x and 〈〉 are clear, and in case~P ,Q we can apply the induction hypothesis. It remains to treat the caseC~P (~x ). Since U ` C~P (~V ) there is a ~b∗0 such that C~b∗0 ∈ U . Let

Ui := a | ∃ ~a∗(C ~a∗ ∈ U ∧ a = a∗i ) .

For the constructor pattern C~x consider C~U . By definition

C~U = C ~a∗ | a∗i ∈ Ui if Ui 6= ∅, and a∗i = ∗ otherwise .

We first show U ∼ C~U . Assume C ~a∗ ∈ C~U . For each i, if Ui 6= ∅, then thereis an ~a∗i such that C ~a∗i ∈ U and a∗ii = a∗i , and if Ui = ∅ then a∗i = ∗. Hence

U ⊇ C ~a∗i | Ui 6= ∅ ∪ C~b∗0 ` C ~a∗.

Conversely assume C ~a∗ ∈ U . We define C~b∗ ∈ C~U by b∗i = a∗i if a∗i 6= ∗,b∗i = ∗ if Ui = ∅, and otherwise (i.e., if a∗i = ∗ and Ui 6= ∅) take an arbitraryb∗i ∈ Ui. Clearly C~b∗ ` C ~a∗.

By definition ~U ` ~P (~V ). Hence by induction hypothesis there are ~W

such that ~U ∼ ~P ( ~W ) and ~W ` ~V . Therefore U ∼ C~U ∼ C~P ( ~W ).

Lemma (Unification). If ~P1(~V1) ∼ · · · ∼ ~Pn(~Vn), then ~P1, . . . , ~Pn areunifiable with a most general unifier ξ and there exists ~W such that

(~P1ξ)( ~W ) = · · · = (~Pnξ)( ~W ) ∼ ~P1(~V1) ∼ · · · ∼ ~Pn(~Vn).

Proof. Assume ~P1(~V1) ∼ · · · ∼ ~Pn(~Vn). Then ~P1(~V1), . . . , ~Pn(~Vn)are componentwise consistent and hence ~P1, . . . , ~Pn are unifiable with amost general unifier ξ. We now proceed by induction on ~P1, . . . , ~Pn. Ifthey are either all empty or all variables the claim is trivial. In the case(~P1, P1), . . . , (~Pn, Pn) it follows from the linearity condition on variables thata most general unifier of (~P1, P1), . . . , (~Pn, Pn) is the union of most generalunifiers of ~P1, . . . , ~Pn and of P1, . . . , Pn. Hence we the induction hypothesisapplies. In the case C~P1, . . . ,C~Pn the assumption C~P1(~V1) ∼ · · · ∼ C~Pn(~Vn)implies ~P1(~V1) ∼ · · · ∼ ~Pn(~Vn) and hence we can apply the induction hy-pothesis. The remaining case is when some are variables and the other onesof the form C~Pi, say x,C~P2, . . . ,C~Pn. By assumption

V1 ∼ C~P2(~V2) ∼ · · · ∼ C~Pn(~Vn).

By induction hypothesis we obtain the required ~W such that

(~P2ξ)( ~W ) = · · · = (~Pnξ)( ~W ) ∼ ~P2(~V1) ∼ · · · ∼ ~Pn(~Vn).

Lemma (Consistency). [[λ~xM ]] is consistent.

Proof. Let (~Ui, bi) ∈ [[λ~xM ]] for i = 1, 2. By coherence (cf. the corollaryat the end of 6.1.5) it suffices to prove that (~U1, b1) and (~U2, b2) are consis-tent. We shall prove this by induction on the maximum of the D-heightsand a side induction on the maximum of the heights.


Case (V). Let (~U1, b1), (~U2, b2) ∈ [[λ~xxi]], and assume that ~U1 and ~U2 arecomponentwise consistent. Then U1i ` b1 and U2i ` b2. Since U1i ∪ U2i isconsistent, b1 and b2 must be consistent as well.

Case (C). For i = 1, 2 we have

~Vi ` ~b∗i(~Ui, ~Vi,C~b∗i ) ∈ [[λ~xC]]

.

Assume ~U1, ~V1 and ~U2, ~V2 are componentwise consistent. The consistency ofC~b∗1 and C~b∗2 follows from ~Vi ` ~b∗i and the consistency of ~V1 and ~V2.

Case (A). For i = 1, 2 we have

(~Ui, Vi, ci) ∈ [[λ~xM ]] (~Ui, Vi) ⊆ [[λ~xN ]]

(~Ui, ci) ∈ [[λ~x(MN)]].

Assume ~U1 and ~U2 are componentwise consistent. By the side inductionhypothesis for the left premises V1 ∪ V2 is consistent. Hence by the sideinduction hypothesis for the right hand sides c1 and c2 are consistent.

Case (D). For i = 1, 2 we have

(~Ui, ~Vi, bi) ∈ [[λ~x,~yiMi(~yi)]] ~Wi ` ~Pi(~Vi)

(~Ui, ~Wi, bi) ∈ [[λ~xD]](D)

for computation rules D~P (~yi) = Mi(~yi). Assume ~U1, ~W1 and ~U2, ~W2 arecomponentwise consistent; we must show that b1 and b2 are consistent. Since~W1 ∪ ~W2 ` ~Pi(~Vi) for i = 1, 2, by (6.16) there are ~V ′

1 ,~V ′2 such that ~V ′

i ` ~Viand ~W1 ∪ ~W2 ∼ ~Pi(~V ′

i ). Then by the Unification Lemma there are ~W suchthat (~P1ξ)( ~W ) ∼ ~Pi(~V ′

i ) ` ~Pi(~Vi) for i = 1, 2, where ξ is the most generalunifier of ~P1 and ~P2. But then also

(~yiξ)( ~W ) ` ~Vi,and hence by (6.12) we have

(~Ui, (~yiξ)( ~W ), bi) ∈ [[λ~x,~yiMi(~yi)]]

with lesser D-height. Now (6.15) gives

(~Ui, ~W, bi) ∈ [[λ~x,~zMi(~yi)ξ]]

without increasing the D-height. Notice that M1(~yi)ξ = M2(~yi)ξ by ourcondition on computation rules. Hence the induction hypothesis applied to(~U1, ~W, b1), (~U2, ~W, b2) ∈ [[λ~x,~zM1(~y1)ξ]] implies the consistency of b1 and b2,as required.

Lemma (Deductive Closure). [[λ~xM ]] is deductively closed, i.e., if W ⊆[[λ~xM ]] and W ` (~V , c), then (~V , c) ∈ [[λ~xM ]].

Proof. By induction on the maximun of the D-heights and a side in-duction on the maximun of the heights of W ⊆ [[λ~xM ]]. We distinguish caseson the last rule of these derivations (which is determined by M).

Case (V). For all (~U, b) ∈W we haveUi ` b

(~U, b) ∈ [[λ~xxi]].


We must show Vi ` c. By assumption W ` (~V , c), hence W~V ` c. It sufficesto prove Vi ` W~V . Let b ∈ W~V ; we show Vi ` b. There are ~U such that~V ` ~U and (~U, b) ∈W . But then by the above Ui ` b, hence Vi ` Ui ` b.

Case (A). Let W = (~U1, b1), . . . , (~Un, bn). For each (~Ui, bi) ∈W thereis Ui such that

(~Ui, Ui, bi) ∈ [[λ~xM ]] (~Ui, Ui) ⊆ [[λ~xN ]]

(~Ui, bi) ∈ [[λ~x(MN)]].

Define U :=⋃Ui | ~V ` ~Ui . We first show that U is consistent. Let

a, b ∈ U . There are i, j such that a ∈ Ui, b ∈ Uj and ~V ` ~Ui, ~Uj . Then ~Uiand ~Uj are consistent, hence by the consistency of [[λ~xN ]] proved above aand b are consistent as well.

Next we show (~V , U) ⊆ [[λ~xN ]]. Let a ∈ U ; we show (~V , a) ∈ [[λ~xN ]]. Fixi such that a ∈ Ui and ~V ` Ui, and let Wi := (~Ui, b) | b ∈ Ui ⊆ [[λ~xN ]].Since by the side induction hypothesis [[λ~xN ]] is deductively closed it sufficesto prove Wi ` (~V , a), i.e., b | b ∈ Ui ∧ ~V ` ~Ui ` a. But the latter setequals Ui, and a ∈ Ui.

Finally we show (~V , U, c) ⊆ [[λ~xM ]]. Let

W ′ := (~U1, U1, b1), . . . , (~Un, Un, bn) ⊆ [[λ~xM ]].

By side induction hypothesis it suffices to prove that W ′ ` (~V , U, c), i.e., bi | ~V ` ~Ui ∧ U ` Ui . But by definition of U the latter set equals bi | ~V ` ~Ui , which in turn entails c because by assumption W ` (~V , c).

Now we can use (A) to infer (~V , c) ∈ [[λ~xM ]], as required.Case (C). Assume W ⊆ [[λ~xC]]. Then W consists of (~U, ~U ′,C~b∗) such

that ~U ′ ` ~b∗. Assume further W ` (~V , ~V ′, c). Then

C~b∗ | ∃~U,~U ′((~U, ~U′,C~b∗) ∈W ∧ ~V ` ~U ∧ ~V ′ ` ~U ′) ` c.

By definition of entailment c has the form C~c∗ such that

Wi := b | ∃~U,~U ′, ~b∗(b = b∗i ∧ (~U, ~U ′,C~b∗) ∈W ∧ ~V ` ~U ∧ ~V ′ ` ~U ′) ` c∗i .

We must show (~V , ~V ′,C~c∗) ∈ [[λ~xC]], i.e., ~V ′ ` ~c∗. It suffices to show V ′i `

Wi, for every i. Let b ∈ Wi. Then there are ~U, ~U ′, ~b∗ such that b = b∗i ,(~U, ~U ′,C~b∗) ∈W and ~V ′ ` ~U ′. Hence V ′

i ` U ′i ` b∗i = b.Case (D). Let W = (~U1, ~U

′′1 , b1), . . . , (~Un, ~U

′′n , bn). For every i there is

an ~U ′i such that

(~Ui, ~U ′i , bi) ∈ [[λ~x,~yiMi(~yi)]] ~U ′′i ` ~Pi(~U ′i)

(~Ui, ~U ′′i , bi) ∈ [[λ~xD]]

for D~Pi(~yi) = Mi(~yi) a computation rule. Assume W ` (~V , ~V ′′, c). We mustprove (~V , ~V ′′, c) ∈ [[λ~xD]]. Let

I := i | 1 ≤ i ≤ n ∧ ~V ` ~Ui ∧ ~V ′′ ` ~U ′′i .

Then bi | i ∈ I ` c, hence I 6= ∅. For i ∈ I we have ~V ′′ ` ~U ′′i ` ~Pi(~U ′i),hence by (6.16) there are ~V ′

i such that ~V ′′ ∼ ~Pi(~V ′i ) and ~V ′

i ` ~U ′i . In


particular for i, j ∈ I~V ′′ ∼ ~Pi(~V ′

i ) ∼ ~Pj(~V ′j ).

To simplify notation assume I = 1, . . . ,m. Hence by the UnificationLemma ~P1, . . . , ~Pm are unifiable with a most general unifier ξ and thereexists ~W such that

(~P1ξ)( ~W ) = · · · = (~Pmξ)( ~W ) ∼ ~P1(~V ′1) ∼ · · · ∼ ~Pm(~V ′

m).

Let i, j ∈ I. Then by the conditions on computation rules Miξ = Mjξ. Also(~yiξ)( ~W ) ` ~V ′

i ` ~U ′i . Therefore by (6.12)

(~V , (~yiξ)( ~W ), bi) ∈ [[λ~x,~yiMi(~yi)]]

and hence by (6.15)

(~V , ~W, bi) ∈ [[λ~x,~yiMi(~yiξ)]].

But Mi(~yiξ) = Miξ = M1ξ = M1(~yiξ) and hence for all i ∈ I

(~V , ~W, bi) ∈ [[λ~x,~yiM1(~y1ξ)]].

Therefore X := (~V , ~W, bi) | i ∈ I ⊆ [[λ~x,~yiM1(~y1ξ)]]. Since bi | i ∈

I ` c, we have X ` (~V , ~W, c) and hence the induction hypothesis implies(~V , ~W, c) ∈ [[λ~x,~yi

M1(~y1ξ)]]. Using (6.15) again we obtain (~V , (~yiξ) ~W, c) ∈[[λ~x,~yi

M1(~y1)]]. Since ~V ′′ ∼ ~P1(~V ′1) ∼ ~P1((~y1ξ) ~W ) we obtain (~V , ~V ′′, c) ∈

[[λ~xD]], by (D).

Corollary. [[λ~xM ]] is an ideal.

6.2.7. Preservation of values. We now prove that our definitionabove of the denotation of a term is reasonable in the sense that it is notchanged by an application of the standard (β- and η-) conversions or a com-putation rule. For the β-conversion part of this proof it is helpful to firstintroduce a more standard notation, which involves variable environments.

Definition. Assume that all free variables in M are among ~x. Let[[M ]]~U~x := b | (~U, b) ∈ [[λ~xM ]] and [[M ]]~u~x :=

⋃~U⊆~u[[M ]]~U~x .

We have a useful monotonicity property, which follows from the deduc-tive closure of [[λ~xM ]].

Lemma. (a) If ~V ` ~U , b ` c and b ∈ [[M ]]~U~x , then c ∈ [[M ]]~V~x .(b) If ~v ⊇ ~u, b ` c and b ∈ [[M ]]~u~x, then c ∈ [[M ]]~v~x.

Proof. (a). By the deductive closure of [[λ~xM ]], ~V ` ~U , b ` c and(~U, b) ∈ [[λ~xM ]] together imply (~V , c) ∈ [[λ~xM ]]. (b) follows from (a).

Lemma. (a) [[xi]]~u~x = ui.(b) [[λyM ]]~u~x = (V, b) | b ∈ [[M ]]~u,V~x,y .(c) [[MN ]]~u~x = [[M ]]~u~x[[N ]]~u~x.

Proof. (b). It suffices to prove this with ~U for ~u. But (V, b) ∈ [[λyM ]]~U~xand b ∈ [[M ]]

~U,V~x,y are both equivalent to (~U, V, b) ∈ [[λ~x,yM ]].


(c).

c ∈ [[M ]]~u~x[[N ]]~u~x↔ ∃V⊆[[N ]]~u

~x((V, c) ∈ [[M ]]~u~x)

↔ ∃V⊆[[N ]]~u~x∃~U⊆~u((V, c) ∈ [[M ]]~U~x )

↔ ∃~U1⊆~u∃V⊆[[N ]]~U1~x

∃~U⊆~u((V, c) ∈ [[M ]]~U~x )

↔(∗) ∃~U⊆~u∃V⊆[[N ]]~U~x

((V, c) ∈ [[M ]]~U~x )

↔ ∃~U⊆~u∃V((~U, V ) ⊆ [[λ~xN ]] ∧ (~U, V, c) ∈ [[λ~xM ]]

)↔ ∃~U⊆~u((~U, c) ∈ [[λ~xMN ]]) by (A)

↔ ∃~U⊆~u(c ∈ [[MN ]]~U~x )

↔ c ∈ [[MN ]]~u~x.

Here is the proof of the equivalence marked (∗). The upwards direction isobvious. For the downwards direction we use monotonicity. Assume ~U1 ⊆ ~u,V ⊆ [[N ]]

~U1~x , ~U ⊆ ~u and (V, c) ∈ [[M ]]~U~x . Let ~U2 := ~U1 ∪ ~U ⊆ ~u. Then by

monotonicity V ⊆ [[N ]]~U2~x and (V, c) ∈ [[M ]]

~U2~x .

Corollary. [[λyM ]]~u~xv = [[M ]]~u,v~x,y.

Proof.

b ∈ [[λyM ]]~u~xv ↔ ∃V⊆v((V, b) ∈ [[λyM ]]~u~x)

↔ ∃V⊆v(b ∈ [[M ]]~u,V~x,y ) by the lemma

↔ b ∈ [[M ]]~u,v~x,y.

Lemma (Substitution). [[M(z)]]~u,[[N ]]~u~x~x,z = [[M(N)]]~u~x.

Proof. By induction on M , and cases on the form of M . Case λyM .For readability we leave out ~x and ~u.

[[λyM(z)]][[N ]]z = (V, b) | b ∈ [[M(z)]][[N ]],V

z,y = (V, b) | b ∈ [[M(N)]]Vy by induction hypothesis

= [[λyM(N)]] by a lemma above

= [[(λyM)(N)]].

The other cases are easy.

Lemma (Preservation of values, β). [[(λyM(y))N ]]~u~x = [[M(N)]]~u~x.

Proof. Again we leave out ~x, ~u. By the last two lemmata and thecorollary, [[(λyM(y))N ]] = [[λyM(y)]][[N ]] = [[M(y)]][[N ]]

y = [[M(N)]].

Lemma (Preservation of values, η). [[λyMy]]~u~x = [[M ]]~u~x if y /∈ FV(M).


Proof. We leave out ~x and ~u.(V, b) ∈ [[λyMy]]↔ b ∈ [[My]]Vy

↔ b ∈ [[M ]]V

↔ ∃U⊆V ((U, b) ∈ [[M ]])

↔ (V, b) ∈ [[M ]],

where in the last step we have used monotonicity.

We can now prove preservation of values under computation rules:

Lemma. For every computation rule D~P (~y ) = M of a defined constantD, [[λ~y(D~P (~y ))]] = [[λ~yM ]].

Proof. The following are equivalent:

(~V , b) ∈ [[λ~y(D~P (~y ))]]

(~P (~V ), b) ∈ [[λ~z(D~z )]] by (6.15)

(~P (~V ), b) ∈ [[D]] by (6.14)

(~V , b) ∈ [[λ~yM ]],

where the last equivalence can be seen as follows. If (~P (~V ), b) ∈ [[D]], thenthere is a ~U such that (~U, b) ∈ [[λ~yM ]] and ~P (~V ) ` ~P (~U). Hence ~V ` ~U andtherefore (~V , b) ∈ [[λ~yM ]] by (6.12). The converse is immediate by (D).

6.2.8. Operational semantics; adequacy. The adequacy theorem ofPlotkin (1977, Theorem 3.1) says that whenever the value of a closed termM is a numeral, then M head-reduces to this numeral. So in this sense the(denotational) semantics is (computationally) “adequate”. Plotkin’s proofis by induction on the types, and uses a computability predicate. We provean adequacy theorem in our setting, for arbitrary computation rules.

Operational semantics. Recall that a token of an algebra ι is a construc-tor tree whose outermost constructor is for ι.

Definition. For closed terms M we inductively define M 1 N (“Mhead-reduces to N”) and M ∈ Nf (“M is in normal form”).

(λxM(x))N ~K 1 M(N) ~K,

D ~P ( ~N) ~K 1 M( ~N) ~K for D~P (~y ) = M(~y ) a computation rule,~M 1

~N

C ~M 1 C ~N

where C ~M is of base type; ~M 1~N means that Mi 1 Ni for at least one i,

and for all i either Mi 1 Ni or Mi = Ni ∈ Nf. In the final rule we assumethat ~M has length ar(D), but is not an instance of ~P (~y ) such that D has acomputation rule of this constructor pattern:

~M 1~N

D ~M ~K 1 D ~N ~K.

If none of the rules applies, then M ∈ Nf.


Clearly for every term M there is at most one M ′ such that M 1 M′.

Let denote the reflexive transitive closure of 1.We define an “operational interpretation” (Martin-Lof, 1983) of formal

neighborhoods U . To this end we define a notion M ∈ [a], for M closed.

Definition (M ∈ [a]). The definition is by induction on the type, anduses a subordinate inductive definition in the base type case. Let M ∈ [U ]mean ∀a∈U (M ∈ [a]).

M ∈ [C ~a∗] := ∃ ~N∈[ ~a∗](M C ~N),

M ∈ [(~U, b)] := ∀ ~N∈[~U ](M~N ∈ [b]).

Remark. Notice that the first clause of the definition generalizes toM ∈ [P ( ~a∗)]→ ∃ ~N∈[ ~a∗](M P ( ~N)). But this implies

(6.17) M ∈ [P (~V )]→ ∃ ~N∈[~V ](M P ( ~N)),

which can be seen as follows. To simplify notation assume ~V is V . Letb ∈ V ; recall that V is finite. Then M ∈ [P (b)], and hence there is Nb ∈ [V ]such that M P (Nb). All these Nb’s must mave a common reduct N , andby the next lemma N ∈ [V ].

We prove some easy but useful properties of the relation M ∈ [a]. Thefirst one says that [a] is closed under backward and forward reduction steps.

Lemma. (a) M− 1 M →M ∈ [a]→M− ∈ [a].(b) M 1 M

+ →M ∈ [a]→M+ ∈ [a].

Proof. (a). By induction on M ∈ [a]. Case M ∈ [C ~a∗]. Then M C ~N for some ~N ∈ [ ~a∗]. From M− 1 M we obtain M− C ~N . HenceM− ∈ [C ~a∗]. Case M ∈ [(~U, b)]. Assume M− 1 M . We must showM− ∈ [(~U, b)]. Let ~N ∈ [~U ]; we must show M− ~N ∈ [b]. By assumption wehave M ~N ∈ [b]. Because of M− 1 M at an arrow type and the trailing ~K

in the rules for 1 at arrow types we also have M− ~N 1 M ~N . By inductionhypothesis M− ~N ∈ [b].

(b). By induction on M ∈ [a]. Case M ∈ [C ~a∗]. Then M C ~N forsome ~N ∈ [ ~a∗]. Subcase M = C ~N . Then M+ = C ~N+ with ~N 1

~N+. Byinduction hypothesis ~N+ ∈ [ ~a∗]. Hence M+ = C ~N+ ∈ [C ~a∗] by definition.Subcase M 1 M

+ C ~N . Then M+ ∈ [C~U ], again by definition. CaseM ∈ [(~U, b)]. Assume M 1 M+. We must show M+ ∈ [(~U, b)]. Let~N ∈ [~U ]; we must show M+ ~N ∈ [b]. By assumption we have M ~N ∈ [b].Because of M 1 M

+ we obtain M ~N 1 M+ ~N , as above. By induction

hypothesis M+ ~N ∈ [b].

The next lemma allows to decrease the information in U .

Lemma. M ∈ [U ]→ U ` b→M ∈ [b].

Proof. By induction on the type, and a side induction on M ∈ [U ].Case U = C ~a∗i | i ∈ I and M ∈ [C ~a∗i ] for all i ∈ I. Then M C ~Ni

for some ~Ni ∈ [ ~a∗i ]. Since 1 is deterministic (i.e., the reduct is unique if


it exists), there is a common reduct C ~N of all C ~Ni, and by the previouslemma ~N ∈ [ ~a∗i ]. Since U = C ~a∗i | i ∈ I ` b, the token b is of the formC~b∗ with a∗ij | i ∈ I ` b∗j . Hence ~N ∈ ~b∗ by induction hypothesis, andtherefore M ∈ [C~b∗].

Case U = (~Ui, bi) | i ∈ I and M ∈ [(~Ui, bi)] for all i ∈ I. AssumeU ` (~V , c), i.e., bi | ~V ` ~Ui ` c. We must show M ∈ [(~V , c)]. Let ~N ∈ [~V ];we must show M ~N ∈ [c]. From ~V ` ~Ui we obtain ~N ∈ [~Ui] by inductionhypothesis. Now M ∈ [(~Ui, bi)] yields M ~N ∈ [bi]. From bi | ~V ` ~Ui ` cwe obtain M ~N ∈ [c], again by induction hypothesis.

Theorem (Adequacy). For closed terms M

a ∈ [[M ]]→M ∈ [a].

Proof. We show for arbitrary terms M with free variables among ~x

(~U, b) ∈ [[λ~xM ]]→ λ~xM ∈ [(~U, b)],

by induction on the rules defining (~U, b) ∈ [[λ~xM ]], and cases on the form ofM .

Case xi.Ui ` b

(~U, b) ∈ [[λ~xxi]](V ).

We must show λ~xxi ∈ [(~U, b)], i.e., ∀ ~N∈[~U ]((λ~xxi)~N ∈ [b]). Let ~N ∈ [~U ]. It

suffices to show Ni ∈ [b]. But this follows from Ni ∈ [Ui] and Ui ` b.Case MN .

(~U, V ) ⊆ [[λ~xN(~x )]] (~U, V, c) ∈ [[λ~xM(~x )]]

(~U, c) ∈ [[λ~x(M(~x )N(~x ))]](A).

We must show ∀ ~K∈[~U ](M( ~K)N( ~K) ∈ [c]). Let ~K ∈ [~U ]. By induction

hypothesis, for all b ∈ V we have λ~xN(~x ) ∈ [(~U, b)] and hence N( ~K) ∈ [b].This means N( ~K) ∈ [V ]. Also, by induction hypothesis we have λ~xM(~x ) ∈[(~U, V, c)]. Therefore (λ~xM(~x )) ~KN( ~K) ∈ [c] and hence M( ~K)N( ~K) ∈ [c].

Case C.~V ` ~b∗

(~U, ~V ,C~b∗ ) ∈ [[λ~xC]](C).

We must show λ~xC ∈ [(~U, ~V ,C~b∗)], i.e., ∀ ~K∈[~U ]∀~L∈[~V ](C~L ∈ [C~b∗]). Let

~L ∈ [~V ]. Since ~V ` ~b∗ we have ~L ∈ [~b∗]. Hence C~L ∈ [C~b∗] by definition.Case D.

(~U, ~V , b) ∈ [[λ~x,~yM ]] ~W ` ~P (~V )

(~U, ~W, b) ∈ [[λ~xD]](D),

with D~P (~y ) = M(~y ) a computation rule. To simplify notation assumethat ~x, ~U are empty. We must show D ∈ [( ~W, b)]. Assume ~L ∈ [ ~W ]; wemust show D~L ∈ [b]. Since ~W ` ~P (~V ) we have ~L ∈ [~P (~V )]. By (6.17)there are ~N ∈ [~V ] such that ~L ~P ( ~N). Hence D~L D~P ( ~N) 1 M( ~N).Because of (~V , b) ∈ [[λ~yM ]] by induction hypothesis λ~yM ∈ [(~V , b)], i.e.,


∀ ~N∈[~V ]((λ~yM) ~N ∈ [b]). Hence (λ~yM) ~N ∈ [b] for the ~N above, and by the

next-to-last lemma D~L ∈ [b].

6.3. Normalization

In the adequacy theorem we have seen that whenever a closed termdenotes a numeral, then a particular reduction method – head reduction– terminates and the result will be this numeral. However, in general wecannot expect that reducing an arbitrary term will terminate. Our quitegeneral computation rules exclude this; in fact, the definition Y f = f(Y f) ofthe least-fixed-point operator easily leads to an example of non-termination.Moreover, we should not expect anything else, since our terms denote partialfunctionals.

Now suppose we want to concentrate on total functionals. This canbe achieved if one gives up general computation rules and restricts atten-tion to the (structural) higher type recursion operators introduced by Godel(1958). In his system T Godel considers terms built from (typed) variablesand (typed) constants for the constructors and recursion operators by (type-correct) application and abstraction. For the recursion operators one canformulate their natural conversion rules, which have the form of computa-tion rules. We will prove in this section that not only head reduction willterminate for such terms, but also arbitrary reduction sequences. The proofwill be given by a “predicative” method (that is, “from below”, withoutquantifying over all predicates or sets).

In the final subsection we address the question whether the normal formof a term can be computed by evaluating the term in an appropriate model.This indeed can be done, but of course the value obtained must be “reified”to a term, which turns out to be the long normal form. For simplicity werestrict attention to λ-terms without defined higher order constants, givenby their computation rules; however, the method works for the general caseas well. In fact, the question arose when implementing normalization inMinlog. Since the underlying programming language is Scheme – a memberof the Lisp family with a built-in efficient evaluation –, it was tempting touse exactly this evaluation mechanism to compute normal forms. This isdone in Minlog.

6.3.1. Strong normalization. We consider term in Godel’s T. Recallthe definition of the conversion relation 7→ in 6.2.2. We define the one stepreduction relation → between terms in T as follows. M → N ifN is obtainedfrom M by replacing a subterm M ′ in M by N ′, where M ′ 7→ N ′. Thereduction relations→+ and→∗ are the transitive and the reflexive transitiveclosure of →, respectively. For ~M = M1, . . . ,Mn we write ~M → ~M ′ ifMi → M ′

i for some i ∈ 1, . . . , n and Mj = M ′j for all i 6= j ∈ 1, . . . , n.

A term M is normal (or in normal form) if there is no term N such thatM → N .

Clearly normal closed terms are of the form C~ιi ~N .

Definition. The set SN of strongly normalizing terms is inductivelydefined by

∀N ;M→N (N ∈ SN)→M ∈ SN.


Note that with M clearly every subterm of M is strongly normalizing.

Definition. We define strong computability predicates SCρ by inductionon ρ.

Case ιj = (µ~ξ ~κ)j . Then M ∈ SCιj if

∀N ;M→N (N ∈ SC), and(6.18)

M = C~ιi ~N → ~NP ∈ SC ∧nR∧∧p=1∀ ~K∈SC(NR

p~K ∈ SCιjp ).(6.19)

Case ρ→ σ.

SCρ→σ := M | ∀N∈SCρ(MN ∈ SCσ) .

The reference to ~NP ∈ SC and ~K ∈ SC in (6.19) is legal, because thetypes ~ρ, ~σi of ~N, ~K must have been generated before ιj . Note also that by(6.19) C~ιi ~N ∈ SC implies ~N ∈ SC.

We now set up a sequence of lemmata leading to a proof that every termis strongly normalizing.

Lemma (Closure of SC under reduction). If M ∈ SCρ and M → M ′,then M ′ ∈ SCρ.

Proof. Induction on ρ. Case ι. By (6.18). Case ρ → σ. AssumeM ∈ SCρ→σ and M → M ′; we must show M ′ ∈ SC. So let N ∈ SCρ; wemust show M ′N ∈ SCσ. But this follows from MN →M ′N and MN ∈ SCρ

by induction hypothesis on σ.

Lemma (Closure of SC under variable application).

∀ ~M∈SN( ~M ∈ SC→ (x ~M)ι ∈ SC).

Proof. Induction on ~M ∈ SN. Assume ~M ∈ SN and ~M ∈ SC; wemust show (x ~M)ι ∈ SC. So assume x ~M → N ; we must show N ∈ SC.Now by the form of the conversion rules N must be of the form x ~M ′ with~M → ~M ′. But ~M ′ ∈ SC by closure of SC under reduction, hence x ~M ′ ∈ SCby induction hypothesis for ~M ′.

Lemma. (a) SCρ ⊆ SN,(b) x ∈ SCρ.

Proof. By simultaneous induction on ρ. Case ιj = (µ~ξ ~κ)j . (a). Weshow that M ∈ SCιj implies M ∈ SN by (side) induction on M ∈ SCιj . Soassume M ∈ SCιj ; we must show M ∈ SN. But for every N with M → Nwe have N ∈ SC by (6.18), hence N ∈ SN by the side induction hypothesis.(b). x ∈ SCιj holds trivially.

Case ρ → σ. (a). Assume M ∈ SCρ→σ; we must show M ∈ SN. Byinduction hypothesis (b) for ρ we have x ∈ SCρ, hence Mx ∈ SCσ, henceMx ∈ SN by induction hypothesis (a) for σ. But Mx ∈ SN clearly impliesM ∈ SN. (b). Let ~M ∈ SC~ρ with ρ1 = ρ; we must show x ~M ∈ SCι. But thisfollows from the closure of SC under variable application, using inductionhypothesis (a) for ~ρ.

It follows that each constructor is strongly computable:


Corollary. ~N ∈ SC→ C~ιi ~N ∈ SC, i.e., C~ιi ∈ SC.

Proof. First show ∀ ~N∈SN( ~N ∈ SC → C~ιi ~N ∈ SC) by induction on~N ∈ SN as we proved closure of SC under variable application, and then useSCρ ⊆ SN.

Lemma. ∀M,N, ~N∈SN(M(N) ~N ∈ SCι → (λxM(x))N ~N ∈ SCι).

Proof. By induction on M,N, ~N ∈ SN. Let M,N, ~N ∈ SN and as-sume that M(N) ~N ∈ SC; we must show (λxM(x))N ~N ∈ SC. Assume(λxM(x))N ~N → K; we must show K ∈ SC. Case K = (λxM ′(x))N ′ ~N ′

with M,N, ~N →M ′, N ′, ~N ′. Then M(N) ~N →∗ M ′(N ′) ~N ′, hence by (6.18)from our assumption M(N) ~N ∈ SC we can infer M ′(N ′) ~N ′ ∈ SC, therefore(λxM ′(x))N ′ ~N ′ ∈ SC by induction hypothesis. Case K = M(N) ~N . ThenK ∈ SC by assumption.

By induction on ρ (using SCρ ⊆ SN) it follows that this property holdsfor arbitrary types ρ as well:

(6.20) ∀M,N, ~N∈SN(M(N) ~N ∈ SCρ → (λxM(x))N ~N ∈ SCρ).

Lemma. ∀N∈SCιj ∀ ~M,~L∈SN( ~M, ~L ∈ SC→ RjN ~M~L ∈ SCι).

Proof. By main induction on N ∈ SCιj , and side induction on ~M, ~L ∈SN. Assume

RjN ~M~L→ L.

We must show L ∈ SC.Case 1. RjN ~M ′ ~L′ ∈ SC by the side induction hypothesis.Case 2. RjN ′ ~M~L ∈ SC by the main induction hypothesis.Case 3. N = C~ιi ~N and

L = Mi~N

((Rj · ~M) NR

1

). . .

((Rj · ~M) NR

n

)~L.

~M, ~L ∈ SC by assumption. ~N ∈ SC follows from N = C~ιi ~N ∈ SC by (6.19).Note that for all recursive arguments NR

p of N and all strongly computable~K by (6.19) we have the induction hypothesis for NR

p~K available. It remains

to show (Rj · ~M)NRp = λ~xpRj(NR

p ~xp) ~M ∈ SC. So let ~K, ~Q ∈ SC be given.We must show (λ~xpRj(NR

p ~xp) ~M) ~K ~Q ∈ SC. By induction hypothesis forNRp~K we have Rj(NR

p~K) ~M ~Q ∈ SC, since ~K, ~Q ∈ SN because of SCρ ⊆ SN.

Now (6.20) yields the claim.

So in particular Rj ∈ SC.

Definition. A substitution ξ is strongly computable, if ξ(x) ∈ SC for allvariables x. A term M is strongly computable under substitution, if Mξ ∈ SCfor all strongly computable substitutions ξ.

Theorem. Every term in Godel’s T is strongly computable under sub-stitution.


Proof. Induction on the term M . Case x. xξ ∈ SC, since ξ is stronglycomputable. The cases C~ιi and Rj have been treated above. Case MN .By induction hypothesis Mξ,Nξ ∈ SC, hence (MN)ξ = (Mξ)(Nξ) ∈ SC.Case λxM . Let ξ be a strongly computable substitution; we must show(λxM)ξ = λxMξxx ∈ SC. So let N ∈ SC; we must show (λxMξxx)N ∈ SC.By induction hypothesis MξNx ∈ SC, hence (λxMξxx)N ∈ SC by (6.20).

It follows that every term in Godel’s T is strongly normalizing.

6.3.2. Normalization by evaluation. A basic question is how topractically normalize a term, say in a system like Minlog. There are manyways to do this, however one wants to compute the normal form in a rationaland efficient way. We show here – as an aside – that this can be done simplyby evaluating the term itself, but in an appropriate model. Of course thevalue obtained must then be “reified” to a term, which turns out to be thelong normal form.

Recall the notion of a simple type: ι, ρ→ σ; we may also include ρ× σ.The set Λ of terms is defined by xσ, (λxρMσ)ρ→σ, (Mρ→σNρ)σ. Let Λρ de-note the set of all terms of type ρ. We consider the set of terms in long nor-mal form (i.e., normal w.r.t. β-reduction and η-expansion): (xM1 . . .Mn)ι,λxM . Abbreviate M1 . . .Mn by ~M , and xM1 . . .Mn by x ~M . By nf(M) wedenote the long normal form of M , i.e., the unique term in long normal formβη-equal to M .

Our goal is to define a normalization function that(i) first evaluates a term M in a suitable (denotational) model to some

object, say a, and then(ii) converts a back into a term which is the long normal form of M .

We take terms of base type as base type objects, and all functions as possiblefunction type objects:

[[ι]] := Λι, [[ρ→ σ]] := [[σ]][[ρ]] (the full function space).

It is crucial that all terms (of base type) are present, not just the closed ones.Next we need an assignment ↑ lifting a variable to an object, and a function↓ giving us a normal term from an object. They should meet the followingcondition, to be called “correctness of normalization by evaluation”

↓([[M ]]↑) = nf(M),

where [[Mρ]]↑ ∈ [[ρ]] denotes the value of M under the assignment ↑.Two such functions ↓ and ↑ can be defined simultaneously, by induction

on the type. It is convenient to define ↑ on all terms (not just on variables).Define ↓ρ : [[ρ]]→ Λρ and ↑ρ : Λρ → [[ρ]] (called reify and reflect) by

↓ι(M) := M, ↑ι(M) := M,

↓ρ→σ(a) := λx(↓σ(a(↑ρ(x)))) (x “new”), ↑ρ→σ(M)(a) := ↑σ(M↓ρ(a)).

x “new” is not a problem for an implementation, where we have an oper-ational understanding and may use something like gensym, but it is for amathematical model. We therefore refine our model by considering termfamilies.


Since normalization by evaluation needs to create bound variables when“reifying” abstract objects of higher type, it is useful to follow de Bruijn’s(1972) style of representing bound variables in terms. This is done here –as in Berger and Schwichtenberg (1991); Filinski (1999) – by means of termfamilies. A term family is a parametrized version of a given term M . Theidea is that the term family of M at index k reproduces M with boundvariables renamed starting at k. For example, for

M := λu,v(cλx(vx)λy,z(zu))

the associated term family M∞ at index 3 yields

M∞(3) := λx3,x4(cλx5(x4x5)λx5,x6(x6x3)).

We denote terms by M,N,K, . . . , and term families by r, s, t, . . . .To every term Mρ we assign a term family M∞ : N→ Λρ by

x∞(k) := x,

(λyM)∞(k) := λxk(M [y := xk]∞(k + 1)),

(MN)∞(k) := M∞(k)N∞(k).

The application of a term family r : N→ Λρ→σ to s : N→ Λρ is the familyrs : N→ Λσ defined by (rs)(k) := r(k)s(k). Hence e.g. (MN)∞ = M∞N∞.Let k > FV(M) mean that k is greater than all i such that xρi ∈ FV(M) forsome type ρ.

Lemma. (a) If M =α N , then M∞ = N∞.(b) If k > FV(M), then M∞(k) =α M .

Proof. (a) Induction on the height |M | of M . Only the case where Mand N are abstractions is critical. So assume λyρM =α λzρN . Then M [y :=P ] =α N [z := P ] for all terms P ρ. In particular M [y := xk] =α N [z := xk]for arbitrary k ∈ N. Hence M [y := xk]∞(k + 1) = N [z := xk]∞(k + 1), byinduction hypothesis. Therefore

(λyM)∞(k) = λxk(M [y := xk]∞(k + 1))

= λxk(N [z := xk]∞(k + 1))

= (λzN)∞(k).

(b) Induction on |M |. We only consider the case λyM . The assumptionk > FV(λyM) implies xk /∈ FV(λyM) and hence λyM =α λxk

(M [y := xk]).Furthermore k + 1 > FV(M [y := xk]), and hence M [y := xk]∞(k + 1) =α

M [y := xk], by induction hypothesis. Therefore

(λyM)∞(k) = λxk(M [y := xk]∞(k + 1)) =α λxk

(M [y := xk]) =α λyM.

Let ext(r) := r(k), where k is the least number greater than all i suchthat some variable of the form xρi occurs (free or bound) in r(0).

Lemma. ext(M∞)

=α M .

Proof. ext(M∞)

= M∞(k) for the least k > i for all i such that xρioccurs (free or bound) in M∞(0), hence k > FV(M). Now use part (b) ofthe lemma above.

6.4. COMPUTABLE FUNCTIONALS 239

We now aim at proving correctness of normalization by evaluation. Firstwe refine our model by allowing term families:

[[ι]] := ΛNι , [[ρ→ σ]] := [[σ]][[ρ]] (full function spaces).

For every type ρ we define two functions

↓ρ : [[ρ]]→ (N→ Λρ) (“reify”), ↑ρ : (N→ Λρ)→ [[ρ]] (“reflect”),

simultaneously, by induction on ρ:

↓ι(r) := r, ↑ι(r) := r,

↓ρ→σ(a)(k) := λρxk

(↓σ

(a(↑ρ(x∞k ))

)(k+1)

), ↑ρ→σ(r)(b) := ↑σ(r ↓ρ(b)).

Then, for ai ∈ [[ρi]]

(6.21) ↑~ρ→σ(r)(a1, . . . an) = ↑σ( ↓ρ1(a1) . . . ↓ρn(an))

Theorem (Correctness of normalization by evaluation). For terms Min long normal form we have

↓([[M ]]↑

)= M∞,

where [[M ]]↑ denotes the value of M in the environment given by ↑.

Proof. Induction on the height of M . Case λyρMσ.

↓([[λyM ]]↑

)(k) = λxk

(↓([[λyM ]]↑(↑(x∞k ))

)(k + 1)

)= λxk

(↓([[M [y := xk]]]↑

)(k + 1)

)= λxk

(M [y := xk]∞(k + 1)

)by induction hypothesis

= (λyM)∞(k).

Case (x ~M)ι. By (6.21) and the induction hypothesis we obtain [[x ~M ]]↑ =↑(x∞)([[ ~M ]]↑) = x∞↓([[ ~M ]]↑) = x∞ ~M∞ =

(x ~M

)∞.

6.4. Computable Functionals

We now study our abstract notion of computability in more detail. Theessential tool will be recursion, and in the proof of the Kleene RecursionTheorem in 2.4.3 we have already seen that solutions to recursive defini-tions can be obtained as least fixed points of certain higher type operators.This approach can be carried over to recursion in a higher order setting bymeans of least-fixed-point operators Yρ of type (ρ → ρ) → ρ defined by thecomputation rule

Yρf = f(Yρf).[[Y ]] has the property that (W, b) ∈ [[Y ]] implies Wn∅ ` b for some n. Wewill prove this fact in 6.4.1, from the inductive definition of [[Y ]].

We need to consider some further continuous functionals, the parallelconditional pcond of type B→ N→ N→ N and a continuous approxima-tion ∃ of type (N→ B)→ B to the existential quantifier. The main resultof this section is that a continuous functional is computable if and only if itis “recursive in pcond and ∃”, where the latter notion (defined below) refersto the fixed point operators.

The denotation of the constants pcond and ∃ will be defined in a “point-free” way, i.e., not referring to “points” or ideals but rather to their (finite)


approximations. This is done by adding some rules to the inductive defini-tion of (~U, a) ∈ [[λ~xM ]] in 6.2.6.

It will be necessary to refer to a given enumeration (an)n∈N of the tokensin our underlying information system AN. Let Sn0 denote the token for AN

as well as the ideal it generates. We shall make use of an “administrative”function valmax: N→ N→ N such that valmax(x,Sn0) compares the idealx ∈ |AN| with the ideal generated by the n-th token an:

valmax(x,Sn0) =

x if an ∈ xan otherwise.

We introduce valmax in a point-free way as a constant, by adding appropri-ate rules.

6.4.1. Fixed point operators. Recall that the fixed point operatorsYρ were defined by the computation rule Yρf = f(Yρf).

Proposition. ( ~W,W, b) ∈ [[λ~xY ]] implies that Wn∅ ` b, where n is theD-height of the given derivation.

Proof. According to the rules in 6.2.6 every derivation of ( ~W,W ′, b) ∈[[λ~xY ]] must have the form

W ` (V, b)

( ~W,W, V, b)∈[[λ~x,ff ]]

( ~W,W,Wi, bi) ∈ [[λ~x,fY ]]

W ` (Vij , bij)

( ~W,W, Vij , bij) ∈ [[λ~x,ff ]]

( ~W,W, bi) ∈ [[λ~x,f (Y f)]]

( ~W,W, b) ∈ [[λ~x,f (f(Y f))]](D), assuming W ′ `W

( ~W,W ′, b) ∈ [[λ~xY ]]

with V := bi | i ∈ I and Wi := (Vij , bij) | j ∈ Ii . The proof is byinduction on the D-height. We have ( ~W,W,Wi, bi) ∈ [[λ~x,fY ]], W `Wi andW ` (V, b). By induction hypothesis Wni

i ∅ ` bi. Monotonicity of applicationimplies Wni

i ∅ ⊆ Wni∅. Because of Wn∅ ⊆ Wn+1∅ (proved by induction onn, using monotonicity) we obtain Wn∅ ` bi with n := maxni. Recall thatW ` (V, b) was defined to mean WV ` b. Hence Wn∅ ` V by monotonicityof application implies W (Wn∅) ` b, as required.

6.4.2. Rules for pcond, ∃ and valmax. For pcond we have the rulesU ` tt V ` a

(~U,U, V,W, a) ∈ [[λ~xpcond]](P1)

U ` ff W ` a(~U,U, V,W, a) ∈ [[λ~xpcond]]

(P2)

V ` a W ` a(~U,U, V,W, a) ∈ [[λ~xpcond]]

(P3)

and for ∃U ` (∅, ff)

(~U,U, ff) ∈ [[λ~x∃]](E1)

U ` (S∗, tt)(~U,U, tt) ∈ [[λ~x∃]]

(E2)U ` (0, tt)

(~U,U, tt) ∈ [[λ~x∃]](E3).

The rules for valmax areU ` an U ` a

(~U,U, Sn0, a) ∈ [[λ~xvalmax]](M1)

an ` a(~U,U, Sn0, a) ∈ [[λ~xvalmax]]

(M2).


One can check easily that the lemmata proved in 6.2.6 and 6.2.7 continueto hold for the extended set of rules. Moreover one can prove easily thatpcond, ∃ and valmax denote the intended (continuous) functionals:

Lemma (Properties of pcond, ∃ and valmax).

[[pcond]](z, x, y) =

x if z = [[tt]],y if z = [[ff]],x if x = y,

[[∃]](x) =

[[ff]] if (∅, ff) ∈ x,[[tt]] if (S∗, tt) ∈ x or (0, tt) ∈ x,

[[valmax]](x, y) =

x if Sn0 ∈ y and an ∈ x,an if Sn0 ∈ y and an /∈ x.

Note that an n with Sn0 ∈ y is uniquely determined if it exists. Notealso that for an algebra ι with at most unary constructors any two consistentideals x, y ∈ |Cι| are comparable, i.e., x ⊆ y or y ⊆ x. (A counterexamplefor an algebra with a binary constructor C and a nullary 0 is C∗0 andC0∗: they are consistent, but uncomparable.) Hence, if the token an isconsistent with the ideal x, then

(6.22) [[valmax]](x,Sn0) = an ∪ x.

This will be needed below. From pcond we can explicitely define the parallelor of type B→ B→ B by ∨(p, q) := pcond(p, tt, q). Then

[[∨]](x, y) =

[[tt]] if x = [[tt]] or y = [[tt]],[[ff]] if x = y = [[ff]].

6.4.3. Plotkin’s definability theorem.

Definition. A partial continuous functional Φ of type ρ1 → · · · →ρp → N is said to be recursive in pcond and ∃ if it can be defined explicitlyby a term involving the constructors 0,S and the constants predecessor, thefixed point operators Yρ, valmax, pcond and ∃.

Theorem (Plotkin). A partial continuous functional is computable ifand only if it is recursive in pcond and ∃.

Proof. The fact that the constants are defined by the rules above im-plies that the ideals they denote are recursively enumerable. Hence everyfunctional recursive in pcond and ∃ is computable. For the converse let Φ becomputable of type ρ1 → · · · → ρp → N. Then Φ is a primitive recursivelyenumerable set of tokens

Φ = (U1f1n, . . . , U

pfpn, agn) | n ∈ N

where for each type ρj , (U ji )i∈N is an enumeration of Conρj , and f1, . . . , fpand g are fixed primitive recursive functions. Henceforth we will drop thesuperscripts from the U ’s.


Let ~ϕ = ϕ1, . . . , ϕp be arbitrary continuous functionals of types ρ1, . . . , ρprespectively. We show that Φ is definable by the equation

Φ~ϕ = Y w~ϕ0

with w~ϕ of type (N→ N)→ N→ N given by

w~ϕψn := pcond(inconsρ1(ϕ1, f1n) ∨ · · · ∨ inconsρp(ϕp, fpn),

ψ(n+ 1), valmax(ψ(n+ 1), gn)).

Here the inconsρi ’s of type ρi → N → B are continuous functionals suchthat

incons(ϕ, n) =

tt if ϕ ∪ Un is inconsistent,ff if ϕ ⊇ Un,∅ otherwise.

We will prove in the lemma below that there are such functionals recursivein pcond and ∃; their definition will involve the functional ∃.

For notational simplicity we assume p = 1 in the argument to follow,and write w for wϕ. We first prove that

∀n(wk+1∅n ` a→ ∃n≤l≤n+k(ϕ ⊇ Ufl ∧ agl ` a)).The proof is by induction on k. For the base case assume w∅n ` a, i.e.,

pcond(incons(ϕ, fn), ∅, valmax(∅, gn)) ` a.Then clearly ϕ ⊇ Ufn and agn ` a. For the step k 7→ k + 1 we have

wk+2∅n ` a

w(wk+1∅)n ` a

pcond(incons(ϕ, fn), wk+1∅(n+1), valmax(wk+1∅(n+1), gn)) ` a.

Then either wk+1∅(n+ 1) ` a or else ϕ ⊇ Ufn and agn ` a, and hence theclaim follows from the induction hypothesis.

Now Φϕ ⊇ Y w0 follows easily. Assume a ∈ Y w0. Then wk+1∅0 ` a forsome k, by the proposition in 6.4.1. Therefore there is an l with 0 ≤ l ≤ ksuch that ϕ ⊇ Ufl and agl ` a. But this implies a ∈ Φϕ.

For the converse assume a ∈ Φϕ. Then by definition of Φ there must bean n such that ϕ ⊇ Ufn and agn ` a; pick the minimal such n. We show

a ∈ wk+1ϕ ∅(n− k) for k ≤ n.

The proof is by induction on k. For the base case k = 0 because of ϕ ⊇ Ufnwe have incons(ϕ, fn) = ff and hence wψn = valmax(ψ(n + 1), gn) ⊇agn ` a. For the step k 7→ k + 1 by definition of w (:= wϕ)

v′ := wk+2∅(n− k − 1)

= w(wk+1∅)(n− k − 1)

= pcond(incons(ϕ, f(n− k − 1)), v, valmax(v, g(n− k − 1)))

with v := wk+1∅(n − k). By induction hypothesis a ∈ v. We show a ∈ v′.By definition of incons we can assume that ϕ and Uf(n−k−1) are consistent.Consider Φϕ′ with ϕ′ := ϕ ∪ Uf(n−k−1). Then both a and ag(n−k−1) are inΦϕ′, and hence they are consistent. Since our underlying algebra AN has at


most unary constructors it follows that a and ag(n−k−1) are comparable. Incase ag(n−k−1) ` a we have v′ = valmax(v, g(n−k−1)) ⊇ ag(n−k−1) ` a,and in case a ` ag(n−k−1) because of v ⊇ a ` ag(n−k−1) we have v′ =valmax(v, g(n− k − 1)) = v 3 a.

Now the converse inclusion Φϕ ⊆ Y wϕ0 can be seen easily. Assumea ∈ Φϕ. The claim just proved for k := n gives a ∈ wn+1

ϕ ∅0, and thisimplies a ∈ Y wϕ0.

Lemma. There are functionals enρ of type N → N → ρ and inconsρ oftype ρ→ N→ B, both recursive in pcond and ∃, such that(a) en(m) enumerates all finitely generated extensions of Um thus

en(m, ∅) = Um,

en(m,n) = Un if Un ⊇ Um.

(b)

incons(ϕ, n) =

tt if ϕ ∪ Un is inconsistent,ff if ϕ ⊇ Un.

Proof. By induction on ρ.(a). We first prove that there is a functional enρ recursive in pcond and

∃ with the properties stated. For its definition we need to look in moredetail into the definition of the sets Um of type ρ.

For any type ρ, fix an enumeration (Uρn)n∈N of Conρ such that U0 = ∅and the following relations are primitive recursive:

Un ⊆ Um,Un ∪ Um ∈ Conρ,Uρ→σn Uρm = Uσk ,

Un ∪ Um = Uk (with k = 0 if Un ∪ Um /∈ Conρ).

We also assume an enumeration (bρi )i∈N of the set of tokens of type ρ.Note that any primitive recursive function f can be regarded as a con-

tinuous functional of type N → . . . → N → N, if we identify it with itsstrict extension. It is easy to see that any primitive recursive function canbe represented in this way by a term involving 0, successor, predecessor, theleast-fixed-point operator YN→N and the cases-operator C. For instance,addition can be written as

m+ n = YN→N(λϕλx[if x = 0 then m else ϕ(x− 1) + 1 fi])n.

Let ρ = ρ1 → . . . → ρp → N and j, k and h be primitive recursivefunctions such that

Um = (Uj(m,1,l), . . . , Uj(m,p,l), ak(m,l)) | l < hm .

enρ will be defined from an auxiliary functional Ψ of type ρ1 → . . .→ ρp →N→ N→ N given by

Ψ(~ϕ,m, d, 0) := d,

Ψ(~ϕ,m, d, l + 1) := pcond(pl,Ψ(~ϕ,m, d, l), valmax(Ψ(~ϕ,m, d, l), k(m, l)))


where pl denotes inconsρ1(ϕ1, j(m, 1, l))∨· · ·∨inconsρp(ϕp, j(m, p, l)). Hence

pl =

tt if ϕi ∪ Uj(m,i,l) is inconsistent for some i = 1 . . . p,ff if ϕi ⊇ Uj(m,i,l) for all i = 1 . . . p,∅ otherwise.

LetU0m := ∅, U l+1

m := U lm ∪ (Uj(m,1,l), . . . , Uj(m,p,l), ak(m,l)) .We first show that

(6.23) Ψ(~ϕ,m, ∅, l) = U lm(~ϕ ).

This is proved by induction on l. For l = 0 both sides = ∅. In the stepl → l + 1 we distinguish three cases according to the possible values tt,ff and ∅ of pl.

Case pl = tt. By definition of Ψ, the induction hypothesis and thefact that pl = tt implies ϕi ∪ Uj(m,i,l) is inconsistent for some i = 1 . . . pwe obtain

Ψ(~ϕ,m, ∅, l + 1) = Ψ(~ϕ,m, ∅, l) = U lm(~ϕ ) = U l+1m (~ϕ ).

Case pl = ff. Then ϕi ⊇ Uj(m,i,l) for all i = 1 . . . p. Now the consis-tency of U l+1

m implies that U lm(~ϕ ) ∪ ak(m,l) is consistent and therefore by(6.22)

valmax(U lm(~ϕ ), k(m, l)) = ak(m,l) ∪ U lm(~ϕ ) = U l+1m (~ϕ ).

Hence the claim, by definition of Ψ and the induction hypothesis.Case pl = ∅. Then by definition of Ψ and the (rule-based) definition of

valmaxΨ(~ϕ,m, ∅, l + 1) = Ψ(~ϕ,m, ∅, l)

(both ideals consist of the same tokens). Moreover U l+1m (~ϕ ) = U lm(~ϕ ), by

definition of pl. This completes the proof of (6.23).Next we show

(6.24) Ψ(~ϕ,m, d, l) = d for d ⊇ Um(~ϕ ).

The proof is by induction on l. For l = 0 we have Ψ(~ϕ,m, d, 0) = d bydefinition. In the step l→ l+ 1 we again distinguish cases according to thepossible values of pl. In case pl = tt we know that ϕi∪Uj(m,i,l) is inconsis-tent for some i = 1 . . . p, hence we have Ψ(~ϕ,m, d, l + 1) = Ψ(~ϕ,m, d, l) = dby induction hypothesis. In case pl = ff we know ak(m,l) ∈ Um(~ϕ ) ⊆ d.Hence the claim follows from the induction hypothesis and the property(6.22) of valmax. In case pl = ∅ we have Ψ(~ϕ,m, d, l + 1) = Ψ(~ϕ,m, d, l) bydefinition of Ψ and the definition of valmax, and the claim follows from theinduction hypothesis. This completes the proof of (6.24).

We can now proceed with the proof of (a). Let

Φ(~ϕ,m, d) := Ψ(~ϕ,m, d, hm),

en(m,n, ~ϕ ) := Φ(~ϕ,m,Φ(~ϕ, n, ∅)).

Recall that Φ(~ϕ, n, ∅) = Un(~ϕ ) by (6.23). The first property of en is nowobvious, since

en(m, ∅, ~ϕ ) = Φ(~ϕ,m,Φ(~ϕ, ∅, ∅)) = Φ(~ϕ,m, ∅) = Um(~ϕ ).

6.5. TOTAL FUNCTIONALS 245

For the second property let Un ⊇ Um and ~ϕ be given, and d := Un(~ϕ ).Then by definition en(m,n, ~ϕ ) = Φ(~ϕ,m, d), and Φ(~ϕ,m, d) = d followsfrom (6.24).

(b). Let ρ = σ → τ and f , g be primitive recursive functions such thatthe i-th token at type ρ is aρi = (Uσfi, a

τgi). We will define inconsρ from

similar functionals [ic]ρ of type ρ→ N→ N with the property

[ic](ϕ, i) =

tt if ϕ ∪ ai is inconsistent,ff if ϕ ⊇ ai.

To see that there are such [ic]’s recursive in pcond and ∃ observe that thefollowing are equivalent:

[ic]ρ(ϕ, i) = ttϕ ∪ ai is inconsistent

ϕ ∪ (Ufi, agi) is inconsistent

∃n(Un ⊇ Ufi and ϕ(Un) ∪ agi is inconsistent)

∃n(ϕ(enσ(fi, n)) ∪ agi is inconsistent)

∃n([ic]τ (ϕ(enσ(fi, n)), gi) = tt)

and also

[ic]ρ(ϕ, i) = ffai ∈ ϕ(Ufi, agi) ∈ ϕagi ∈ ϕ(Ufi)

[ic]τ (ϕ(enσ(fi, ∅)), gi) = ff.

Hence we can define

[ic]ρ(ϕ, i) := ∃λn([ic]τ [ϕ(enσ(fi, n)), gi]).

We still have to define inconsρ from [ic]ρ. Let

[ic]∗(ϕ, n, 0) := 0,

[ic]∗(ϕ, n, l + 1) := [ic]∗(ϕ, n, l) ∨ [ic](ϕ, j(n, l)),

where j(n, l) is defined by Uρn = aj(n,l) | l < hn . It is now easy to seethat inconsρ with the properties required above can be defined by

inconsρ(ϕ, n) := [ic]∗ρ(ϕ, n, hn).

Note that one needs the coherence of Conρ here. Note also that we do needthe parallel or in the definition of [ic]∗.

6.5. Total Functionals

We now single out the total continuous functionals from the partial ones.Our main goal will be the density theorem, which says that every finitefunctional can be extended to a total one.


6.5.1. Total and structure-total ideals. The total and structure-total ideals in the information system Cι of a finitary algebra ι have beendefined in 6.1.7. We now extend this definition to arbitrary types.

Definition. The total ideals of type ρ are defined inductively.(a) Case ι. For an algebra ι, the total ideals x are those of the form C~z

with C a constructor of ι and ~z total (C denotes the continuous function|rC|).

(b) Case ρ→ σ. An ideal r of type ρ→ σ is total if and only if for all totalz of type ρ, the result |r|(z) of applying r to z is total.

The structure-total ideals are defined similarly; the difference is that in caseι the ideals at parameter positions of C need not be total. – We write x ∈ Gρto mean that x is a total ideal of type ρ.

Remark. Note that in the arrow case of the definition of totality, wehave made use of the universal quantifier “for all total z of type ρ” withan implication in its kernel. So using the concept of a total computablefunctional to explain the meaning of the logical connectives – as it is donein the Brouwer-Heyting-Kolmogorov interpretation (see 7.1.1) – is in thissense circular.

6.5.2. Equality for total functionals.

Definition. An equivalence ∼ρ between total ideals x1, x2 ∈ Gρ isdefined inductively.(a) Case ι. For an algebra ι, two total ideals x1, x2 are equivalent if both are

of the form C~zi with the same constructor C of ι, and we have z1j ∼τ z2jfor all j.

(b) Case ρ→ σ. For f, g ∈ Gρ→σ define f ∼ρ→σ g by ∀x∈Gρ(fx ∼σ gx).Clearly ∼ρ is an equivalence relation. Similarly, one can define an equiva-lence relation ≈ρ between structure-total ideals x1, x2.

We obviously want to know that ∼ρ (and similarly ≈ρ) is compatiblewith application; we only treat ∼ρ here. The nontrivial part of this argumentis to show that x ∼ρ y implies fx ∼σ fy. First we need some lemmata.Recall that our partial continuous functionals are ideals (i.e., certain sets oftokens) in the information systems Cρ.

Lemma. If f ∈ Gρ, g ∈ |Cρ| and f ⊆ g, then g ∈ Gρ.

Proof. By induction on ρ. For base types ι the claim easily followsfrom the induction hypothesis. ρ → σ: Assume f ∈ Gρ→σ and f ⊆ g. Wemust show g ∈ Gρ→σ. So let x ∈ Gρ. We have to show gx ∈ Gσ. Butgx ⊇ fx ∈ Gσ, so the claim follows by induction hypothesis.

Lemma.

(f1 ∩ f2)x = f1x ∩ f2x, for f1, f2 ∈ |Cρ→σ| and x ∈ |Cρ|.

Proof. By the definition of |r|,|f1 ∩ f2|x= b ∈ Cσ | ∃U⊆x((U, b) ∈ f1 ∩ f2)


= b ∈ Cσ | ∃U1⊆x((U1, b) ∈ f1) ∩ b ∈ Cσ | ∃U2⊆x((U2, b) ∈ f2) = |f1|x ∩ |f2|x.

The part ⊆ of the middle equality is obvious. For ⊇, let Ui ⊆ x with(Ui, b) ∈ fi be given. Choose U = U1 ∪ U2. Then clearly (U, b) ∈ fi (as(Ui, b) ` (U, b) and fi is deductively closed).

Lemma. f ∼ρ g if and only if f ∩ g ∈ Gρ, for f, g ∈ Gρ.

Proof. By induction on ρ. For base types ι the claim easily followsfrom the induction hypothesis. ρ→ σ:

f ∼ρ→σ g ↔ ∀x∈Gρ(fx ∼σ gx)↔ ∀x∈Gρ(fx ∩ gx ∈ Gσ) by induction hypothesis

↔ ∀x∈Gρ((f ∩ g)x ∈ Gσ) by the last lemma↔ f ∩ g ∈ Gρ→σ.


Theorem. x ∼ρ y implies fx ∼σ fy, for x, y ∈ Gρ and f ∈ Gρ→σ.

Proof. Since x ∼ρ y we have x ∩ y ∈ Gρ by the previous lemma. Nowfx, fy ⊇ f(x∩y) and hence fx∩fy ∈ Gσ. But this implies fx ∼σ fy againby the previous lemma.

6.5.3. Dense and separating sets. We prove the density theorem,which says that every finitely generated functional (i.e., every U with U ∈Conρ) can be extended to a total one. Notice that need to know here thatthe base types have nullary constructors, as required in 6.1.4. Otherwise,density might fail for the trivial reason that there are no total ideals at all(e.g., in µξ(ξ → ξ)).

Definition. A type ρ is called dense if

∀U∈Conρ∃x∈Gρ(U ⊆ x),and separating if

∀U,V ∈Conρ(U ∪ V /∈ Conρ → ∃x∈Gρ→B((U, tt) ∈ x ∧ (V, ff) ∈ x)).

We prove that every type ρ is both, dense and separating. This extendedclaim is needed for the inductive argument.

Define the depth dp(a∗) of an extended token a∗ by

dp(Ca∗1 . . . a∗n) := 1 + maxdp(a∗i ) | i = 1, . . . , n , dp(∗) := 0

and the depth dp(U) of a formal neighborhood U by

dp( ai | i ∈ I ) := max 1 + dp(ai) | i ∈ I .

Remark. Let U ∈ Conι be non-empty. Then every token in U startswith the same constructor C. Let Ui consist of all tokens at the ith argumentposition of some token in U . Then C~U ` U (and also U ` C~U), anddp(Ui) < dp(U). (Recall

C~U := C ~a∗ | a∗i ∈ Ui if Ui 6= ∅, and a∗i = ∗ otherwise was defined in 6.2.6, in the proof of (6.16)).


Theorem (Density). For every U ∈ Conρ(a) ∃x∈Gρ(U ⊆ x) and(b) ∀V ∈Conρ(U ∪ V /∈ Conρ → ∃x∈Gρ→B

((U, tt) ∈ x ∧ (V, ff) ∈ x)).Moreover, the required x ∈ G can be chosen to be Σ0

1-definable in both cases.

Proof. The proof is by induction on the depth dp(U) of U , using acase distinction on the form of the type ρ.

Case ι. For U = ∅ both claims are easy. Notice that for (a) we need thatevery base type has a total ideal. Now assume that U ∈ Conι is non-empty.Define Ui from U as in the remark above; then C~U ` U .

(a). By induction hypothesis (a) there are ~x ∈ G such that Ui ⊆ xi.Then for x := |rC|~x ∈ Gι we have U ⊆ x, since C~U ⊆ x and C~U ` U .

(b). Assume U ∪V /∈ Con. We need z ∈ Gι→B such that (U, tt), (V, ff) ∈z. Define Vi from V as in the remark above; then C′~V ` V . If C = C′,we have Ui ∪ Vi /∈ Con for some i. The induction hypothesis (b) for Uiyields z′ ∈ Gρi→B such that (Ui, tt), (Vi, ff) ∈ z′. Define p ∈ Gι→ρi bythe computation rules p(C~x ) = xi and p(C′′~y ) = y for every constructorC′′ 6= C, with a fixed y ∈ Gρi . Let z := z′ p. Then z ∈ Gρi→B, and(U, tt) ∈ z because of C~U ` U , (C~U,Ui) ⊆ p and (Ui, tt) ∈ z′; similarly(V, ff) ∈ z. If C 6= C′, define z ∈ Gι→B by z(C~x) = tt and z(C′′~y) = ff for allconstructors C′′ 6= C. Then clearly (U, tt), (V, ff) ∈ z.

Case ρ→ σ. (b). LetW1,W2 ∈ Conρ→σ and assumeW1∪W2 /∈ Conρ→σ.Then there are (Ui, ai) ∈ Wi (i = 1, 2) with U1 ∪ U2 ∈ Conρ but a1 6` a2.Because of dp(U1 ∪ U2) < maxdp(W1),dp(W2) by induction hypothesis(a) we have x ∈ Gρ such that U1 ∪ U2 ⊆ x. By induction hypothesis (b) wehave v ∈ Gσ such that (a1, tt), (a2, ff) ∈ v.

We need z ∈ G(ρ→σ)→B such that (W1, tt), (W2, ff) ∈ z. It suffices tohave ((U1, a1), tt), ((U2, a2), ff) ∈ z. Define z by zy := v(yx) (with v, xfixed as above). Then clearly z ∈ G(ρ→σ)→B. We show ((U1, a1), tt) ∈ z.For the proof we refer to the rules in 6.2.6. ((U1, a1), tt) ∈ z must havebeen inferred by rule (D) from

(W, tt) ∈ [[λy(v(yx))]] and (U1, a1) `W

for some W . Let W := (U1, a1), which makes the right hand side triviallytrue. The left hand side must have been inferred by rule (A) from

((U1, a1), V, tt) ∈ [[λyv]] and ((U1, a1), V ) ⊆ [[λy(yx)]]

for some V . Let V := a1. Then by definition of v the left hand side holds.The right hand side must have been inferred by rule (A) from

((U1, a1), U, a1) ∈ [[λyy]] and ((U1, a1), U) ⊆ [[λyx]]

for some U . Let U := U1. Then by definition of x the right hand side istrue, and because of (U1, a1) ` (U, a1) the left hand side holds as well.((U2, a2), ff) ∈ z can be seen similarly.

(a). Fix W = (Ui, ai) | i ∈ I ∈ Conρ→σ with I := 0, . . . , n − 1.Consider i < j such that ai 6` aj . Then Ui ∪ Uj /∈ Conρ. By inductionhypothesis (b) there are zij ∈ Gρ such that (Ui, tt), (Uj , ff) ∈ zij . Define forevery U ∈ Conρ a set IU of indices i ∈ I such that “U behaves as Ui with


respect to the zij”. More precisely, let

IU := k ∈ I | ∀i<k(ai 6` ak → (U, tt) ∈ zik) ∧∀j>k(ak 6` aj → (U, ff) ∈ zkj) .

Notice that k ∈ IUk. We first show

VU := ak | k ∈ IU ∈ Conσ.

It suffices to prove that ai ` aj for all i < j < n. Since ai ` aj is de-cidable we can argue indirectly. So let i < j and assume ai 6` aj . Then(U, tt), (U, ff) ∈ zij and hence zij would be inconsistent. This contradictionproves ai ` aj and hence VU ∈ Conσ.

By induction hypothesis (b) we can find yVU∈ Gσ such that VU ⊆ yVU

.Let f ⊆ Conρ × Cσ consist of all (U, a) such that

(a ∈ yVU∧ ∀i,j(i < j → ai 6` aj → (U, tt) ∈ zij ∨ (U, ff) ∈ zij)) ∨ VU ` a,

which is a Σ01-formula. We will show that f ∈ Gρ→σ and W ⊆ f .

For W ⊆ f we show (Ui, ai) ∈ f for all i ∈ I. But this holds, sincei ∈ IUi , hence ai ∈ VUi .

We now show f ∈ |Cρ→σ|. To prove this we verify the defining propertiesof approximable maps (cf. 6.1.3). We first show that (U, a) ∈ f and (U, b) ∈ fimplies a ` b. Since the conclusion is decidable, we can argue by cases onthe (Σ0

1-)condition

(6.25) ∀i,j(i < j → ai 6` aj → (U, tt) ∈ zij ∨ (U, ff) ∈ zij)in the definition of f . If it holds, the claim follows from the consistency ofyVU

. If not, it follows from general properties of information systems. Nextwe show that (U, b1), . . . , (U, bn) ∈ f and b1, . . . , bn ` b implies (U, b) ∈ f .We argue by ∨−. If the left hand side holds for one bk we have (6.25).Since b1, . . . , bn ⊆ yVU

we have b ∈ yVUby deductive closure. Hence

(U, b) ∈ f . If the right hand side holds for all bk, then VU ` b1, . . . , bn ` band therefore (U, b) ∈ f as well. Finally we show that (U, a) ∈ f and U ′ ` Uimplies (U ′, a) ∈ f . We again argue by ∨−. Assume (6.25) and a ∈ yVU

.Because of U ′ ` U (6.25) holds for U ′ as well. We must show a ∈ yVU′ . Wehave IU = IU ′ , hence VU = VU ′ , hence yVU

= yVU′ . Now assume VU ` a.Because of U ′ ` U we have IU ⊆ IU ′ , hence VU ⊆ VU ′ , hence VU ′ ` a anda ∈ yVU′ .

This concludes the proof that f is an approximable map. It remains toprove f ∈ Gρ→σ. So let x ∈ Gρ. We must show fx ∈ Gσ, i.e.,

a ∈ Cσ | ∃U⊆x((U, a) ∈ f) ∈ Gσ.Recall zij ∈ Gρ→B for all i < j with ai 6` aj . Hence tt ∈ zijx or ff ∈zijx for all such i, j, and therefore we have Uij ⊆ x with (Uij , tt) ∈ zij or(Uij , ff) ∈ zij . Hence (6.25) holds with U :=

⋃Uij . Therefore (U, a) ∈ f

for all a ∈ yVU, i.e., yVU

⊆ fx and hence fx ∈ Gσ, by the first lemma in6.5.2.

An easy consequence of the density theorem is a further characterizationof the equivalence between total ideals.

Corollary. x ∼ρ y if and only if x ∪ y is consistent, for x, y ∈ Gρ.


Proof. We use induction on ρ, and only treat the case ρ → σ. Letf, g ∈ Gρ→σ. First assume f ∼ρ→σ g. Let (U, a) ∈ f and (V, b) ∈ g andassume U ∪ V ∈ Con. We must show a ` b. By the densitiy theoremthere is an x ∈ Gρ with U ∪ V ⊆ x. Hence a ∈ f(x) and b ∈ g(x). Byinduction hypothesis f(x) ∪ g(x) is consistent, and therefore a ` b. Nextassume that f∪g is consistent, and let x ∈ Gρ. We must show f(x) ∼σ g(x).By induction hypothesis it suffices to prove that f(x) ∪ g(x) is consistent.Let a ∈ f(x) and b ∈ g(x). Then there are U, V ⊆ x such that (U, a) ∈ fand (V, b) ∈ g. Since U ∪ V ∈ Con and f ∪ g is consistent it follows thata ` b.

As a further application of the density theorem we prove a choice prin-ciple for total continuous functionals.

Theorem (Choice principle for total functionals). There is an idealΓ ∈ |C(ρ→σ→B)→ρ→σ| such that for every F ∈ Gρ→σ→B satisfying

∀x∈Gρ∃y∈Gσ(F (x, y) = tt)

we have Γ(F ) ∈ Gρ→σ and

∀x∈Gρ(F (x,Γ(F, x)) = tt).

Proof. Let V0, V1, V2, . . . be an enumeration of Conσ. By the densitytheorem we can find yn ∈ Gσ such that Vn ⊆ yn. Define a relation Γ ⊆Conρ→σ→B × Cρ→σ by

Γ := (W,U, a) | ∃m∀i<m(W (U, ym) = tt ∧ a ∈ ym ∧W (U, yi) = ff) .

We first show that Γ is an approximable map. To prove this we have toverify the clauses of the definition of approximable maps.

(a). (W,U1, a1), (W,U2, a2) ∈ Γ imply (U1, a1) ` (U2, a2). Assume thepremise and U := U1 ∪ U2 ∈ Conρ. We show a1 ` a2. The numbers mi inthe definition of (W,Ui, ai) ∈ Γ are the same, = m say. Hence a1, a2 ∈ ym,and the claim follows from the consistency of ym.

(b). (W ′, U, a) ∈ Γ, W ` W ′ and (U, a) ` (V, b) implies (W,V, b) ∈ Γ.Then V ` U and a ` b. The claim follows from the definition of Γ, usingthe deductive closure of ym. The m from (W ′, U, a) ∈ Γ can be used for(W,U, a) ∈ Γ.

We finally show that for all F ∈ Gρ→σ→B satisfying

∀x∈Gρ∃y∈Gσ(F (x, y) = tt)

and all x ∈ Gρ we have Γ(F, x) ∈ Gσ and F (x,Γ(F, x)) = tt. So let F and xwith these properties be given. By assumption there is a y ∈ Gσ such thatF (x, y) = tt. Hence by the definition of application there is a Vn ∈ Conσsuch that F (x, Vn) = tt. Since Vn ⊆ yn we also have F (x, yn) = tt. Clearlywe may assume here that n is minimal with this property, i.e., that

F (x, y0) = ff, . . . , F (x, yn−1) = ff.

We show that Γ(F, x) ⊇ yn; this suffices because every superset of a totalideal is total. Recall that

Γ(F ) = (U, a) ∈ Conρ × Cσ | ∃W⊆F ((W,U, a) ∈ Γ)

6.6. NOTES 251

and

Γ(F, x) = a ∈ Cσ | ∃U⊆x((U, a) ∈ Γ(F )) = a ∈ Cσ | ∃U⊆x∃W⊆F ((W,U, a) ∈ Γ) .

Let a ∈ yn. By the choice of n we get U ⊆ x and W ⊆ F such that

∀i<n(W (U, yi) = ff) and W (U, yn) = tt.

Therefore (W,U, a) ∈ Γ and hence a ∈ Γ(F, x).

It is easy to see from the proof that the functional Γ is in fact Σ01-

definable. This “effective” choice principle generalizes the simple fact thatwhenever we know the truth of ∀x∈N∃y∈NRxy with Rxy decidable, then givenx we can just search for a y such that Rxy holds; the truth of ∀x∈N∃y∈NRxyguarantees termination of the search.

6.6. Notes

The development of constructive theories of computable functionals offinite type began with Godel’s (1958). There the emphasis was on particu-lar computable functionals, the structural (or primitive) recursive ones. Incontrast to what was done later by Kreisel, Kleene, Scott and Ershov, thedomains for these functionals were not constructed explicitly, but ratherconsidered as described axiomatically by the theory.

Denotational semantics for PCF-like languages is well-developed, andusually (as in Plotkin’s (1977)) done in a domain-theoretic setting. Thestudy of the semantics of non-overlapping higher type recursion equations– called here computation rules – has been initiated in Berger et al. (2003),again in a domain-theoretic setting. Berger (2005b) introduced a “strict”variant of this domain-theoretic semantics, and used it to prove strong nor-malization of extensions of Godel’s T by different versions of bar recursion.Information systems have been conceived by Scott (1982), as an intuitiveapproach to domains for denotational semantics. Coherent information sys-tems have been introduced by Plotkin (1978, p.210). Taking up Kreisel’s(1959) idea of neighborhood systems, Martin-Lof developed in unpublishednotes (1983) a domain theoretic interpretation of his type theory. The inter-section type discipline of Barendregt, Coppo, and Dezani-Ciancaglini (1983)can be seen as a different style of presenting the idea of a neighborhood sys-tem. The desire to have a more general framework for these ideas has leadMartin-Lof, Sambin and others to develop a formal topology; cf. Coquand,Sambin, Smith, and Valentini (2003).

The first proof of an adequacy theorem (not under this name) is due toPlotkin (1977, Theorem 3.1); Plotkin’s proof is by induction on the types,and uses a computability predicate. A similar result in a type-theoreticsetting is in Martin-Lof’s notes (1983, Second Theorem). Adequacy the-orems have been proved in many contexts, by Abramsky (1991); Amadioand Curien (1998); Barendregt et al. (1983); Martin-Lof (1983). Coquandand Spiwack (2006) – building on the work of Martin-Lof (1983) and Berger(2005b) – observed that the adequacy result even holds for untyped lan-guages, hence also for dependently typed ones.


The problem of proving strong normalization for extensions of typed λ-calculi by higher order rewrite rules has been studied extensively in the lit-erature: Tait (1971); Girard (1971); Troelstra (1973); Blanqui et al. (1999);Abel and Altenkirch (2000); Berger (2005b). Most of these proofs use im-predicative methods (e.g., by reducing the problem to strong normalizationof second order propositional logic, called system F by Girard (1971)). Ourdefinition of the strong computability predicates and also the proof are re-lated to Zucker’s (1973) proof of strong normalization of his term system forrecursion on the first three number or tree classes. However, Zucker uses acombinatory term system and defines strong computability for closed termsonly. Following some ideas in an unpublished note of Berger, Benl (in hisdiploma thesis (1998)) transferred this proof to terms in simply typed λ-calculus, possibly involving free variables. Here it is adapted to the presentcontext. Normalization by evaluation has been introduced in Berger andSchwichtenberg (1991), and extended to constants defined by computationrules in Berger et al. (2003).

In 6.4.3 we have proved that every computable functional Φ is recursivein pcond and ∃. If in addition one requires that Φ is total, then in fact the“parallel” computation involved in pcond and ∃ can be avoided. This hasbeen conjectured by Berger (1993a) and proved by Normann (2000). Fora good survey of these and related results we refer the reader to Normann(2006).

The density theorem was first stated by Kreisel (1959). Proofs of variousversions of it have been given by Ershov (1972), Berger (1993a), Stoltenberg-Hansen et al. (1994), Schwichtenberg (1996) and Kristiansen and Normann(1997). The proof given here extends these to the case where the basedomains are not just the flat domain of natural numbers, but non-flat andpossibly parametrized free algebras.

CHAPTER 7

Extracting Computational Content from Proofs

The treatment of our subject – proof and computation – would be incom-plete if we could not address the issue of extracting computational contentfrom formalized proofs. The first author has over many years developeda machine implemented proof assistant, Minlog, within which this can bedone where, unlike many other similar systems, the extracted content lieswithin the logic itself. Many non-trivial examples have been developed, il-lustrating both the breadth and the depth of Minlog, and some of them willbe seen in what follows. Here we shall develop the theoretical underpinnigsof this system. It will be a Theory of Computable Functionals (TCF), aself-generating system built from scratch and based on minimal logic, whoseintended model are the computable functions on partial continuous objects,as treated in the previous chapter. The main tool will be (iterated) inductivedefinitions of predicates and their elimination (or least-fixed-point) axioms.Its computational strength will be roughly that of ID<ω, but it will be moreadaptable and computationally applicable.

After developing the system TCF, we shall concentrate on delicate ques-tions to do with finding computational content in both constructive andclassical existence proofs. We discuss three “proof interpretations” whichachieve this task: realizability for constructive existence proofs and, for clas-sical proofs, the refined A-translation and Godel’s Dialectica interpretation.After presenting these concepts and proving the crucial Soundness Theoremfor each of them, we address the question of how to implement such proofinterpretations. However we do not give a description of Minlog itself, butprefer to present the methods and their implementation by means of workedexamples. For references to the Minlog system see Schwichtenberg (1993;2006b) and http://www.minlog-system.de.

7.1. Theory of Computable Functionals

7.1.1. Brouwer-Heyting-Kolmogorov and Godel. The Brouwer-Heyting-Kolmogorov interpretation (BHK-interpretation for short) of intui-tionistic (and minimal) logic explains what it means to prove a logicallycompound statement in terms of what it means to prove its components;the explanations use the notions of construction and constructive proof asunexplained primitive notions. For prime formulas the notion of proof issupposed to be given. The clauses of the BHK-interpretation are:

(i) p proves A ∧ B if and only if p is a pair 〈p0, p1〉 and p0 proves A, p1

proves B;(ii) p proves A → B if and only if p is a construction transforming any

proof q of A into a proof p(q) of B;

253

254 7. EXTRACTING COMPUTATIONAL CONTENT FROM PROOFS

(iii) ⊥ is a proposition without proof;(iv) p proves ∀x∈DA(x) if and only if p is a construction such that for all

d ∈ D, p(d) proves A(d);(v) p proves ∃x∈DA(x) if and only if p is of the form 〈d, q〉 with d an element

of D, and q a proof of A(d).The problem with the BHK-interpretation clearly is its reliance on the

unexplained notions of construction and constructive proof. Godel was con-cerned with this problem for more than 30 years. In 1941 he gave a lec-ture at Yale university with the title “In what sense is intuitionistic logicconstructive?”. According to Kreisel, Godel “wanted to establish that in-tuitionistic proofs of existential theorems provide explicit realizers” (Godel,1990, p.219). Godel published his “Dialectica interpretation” in (1958), andrevised this work over and over again; its state in 1972 has been publishedin (Godel, 1990). Troelstra, in his introductory note to the latter two paperswrites in (Godel, 1990, p.220/221):

Godel argues that, since the finististic methods consideredare not sufficient to carry out Hilbert’s program, one hasto admit at least some abstract notions in a consistencyproof; . . . However, Godel did not want to go as far asadmitting Heyting’s abstract notion of constructive proof;hence he tried to replace the notion of constructive proofby something more definite, less abstract (that is, morenearly finitistic), his principal candidate being a notion of“computable functional of finite type” which is to be ac-cepted as sufficiently well understood to justify the axiomsand rules of his system T, an essentially logic-free theoryof functionals of finite type.

We intend to utilize the notion of a computable functional of finite typeas an ideal in an information system, as explained in the previous chapter.However, Godel noted in (1990) that his proof interpretation is largely in-dependent of a precise definition of computable functional; one only needsto know that certain basic functionals are computable (including primitiverecursion operators in finite types), and that they are closed under compo-sition. Building on Godel (1958), we assign to every formula A a new one∃xA1(x) with A1(x) ∃-free. Then from a derivation of A we want to extracta “realizing term” r such that A1(r). Of course its meaning should in somesense be related to the meaning of the original formula A. However, Godelexplicitly states in (1958, p.286) that his Dialectica interpretation is not theone intended by the BHK-interpretation.

7.1.2. Formulas and predicates. When we want to make proposi-tions about computable functionals and their domains of partial continuousfunctionals, it is perfectly natural to take, as initial propositions, ones formedinductively or coinductively. However, for simplicity we postpone the treat-ment of coinductive definitions to 7.1.7 and deal with inductive definitionsonly until then. For example, in the algebra N we can inductively definetotality by the clauses

T0, ∀n(Tn→ T (Sn)).

7.1. THEORY OF COMPUTABLE FUNCTIONALS 255

Its least-fixed-point scheme is

∀n(Tn→ A(0)→ ∀n(Tn→ A(n)→ A(Sn))→ A(n)).

It expresses that every “competitor” n | A(n) satisfying the same clausescontains T . This is the usual induction schema for natural numbers, whichclearly only holds for “total” numbers (i.e., total ideals in the informationsystem for N). Notice that we have used a “strengthened” form of the “stepformula”, namely ∀n(Tn→ A(n)→ A(Sn)) rather than ∀n(A(n)→ A(Sn)).In applications of the least-fixed-point axiom this simplifies the proof of the“induction step”, since we have the additional hypothesis T (n) available.Totality for an arbitrary algebra can be defined similarly. Consider forexample the non-finitary algebra O (cf. 6.1.4), with constructors 0, successorS of type O → O and supremum Sup of type (N → O) → O. Its clausesare

TO0, ∀x(TOx→ TO(Sx), ∀f (∀n∈TTO(fn)→ TO(Sup(f))),

and its least-fixed-point scheme is

∀x(TOx→ A(0)→∀x(TOx→ A(x)→ A(Sx))→∀f (∀n∈T (TO(fn) ∧A(fn))→ A(Sup(f)))→A(x)).

Generally, an inductively defined predicate I is given by k clauses, whichare of the form

∀~x( ~Ai → (∀~yiν( ~Biν → I~siν))ν<ni → I~ti) (i < k).

Our formulas will be defined by the operations of implication A → B

and universal quantification ∀xρA from inductively defined predicates µX ~K,where X is a “predicate variable”, and the Ki are “clauses”. Every predicatehas an arity , which is a possibly empty list of types.

Definition (Predicates and formulas). Let X, ~Y be distinct predicatevariables; the Yl are called predicate parameters. We inductively defineformula forms A,B,C,D ∈ F(~Y ), predicate forms P,Q, I, J ∈ Preds(~Y )and clause forms K ∈ ClX(~Y ); all these are called strictly positive in ~Y . Incase ~Y is empty we abbreviate F(~Y ) by F and call its elements formulas;similarly for the other notions. (However, for brevity we often say “formula”etc. when it is clear from the context that parameters may occur.)

Yl~r ∈ F(~Y ),A ∈ F B ∈ F(~Y )

A→ B ∈ F(~Y ),

A ∈ F(~Y )

∀xA ∈ F(~Y ),

C ∈ F(~Y )

~x | C ∈ Preds(~Y ),

P ∈ Preds(~Y )

P~r ∈ F(~Y ),

K0, . . . ,Kk−1 ∈ ClX(~Y )

µX(K0, . . . ,Kk−1) ∈ Preds(~Y )(k ≥ 1),

~A ∈ F(~Y ) ~B0, . . . , ~Bn−1 ∈ F

∀~x( ~A→ (∀~yν ( ~Bν → X~sν))ν<n → X~t ) ∈ ClX(~Y )(n ≥ 0).


Here ~A → B means A0 → · · · → An−1 → B, associated to the right.For a clause ∀~x( ~A → (∀~yν ( ~Bν → X~sν))ν<n → X~t ) ∈ ClX(~Y ) we call ~Aparameter premises and ∀~yν ( ~Bν → X~sν) recursive premises. We require thatin µX(K0, . . . ,Kk−1) the clause K0 is “nullary”, without recursive premises.The terms ~r are those introduced in section 6.2, i.e., typed terms built fromconstants by abstraction and application, and (importantly) those with acommon reduct are identified.

A predicate of the form ~x | C is called a comprehension term. Weidentify ~x | C(~x ) ~r with C(~r ). The letter I will be used for predicates ofthe form µX(K0, . . . ,Kk−1); they are called inductively defined predicates.

Remark (Substitution for predicate parameters). Let A ∈ F(~Y ); wewrite A(~Y ) for A to indicate its dependence on the predicate parametes ~Y .Similarly we write I(~Y ) for I if I ∈ Preds(~Y ). We can substitute predicates~P for ~Y , to obtain A(~P ) and I(~P ), respectively.

An inductively defined predicate is finitary if its clauses have recursivepremises of the form X~s only (so the ~yν and ~Bν in the general definition areall empty).

Definition (Theory of Computable Functionals TCF). TCF is the sys-tem in minimal logic for→ and ∀, whose formulas are those in F above, andwhose axioms are the following. For each inductively defined predicate, thereare “closure” or introduction axioms, together with a “least-fixed-point” orelimination axiom. In more detail, consider an inductively defined predicateI := µX(K0, . . .Kk−1). For each of the k clauses we have an introductionaxiom, as follows. Let the i-th clause for I be

Ki(X) := ∀~x( ~A→ (∀~yν ( ~Bν → X~sν))ν<n → X~t ).

Then the corresponding introduction axiom is Ki(I), that is,

(7.1) ∀~x( ~A→ (∀~yν ( ~Bν → I~sν))ν<n → I~t ).

The elimination axiom is

(7.2) ∀~x(I~x→ (Ki(I, P ))i<k → P~x ),

where

Ki(I, P ) := ∀~x( ~A→ (∀~yν ( ~Bν → I~sν))ν<n →

(∀~yν ( ~Bν → P~sν))ν<n → P~t ).

We label each introduction axiom Ki(I) by I+i and the elimination axiom

by I−.

7.1.3. Equalities. A word of warning is in order here: we need todistinguish four separate, but closely related equalities.

(i) Firstly, defined function constants D are introduced by computationrules, written l = r, but intended as left-to-right rewrites.

(ii) Secondly, we have Leibniz equality Eq inductively defined below.(iii) Thirdly, pointwise equality between partial continuous functionals will

be defined inductively as well.


(iv) Fourthly, if l and r have a finitary algebra as their type, l = r can beread as a boolean term, where = is the decidable equality defined in6.2.4 as a boolean-valued binary function.

Leibniz equality. We define Leibniz equality by

Eq(ρ) := µX(∀xX(xρ, xρ)).

The introduction axiom is∀xEq(xρ, xρ)

and the elimination axiom

∀x,y(Eq(x, y)→ ∀xPxx→ Pxy),

where Eq(x, y) abbreviates Eq(ρ)(xρ, yρ).

Lemma (Compatibility of Eq). ∀x,y(Eq(x, y)→ A(x)→ A(y)).

Proof. Use the elimination axiom with Pxy := (A(x)→ A(y)).

Using compatibility of Eq one easily proves symmetry and transitivity.Define falsity by F := Eq(ff, tt). Then we have

Theorem (Ex-Falso-Quodlibet). F→ A.

Proof. We first show that F→ Eq(xρ, yρ). To see this, one first obtainsEq([if ff then x else y], [if ff then x else y]) from the introduction axiom,since [if ff then x else y] is an allowed term, and then from Eq(ff, tt) onegets Eq([if tt then x else y], [if ff then x else y]) by compatibility. HenceEq(xρ, yρ).

The claim can now be proved by induction on A ∈ F. Case I~s. LetKi be the nullary clause, with final conclusion I~t. By induction hypothesisfrom F we can derive all parameter premises. Hence I~t. From F we alsoobtain Eq(si, ti), by the remark above. Hence I~s by compatibility. Thecases A→ B and ∀xA are obvious.

A crucial use of the equality predicate Eq is that it allows to lift a booleanterm rB to a formula, using atom(rB) := Eq(rB, tt). This opens up a con-venient way to deal with equality on finitary algebras. The computationrules ensure that for instance the boolean term Sr =N Ss or more precisely,=N(Sr, Ss), is identified with r =N s. We can now turn this boolean terminto the formula Eq(Sr =N Ss, tt), which again is abbreviated by Sr =N Ss,but this time with the understanding that it is a formula. Then (impor-tantly) the two formulas Sr =N Ss and r =N s are identified because thelatter is a reduct of the first. Consequently there is no need to prove theimplication Sr =N Ss→ r =N s explicitly.

Pointwise equality =ρ. For every constructor Ci of an algebra ι we havean introduction axiom

∀~y,~z(~yP =~ρ ~zP → (∀~xν (yRm+ν~xν =ι z

Rm+ν~xν))ν<n → Ci~y =ι Ci~z).

For an arrow type ρ → σ the introduction axiom is explicit, in the sensethat it has no recursive premise:

∀x1,x2(∀y(x1y =σ x2y)→ x1 =ρ→σ x2).


For example, =N is inductively defined by

0 =N 0,

∀n1,n2(n1 =N n2 → Sn1 =N Sn2),

and the elimination axiom is∀n1,n2(n1 =N n2 → P00→

∀n1,n2(n1 =N n2 → Pn1n2 → P (Sn1,Sn2))→Pn1n2).

An example with the non-finitary algebra T1 (cf. 6.1.4) is:

0 =T1 0,

∀f1,f2(∀n(f1n =T1 f2n)→ Sup(f1) =T1 Sup(f2)),

and the elimination axiom is∀x1,x2(x1 =T1 x2 → P00→

∀f1,f2(∀n(f1n =T1 f2n)→ ∀nP (f1n, f2n)→P (Sup(f1),Sup(f2)))→

Px1x2).

The main purpose of pointwise equality is that it allows to formulate theextensionality axiom: we express the extensionality of our intended modelby stipulating that pointwise equality is equivalent to Leibniz equality.

Axiom (Extensionality). ∀x1,x2(x1 =ρ x2 ↔ Eq(x1, x2)).

We write E-LID when the extensionality axioms are present.

7.1.4. Existence, conjunction and disjunction. Important exam-ples of inductively defined predicates are the existential quantifier and con-junction. Similar definitions have been considered by Martin-Lof (1971).

Existential quantifier.

Ex(Y ) := µX(∀x(Y xρ → X)).

The introduction axiom is

∀x(A→ ∃xA),

where ∃xA abbreviates Ex(xρ | A ), and the elimination axiom is

∃xA→ ∀x(A→ P )→ P.

Conjunction. We define

And(Y, Z) := µX(Y → Z → X).

The introduction axiom is

A→ B → A ∧B

where A ∧B abbreviates And( | A , | B ), and the elimination axiom is

A ∧B → (A→ B → P )→ P.


Disjunction. We define

Or(Y, Z) := µX(Y → X,Z → X).

The introduction axioms are

A→ A ∨B, B → A ∨B,where A ∨B abbreviates Or( | A , | B ), and the elimination axiom is

A ∨B → (A→ P )→ (B → P )→ P.

Remark. Alternatively, disjunction A ∨B could be defined by the for-mula ∃p((p → A) ∧ (¬p → B)) with p a boolean variable. However, for ananalysis of the computational content of coinductively defined predicates itis better to define it inductively.

7.1.5. Further examples. We give some more familiar examples ofinductively defined predicates.

The even numbers. The introduction axioms are

Even(0), ∀n(Even(n)→ Even(S(Sn)))

and the elimination axiom is

∀n(Even(n)→ P0→ ∀n(Even(n)→ Pn→ P (S(Sn)))→ Pn).

Transitive closure. Let ≺ be a binary relation. The transitive closure of≺ is inductively defined as follows. The introduction axioms are

∀x,y(x ≺ y → TC(x, y)),

∀x,y,z(x ≺ y → TC(y, z)→ TC(x, z))

and the elimination axiom is∀x,y(TC(x, y)→ ∀x,y(x ≺ y → Pxy)→

∀x,y,z(x ≺ y → TC(y, z)→ Pyz → Pxz)→Pxy).

Accessible part. Let ≺ again be a binary relation. The accessible part of≺ is inductively defined as follows. The introduction axioms are

∀x(F→ Acc(x)),

∀x(∀y≺xAcc(y)→ Acc(x)),

and the elimination axiom is∀x(Acc(x)→ ∀x(F→ Px)→

∀x(∀y≺xAcc(y)→ ∀y≺xPy → Px)→Px).

7.1.6. Totality and induction. In 6.1.7 we have defined what thetotal and structure-total ideals of a finitary algebra are. We now inductivelydefine corresponding predicates Gι and Tι; this inductive definition worksfor arbitrary algebras ι. The least-fixed-point axiom for Tι will provide uswith the induction axiom.

Let us first look at some examples. We already have stated the clausesdefining totality for the algebra N:

TN0, ∀n(TNn→ TN(Sn)).


The least-fixed-point axiom is

∀n(TNn→ P0→ ∀n(TNn→ Pn→ P (Sn))→ Pn).

Clearly the partial continuous functionals with TN interpreted as the totalideals for N provide a model of TCF extended by these axioms.

For the algebra D of derivations totality is inductively defined by theclauses

TD0D, ∀x,y(TDx→ TDy → TD(CD→D→Dxy)),

with least-fixed-point axiom

∀x(TDx→ P0D →∀x,y(TDx→ TDy → Px→ Py → P (CD→D→Dxy))→Px).

Again, the partial continuous functionals with TD interpreted as the totalideals for D (i.e., the finite derivations) provide a model.

As an example of a finitary algebra with parameters consider L(ρ). Theclauses of defining its (full) totality predicate GL(ρ) are

GL(ρ)(nil), ∀x,l(Gρx→ GL(ρ)l→ GL(ρ)(x :: l)),

where Gρ is assumed to be defined already; x :: l is shorthand for cons(x, l).In contrast, the clauses for the predicate T expressing structure-totality are

TL(ρ)(nil), ∀x,l(TL(ρ)l→ TL(ρ)(x :: l)),

with no assumptions on x.Generally, for arbitrary types ρ we inductively define predicates Gρ of

totality and Tρ of structure-totality, by induction on ρ. This definition isrelative to an assignment of predicate variables Gα, Tα of arity (α) to typevariables α.

Definition. In case ι ∈ Alg(~α ) we have ι = µξ(κ0, . . . , κk−1), withκi = ~ρ→ (~σν → ξ)ν<n → ξ. Then Gι := µX(K0, . . . ,Kk−1), with

Ki := ∀~x((∀~yν (G~σν~yν → X(xν~yν)))ν<n → X(Ci~x )).

Similarly, Tι := µX(K ′0, . . . ,K

′k−1), with

K ′i := ∀~x((∀~yν (T~σν~yν → X(xRν ~yν)))ν<nR → X(Ci~x )).

For arrow types the definition is explicit , that is, the clauses have no recur-sive premises but parameter premises only.

Gρ→σ := µX∀f (∀x(Gρx→ Gσ(fx))→ Xf),

Tρ→σ := µX∀f (∀x(Tρx→ Tσ(fx))→ Xf).

This concludes the definition.

In the case of an algebra ι the introduction axioms for Tι are

(∀~yν (T~σν~yν → Tι(xRν ~yν)))ν<n → Tι(Ci~x )

and the elimination axiom is

Tιx→ K0(Tι, P )→ · · · → Kk−1(Tι, P )→ Px,


whereKi(Tι, P ) := ∀~x((∀~yν (T~σν~yν → Tι(xRν ~yν)))ν<n →

(∀~yν (T~σν~yν → P (xRν ~yν)))ν<n → P (Ci~x )).

In the arrow type case, the introduction and elimination axioms are∀x(Tρx→ Tσ(fx))→ Tρ→σf,

Tρ→σf → ∀x(Tρx→ Tσ(fx)).

(The “official” axiom Tρ→σf → (∀x(Tρx → Tσ(fx)) → P ) → P is clearlyequivalent to one stated). Abbreviating ∀x(Tx → A) by ∀x∈T A allows ashorter formulation of these axioms:

(∀~yν∈T~σνTι(xRν ~yν))ν<n → Tι(Ci~x ),

∀x∈Tι(K0(Tι, P )→ · · · → Kk−1(Tι, P )→ Px),

∀x∈TρTσ(fx)→ Tρ→σf,

∀f∈Tρ→σ ,x∈TρTσ(fx))

whereKi(Tι, P ) := ∀~x∈T~ρ

((∀~yν∈T~σνP (xRν ~yν))ν<n → P (Ci~x )).

Hence the elimination axiom T−ι is the induction axiom, and the Ki(Tι, P )are its step formulas. We write Indx,Pι or Indx,P for T−ι , and omit the indicesx, P when they are clear from the context. Examples are

Indp,P : ∀p∈T (P tt→ P ff → PpB),

Indn,P : ∀n∈T (P0→ ∀n∈T (Pn→ P (Sn))→ PnN),

Indl,P : ∀l∈T (P (nil)→ ∀x∀l∈T (Pl→ P (x :: l))→ PlL(ρ)),

Indz,P : ∀z∈T (∀x,y∈TP 〈xρ, yσ〉 → Pzρ×σ),

where x :: l is shorthand for cons(x, l) and 〈x, y〉 for ×+xy.All this can be done similarly for the Gρ. A difference occurs for algebras

with parameters only: for example, list induction then is

∀l∈G(P (nil)→ ∀x,l∈G(Pl→ P (x :: l))→ PlL(ρ)).

Parallel to general recursion, one can also consider general induction,which allows recurrence to all points “strictly below” the present one. Forapplications it is best to make the necessary comparisons w.r.t. a “measurefunction” µ. Then it suffices to use an initial segment of the ordinals insteadof a well-founded set. For simplicity we here restrict ourselves to the segmentgiven by ω, so the ordering we refer to is just the standard <-relation onthe natural numbers. The principle of general induction then is

(7.3) ∀µ,x∈T (ProgµxPx→ Px)

where ProgµxPx expresses “progressiveness” w.r.t. the measure function µand the ordering <:

ProgµxPx := ∀x∈T (∀y∈T ;µy<µxPy → Px).

It is easy to see that in our special case of the <-relation we can prove (7.3)from structural induction. However, it will be convenient to use generalinduction as a primitive axiom.


7.1.7. Coinductive definitions. We now extend TCF by allowingcoinductive definitions as well as inductive ones. For instance, in the al-gebra N we can coinductively define cototality by the clause

coTNn→ Eq(n, 0) ∨ ∃m(Eq(n,Sm) ∧ coTNm).

Its greatest-fixed-point axiom is

Pn→ ∀n(Pn→ Eq(n, 0) ∨ ∃m(Eq(n,Sm) ∧ Pm))→ coTNn.

It expresses that every “competitor” P satisfying the same clause is a subsetof coTN. The partial continuous functionals with coTN interpreted as thecototal ideals for N provide a model of TCF extended by these axioms.The greatest-fixed-point axiom is called the coinduction axiom for naturalnumbers.

Similarly, for the algebra D of derivations cototality is coinductivelydefined by the clause

coTDx→ Eq(x, 0D) ∨ ∃y,z(Eq(x,CD→D→Dyz) ∧ coTDy ∧ coTDz).

Its greatest-fixed-point axiom is

Px→ ∀x(Px→ Eq(x, 0D)∨ ∃y,z(Eq(x,CD→D→Dyz)∧Py ∧Pz))→ coTDx.

The partial continuous functionals with coTD interpreted as the cototal idealsfor D (i.e., the finite or infinite locally correct derivations) provide a model.

For the algebra I of standard rational intervals cototality is defined bycoTIx→ Eq(x, I) ∨ ∃y(Eq(x,C−1y) ∧ coTIy) ∨

∃y(Eq(x,C0y) ∧ coTIy) ∨∃y(Eq(x,C1y) ∧ coTIy).

A model is provided by the set of all finite or infinite streams of signed digitsfrom −1, 0, 1, i.e., the well-known (non-unique) stream representation ofreal numbers.

Generally, a coinductively defined predicate J is given by exactly oneclause, which is of the form

∀~x(J~x→∨∨i<k

∃~yi(∧∧ ~Ai ∧

∧∧ν<ni

∀~yiν( ~Biν → J~siν))).

More precisely, we must extend the definition of formulas and predicates in7.1.2 by (co)clause forms K ∈ coClX(~Y ), and need the additional rules

K ∈ coClX(~Y )

νXK ∈ Preds(~Y ),

~Ai ∈ F(~Y ) ~Bi0, . . . , ~Bini−1 ∈ F (i < k)

∀~x(X~x→∨∨

i<k ∃~yi(∧∧ ~Ai ∧

∧∧ν<ni

∀~yiν( ~Biν → X~siν))) ∈ coClX(~Y )

,

where we require k > 0 and n0 = 0. The letter J will be used for predicates ofthe form νXK, called coinductively defined . For each coinductively definedpredicate, there is a closure axiom

J− : ∀~x(J~x→∨∨i<k

∃~yi(∧∧ ~Ai ∧

∧∧ν<ni

∀~yiν( ~Biν → J~siν)))


and a greatest-fixed-point axiom

J+ : ∀~x(P~x→ ∀~x(P~x→∨∨i<k

∃~yi(∧∧ ~Ai ∧

∧∧ν<ni

∀~yiν( ~Biν → P~siν)))→ J~x ).

Notice that the proof of the Ex-Falso-Quodlibet theorem in 7.1.3 can beeasily extended by a case J~s with J coinductively defined: use the greatest-fixed-point axiom for J with P~x := F. Since k > 0 and n0 = 0 it suffices toprove F→ ∃~yi

∧∧ ~Ai. But this follows from the induction hypothesis.A coinductively defined predicate is finitary if its clause has the form

∀~x(J~x →∨∨

i<k ∃~yi(∧∧ ~Ai ∧ J~si)) (so the ~yiν and ~Biν in the general defini-

tion are all empty). We will often restrict to finitary coinductively definedpredicates only.

The most important coinductively defined predicates for us will be thoseof cototality and structure-cototality; we have seen some examples above.Generally, for a finitary algebra ι cototality and structure-cototality arecoinductively defined by

coGιx→∨∨i<k

∃~yi(Eq(x,Ci~yi) ∧ coGι~yi),

coTιx→∨∨i<k

∃~yPi ,~y

Ri(Eq(x,Ci~y

Pi ~y

Ri ) ∧ coTι~y

Ri ).

Finally we consider simultaneous inductive/coinductive definitions ofpredicates. An example where this comes up is the formalization of anabstract theory of (uniformly) continuous real functions f : I → I whereI := [−1, 1] (cf. 6.1.7); “continuous” is to mean “uniformly continuous”here. Let Cf abbreviate the formula expressing that f is a continuous realfunction, and Ip,k := [p−2−k, p+2−k]. Assume we can prove in the abstracttheory

(7.4) Cf → ∀k∃lBl,kf, with Bl,kf := ∀p∃q(f [Ip,l] ⊆ Iq,k).

The converse is true as well: every real function f satisfying ∀k∃lBl,kf is(uniformly) continuous.

For d ∈ −1, 0, 1 let Id be defined by I−1 := [−1, 0], I0 := [−12 ,

12 ]

and I1 := [0, 1]. We define continuous real functions ind, outd such thatind[I] = Id and outd[Id] = I by

ind(x) :=d+ x

2, outd(x) := 2x− d.

Clearly both functions are inverse to each other.We give an inductive definition of a predicate IY depending on a param-

eter Y by

f [I] ⊆ Id → Y (outd f)→ IY f (d ∈ −1, 0, 1),(7.5)

IY (f in−1)→ IY (f in0)→ IY (f in1)→ IY f.(7.6)

The corresponding least-fixed-point axiom is

IY f → (∀f (f [I] ⊆ Id → Y (outd f)→ Pf))d∈−1,0,1 →∀f ((IY (f ind))d∈−1,0,1 → (P (f ind))d∈−1,0,1 → Pf)→Pf).

(7.7)


Using IY we give a “simultaneous inductive/coinductive” definition of apredicate J by

(7.8) Jf → Eq(f, id) ∨ IJf.The corresponding greatest-fixed-point axiom is

(7.9) Qf → ∀f (Qf → Eq(f, id) ∨ IQf)→ Jf.

We now restrict attention to continuous functions f on the interval Isatisfying f [I] ⊆ I. Define

B′l,kf := ∀p∃q(f [Ip,l ∩ I] ⊆ Iq,k).

Lemma. (a) B′l,k(outd f)→ B′

l,k+1f .(b) Assume B′

ld,k+1(f ind) for all d ∈ −1, 0, 1. Then B′l,k+1(f) with

l := 1 + maxd∈−1,0,1 ld.

Proof. (a). Let p and x be given such that

−1, p− 12l≤ x ≤ p+

12l, 1.

We need q′ such that

q′ − 12k+1

≤ f(x) ≤ q′ + 12k+1

.

By assumption we have q such that

q − 12k≤ 2f(x)− d ≤ q +

12k.

Let q′ := q+d2 .

(b). Let p and x be given such that

−1, p− 12l≤ x ≤ p+

12l, 1.

Then−2, 2p− 1

2max ld≤ 2x ≤ 2p+

12max ld

, 1.

By choosing d ∈ −1, 0, 1 appropriately we can ensure that−1 ≤ 2x−d ≤ 1.Hence

−1, 2p− d− 12ld≤ 2x− d ≤ 2p− d+

12ld, 1.

The assumption B′ld,k+1(f ind) for 2p− d yields q such that

q − 12k+1

≤ f(ind(2x− d)) ≤ q +1

2k+1.

But ind(2x− d) = x.

Proposition. (a) ∀f (Cf → f [I] ⊆ I→ Jf).(b) ∀f (Jf → f [I] ⊆ I→ ∀k∃lB′

l,kf).

Proof. (a) Assume Cf . We use (7.9) with Q := f | Cf ∧ f [I] ⊆ I .Let f be arbitrary; it suffices to show Qf → IQf . Assume Qf , i.e., Cf andf [I] ⊆ I. By (7.4) we have an l such that Bl,2f . We prove ∀l,f (Bl,2f →Cf → f [I] ⊆ I → IQf) by induction on l. Base, l = 0. B0,2f implies thatthere is a rational q such that f [I0,0] ⊆ Iq,2. Because of I0,0 = I, f [I] ⊆ Iand the fact that there is a d such that Iq,2 ∩ I ⊆ Id we have f [I] ⊆ Id and

7.2. REALIZABILITY INTERPRETATION 265

hence (outd f)[I] ⊆ I. Then Q(outd f) since outd f is continuous. HenceIQf by (7.5). Step. Assume l > 0. Then Bl−1,2(f ind) because of Bl,2f ,and clearly f ind is continuous and satisfies (f ind)[I] ⊆ I, for every d. Byinduction hypothesis IQ(f ind). Hence IQf by (7.6).

(b) We prove ∀k∀f (Jf → f [I] ⊆ I → ∃lB′l,kf) by induction on k. Base.

Because of f [I] ⊆ I clearly B′0,0f . Step, k 7→ k+1. Assume Jf and f [I] ⊆ I.

Then Eq(f, id) ∨ IJf by (7.8). In case Eq(f, id) the claim is trivial, sincethen clearly B′

k+1,k+1f . We prove ∀f (IJf → f [I] ⊆ I → ∃lB′l,k+1f) using

(7.7), that is, by a side induction on IJf . Side induction base. Assumef [I] ⊆ Id and J(outd f). We must show f [I] ⊆ I → ∃lB′

l,k+1f . Because off [I] ⊆ Id we have (outd f)[I] ⊆ I. The main induction hypothesis yields anl such that B′

l,k(outd f), hence B′l,k+1f by the lemma. Side induction step.

Assume, as side induction hypothesis, (f ind)[I] ⊆ I→ B′ld,k+1(f ind) for

all d ∈ −1, 0, 1. We must show f [I] ⊆ I → ∃lB′l,k+1f . Assume f [I] ⊆ I.

Then clearly (f ind)[I] ⊆ I. Hence B′ld,k+1(f ind) for all d ∈ −1, 0, 1.

By the lemma this implies B′l,k+1f with l := 1 + maxd∈−1,0,1 ld.

Our general form of simultaneous inductive/coinductive definitions ofpredicates is based on an inductively defined IY with a predicate parame-ter Y ; this is needed to formulate the greatest-fixed-point axiom for thesimultaneously defined J . More precisely, we coinductively define J by

∀~x(J~x→∨∨i<k

∃~yi(∧∧ ~Ai ∧

∧∧ν<ni

∀~yiν( ~Biν → IJ~siν))).

Its greatest-fixed-point axiom then is

J+ : ∀~x(P~x→ ∀~x(P~x→∨∨i<k

∃~yi(∧∧ ~Ai ∧

∧∧ν<ni

∀~yiν( ~Biν → IP~siν)))→ J~x ).

The definition of formulas and predicates in 7.1.2 can easily be adapted, andthe proof of the Ex-Falso-Quodlibet theorem in 7.1.3 extended. A simulta-neous inductive/coinductive definition is finitary if both parts are.

7.2. Realizability Interpretation

We now come to the crucial step of inserting “computational content”into proofs, which can then be extracted. It consists in “decorating” ourconnectives → and ∀, or rather allowing “computational” variants →c and∀c as well as non-computational ones→nc and ∀nc . This distinction (for theuniversal quantifier) is due to Berger (1993b, 2005a). The logical meaningof the connectives is not changed by the decoration. Since we inductivelydefined predicates by means of clauses built with → and ∀, we can nowintroduce computational variants of these predicates. This will give us thepossibility to fine-tune the computational content of proofs.

For instance, the introduction and elimination axioms for the inductivelydefined totality predicate T in the algebra N will be decorated as follows.The clauses are

T0, ∀ncn (Tn→c T (Sn)),

and its elimination axiom is

∀ncn (Tn→c P0→c ∀nc

n (Tn→c Pn→c P (Sn))→c Pn).


If Tr holds, then this fact must have been derived from the clauses, andhence we have a total ideal in an algebra built in correspondence with theclauses, which in this case is N again. The predicate T can be understood asthe least set of witness - argument pairs satisfying the clauses; the witnessbeing a total ideal.

7.2.1. Decorating → and ∀. We adapt the definition in 7.1.2 of pre-dicates and formulas to newly introduced decorated connectives →c,∀c and→nc,∀nc. Let → denote either →c or →nc, and similarly ∀ either ∀c or ∀nc.Then the definition in 7.1.2 can be read as it stands.

We also need to adapt our definition of TCF to the decorated connectives→c,→nc and ∀c,∀nc. The introduction and elimination rules for →c and ∀cremain as before, and also the elimination rules for →nc and ∀nc. However,the introduction rules for →nc and ∀nc must be restricted: the abstracted(assumption or object) variable must be “non-computational”, in the follow-ing sense. Simultaneously with a derivation M we define the sets CV(M)and CA(M) of computational object and assumption variables of M , as fol-lows. Let MA be a derivation. If A is c.i. then CV(MA) := CA(MA) := ∅.Otherwise

CV(cA) := ∅ (cA an axiom),

CV(uA) := ∅,CV((λuAMB)A→

cB) := CV((λuAMB)A→ncB) := CV(M),

CV((MA→cBNA)B) := CV(M) ∪ CV(N),

CV((MA→ncBNA)B) := CV(M)

CV((λxMA)∀cxA) := CV((λxMA)∀

ncx A) := CV(M) \ x,

CV((M∀cxA(x)r)A(r)) := CV(M) ∪ FV(r),

CV((M∀ncx A(x)r)A(r)) := CV(M)

and similarly

CA(cA) := ∅ (cA an axiom),

CA(uA) := ∅,CA((λuAMB)A→

cB) := CA((λuAMB)A→ncB) := CA(MA) \ u,

CA((MA→cBNA)B) := CA(M) ∪ CA(N),

CA((MA→ncBNA)B) := CA(M)

CA((λxMA)∀cxA) := CA((λxMA)∀

ncx A) := CA(M),

CA((M∀cxA(x)r)A(r)) := CA((M∀nc

x A(x)r)A(r)) := CA(M).

The introduction rules for →nc and ∀nc then are

(i) If MB is a derivation and uA /∈ CA(M) then (λuAMB)A→ncB is a

derivation.(ii) If MA is a derivation, x is not free in any formula of a free assumption

variable of M and x /∈ CV(M), then (λxMA)∀ncx A is a derivation.


An alternative way to formulate these rules is simultaneously with the notionof the “extracted term” et(M) of a derivation M . This will be done in 7.2.4.

Formulas can be decorated in many different ways, and it is a naturalquestion to ask when one such decoration A′ is “stronger” than anotherone A, in the sense that the former computationally implies the latter, i.e.,` A′ →c A. We give a partial answer to this question in the propositionbelow.

We define a relation A′ w A (A′ is a computational strengthening of A)between c.r. formulas A′, A inductively. It is reflexive, transitive and satisfies

(A→nc B) w (A→c B),

(A→c B) w (A→nc B) if A is c.i.,

(A→ B′) w (A→ B) if B′ w B, with → ∈ →c,→nc,(A→ B) w (A′ → B) if A′ w A, with → ∈ →c,→nc,∀ncx A w ∀cxA,∀xA′ w ∀xA if A′ w A, with ∀ ∈ ∀c,∀nc.

Proposition. If A′ w A, then ` A′ →c A.

Proof. We show that the relation “` A′ →c A” has the same closureproperties as “A′ w A”. For reflexivity and transitivity this is clear. For therest we give some sample derivations.

A→nc B u : AB (→c)+, u

A→c B

| assumedB′ →c B

A→nc B′ u : AB′

B (→nc)+, uA→nc B

where in the last derivation the final (→nc)+-application is correct since uis not a computational assumption variable in the premise derivation of B.

A→nc B

| assumedA′ →c A u : A′

AB (→nc)+, u

A′ →nc B

where for the same reason the final (→nc)+-application is correct.

Remark. In 7.2.5 we shall define decorated variants ∃d,∃l,∃r,∧d,∧l,∧r

of the existential quantifier and of conjunction. For formulas involving thesethe proposition continues to hold if the definiton of computational strength-ening is extended by

∃dxA w ∃lxA,∃rxA,∃xA′ w ∃xA if A′ w A, with ∃ ∈ ∃d,∃l,∃r,

(A ∧d B) w (A ∧l B), (A ∧r B),

(A′ ∧B) w (A ∧B) if A′ w A, with ∧ ∈ ∧d,∧l,∧r,(A ∧B′) w (A ∧B) if B′ w B, with ∧ ∈ ∧d,∧l,∧r.


7.2.2. Decorating inductive definitions. For the introduction andelimination axioms of computationally relevant (c.r.) inductively definedpredicates I (which is the default case) we can now use arbitrary formu-las; these axioms need to be carefully decorated. In particular, in all clausesthe → after recursive premises must be →c. Generally, the introductionaxioms (or clauses) are

(7.10) ∀~x( ~A→ (∀~yν ( ~Bν → I~sν))ν<n →c I~t ),

and the elimination (or least-fixed-point) axiom is

(7.11) ∀nc~x (I~x→c (Ki(I, P ))i<k →c P~x ),

where

K(I, P ) := ∀~x( ~A→ (∀~yν ( ~Bν → I~sν))ν<n →c

(∀~yν ( ~Bν → P~sν))ν<n →c P~t ).

The decorated form of the general induction schema is

(7.12) ∀cµ∈T∀cx∈T (ProgµxPx→c Px)

withProgµxPx := ∀cx∈T (∀cy∈T (µy < µx→nc Py)→c Px).

We have made use here of totality predicates and the abbreviation ∀cx∈TA;both are introduced in 7.2.6 below.

The next thing to do is to view a formula A as a “computational prob-lem”, as done by Kolmogorov (1932). Then what should be the solutionto the problem posed by the formula I~r, where I is inductively defined?The obvious idea here is to take a “generation tree”, witnessing how thearguments ~r were put into I. For example, consider the clauses Even(0) and∀ncn (Even(n)→c Even(S(Sn))). A generation tree for Even(6) should consist

of a single branch with nodes Even(0), Even(2), Even(4) and Even(6).When we want to generally define this concept of a generation tree,

it seems natural to let the clauses of I determine the algebra to whichsuch trees belong. Hence we will define ιI to be the type µξ(κ0, . . . , κk−1)generated from constructor types κi := τ(Ki), where Ki is the i-th clauseof the inductive definition of I as µX(K0, . . .Kk−1), and τ(Ki) is the typeof the clause Ki, relative to τ(X~r ) := ξ.

More formally, along the inductive definition of formulas, predicates andclauses we will define

(i) the type τ(A) of a formula A, and in particular when A is computa-tionally relevant (c.r.);

(ii) the formula t realizes A, written t r A, for t a term of type τ(A).This will require other subsidiary notions: for a (c.r.) inductively definedI, (i) its associated algebra ιI of witnesses or generating trees, and (ii) awitnessing predicate Ir of arity (ιI , ~ρ ), where ~ρ is the arity of I. All thesenotions are defined simultaneously.

We will also need to allow special computationally irrelevant (c.i.) in-ductively defined predicates. These are the following.

(i) For every I its witnessing predicate Ir. It is special in the sense thatit just states the fact that we do have a realizer for I.


(ii) Leibniz equality Eq, and computationally irrelevant versions ∃nc and∧nc of the existential quantifier and of conjunction. These are specialin the sense that they are defined by just one clause, which contains→nc,∀nc only and has no recursive premises.

Ir, Eq, ∃nc and ∧nc will be defined below.

7.2.3. The type of a formula, realizability and witnesses. Webegin with the definition of the type τ(A) of a formula A, the type of apotential realizer of A. More precisely, τ(A) should be the type of the term(or “program”) to be extracted from a proof of A. Formally, we assign toevery formula A an object τ(A) (a type or the “nulltype” symbol ). In caseτ(A) = proofs of A have no computational content; such formulas A arecalled computationally irrelevant (c.i.) (or Harrop formulas); the other onesare called computationally relevant (c.r.). The definition can be convenientlywritten if we extend the use of ρ→ σ to the nulltype symbol :

(ρ→ ) := , ( → σ) := σ, ( → ) := .With this understanding of ρ→ σ we can simply write

Definition (Type τ(A) of a formula A).

τ(I~r ) :=

ιI if I is c.r. if I is c.i.

τ(A→c B) := (τ(A)→ τ(B)), τ(A→nc B) := τ(B),

τ(∀cxρA) := (ρ→ τ(A)), τ(∀ncxρA) := τ(A).

We now define realizability . It will be convenient to introduce a special“nullterm” symbol ε to be used as a “realizer” for c.i. formulas. We extendterm application to the nullterm symbol by

εt := ε, tε := t, εε := ε.

The definition uses the witnessing predicate Ir associated with I, which isintroduced below.

Definition (t realizes A). Let A be a formula and t either a term oftype τ(A) if the latter is a type, or the nullterm symbol ε for c.i. A.

t r I~s := Irt~s,

t r (A→c B) := ∀ncx (x r A →nc tx r B),

t r (A→nc B) := ∀ncx (x r A →nc t r B),

t r (∀cxA) := ∀ncx (tx r A),

t r (∀ncx A) := ∀nc

x (t r A).

In case A is c.i., ∀ncx (x r A →nc B(x)) means ε r A →nc B(ε). For the

special c.i. inductively defined predicates realizability is defined by

ε r Irt~s := Irt~s,

ε r Eq(t, s) := Eq(t, s),

ε r ∃ncx A := ∃nc

x,y(y r A),

ε r (A ∧nc B) := ∃ncx (x r A) ∧nc ∃nc

y (y r B),


Note. Call two formulas A and A′ computationally equivalent if eachof them computationally implies the other, and in addition the identityrealizes each of the two derivations of A′ →c A and of A →c A′. It is aneasy exercise to verify that for c.i. A, the formulas A →c B and A →nc Bare computationally equivalent, and hence can be identified. In the sequelwe shall simply write A→ B for either of them. Similarly, for c.i. A the twoformulas ∀cxA and ∀nc

x A are c.i., and both ε r ∀cxA and ε r ∀ncx A are defined

to be ∀ncx (ε r A). Hence they can be identified as well, and we shall simply

write ∀xA for either of them. Since the formula t r A is c.i., under thisconvention the →,∀-cases in the definition of realizability can be written

t r (A→c B) := ∀x(x r A → tx r B),

t r (A→nc B) := ∀x(x r A → t r B),

t r (∀cxA) := ∀x(tx r A),

t r (∀ncx A) := ∀x(t r A).

For every c.r. inductively defined predicate I = µX(K0, . . .Kk−1) wedefine the algebra ιI of its generation trees or witnesses.

Definition (Algebra ιI of witnesses). Each clause generates a construc-tor type κi := τ(Ki), relative to τ(X~r ) := ξ. Then ιI := µξ~κ.

The witnessing predicate Ir of arity (ιI , ~ρ ) can now be defined as follows.

Definition (Witnessing predicate Ir). For every clause

K = ∀~x( ~A→ (∀~yν ( ~Bν → X~sν))ν<ni →c X~t )

of the original inductive definition of I we require the introduction axiom

∀~x,~u, ~f

(~u r ~A→ (∀~yν ,~vν (~vν r ~Bν → Ir(fν~yν~vν , ~sν)))ν<n →

Ir(C~x~u~f,~t ))(7.13)

with the understanding that(i) only those xj with a computational ∀cxj

in K, and(ii) only those ui with Ai c.r. and followed by →c in K

occur as arguments in C~x~u~f ; similarly for ~yν , ~vν and fν~yν~vν . Here C is theconstructor of the algebra ιI generated from the constructor type τ(K).

Notice that in the clause K above →,∀ can be either of →c or →nc

and ∀c or ∀nc, depending on how the clause is formulated. However, in theintroduction axiom (7.13) all displayed →,∀ mean →nc,∀nc, according toour convention in the note above.

The elimination axiom is

(7.14) ∀nc~x ∀

cw(Irw~x→ (Kr

i (Ir, Q))i<k →c Qw~x )

with

Kri (I

r, Q) := ∀nc~x,~u, ~f


(∀c~yν ,~vν(~vν r ~Bν → Q(fν~yν~vν , ~sν)))ν<n →c

Q(Ci~x~u~f,~t )).


To understand this definition one needs to look at examples. Considerthe totality predicate T for N inductively defined by the clauses

T0, ∀ncn (Tn→c T (Sn)).

More precisely T := µX(K0,K1) with K0 := X0, K1 := ∀ncn (Xn→c X(Sn)).

These clauses have types κ0 := τ(K0) = τ(X0) = ξ and κ1 := τ(K1) =τ(∀nc

n (Xn →c X(Sn))) = ξ → ξ. Therefore the algebra of witnesses isιT := µξ(ξ, ξ → ξ), that is, N again. The witnessing predicate T r is definedby the clauses

T r00, ∀n,m(T rmn→ T r(Sm,Sn))and it has as its elimination axiom

∀ncn ∀cm(T rmn→ Q(0, 0)→c

∀ncn,m(T rmn→ Qmn→c Q(Sm,Sn))→c

Qmn.

As an example involving parameters consider the formula ∃dxA with ac.r. formula A, and view ∃dxA as inductively defined by the clause

∀cx(A→c ∃dxA).

More precisely, Exd(Y ) := µX(K0) with K0 := ∀cx(Y xρ →c X). Then ∃dxAabbreviates Exd(xρ | A ). The single clause has type κ0 := τ(K0) =τ(∀cx(Y xρ →c X)) = ρ → α → ξ. Therefore the algebra of witnesses isι := ι∃d

xA:= µξ(ρ → α → ξ), that is, ρ × α. We write 〈x, u〉 for the values

of the (only) constructor of ι, i.e., the pairing operator. The witnessingpredicate (∃dxA)r is defined by the clause Kr

0((∃dxA)r, xρ | A ) :=

∀x,u(u r A→ (∃dxA)r〈x, u〉)and its elimination axiom is

∀cw((∃dxA)rw → ∀ncx,u(u r A→ Q〈x, u〉)→c Qw).

Definition (Leibniz equality Eq and ∃nc, ∧nc). The introduction ax-ioms are

∀ncx Eq(x, x), ∀nc

x (A→nc ∃ncx A), A→nc B →nc A ∧nc B,

and the elimination axioms are

∀ncx,y(Eq(x, y)→ ∀nc

x Pxx→c Pxy),

∃ncx A→ ∀nc

x (A→nc P )→c P,

A ∧nc B → (A→nc B →nc P )→c P.

An important property of the realizing formulas t r A is that they areinvariant , in the following sense:

Proposition. ε r (t r A) is the same formula as t r A.

Proof. By induction on the simultaneous inductive definition of for-mulas and predicates in 7.1.2.

Case t r I~s. By definition the formulas ε r (t r I~s ), ε r Irt~s, Irt~s andt r I~s are identical.

Case Irt~s. By definition ε r (ε r Irt~s) and ε r Irt~s are identical.


Case Eq(t, s). By definition ε r (ε r (Eq(t, s))) and ε r (Eq(t, s)) areidentical.

Case ∃ncx A. The following formulas are identical.

ε r (ε r ∃ncx A)

ε r ∃ncx ∃nc

y (y r A)

∃ncx (ε r ∃nc

y (y r A))

∃ncx ∃nc

y (ε r (y r A))

∃ncx ∃nc

y (y r A) by induction hypothesisε r ∃nc

x A.

Case A ∧nc B. The following formulas are identical.

ε r (ε r (A ∧nc B))

ε r (∃ncx (x r A) ∧nc ∃nc

y (y r B))

ε r ∃ncx (x r A) ∧nc ε r ∃nc

y (y r B))

∃ncx (ε r (x r A)) ∧nc ∃nc

y (ε r (y r B))

∃ncx (x r A) ∧nc ∃nc

y (y r B) by induction hypothesis

ε r (A ∧nc B).

Case A→c B. The following formulas are identical.

ε r (t r (A→c B))

ε r ∀x(x r A → tx r B)

∀x(ε r (x r A) → ε r (tx r B))

∀x(x r A → tx r B) by induction hypothesis

t r (A→c B).

Case A→nc B. The following formulas are identical.

ε r (t r (A→nc B))

ε r ∀x(x r A → t r B)

∀x(ε r (x r A) → ε r (t r B))

∀x(x r A → t r B) by induction hypothesis

t r (A→nc B).

Case ∀cxA. The following formulas are identical.

ε r (t r ∀cxA)

ε r ∀x(tx r A)

∀x(ε r (tx r A))

∀x(tx r A) by induction hypothesist r ∀cxA.

Case ∀ncx A. The following formulas are identical.

ε r (t r ∀ncx A)

ε r ∀x(t r A)


∀x(ε r (t r A))

∀x(t r A) by induction hypothesist r ∀nc

x A.


7.2.4. Extracted terms. For a derivation M of a formula A we defineits extracted term et(M), of type τ(A). This definition is relative to a fixedassignment of object variables to assumption variables: to every assumptionvariable uA for a formula A we assign an object variable xu of type τ(A).

Definition (Extracted term et(M) of a derivation M). For derivationsMA with A c.i. let et(MA) := ε. Otherwise

et(uA) := xτ(A)u (xτ(A)

u uniquely associated with uA),

et((λuAMB)A→cB) := λ

xτ(A)u

et(M),

et((MA→cBNA)B) := et(M)et(N),

et((λxρMA)∀cxA) := λxρet(M),

et((M∀cxA(x)r)A(r)) := et(M)r,

et((λuAMB)A→ncB) := et((MA→ncBNA)B) := et((λxρMA)∀

ncx A)

:= et((M∀ncx A(x)r)A(r)) := et(M).

Here λx

τ(A)u

et(M) means just et(M) if A is c.i.It remains to define extracted terms for the axioms. Consider a (c.r.)

inductively defined predicate I. For its introduction axioms (7.1) and elim-ination axiom (7.2) define

et(I+i ) := Ci, et(I−) := R,

where both the constructor Ci and the recursion operator R refer to thealgebra ιI associated with I.

Now consider the special computationally irrelevant inductively definedpredicates. Since they are c.i., we only need to define extracted termsfor their elimination axioms. For the witnessing predicate Ir we defineet((Ir)−) := R (referring to the algebra ιI again), and for Leibniz equalityEq, the c.i. existential quantifier ∃nc

x A and conjunction A ∧nc B we takeidentities of the appropriate type.

If derivations M are defined simultaneously with their extracted termset(M), we can formulate the introduction rules for →nc and ∀nc by

(i) If MB is a derivation and xuA /∈ FV(et(M)), then (λuAMB)A→ncB is

a derivation.(ii) If MA is a derivation, x is not free in any formula of a free assumption

variable of M and x /∈ FV(et(M)), then (λxMA)∀ncx A is a derivation.

7.2.5. Computational variants of some inductively defined pre-dicates. We can now define variants of the inductively defined predicates in7.1.4 and 7.1.5, which take computational aspects into account. For ∃ and ∧we obtain ∃d,∃l,∃r,∃nc,∧d,∧l,∧r,∧nc, with d for “double”, l for “left” and rfor “right”. They are defined by their introduction and elimination axioms,


which involve both→c,∀c and→nc,∀nc. For ∃nc and ∧nc these have alreadybeen defined (in 7.2.3). For the remaining ones they are

∀cx(A→c ∃dxA),

∀cx(A→nc ∃lxA),

∀ncx (A→c ∃rxA),

∃dxA→c ∀cx(A→c P )→c P,

∃lxA→c ∀cx(A→nc P )→c P,

∃rxA→c ∀ncx (A→c P )→c P,

and similar for ∧:

A→c B →c A ∧d B,

A→c B →nc A ∧l B,

A→nc B →c A ∧r B,

A ∧d B →c (A→c B →c P )→c P,

A ∧l B →c (A→c B →nc P )→c P,

A ∧r B →c (A→nc B →c P )→c P.

Let ≺ be a binary relation. A computational variant of the inductivelydefined transitive closure of ≺ has introduction axioms

∀cx,y(x ≺ y →nc TC(x, y)),

∀cx,y∀ncz (x ≺ y →nc TC(y, z)→c TC(x, z)),

and the elimination axiom is according to (7.11)

∀ncx,y(TC(x, y)→c ∀cx,y(x ≺ y →nc Pxy)→c

∀cx,y∀ncz (x ≺ y →nc TC(y, z)→c Pyz →c Pxz)→c

Pxy).

Consider the accessible part of a binary relation ≺. A computationalvariant is determined by the introduction axioms

∀cx(F→ Acc(x)),

∀ncx (∀cy≺xAcc(y)→c Acc(x)),

where ∀cy≺xA stands for ∀cy(y ≺ x→nc A). The elimination axiom is

∀ncx (Acc(x)→c ∀cx(F→ Px)→c

∀ncx (∀cy≺xAcc(y)→c ∀cy≺xPy →c Px)→c

Px).

A computable variant of the derivability predicate from 7.1.5 has theintroduction axioms

∀cx(A(x)→nc Der(x)),

∀ncx,y∀cz(B(x, y, z)→nc Der(x)→c Der(y)→c Der(z)).

The elimination axiom is according to (7.11)

∀ncx (Der(x)→c ∀cx(A(x)→nc Px)→c

∀ncx,y∀cz(B(x, y, z)→nc Der(x)→c Der(y)→c

Px→c Py)→c Pz)→c

Px).


7.2.6. Computational variants of totality and induction. Wenow adapt the treatment of totality and induction in 7.1.6 to the decoratedconnectives →c,∀c and →nc,∀nc, giving computational variants of totality.Their elimination axiom provides us with a computational induction axiom,whose extracted term is the recursion operator of the corresponding algebra.

Recall that the definition of the totality predicates Tρ was relative to agiven assignment α 7→ Tα of predicate variables to type variables. In thedefinition of Tι the clauses are decorated by

Ki := ∀c~xP ∀nc~xR((∀nc

~yν(T~σν~yν →

c X(xRν ~yν)))ν<n →c X(Ci~x )),

and in the arrow type case the (explicit) clause is decorated by

Tρ→σ := µX∀ncf (∀nc

x (Tρx→c Tσ(fx))→c Xf).

Abbreviating ∀ncx (Tx →c A) by ∀cx∈T A allows a shorter formulation of the

introduction axioms and elimination schemes:∀ncf (∀cx∈Tρ

Tσ(fx)→c Tρ→σf),

∀cf∈Tρ→σ ,x∈TρTσ(fx),

∀c~xP ∀nc~xR((∀c~yν∈T~σν

Tι(xRν ~yν))ν<n →c Tι(Ci~x )),

∀cx∈Tι(K0(Tι, P )→c . . .→c Kk−1(Tι, P )→c Px)

where Ki(Tι, P ) :=

∀c~xP ∀nc~xR((∀c~yν∈T~σν

T (xRν ~yν))ν<n →c (∀c~yν∈T~σνP (xRν ~yν))ν<n →c P (Ci~x )).

It is helpful to look at some examples. Let (Tι)+i denote the i-th introductionaxiom for Tι.

(TN)+1 : ∀cn∈TT (Sn),

(TL(ρ))+1 : ∀cx∀cl∈TT (x :: l),

(Tρ×σ)+0 : ∀cx,yT 〈x, y〉.

The elimination axiom T−ι now is the computational induction axiom, andis denoted accordingly. Examples are

Indp,P : ∀cp∈T (P tt→c P ff →c PpB),

Indn,P : ∀cn∈T (P0→c ∀cn∈T (Pn→c P (Sn))→c PnN),

Indl,P : ∀cl∈T (P (nil)→c ∀cx∀cl∈T (Pl→c P (x :: l))→c PlL(ρ)),

Indz,P : ∀cz∈T (∀cx,yP 〈xρ, yσ〉 →c Pzρ×σ),

Notice that for the totality predicates Tρ the type τ(Tρr) is ρ, providedwe extend the definition of τ(A) to the predicate variable Tα assigned totype variable α by stipulating τ(Tαr) := α. This can be proved easily, byinduction on ρ. As a consequence, the types of the various computationalinduction schemes are, with τ := τ(A)

τ(Indp,A) = B→ τ → τ → τ,

τ(Indn,A) = N→ τ → (N→ τ → τ)→ τ,

τ(Indl,A) = L(ρ)→ τ → (ρ→ L(ρ)→ τ → τ)→ τ,

τ(Indx,A) = ρ+ σ → (ρ→ τ)→ (σ → τ)→ τ,


τ(Indz,A) = ρ× σ → (ρ→ σ → τ)→ τ.

These are the types of the corresponding recursion operators.The type of general induction (7.12) is

(α→ N)→ α→ (α→ (α→ τ)→ τ)→ τ,

which is the type of the general recursion operator F defined in (6.5).

7.2.7. Soundness. We prove that every theorem in TCF + Ax has arealizer: the extracted term of its proof. Here (Ax) is an arbitrary set ofcomputationally irrelevant formulas viewed as axioms. Since the extensio-nality axiom is c.i., E-LID is covered as well.

Theorem (Soundness). Let M be a derivation of A from assumptionsui : Ci (i < n). Then we can derive et(M) r A from assumptions xui r Ci(with xui := ε in case Ci is c.i.).

If not stated otherwise, all derivations are in TCF + Ax. The proof isby induction on M .

Proof for the logical rules. Case u : A. Then et(u) = xu.Case (λuAMB)A→

cB. We must find a derivation of

et(λuM) r (A→c B), which is ∀x(x r A→ et(λuM)x r B),

Recall that et(λuM) = λxuet(M). Renaming x into xu, our goal is to find aderivation of

∀xu(xu r A → et(M) r B),since we identify terms with the same β-normal form. But by inductionhypothesis we have a derivation of et(M) r B from xu r A.

Case MA→cBNA. We must find a derivation of et(MN) r B. Recallet(MN) = et(M)et(N). By induction hypothesis we have derivations of

et(M) r (A→c B), which is ∀x(x r A → et(M)x r B)

and of et(N) r A. Hence the claim.Case (λxMA)∀

cxA. We must find a derivation et(λxM) r ∀cxA. By defi-

nition et(λxM) = λxet(M). Hence we must derive

λxet(M) r ∀cxA, which is ∀x((λxet(M))x r A).

Since we identify terms with the same β-normal form, the claim follows fromthe induction hypothesis.

Case M∀cxA(x)t. We must find a derivation of et(Mt) r A(t). By defi-

nition et(Mt) = et(M)t, and by induction hypothesis we have a derivationof

et(M) r ∀cxA(x), which is ∀x(et(M)x r A(x)).Hence the claim.

Case (λuAMB)A→ncB. We must find a derivation of et(M) r (A→nc B),

i.e., of ∀y(y r A → et(M) r B). But this is immediate from the inductionhypothesis.

Case MA→ncBNA. We must find a derivation of et(M) r B. By induc-tion hypothesis we have derivations of

et(M) r (A→nc B), which is ∀y(y r A→ et(M) r B),


and of et(N) r A. Hence the claim.Case (λxMA)∀

ncx A. We must find a derivation et(λxM) r ∀nc

x A. Bydefinition et(λxM) = et(M). Hence we must derive

et(M) r ∀ncx A, which is ∀x(et(M) r A).

But this follows from the induction hypothesis.Case M∀nc

x A(x)t. We must find a derivation of et(Mt) r A(t). By def-inition et(Mt) = et(M), and by induction hypothesis we have a derivationof

et(M) r ∀ncx A(x), which is ∀x(et(M) r A(x)).

Hence the claim.

It remains to prove the soundness theorem for the axioms, i.e., that theirextracted terms are realizers. Before doing anything general let us first lookat an example. Totality for N has been inductively defined by the clauses

T0, ∀ncn (Tn→c T (Sn)).

Its elimination axiom is

∀ncn (Tn→c P0→c ∀nc

n (Tn→c Pn→c P (Sn))→c Pn),

We show that their extracted terms 0, S and R are indeed realizers. For theproof recall from the examples in 7.2.3 that the witnessing predicate T r isdefined by the clauses

T r00, ∀n,m(T rmn→ T r(Sm,Sn))

and it has as its elimination axiom

∀ncn ∀cm(T rmn→ Q00→c

∀ncn,m(T rmn→ Qmn→c Q(Sm,Sn))→c

Qmn).

Lemma. (a) 0 r T0 and S r ∀ncn (Tn→c T (Sn)).

(b) R r ∀ncn (Tn→c P0→c ∀nc

n (Tn→c Pn→c P (Sn))→c Pn).

Proof. (a). 0 r T0 is defined to be T r00. Moreover, by definitionS r ∀nc

n (Tn→c T (Sn)) unfolds into ∀n,m(T rmn→ T r(Sm,Sn)).(b). Let n,m be given and assume m r Tn. Let further w0, w1 be such

that w0 r P0 and w1 r ∀ncn (Tn→c Pn→c P (Sn)), i.e.,

∀n,m(T rmn→ ∀g(g r Pn→ w1mg r P (Sn))).

Our goal isRmw0w1 r Pn =: Qmn.

To this end we use the elimination axiom for T r above. Hence it sufficesto prove its premises Q00 and ∀nc

n,m(T rmn → Qmn →c Q(Sm,Sn)). By aconversion rule for R (cf. 6.2.2) the former is the same as w0 r P0, whichwe have. For the latter assume n,m and its premises. We show Q(Sm,Sn),i.e., R(Sm)w0w1 r P (Sn). By a conversion rule for R this is the same as

w1m(Rmw0w1) r P (Sn).

But with g := Rmw0w1 this follows from what we have.


Proof for the axioms. We first prove soundness for introduction andelimination axioms of c.r. inductively defined predicates, and show that theextracted terms defined above indeed are realizers. The proof uses the in-troduction axioms (7.13) and the elimination axiom (7.14) for Ir.

By the clauses (7.13) for Ir we clearly have Ci r I+i . For the elimination

axiom we have to prove R r I−, that is,

R r ∀nc~x (I~x→c (Ki(I, P ))i<k →c P~x ).

Let ~x,w be given and assume w r I~x. Let further w0, . . . , wk−1 be such thatwi r Ki(I, P ). For simplicity we assume that all universal quantifiers andimplications in Ki are computational. Then wi r Ki(I, P ) is

∀~x,~u, ~f,~g

(~u r ~A→ (∀~yν ,~vν (~vν r ~Bν → fν~yν~vν r I(~sν)))ν<n →

(∀~yν ,~vν (~vν r ~Bν → gν~yν~vν r P (~sν)))ν<n →

wi~x~u~f~g r P (~t )).

(7.15)

Our goal is

Rw~w r P~x =: Qw~x.

We use the elimination axiom (7.14) for Ir with Q(w, ~x ), i.e.,

∀nc~x ∀

cw(Irw~x→ (Kr

i (Ir, Q))i<k →c Qw~x ).

Hence it suffices to prove Kri (I

r, Q) for every constructor formula Ki, i.e.,

∀nc~x,~u, ~f


(∀c~yν ,~vν(~vν r ~Bν → Q(fν~yν~vν , ~sν)))ν<n →c

Q(Ci~x~u~f,~t )).

(7.16)

So assume ~x, ~u, ~f and the premises of (7.16). We show Q(Ci~x~u~f,~t ), i.e.,

R(Ci~x~u~f )~w r P (~t ).

By the conversion rules for R (cf. 6.2.2) this is the same as

wi~x~u~f(λ~yν ,~vνR(fν~yν~vν)~w)ν<n r P (~t ).

To this end we use (7.15) with ~x, ~u, ~f, (λ~yν ,~vνR(fν~yν~vν)~w)ν<n. Its conclusionis what we want, and its premises follow from the premises of (7.16).

We still need to attend the special c.i. inductively defined predicatesIr,Eq,∃nc and ∧nc. For Ir we must show that ε realizes the introductionaxiom (7.13) andR realizes the elimination axiom (7.14). The former followsfrom the very same axiom, using the invariance of realizing formulas (asproved in the proposition at the end of 7.2.3). For the latter we can arguesimilarly as for the proof of R r I− above. However, we carry this out, sincethe way the decorations work is rather delicate here.

We have to prove R r (Ir)−, that is,

R r ∀nc~x ∀

cw(Irw~x→ (Kr

i (Ir, Q))i<k →c Qw~x )


with Kri (I

r, Q) as in (7.16). Let ~x,w be given and assume Irw~x. Let furtherw0, . . . , wk−1 be such that wi r Kr

i (Ir, P ), i.e.,

∀~x,~u, ~f,~g


(∀~yν ,~vν (~vν r ~Bν → gν~yν~vν r Q(fν~yν~vν , ~sν)))ν<n →

wi~x~u~f~g r Q(Ci~x~u~f,~t )).

(7.17)

Our goal isRw~w r Qw~x =: Q′w~x.

We use the elimination axiom (7.14) for Ir with Q′w~x, i.e.,

∀nc~x ∀

cw(Irw~x→ (Kr

i (Ir, Q′))i<k →c Q′w~x ).

Hence it suffices to prove Kri (I

r, Q′) for every constructor formula Ki, i.e.,

∀nc~x,~u, ~f


(∀c~yν ,~vν(~vν r ~Bν → Q′(fν~yν~vν , ~sν)))ν<n →c

Q′(Ci~x~u~f,~t )).

(7.18)

So assume ~x, ~u, ~f and the premises of (7.18). We show Q′(Ci~x~u~f,~t ), i.e.,

R(Ci~x~u~f )~w r Q(Ci~x~u~f,~t ).

By the conversion rules for R this is the same as

wi~x~u~f(λ~yν ,~vνR(fν~yν~vν)~w)ν<n r Q(Ci~x~u~f,~t ).

To this end we use (7.17) with ~x, ~u, ~f, (λ~yν ,~vνR(fν~yν~vν)~w)ν<n. Its conclusionis what we want, and its premises follow from the premises of (7.18).

It remains to consider the introduction and elimination axioms for Eq,∃nc and ∧nc. We first prove that ε is a realizer for the introduction axioms.The following formulas are identical by definition, and the final one in eachblock is derivable:

ε r ∀ncx Eq(x, x)

∀x(ε r Eq(x, x))

∀xEq(x, x)

ε r ∀ncx (A→nc ∃nc

x A)

∀x(ε r (A→nc ∃ncx A))

∀x,y(y r A→ ε r ∃ncx A)

∀x,y(y r A→ ∃ncx,y(y r A))

ε r (A→nc B →nc A ∧nc B)

∀x(x r A→ ∀y(y r B → ε r (A ∧nc B)))

∀x(x r A→ ∀y(y r B → ∃ncx (x r A) ∧nc ∃nc

y (y r B)))

We now prove that the identity is a realizer for the elimination axioms.Again the formulas in each block are identical by definition, and the finalone is derivable.

id r ∀ncx,y(Eq(x, y)→ ∀nc

x Pxx→c Pxy),

∀x,y(ε r Eq(x, y)→ id r (∀ncx Pxx→c Pxy))

∀x,y(Eq(x, y)→ ∀z(z r ∀ncx Pxx→ z r Pxy))

∀x,y(Eq(x, y)→ ∀z(∀x(z r Pxx)→ z r Pxy))


id r (∃ncx A→nc ∀nc

x (A→nc P )→c P )

ε r ∃ncx A→ id r (∀nc

x (A→nc P )→c P )

∃ncx,y(y r A)→ ∀z(z r ∀nc

x (A→nc P )→ z r P )

∃ncx,y(y r A)→ ∀z(∀x(z r (A→nc P ))→ z r P )

∃ncx,y(y r A)→ ∀z(∀x(∀y(y r A→ z r P ))→ z r P )

id r (A ∧nc B →nc (A→nc B →nc P )→c P )

ε r (A ∧nc B)→ id r ((A→nc B →nc P )→c P )


y (y r B)→ ∀z(z r (A→nc B →nc P )→ z r P )


y (y r B)→ ∀z(∀x(x r A→ ∀y(y r B → z r P ))→ z r P )

We finally show that general recursion provides a realizer for generalinduction. Recall that according to (7.12) general induction is the schema

∀cµ∈T∀cx∈T (ProgµxPx→c Px)

where ProgµxPx expresses “progressiveness” w.r.t. the measure function µand the ordering <:

ProgµxPx := ∀cx∈T (∀cy∈T (µy < µx→nc Py)→c Px).

We need to showF r ∀cµ,x∈T (ProgµxPx→c Px),

that is,

∀cµ,x∈T∀ncg (g r ∀cx∈T (∀cy∈T ;µy<µxPy →c Px)→ Fµxg r Px).

Fix µ, x, g and assume the premise, which unfolds into

(7.19) ∀ncx∈T,f (∀nc

y∈T ;µy<µx(fy r Py)→nc gxf r Px).

We have to show Fµxg r Px. To this end we use an instance of generalinduction with the formula Fµxg r Px, that is,

∀cµ,x∈T (∀cx∈T (∀cy∈T ;µy<µx(Fµyg r Py)→c Fµxg r Px)→c Fµxg r Px).

It suffices to prove the premise. Assume ∀cy∈T ;µy<µx(Fµyg r Py) for a fixedx ∈ T . We must show Fµxg r Px. Recall that by definition (6.5)

Fµxg = gxf0 with f0 := λy[if µy < µx then Fµyg else ε].

Hence we can apply (7.19) to x, f0, and it remains to show

∀ncy∈T ;µy<µx(f0y r Py).

Fix y ∈ T with µy < µx. Then f0y = Fµyg, and by the last assumption wehave Fµyg r Py.

7.2.8. An example: list reversal. We first give an informal existenceproof for list reversal. Write vw for the result v ∗ w of appending the listw to the list v, vx for the result v ∗ x: of appending the one element list x:to the list v, and xv for the result x :: v of constructing a list by writingan element x in front of a list v, and omit the parentheses in R(v, w) for(typographically) simple arguments. Assume

InitRev : R(nil,nil),(7.20)

GenRev : ∀v,w,x(Rvw → R(vx, xw))(7.21)


We view R as a predicate variable without computational content. Thereader should not be confused: of course these formulas involving R doexpress how a computation of the reverted list should proceed. However,the predicate variable R itself is a placeholder for a c.i. formula.

A straightforward proof of ∀v∈T∃w∈TRvw proceeds as follows. We firstprove a lemma ListInitLastNat stating that every non-empty list can bewritten in the form vx. Using it, ∀v∈T∃w∈TRvw can be proved by inductionon the length of v. In the step case, our list is non-empty, and hence can bewritten in the form vx. Since v has smaller length, the induction hypothesisyields its reversal w. Then we can take xw.

Here is the term neterm (for “normalized extracted term”) extractedfrom a formalization of this proof, with variable names f for unary functionson lists and p for pairs of lists and numbers:[x0](Rec nat=>list nat=>list nat)x0([v2](Nil nat))([x2,f3,v4][if v4(Nil nat)([x5,v6][let p7 (cListInitLastNat v6 x5)

(right p7::f3 left p7)])])

where the square brackets in [x] is a notation for λ-abstraction λx. Theterm contains the constant cListInitLastNat denoting the content of theauxiliary proposition, and in the step the function defined recursively callsitself via f3. The underlying algorithm defines an auxiliary function g by

g(0, v) := nil,

g(n+ 1,nil) := nil,

g(n+ 1, xv) := let wy = xv in y :: g(n,w)

and gives the result by applying g to lh(v) and v. It clearly takes quadratictime. To run this algorithm one has to normalize (via “nt”) the term ob-tained by applying neterm to the length of a list and the list itself, prettyprint the result (via “pp”):(animate "ListInitLastNat")(animate "Id")(pp (nt (mk-term-in-app-form

neterm (pt "4") (pt "1::2::3::4:"))))

The returned value is the reverted list 4::3::2::1:. We have made use hereof a mechanism to “animate” or “deanimate” lemmata, or more preciselythe constants that denote their computational content. This method can bedescribed generally as follows. Suppose a proof of a theorem uses a lemma.Then the proof term contains just the name of the lemma, say L. In theterm extracted from this proof we want to preserve the structure of theoriginal proof as much as possible, and hence we use a new constant cLat those places where the computational content of the lemma is needed.When we want to execute the program, we have to replace the constant cLcorresponding to a lemma L by the extracted program of its proof. This canbe achieved by adding computation rules for cL. We can be rather flexible


here and enable/block rewriting by using animate/deanimate as desired.To obtain the let expression in the term above, we have used implicitelythe “identity lemma” Id : P → P ; its realizer has the form λf,x(fx). IfId is not animated, the extracted term has the form cId(λxM)N , which isprinted as [let x N M ].

We shall later (in 7.5.2) come back to this example. It will turn outthat the method of “refined A-translation” (treated in section 7.3) appliedto a weak existence proof (of ∀v∈T ∃w∈TRvw rather than ∀v∈T∃w∈TRvw)together with decoration will make it possible to extract the usual linear listreversal algorithm from a proof.

7.2.9. Computational content for coinductive definitions. Wenow extend the insertion of computational content to the axioms for coin-ductively defined predicates.

Consider for example the coinductive definition of cototality for the al-gebra N in 7.1.7. Taking computational content into account, it is decoratedby

∀ncn (coTNn→c Eq(n, 0) ∨ ∃rm(Eq(n,Sm) ∧ coTNm)).

Its decorated greatest-fixed-point axiom is

∀ncn (Pn→c ∀nc

n (Pn→c Eq(n, 0) ∨ ∃rm(Eq(n,Sm) ∧ Pm))→c coTNn).

If coTNr holds, then by the clause we have a cototal ideal in an algebra builtin correspondence with the clause, which in this case again is (an isomorphiccopy of) N. The predicate coTN can be understood as the greatest set ofwitness - argument pairs satisfying the clause, the witness being a cototalideal.

Let us also reconsider the example at the end of 6.2.3 concerning “ab-stract” reals, having an unspecified type ρ. Let Rx abbreviate “x is a real in[−1, 1]”, and assume that we have a type σ for rationals, and a predicate Qsuch that Qp means “p is a rational in [−1, 1]”. To formalize the argument,we assume that in the abstract theory we can prove that every real can becompared with a proper interval with rational endpoints:

(7.22) ∀cx∈R;p,q∈Q(p < q → x ≤ q ∨ p ≤ x).

We coinductively define a predicate J of arity (ρ) by the clause

∀ncx (Jx→c Eq(x, 0) ∨ ∃ry(Eq(x,

y − 12

) ∧ Jy) ∨

∃ry(Eq(x,y

2) ∧ Jy) ∨

∃ry(Eq(x,y + 1

2) ∧ Jy)).

(7.23)

Notice that this clause has the same form as the definition of cototality coTI

for I in 7.1.7; in particular, its associated algebras (defined below) are thesame. The only difference is that the arity of coTI is (I), whereas the arityof J is (ρ), with ρ the unspecified type of reals. This makes it possible toextract computational content (w.r.t. a stream representation) from proofs


in an abstract theory of reals. — The greatest-fixed-point axiom for J is

∀ncx (Px→c ∀nc

x (Px→c Eq(x, 0) ∨ ∃ry(Eq(x,y − 1

2) ∧ Py) ∨

∃ry(Eq(x,y

2) ∧ Py) ∨

∃ry(Eq(x,y + 1

2) ∧ Py))→c Jx).

(7.24)

The types of (7.23) and (7.24) are

ι→ U + ι+ ι+ ι, τ → (τ → U + τ + τ + τ)→ ι,

respectively, with ι the algebra associated with this clause (which in fact isI), and τ := τ(Pr). Note that the former is the type of the destructor for ι,and the latter is the type of the corecursion operator coRτι .

We prove that Rx implies Jx, and Jx implies that x can be approx-imated arbitrarily good by a rational. As one can expect from the typesabove, a realizability interpretation of these proofs will be computationallyinformative.

Let Ip,k := [p − 2−k, p + 2−k] and Bkx := ∃lq(x ∈ Iq,k), meaning that xcan be approximated by a rational with accuracy 2−k.

Proposition. (a) ∀ncx (Rx→c Jx).

(b) ∀ncx (Jx→c ∀ckBkx).

Proof. (a). We use (7.24) with R for P . It suffices to prove Rx →∃ry(Eq(x, y−1

2 ) ∧ Ry) ∨ ∃ry(Eq(x, y2 ) ∧ Ry) ∨ ∃ry(Eq(x, y+12 ) ∧ Ry). Since x ∈

[−1, 1], by (7.22) either x ∈ [−1, 0] or x ∈ [−12 ,

12 ] or x ∈ [0, 1]. Let for

example x ∈ [−1, 0]. Choose y := 2x + 1. Then y ∈ [−1, 1] and thereforeRy, and clearly Eq(x, y−1

2 ).(b). We prove ∀ck∀nc

x (Jx→c Bkx) by induction on k. Base, k = 0. Sincex ∈ [−1, 1], we clearly have B0x. Step, k 7→ k + 1. Assume Jx. ThenEq(x, 0) or (for instance) ∃ry(Eq(x, y−1

2 )∧Jy) by (7.23). In case Eq(x, 0) theclaim is trivial, since Bk0. Otherwise let a real y with Eq(x, y−1

2 ) and Jy

be given. By induction hypothesis we have Bky. Because of Eq(x, y−12 ) this

implies Bk+1x.

The general development follows the one for inductively defined predi-cates rather closely. For simplicity we only consider the finitary case. Againby default, coinductively defined predicates are computationally relevant(c.r.), with the only exception of the witnessing predicates Jr defined be-low. The clause for a c.r. coinductively defined predicate is decorated by

∀nc~x (J~x→c ∨∨

i<k

∃r~yi(∧∧ ~Ai ∧

∧∧ν<ni

J~siν))

where the conjunction after each Aiν is either ∧d or ∧r, and each conjunctionbetween the J~siν is ∧d. Its greatest-fixed-point axiom is decorated by

∀nc~x (P~x→c ∀nc

~x (P~x→c ∨∨i<k

∃r~yi(∧∧ ~Ai ∧

∧∧ν<ni

P~siν))→c J~x )


The definitions of the type τ(A) of a formula A and of the realizabilityrelation t r A is extended by

τ(J~r ) :=

ιJ if J is c.r. if J is c.i.

t r J~s := Jrt~s,

ε r Jrt~s := Jrt~s.

The algebra ιJ of witnesses for a coinductively defined predicate J := νXKis defined as follows. Let

K = ∀nc~x (J~x→c ∨∨

i<k

∃r~yi(∧∧ ~Ai ∧

∧∧ν<ni

J~siν)).

Then ιJ has k constructors, the ith one of type τ(Aim1)→ . . .→ τ(Aimn)→ιJ → . . . → ιJ with Aim1 , . . . , Aimn those of ~Ai which are c.r. and followedby ∧d (rather than ∧r) in K, and ni occurrences of ιJ .

The witnessing predicate Jr of arity (ιJ , ~ρ ) is coinductively defined by

∀nc~x ∀

cw(Jrw~x→nc ∨∨

i<k

∃r~yi∃d~ui∃l~zi

(Eq(w,Ci~ui~zi) ∧∧∧~ui r ~Ai ∧

∧∧ν<ni

Jrziν~siν))

with the understanding that only those uij occur with Aij c.r. and followedby ∧d in K.

For example, for cototality of N coinductively defined by the clause

∀ncn (coTNn→c Eq(n, 0) ∨ ∃rm(Eq(n,Sm) ∧ coTNm))

the witnessing predicate coT rN has arity (N,N) and is defined by

∀ncn ∀cw(coT r

Nwn→nc (Eq(w, 0) ∧ Eq(n, 0)) ∨

∃rm∃lz(Eq(w,Sz) ∧ Eq(n,Sm) ∧ coT rNzm)).

The realizing formula t r A continues to be invariant, since ε r (t r J~s )is identical to t r J~s. The extracted term of the clause of a coinductively de-fined predicate is the destructor of its associated algebra, and for its greatest-fixed-point axiom it is the corecursion operator of this algebra. The proofof the soundness theorem can easily be extended.

Finally we reconsider the example from 6.1.7 and 7.1.7 dealing with(uniformly) continuous real functions, taking computational content intoaccount. We decorate (7.5) – (7.9) as follows.

∀ncf (f [I] ⊆ Id → Y (outd f)→c IY f) (d ∈ −1, 0, 1),(7.25)

∀ncf (IY (f in−1)→c IY (f in0)→c IY (f in1)→c IY f).(7.26)

The decorated version of its least-fixed-point axiom is

∀ncf (IY f →c

(∀ncf (f [I] ⊆ Id → Y (outd f)→c Pf))d∈−1,0,1 →c

∀ncf ((IY (f ind))d∈−1,0,1 →c (P (f ind))d∈−1,0,1 →c Pf)→c

Pf).

(7.27)

The simultaneous inductive/coinductive definition of J is decorated by

(7.28) ∀ncf (Jf →c Eq(f, id) ∨ IJf)

7.3. REFINED A-TRANSLATION 285

and its greatest-fixed-point axiom by

(7.29) ∀ncf (Qf →c ∀nc

f (Qf →c Eq(f, id) ∨ IQf)→c Jf).

The types of (7.25) – (7.29) are

α→ R(α),

R(α)→ R(α)→ R(α)→ R(α),

R(α)→ (α→ τP )3 → (R(α)3 → τ3P → τP )→ τP ,

W→ U + R(W),

τQ → (τQ → U + R(τQ))→W,

respectively, with α := τ(Y f), τP := τ(Pr) and τQ := τ(Qs). Substitutingα by W and writing R for R(W) we obtain

W→ R,R→ R→ R→ R,

R→ (W→ τP )3 → (R3 → τ3P → τP )→ τP ,

W→ U + R,

τQ → (τQ → U + R(τQ))→W.

These are the types of the first three constructors for R, the fourth con-structor for R, the recursion operator RτPR , the destructor for W and thecorecursion operator coRτQW, respectively.

The general from of simultaneous inductive/coinductive definitions ofpredicates (in the finitary case) is decorated by

∀nc~x (J~x→c ∨∨

i<k

∃r~yi(∧∧ ~Ai ∧

∧∧ν<ni

IJ~siν))

where the conjunction after each Aiν is either ∧d or ∧r, and each conjunctionbetween the IJ~siν is ∧d. Its greatest-fixed-point axiom is decorated by

J+ : ∀nc~x (P~x→c ∀nc

~x (P~x→c ∨∨i<k

∃r~yi(∧∧ ~Ai ∧

∧∧ν<ni

IP~siν))→c J~x ).

The algebra ι of witnesses has as constructors those of IJ , and in additionthose caused by the (single) clause of J , as explained above. The witnessingpredicate Jr of arity (ι, ~ρ) then needs J-cototal-IJ -total ideals as witnesses.However, we omit a further development of the general case here.

7.3. Refined A-Translation

In this section the connectives →, ∀ denote the computational versions→c, ∀c, unless stated otherwise.

We will concentrate on the question of classical versus constructiveproofs. It is known that any proof of a specification of the form ∀x∃yBwith B quantifier-free and a weak (or “classical”) existential quantifier ∃ycan be transformed into a proof of ∀x∃yB, now with the constructive existen-tial quantifier ∃y. However, when it comes to extraction of a program froma proof obtained in this way, one easily ends up with a mess. Therefore,some refinements of the standard transformation are necessary. We shallstudy a refined method of extracting reasonable and sometimes unexpected


programs from classical proofs. It applies to proofs of formulas of the form∀x∃yB where B need not be quantifier-free, but only has to belong to thestrictly larger class of goal formulas defined in 7.3.1. Furthermore we allowunproven lemmata D in the proof of ∀x∃yB, where D is a definite formula(also defined in 7.3.1).

We now describe in more detail what this section is about. It is wellknown that from a derivation of a classical existential formula ∃yA :=∀y(A → ⊥) → ⊥ one generally cannot read off an instance. A simpleexample has been given by Kreisel: Let R be a primitive recursive relationsuch that ∃zRxz is undecidable. Clearly – even logically –

` ∀x∃y∀z(Rxz → Rxy).

But there is no computable f satisfying

∀x∀z(Rxz → R(x, f(x))),

for then ∃zRxz would be decidable: it would be true if and only if R(x, f(x))holds.

However, it is well known that in case ∃yG with G quantifier-free onecan read off an instance. Here is a simple idea of how to prove this: replace⊥ anywhere in the proof by ∃yG. Then the end formula ∀y(G→ ⊥)→ ⊥ isturned into ∀y(G→ ∃yG)→ ∃yG, and since the premise is trivially provable,we have the claim.

Unfortunately, this simple argument is not quite correct. First, G maycontain ⊥, and hence is changed under the substitution ⊥ 7→ ∃yG. Second,we may have used axioms or lemmata involving ⊥ (e.g., ⊥ → P ), whichneed not be derivable after the substitution. But in spite of this, the simpleidea can be turned into something useful.

Assume that the lemmata ~D and the goal formula G are such that wecan derive

~D → Di[⊥ := ∃yG],(7.30)

G[⊥ := ∃yG]→ ∃yG.(7.31)

Assume also that the substitution ⊥ 7→ ∃yG turns the axioms into instancesof the same schema with different formulas, or else into derivable formulas.Then from our given derivation (in minimal logic) of ~D → ∀y(G→ ⊥)→ ⊥we obtain

~D[⊥ := ∃yG]→ ∀y(G[⊥ := ∃yG]→ ∃yG)→ ∃yG.

Now (7.30) allows to drop the substitution in ~D, and by (7.31) the secondpremise is derivable. Hence we obtain as desired

~D → ∃yG.We shall identify classes of formulas – to be called definite and goal formulas– such that slight generalizations of (7.30) and (7.31) hold. This will be donein 7.3.1. In 7.3.2 we then prove our main theorem about extraction fromclassical proofs.

We end the section with some examples of our general machinery. Froma classical proof of the existence of the Fibonacci numbers we extract in 7.3.4a short and surprisingly efficient program, where λ-expressions rather than


pairs are passed. In 7.3.6 we consider unary functions f, g, h, s on the naturalnumbers, and a simple proof that for s not surjective, h s h cannot bethe identity. It turns out that a rather unexpected program is extracted. In7.3.5 we treat as a further example a classical proof of the well-foundednessof < on N. Finally in 7.3.7 we take up a suggestion of Bezem and Veldman(1993) and present a short classical proof of (the general form of) Dickson’sLemma, as an interesting candidate for further study.

7.3.1. Definite and goal formulas. We simultaneously inductivelydefine the classes D of definite formulas, G of goal formulas, R of relevantdefinite formulas and I of irrelevant goal formulas. Let D, G, R, I rangeover D, G, R, I, respectively, P over prime formulas distinct from ⊥, andD0 over quantifier-free formulas in D.D, G, R and I are generated by the clauses

(a) R, P, I → D, ∀xD ∈ D.(b) I, ⊥, R→ G, D0 → G ∈ G.(c) ⊥, G→ R, ∀xR ∈ R.(d) P, D → I, ∀xI ∈ I.

Let AF denote A[⊥ := F], and ¬A, ¬⊥A abbreviate A → F, A → ⊥,respectively.

Lemma. We have derivations from F→ ⊥ and F→ P of

DF → D,(7.32)

G→ ¬⊥¬⊥GF,(7.33)

¬⊥¬RF → R,(7.34)

I → IF.(7.35)

Proof. We prove (7.32)–(7.35) simultaneously, by induction on formu-las.

(7.32). Case ⊥. Then ⊥F = F and the claim follows from our assump-tion F→ ⊥. Case P . Obvious. Case ∀xD. By induction hypothesis (7.32)for D we have DF → D, which clearly implies ∀xDF → ∀xD.

Case R.

|¬⊥¬RF → R

F→ ⊥¬RF RF

F⊥

¬⊥¬RF

R

RF → R

Here we have used (7.34) and F→ ⊥.Case I → D.

|DF → D

IF → DF

|I → IF I

IF

DF

D

(IF → DF)→ I → D


Here we have used the induction hypotheses (7.35) for I and (7.32) for D.(7.33). Case ⊥. Clear. Case P . Clear, since PF is P . Case I. This is

clear again, using the induction hypothesis (7.35).Case R → G. We have to prove (R → G) → ¬⊥¬⊥(RF → GF). Let

D1[R→ G,¬⊥(RF → GF)] : ¬⊥R be

|G→ ¬⊥¬⊥GF

R→ G RG

¬⊥¬⊥GF

¬⊥(RF → GF)GF

RF → GF

⊥¬⊥GF

⊥¬⊥R

(by induction hypothesis (7.33) for G) and D2[¬⊥(RF → GF)] : ¬⊥¬RF be

¬⊥(RF → GF)

¬RF RF

F...GF

RF → GF

⊥¬⊥¬RF

Note that GF is derivable from F, using our assumption F→ P .

D1[R→ G,¬⊥(RF → GF)]|¬⊥R

|¬⊥¬RF → R

D2[¬⊥(RF → GF)]|

¬⊥¬RF

R

⊥(R→ G)→ ¬⊥¬⊥(RF → GF)

Here we have used the induction hypothesis (7.34) for R.Case D0 → G. We have to prove (D0 → G) → ¬⊥¬⊥(DF

0 → GF). LetD1[D0 → G,¬⊥(DF

0 → GF)] : ¬⊥D0 and D2[¬⊥(DF0 → GF)] : ¬⊥¬DF

0 be asabove. We use (DF

0 → ⊥)→ (¬DF0 → ⊥)→ ⊥, i.e., case distinction on DF

0 .Hence it suffices to derive from D0 → G and ¬⊥(DF

0 → GF) both ¬⊥DF0

and ¬⊥¬DF0 ; recall that our goal is (D0 → G)→ ¬⊥(DF

0 → GF)→ ⊥. Thenegative case is provided by D2[¬⊥(DF

0 → GF)], and the positive case by

D1[D0 → G,¬⊥(DF0 → GF)]

|¬⊥D0

|DF

0 → D0 DF0

D0

⊥¬⊥DF

0

Here we have used the induction hypothesis (7.32) for D0.(7.34). Case ⊥. Clearly ¬⊥¬⊥(F→ F) is derivable.


Case ∀xR.

|¬⊥¬RF → R

¬⊥¬∀xRF

¬RF

∀xRF

RF

F¬∀xRF

⊥¬⊥¬RF

R∀xR

¬⊥¬∀xRF → ∀xRHere we have used the induction hypothesis (7.34) for R.

Case G→ R.

|¬⊥¬RF → R

|G→ ¬⊥¬⊥GF G

¬⊥¬⊥GF

¬⊥¬(GF → RF)

¬RFGF → RF GF

RF

F¬(GF → RF)

⊥¬⊥GF

⊥¬⊥¬RF

R

¬⊥¬(GF → RF)→ G→ R

Here we have used the induction hypotheses (7.34) for R and (7.33) for G.(7.35). Case P . Clear. Case ∀xI. This is clear again, using the induction

hypothesis (7.35) for I.Case D → I.

|I → IF

D → I

|DF → D DF

DI

IF

(D → I)→ DF → IF

Here we have used the induction hypotheses (7.35) for I and (7.32) forD.

Remark. Is D the largest class of formulas such that DF → D is prov-able intuitionistically? This is not the case, as the following example shows.

S := ∀x(((Qx→ F)→ F)→ Qx),

D := (∀xQx→ ⊥)→ ⊥.

One can easily derive (S → D)F → S → D, since SF is S and a derivationof DF → S → D can be found easily.

However, S → D /∈ D, since D /∈ D. This is because D is (i) neither inR nor (ii) of the form I → D1. For (i), observe that if D were in R, thenits premise ∀xQx → ⊥ would be in G, hence ∀xQx in R, which is not thecase. For (ii), observe that ∀xQx→ ⊥ is not in I bcause ⊥ /∈ I.


It is an open problem to find a useful characterization of the class offormulas such that DF → D is provable intuitionistically.

Lemma. For goal formulas ~G = G1, . . . , Gn we have a derivation fromF→ ⊥ of

(7.36) (~GF → ⊥)→ ~G→ ⊥.

Proof. Assume F→ ⊥. By (7.33)

Gi → (GFi → ⊥)→ ⊥

for all i = 1, . . . , n. Now the assertion follows by minimal logic: Assume~GF → ⊥ and ~G; we must show ⊥. By G1 → (GF

1 → ⊥) → ⊥ it suffices toprove GF

1 → ⊥. Assume GF1 . By G2 → (GF

2 → ⊥) → ⊥ it suffices to proveGF

2 → ⊥. Assume GF2 . Repeating this pattern, we finally have assumptions

GF1 , . . . , G

Fn available, and obtain ⊥ from ~GF → ⊥.

7.3.2. Extraction from weak existence proofs.

Theorem (Strong from weak existence proofs). Assume that for ar-bitrary formulas ~A, definite formulas ~D and goal formulas ~G we have aderivation M∃ of

(7.37) ~A→ ~D → ∀y(~G→ ⊥)→ ⊥.

Then from F→ ⊥ and F→ P for all prime formulas in ~D, ~G we can derive

~DF → ∀y(~GF → ⊥)→ ⊥.

In particular, substitution of the formula

∃y ~GF := ∃y(GF1 ∧ · · · ∧GF

n )

for ⊥ yields a derivation M∃ from the F→ P of

(7.38) ~A[⊥ := ∃y ~GF]→ ~DF → ∃y ~GF.

Proof. The first assertion follows from (7.32) (to infer ~D from ~DF) and(7.36) (to infer ~G → ⊥ from ~GF → ⊥). The second assertion is a simpleconsequence since ∀y(~GF → ∃y ~GF) and F→ ∃y ~GF are both derivable.

We shall apply the method of realizability to extract computational con-tent from the resulting strong existence proof M∃. Recall that this proofessentially follows the given weak existence proof M∃. The only difference isthat proofs of (7.32) (to infer ~D from ~DF) and (7.36) (to infer ~G→ ⊥ from~GF → ⊥) have been inserted. Therefore the extracted term can be struc-tured in a similar way, with one part determined solely by M∃ and anotherpart depending only on the definite formulas ~D and and goal formulas ~G. –For simplicity let ~G consist of a single goal formula G.

To make the method work we need to assume that all prime formulas Pappearing in ~DF, GF are c.i. and invariant (for instance, equalities).

Lemma. Let D be a definite and G a goal formula. Assume that allprime formulas P in DF, GF are c.i. and invariant.


(a) We have a term tD such that

DF → tD r D

is derivable from ∀y(F→ y r ⊥) and F→ P .(b) We have a term sG such that

(GF → v r ⊥)→ w r G→ sGvw r ⊥is derivable from ∀y(F→ y r ⊥) and F→ P .

Proof. The assumption implies that all formulas ~DF, GF are c.i. andinvariant as well, by the definition of realizability.

(a). By (7.32) we have a derivation ND of DF → D from assumptionsF→ ⊥ and F→ P . By the Soundness Theorem we can take tD := et(ND).

(b). By (7.33) we have a derivation HG of (GF → ⊥) → G → ⊥ fromassumptions F→ ⊥ and F→ P . Observe that the following are equivalent.

et(HG) r ((GF → ⊥)→ G→ ⊥)

∀v,w(v r (GF → ⊥)→ w r G→ et(HG)vw r ⊥)

∀v,w((GF → v r ⊥)→ w r G→ et(HG)vw r ⊥)

Hence we can take sG := et(HG).

Theorem (Extraction from weak existence proofs). Assume that fordefinite formulas ~D and a goal formula G(y) we have a derivation M∃ of

~D → ∀y(G(y)→ ⊥)→ ⊥.

Assume that all prime formulas P in ~DF, GF(y) are c.i. and invariant. Lett1, . . . , tn and s be terms for D1, . . . , Dn and G according to parts (a) and(b) of the lemma above. Then from the F→ P we can derive

~DF → GF(et(M ′∃)t1 . . . tns),

where M ′∃ is the result of substituting ⊥ by ∃yGF(y) in M∃.

Proof. By the Soundness Theorem we have

et(M∃) r ( ~D → ∀y(G(y)→ ⊥)→ ⊥)

∀~u,x(~u r ~D → x r ∀y(G(y)→ ⊥)→ et(M∃)~ux r ⊥)

∀~u,x(~u r ~D → ∀y,w(w r G(y)→ xyw r ⊥)→ et(M∃)~ux r ⊥)

Instantiating ~u, x by ~t, s, respectively, we obtain~t r ~D → ∀y,w(w r G(y)→ syw r ⊥)→ et(M∃)~ts r ⊥.

Hence by part (a) of the lemma above we have a derivation of~DF → ∀y,w(w r G(y)→ syw r ⊥)→ et(M∃)~ts r ⊥

from ∀y(F→ y r ⊥) and F→ P . Substituting ⊥ by ∃yGF(y) gives~DF → ∀y,w((w r G(y))[⊥ := ∃yGF(y)]→ GF(syw))→ GF(et(M∃)~ts)

from F→ P . Substituting ⊥ by ∃yGF(y) in the formula derived in part (b)of the lemma above gives

(GF(y)→ GF(v))→ (w r G(y))[⊥ := ∃yGF(y)]→ GF(svw)


from F→ P . Instantiating this with v := y we obtain a derivation of

~DF → GF(et(M∃)~ts)

from F→ P , as required.

Remark. The theorem can be generalized by allowing arbitrary formu-las ~A as additional premises. Then the final conclusion needs additionalpremises ~A[⊥ := ∃yGF(y)], and we must assume that we have term ~r suchthat ~A[⊥ := ∃yGF(y)]→ (~r r ~A)[⊥ := ∃yGF(y)] is derivable. Moreover, theet(M∃) in the final conclusion needs ~r as additional arguments.

Below we will give examples for this “refined” A-translation. However,let us check first the mechanism of working with definite and goal formulasfor Kreisel’s “non-example” mentioned in the introduction. There we gavea trivial proof of a ∀∃-formula that cannot be realized by a computablefunction, and we want to make sure that our general result also does notprovide such a function. The example amounts to a proof of

∀z(¬⊥¬⊥Rxz → Rxz)→ ∀y((Rxy → ∀zRxz)→ ⊥)→ ⊥.

Here Rxy → ∀zRxz is a goal formula, but the premise ∀z(¬⊥¬⊥Rxz → Rxz)is not definite. Replacing R by ¬⊥S (to get rid of the stability assumption)does not help, for then ¬⊥Sxy → ∀z¬⊥Sxz is not a goal formula.

Note (Critical predicate symbols). To apply these results we have toknow that our assumptions are definite formulas and our goal is given by goalformulas. For quantifier-free formulas this clearly can always be achievedby inserting double negations in front of every atom (cf. the definitions ofdefinite and goal formulas). This corresponds to the original (unrefined)so-called A-translation of Friedman (1978) and Dragalin (1979); see alsoLeivant (1985). However, in order to obtain reasonable programs which donot unnecessarily use higher types or case analysis we want to insert doublenegations only at as few places as possible.

We describe a more economical general way to obtain definite and goalformulas. It consists in singling out some predicate symbols as being “criti-cal”, and then double negating only the atoms formed with critical predicatesymbols; call these critical atoms. Assume we have a proof of

∀~x1C1 → · · · → ∀~xnCn → ∀~y( ~B → ⊥)→ ⊥

with ~C, ~B quantifier-free. Let

L := C1, . . . , Cn, ~B → ⊥

The set of L-critical predicate symbols is defined to be the smallest setsatisfying

(i) ⊥ is critical.(ii) If (~C1 → R1(~s1)) → · · · → (~Cm → Rm(~sm)) → R(~s) is a positive

subformula of L, and if some Ri is L-critical, then R is L-critical.

Now if we double negate every L-critical atom different from ⊥ we clearlyobtain definite assumptions ~C ′ and goal formulas ~B′. Furthermore the proof


term of the given derivation can easily be transformed into a correct deriva-tion of the translated formula from the translated assumptions (by insertingthe obvious proofs of the translated axioms).

However, in particular cases we might be able to obtain definite andgoal formulas with still fewer double negations: it may not be necessary todouble negate every critical atom.

We now present some simple examples of how to apply this method. Inall of them we will have a single goal formula G. However, before we dothis we describe a useful method to suppress somewhat obvious proofs oftotality in derivation terms.

7.3.3. Suppressing totality proofs. In a derivation involving induc-tion we need to provide totality proofs in order to be able to use the inductionaxiom. For instance, when we want to apply an induction axiom ∀n∈TA(n)to a term r, we must know T (r) to conclude A(r). However, in many casessuch totality proofs are easy: Suppose r is k+ l, and we already know T (k)and T (l). Then – referring to a proof of T (+) which is done once and forall – we clearly know T (k + l). In order to suppress such trivial proofs, wemark the addition function + as total, and call a term syntactically total ifit is built from total variables by total function constants. Then we allowan inference

∀n∈TA(n) r

A(r)or (written as derivation term) M∀n∈TA(n)r provided r is syntactically total.It is clear that and how this “derivation” can be expanded into a properone. Since in the rest of the present section all variables will be restrictedto total ones we shall write ∀nA for ∀n∈TA. We also write ι for N.

7.3.4. Example: Fibonacci numbers. Let αn be the n-th Fibonaccinumber, i.e.,

α0 := 0, α1 := 1, αn := αn−2 + αn−1 for n ≥ 2.

We give a weak existence proof for the Fibonacci numbers:

∀n∃kGnk, i.e., ∀n(∀k(Gnk → ⊥)→ ⊥)

from clauses expressing that G is the graph of the Fibonacci function:

v0 : G00, v1 : G11, v2 : ∀n,k,l(Gnk → G(n+ 1, l)→ G(n+ 2, k + l)).

We view G as a predicate variable without computational content. Clearlythe clause formulas are definite and Gnk is a goal formula. To construct aderivation, assume (n ∈ T and)

u : ∀k(Gnk → ⊥).

Our goal is ⊥. To this end we first prove a strengthened claim in order toget the induction through:

∀nB(n) with B(n) := ∀k,l(Gnk → G(n+ 1, l)→ ⊥)→ ⊥.This is proved by induction on n. The base case follows from the first twoclauses. In the step case we can assume that we have k, l satisfying Gnkand G(n+1, l). We need k′, l′ such that G(n+1, k′) and G(n+2, l′). Using


the third clause simply take k′ := l and l′ := k + l. – To obtain our goal ⊥from ∀nB, it suffices to prove its premise ∀k,l(Gnk → G(n+ 1, l)→ ⊥). Solet k, l be given and assume u1 : Gnk and u2 : G(n + 1, l). Then u appliedto k and u1 gives our goal ⊥.

The derivation term is

M∃ = λ∀k(Gnk→⊥)u (Indn,BnMbaseMstep(λk,lλGnku1

λG(n+1,l)u2

.uku1))

where

Indn,B(n) : ∀n(B(0)→ ∀n(B(n)→ B(Sn))→ B(n)),

Mbase = λ∀k,l(G0k→G1l→⊥)w0 (w001v0v1),

Mstep = λnλBwλ

∀k,l(G(n+1,k)→G(n+2,l)→⊥)w1 (

w(λk,lλGnku3λG(n+1,l)u4

(w1l(k + l)u4(v2nklu3u4)))).

LetM ′ denote the result of substituting ⊥ by ∃kGnk inM . Since neither theclauses nor the goal formula Gnk contain ⊥, the extracted term according tothe theorem above is et(M ′

∃)(λvv). The term et(M ′∃) can be computed from

M∃ as follows. For the object variable assigned to an assumption variable uwe shall use the same name.

et(M ′∃) = λι→ι

u (R(ι→ι→ι)→ιι net(M ′

base)et(M′step))(λk,lk)

where

et(M ′base) = λι→ι→ι

w0(w001)

et(M ′step) = λnλ

(ι→ι→ι)→ιw λι→ι→ι

w1(w(λk,l(w1l(k + l))))

The term extracted by Minlog from a formalization of this proof is almostliterally the same:[n0](Rec nat=>(nat=>nat=>nat)=>nat)n0([f1]f1 0 1)([n1,p2,f3]p2([n4,n5]f3 n5(n4+n5)))([n1,n2]n1)

with p a name for variables of type (nat=>nat=>nat)=>nat and f of typenat=>nat=>nat. The underlying algorithm defines an auxiliary functionalH by

H(0, f) := f(0, 1), H(n+ 1, f) := H(n, λk,lf(l, k + l))

and gives the result by applying H to the original number and the firstprojection λk,lk. This is a linear algorithm in tail recursive form. It issomewhat unexpected since it passes functions (rather than pairs, as onewould ordinarily do), and hence uses functional programming in a properway. This clearly is related to the use of classical logic, which by its use ofdouble negations has a functional flavour.

7.3.5. Example: well-foundedness of the natural numbers. Aninteresting phenomenon can occur when we extract a program from a classi-cal proof which uses the minimum principle. Consider as a simple examplethe well-foundedness of < on the natural numbers, i.e.,

∀f ι→ι ∃k(fk ≤ fk+1).


If one formalizes the classical proof “choose k such that fk is minimal” andextracts a program one might expect that it computes a k such that fk isminimal. But this is impossible! In fact the program computes the least ksuch that fk ≤ fk+1 instead. This discrepancy between the classical proofand the extracted program can of course only show up if the solution is notuniquely determined.

We begin with a rather detailed exposition of the classical proof, since weneed a complete formalization. Our goal is ∃k(fk ≤ fk+1), and the classicalproof consists in using the minimum principle to choose a minimal elementin the range of f . This suffices, for if we have such a minimal element, sayn0, then it must be of the form fk0 , and by the choice of n0 we have fk0 ≤ fkfor every k, so in particular fk0 ≤ fk0+1.

Next we need to prove the minimum principle

∃kRk → ∃k(Rk ∧ ∀l<k(Rl→ ⊥))

from ordinary zero-successor-induction. The minimum principle is logicallyequivalent to

∀k(∀l<k(Rl→ ⊥)→ Rk → ⊥)→ ∀k(Rk → ⊥).

The premise P expresses the “progressiveness” of Rk → ⊥ w.r.t. <. Weprovide a proof by zero-successor-induction on n w.r.t. the formula

A(n) := ∀k<n(Rk → ⊥).

Base. A(0) follows easily from the lemma

v1 : ∀m<0⊥.Step. Let n be given and assume w2 : A(n). To show A(n+1) let k be givenand assume w3 : k < n + 1. We will derive Rk → ⊥ by using w1 : P at k.Hence we have to prove

∀l<k(Rl→ ⊥).So, let l be given and assume further w4 : l < k. From w4 and w3 : k < n+1we infer l < n (using an arithmetical lemma). Hence, by the inductionhypothesis w2 : A(n) at l we get Rl→ ⊥.

Now a complete formalization is easy. We express m ≤ k by k < m →⊥ and formalize a variant of the proof just given with ∀m(fm 6= k) (i.e.,∀m(fm = k → ⊥)) instead of Rk → ⊥; this does not change much. Thederivation term is

M∃ := λ∀m<0⊥v1

λ∀k((fk+1<fk→⊥)→⊥)u (

MProg→∀k∀m(fm 6=k)cvind Mprogf00Lf0=f0)

where

Mcvind = λProgw1

λk(Indn,B(n)(k + 1)MbaseMstepkLk<k+1),

Mbase = λkλk<0w0

λmλfm=kw0

(v1kw0),

Mstep = λnλB(n)w2

λkλk<n+1w3

(w1kλlλl<kw4

(w2l(Ll<n[w4, w3]))),

Mprog = λkλ∀l<k∀m(fm 6=l)u1

λmλfm=ku2

(umλfm+1<fmw5

(


u1fm+1Lfm+1<k[w5, u2](m+ 1)Lfm+1=fm+1))

Here we have used the abbreviations

Prog = ∀k(∀l<k∀m(fm 6= l))→ ∀m(fm 6= k),

B(n) = ∀k<n∀m(fm 6= k),

Indn,B(n) = ∀n(B(0)→ ∀n(B(n)→ B(n+ 1))→ B(n)).

Let M ′ denote the result of substituting ⊥ by ∃k(fk+1 < fk → F) inM . The term et(M ′

∃) can be computed from M∃ as follows. For the objectvariable assigned to an assumption variable u we shall use the same name.

et(M ′) = λι→ιv1 λι→ι→ι

u (et(M ′cvind)et(M

′prog)f00)

where

et(M ′cvind) = λι→(ι→ι→ι)→ι→ι

w1λk(Rι→ι→ι

ι (k + 1)et(M ′base)et(M

′step)k),

et(M ′base) = λk,m(v1k),

et(M ′step) = λnλ

ι→ι→ιw2

λk(w1k(λl(w2l))),

et(M ′prog) = λkλ

ι→ι→ιu1

λm(um(u1fm+1(m+ 1))).

Note that k is not used in et(M ′prog); this is the reason why the optimization

below is possible.Recall that by the extraction theorem, the term extracted from the

present proof has the form et(M ′∃)t1 . . . tns where t1, . . . , tn and s are terms

for D1, . . . , Dn and G according to parts (a) and (b) of the lemma above.In our case we have just one definite formula D = ∀k<0⊥, and since we canderive

∀k<0F→ (λn0) r ∀k<0⊥,from ∀k(F→ k r ⊥), we can take t := λn0. The goal formula in our case isG := (fk+1 < fk → ⊥). For this G we can derive directly

((fk+1 < fk → F)→ v r ⊥)→ (fk+1 < fk → w r ⊥)→ svw r ⊥with

s := λv,w[if fk+1 < fk then w else v].Then the term extracted according to the theorem is

et(M ′∃)ts =β et(M ′

cvind)t,set(Mprog)t,sf00

where t,s indicates substitution of t, s for v1, u. Therefore

et(M ′cvind)

t,s =βη λw1,k′(R(k′ + 1)(λk,m0)(λn,w2,k(w1kw2))k′,

et(M ′prog)

t,s =β λk,u1,m[if fm+1 < fm then u1(fm+1)(m+ 1) else m].

Hence we obtain as extracted term

et(M ′∃)ts =β R(f0 + 1)rbaserstepf00

with

rbase := λk,m0,

rstep := λn,w2,k,m[if fm+1 < fm then w2fm+1(m+ 1) else m].

Since the recursion argument is f0 + 1, we can convert et(M ′∃)ts into

[if f1 < f0 then Rf0rbaserstepf11 else 0].


To make this algorithm more readable we may define

h(0, k,m) = 0,

h(n+ 1, k,m) = [if fm+1 < fm then h(n, fm+1,m+ 1) else m]

and then write the result as h(f0 + 1, f0, 0), or (unfolded) as

[if f1 < f0 then h(f0, f1, 1) else 0].

The machine extracted term is (original output of Minlog, with renamingof variables done manually)[f][if (f 1<f 0)

((Rec 2 nat=>nat nat=>nat nat=>nat=>nat=>nat)([n]n)f([k,m]0)([n,g,k,m][if (f(Succ m)<f m)

(g(f(Succ m))(Succ m))m])

(f 0)(f 1)1)0]

We can rewrite this as a Scheme program as follows.(define (wf f) (wf-aux f (+ (f 0) 1) (f 0) 0))

(define (wf-aux f n k m)(if (= 0 n)

0(if (< (f (+ m 1)) (f m))

(wf-aux f (- n 1) (f (+ m 1)) (+ m 1))m)))

Note that k is not used here (this will always happen if induction is used inthe form of the minimum principle only), and hence we may optimize ourprogram to(define (wf1 f) (wf1-aux f (+ (f 0) 1) 0))

(define (wf1-aux f n m)(if (= 0 n)

0(if (< (f (+ m 1)) (f m))

(wf-aux f (- n 1) (+ m 1))m)))

Now it is immediate to see that the program computes the least k such thatfk+1 < fk → ⊥, where f0 + 1 only serves as an upper bound for the search.

Remark. There is an alternative proof of the well-foundedness of thenatural numbers, which uses the minimum principle with a measure functioninstead. Here we can take the function f itself as a measure function. Werefrain from analysing this proof in the same detail as before, but rather theterm extracted by Minlog, which in fact is slightly simpler.


[f][if (f 1<f 0)((Rec 1 nat=>nat nat=>nat=>nat)f(f 0)([m]0)([n,f1,m][if (f(Succ m)<f m) (f1(Succ m)) m])1)0]

7.3.6. Example: The hsh-theorem. Let f, g, h, s denote unary func-tions on the natural numbers. We show ∃n(h(s(hn)) 6= n) and extract an(unexpected) program from it.

Lemma (Surjectivity). g f surjective implies g surjective.

Lemma (Injectivity). g f injective implies f injective.

Lemma (Surjectivity-Injectivity). gf surjective and g injective impliesf surjective.

Proof. Assume y is not in the range of f . Consider g(y). Since g fis surjective, there is an x with g(y) = g(f(x)). The injectivity of g impliesy = f(x), a contradiction.

Theorem (hsh-Theorem). ∀n(s(n) 6= 0)→ ¬∀n(h(s(h(n))) = n).

Proof. Assume hsh is the identity. Then by the Injectivity Lemmah is injective. Hence by the Surjectivity-Injectivity Lemma sh is surjective,and therefore by the Surjectivity Lemma s is surjective, a contradiction.

From the Godel-Gentzen translation and the fact that we can system-atically replace triple negations by single negations we obtain a derivationof

∀n(s(n) 6= 0)→ ∃n(h(s(hn)) 6= n).

Now since ∀n(s(n) 6= 0) is a definite formula, this is in the form where ourgeneral theory applies. The extracted program is, somewhat unexpectedly,

[s,h][if (h(s(h(h 0)))=h 0)[if (h(s(h(s(h(h 0)))))=s(h(h 0)))

0(s(h(h 0)))]

(h 0)]

Let us see why this program indeed provides a counterexample to the as-sumption that h s h is the identity.

If h(s(h(h0))) 6= h0, take h0. So assume h(s(h(h0))) = h0. If

h(s(h(s(h(h0))))) = s(h(h0)),

then also h(s(h0)) = s(h(h0)), so 0 is a counterexample, because the righthand side cannot be 0 (this was our assumption on s). So assume

h(s(h(s(h(h0))))) 6= s(h(h0)).

Then s(h(h0)) is a counterexample.

7.4. GODEL’S DIALECTICA INTERPRETATION 299

7.3.7. Towards more interesting examples. Bezem and Veldman(1993) suggested Dickson’s Lemma (1913) as an interesting case study forprogram extraction from classical proofs. It states that for k given infinitesequences f1, . . . , fk of natural numbers and a given number l there areindices i1, . . . , il such that every sequence fκ increases on i1, . . . , il, i.e.,fκ(i1) ≤ · · · ≤ fκ(il) for κ = 1, . . . , k. Here is a short classical proof, usingthe minimum principle for undecidable sets.

Call a unary predicate (or set) Q ⊆ N unbounded if ∀x∃y>xQ(y).

Lemma. Let Q be unbounded and f a function from a superset of Q toN. Then the set Qf of left f-minima w.r.t. Q is unbounded; here

Qf (x) := Q(x) ∧ ∀y>x(Q(y)→ f(x) ≤ f(y)).

Proof. Let x be given. We must find y > x with Qf (y). The minimumprinciple for y > x | Q(y) with measure f yields

∃y>xQ(y)→ ∃y>x(Q(y) ∧ ∀z>x(Q(z)→ f(y) ≤ f(z))).

Since Q is assumed to be unbounded, the premise is true. We show that they provided by the conclusion satisfies Qf (y), that is,

Q(y) ∧ ∀z>y(Q(z)→ f(y) ≤ f(z)).

Let z > y with Q(z). From y > x we obtain z > x. Hence f(y) ≤ f(z).

Let Q be unbounded and f0, f1 . . . be functions from a superset of Qto N. Then for every k there is an unbounded subset Qk of Q such thatf0, . . . , fk−1 increase on Qk w.r.t. Q, that is, ∀x,y;x<y(Qk(x) → Q(y) →fi(x) ≤ fi(y)).

Lemma.

∀x∃y>xQ(y)→

∀k∃Qk⊆Q(∀x∃y>xQk(y) ∧ ∀i<k∀x,y;x<y(Qk(x)→ Q(y)→ fi(x) ≤ fi(y))).

Proof. By induction on k. Base. Let Q0 := Q. Step. Consider (Qk)fk.

By induction hypothesis, f0, . . . , fk−1 increase on Qk w.r.t. Q, and thereforealso on its subset (Qk)fk

. Moreover, by construction also fk increases on(Qk)fk

w.r.t. Q.

Corollary. For every k, l we have

∀f1,...,fk∃i0,...,il

∧∧λ<l

(iλ < iλ+1 ∧k∧∧

κ=1fκ(iλ) ≤ fκ(iλ+1)).

For k = 2 (i.e., two sequences) this example has been treated by Berger,Schwichtenberg, and Seisenberger (2001). However, it is interesting to lookat the general case, since then the brute force search takes time O(nk), andwe can hope that the program extracted from the classical proof is better.

7.4. Godel’s Dialectica Interpretation

In his original functional interpretation of (1958), Godel assigned toevery formula A a new one ∃~x∀~yAD(~x, ~y ) with AD(~x, ~y ) quantifier-free. Here~x, ~y are lists of variables of finite types; the use of higher types is necessaryeven when the original formula A is first-order. He did this in such a way


that whenever a proof of A say in Peano arithmetic was given, one couldproduce closed terms ~r such that the quantifier-free formula AD(~r, ~y ) isprovable in his quantifier-free system T.

In (1958) Godel referred to a Hilbert-style proof calculus. However, sincethe realizers will be formed in a λ-calculus formulation of system T, Godel’sinterpretation becomes more perspicuous when it is done for a natural de-duction calculus. The present exposition is based on such a setup. Then theneed for contractions comes up in the (only) logical rule with two premises:modus ponens (or implication elimination →−). This makes it possible togive a relatively simple proof of the Soundness Theorem.

7.4.1. Positive and negative types. To determine the types of xand y, we assign to every formula A objects τ+(A), τ−(A) (a type or the“nulltype” symbol ). τ+(A) is intended to be the type of a (Dialectica-)realizer to be extracted from a proof of A, and τ−(A) the type of a challengefor the claim that this term realizes A. We define

τ+(P~s ) := , τ−(P~s ) := ,τ+(∀xρA) := ρ→ τ+(A), τ−(∀xρA) := ρ× τ−(A),

τ+(∃xρA) := ρ× τ+(A), τ−(∃xρA) := τ−(A),

τ+(A ∧B) := τ+(A)× τ+(B), τ−(A ∧B) := τ−(A)× τ−(B),

and for implication

τ+(A→ B) := (τ+(A)→ τ+(B))× (τ+(A)→ τ−(B)→ τ−(A)),

τ−(A→ B) := τ+(A)× τ−(B).

Recall that (ρ → ) := , ( → σ) := σ, ( → ) := , and (ρ × ) := ρ,( × σ) := σ, ( × ) := .

In case τ+(A) (τ−(A)) is 6= we say that A has positive (negative)computational content . For formulas without positive or without negativecontent one can give an easy characterization, involving the well-known no-tion of positive or negative occurrences of quantifiers in a formula.

τ+(A) = ↔ A has no positive ∃ and no negative ∀,τ−(A) = ↔ A has no positive ∀ and no negative ∃,τ+(A) = τ−(A) = ↔ A is quantifier-free.

Examples. (a) For quantifier-free A0, B0,

τ+(∀xρA0) = , τ−(∀xρA0) = ρ,

τ+(∃xρA0) = ρ, τ−(∃xρA0) = ,

τ+(∀xρ∃yσA0) = (ρ→ σ), τ−(∀xρ∃yσA0) = ρ.

(b) For arbitrary A,B, writing τ±A for τ±(A)

τ+(∀zρ(A→ B)) = ρ→ (τ+A→ τ+B)× (τ+A→ τ−B → τ−A),

τ+(∃zρA→ B) = (ρ× τ+A→ τ+B)× (ρ× τ+A→ τ−B → τ−A),

τ−(∀zρ(A→ B)) = ρ× (τ+A× τ−B),


τ−(∃zρA→ B) = (ρ× τ+A)× τ−B.

Later we will see many more examples.

It is interesting to note that for an existential formula with a quantifier-free kernel the positive and negative type is the same, irrespective of thechoice of the existential quantifier, constructive or classical.

Lemma. τ±(∃xA0) = τ±(∃xA0) for A0 quantifier-free. In more detail,

(a) τ+(∃xA) = τ+(∃xA) = ρ× τ+(A) provided τ−(A) = ,(b) τ−(∃xA) = τ−(∃xA) = τ−(A) provided τ+(A) = .

Proof. For an arbitrary formula A we have

τ+(∀xρ(A→ ⊥)→ ⊥) = τ+(∀xρ(A→ ⊥))→ τ−(∀xρ(A→ ⊥))

= (ρ→ τ+(A→ ⊥))→ (ρ× τ−(A→ ⊥))

= (ρ→ τ+(A)→ τ−(A))→ (ρ× τ+(A)),

τ+(∃xρA) = ρ× τ+(A).

Both types are equal if τ−(A) = . Similarly

τ−(∀xρ(A→ ⊥)→ ⊥) = τ+(∀xρ(A→⊥)) = τ+(A→⊥) = τ+(A)→ τ−(A),

τ−(∃xρA) = τ−(A).

Both types are = τ−(A) if τ+(A) = .

7.4.2. Godel translation. For every formula A and terms r of typeτ+(A) and s of type τ−(A) we define a new quantifier-free formula |A|rs byinduction on A.

|P~s |rs := P~s,

|∀xA(x)|rs := |A(s0)|r(s0)s1 ,

|∃xA(x)|rs := |A(r0)|r1s ,

|A ∧B|rs := |A|r0s0 ∧ |B|r1s1,

|A→ B|rs := |A|s0r1(s0)(s1) → |B|r0(s0)s1 .

The formula ∃x∀y|A|xy is called the Godel translation of A and is often de-noted by AD. Its quantifier-free kernel |A|xy is called Godel kernel of A; itis denoted by AD.

For readability we sometimes write terms of a pair type in pair form:

|∀zA|fz,y := |A|fzy ,|∃zA|z,xy := |A|xy ,

|A ∧B|x,zy,u := |A|xy ∧ |B|zu,

|A→ B|f,gx,u := |A|xgxu → |B|fxu .

Examples. (a) For quantifier-free formulas A0, B0 with xρ /∈ FV(B0)

τ+(∀xρA0 → B0) = τ−(∀xρA0) = ρ,

τ+(∃xρ(A0 → B0)) = ρ,

τ−(∀xρA0 → B0) = ,τ−(∃xρ(A0 → B0)) = .

Then

|∀xρA0 → B0|xε = |∀xρA0|εx → |B0|εε = A0 → B0,

|∃xρ(A0 → B0)|xε = A0 → B0.


(b) For A with τ+(A) = and z /∈ FV(A), and arbitrary B

τ+(A→ ∃zρB) = (ρ× τ+(B))× (τ+(B)→ τ−(A)),

τ+(∃zρ(A→ B)) = ρ× (τ+(B)× (τ+(B)→ τ−(A))),

τ−(A→ ∃zρB) = τ−(B),

τ−(∃zρ(A→ B)) = τ−(B).

Then

|A→ ∃zρB|〈z,y〉,gv = |A|εgv → |∃zρB|z,yv = |A|εgv → |B|yv,

|∃zρ(A→ B)|z,〈y,g〉v = |A→ B|y,gv = |A|εgv → |B|yv.

(c) For arbitrary A

τ+(∀xρ∃yσA(x, y)) = (ρ→ σ × τ+(A)),

τ+(∃fρ→σ∀xρA(x, fx)) = (ρ→ σ)× (ρ→ τ+(A)),

τ−(∀xρ∃yσA(x, y)) = ρ× τ−(A),

τ−(∃fρ→σ∀xρA(x, fx)) = ρ× τ−(A).

Then

|∀xρ∃yσA(x, y)|λx〈fx,z〉x,u = |∃yσA(x, y)|fx,zu = |A(x, fx)|zu,

|∃fρ→σ∀xρA(x, fx)|f,λxzx,u = |∀xρA(x, fx)|λxz

x,u = |A(x, fx)|zu.

(d) For arbitrary A, writing τ±A for τ±(A)

τ+(∀zρ(A→ ∃zρA)) = ρ→ (τ+A→ ρ× τ+A)× (τ+A→ τ−A→ τ−A),

τ−(∀zρ(A→ ∃zρA)) = ρ× (τ+A× τ−A).

Then

|∀zρ(A→ ∃zρA)|λz〈λx〈z,x〉,λx,w w〉z,〈x,w〉 = |A→ ∃zρA|λx〈z,x〉,λx,w w

x,w

= |A|xw → |∃zρA|z,xw= |A|xw → |A|xw.

7.4.3. Characterization. We consider the question when the Godeltranslation of a formula A is equivalent to the formula itself. This will onlyhold if we assume the (constructively doubtful) Markov principle (MP), forhigher type variables and quantifier-free formulas A0, B0.

(∀xρA0 → B0)→ ∃xρ(A0 → B0) (xρ /∈ FV(B0)).

We will also need the less problematic axiom of choice (AC)

∀xρ∃yσA(x, y)→ ∃fρ→σ∀xρA(x, f(x)).

and independence of premise axiom (IP)

(A→ ∃xρB)→ ∃xρ(A→ B) (xρ /∈ FV(A), τ+(A) = ).Notice that (AC) expresses that we can only have continuous dependencies.

Theorem (Characterization).

AC + IP + MP ` (A↔ ∃x∀y |A|xy).


Proof. Induction on A; we only treat the implication case.

(A→ B)↔ (∃x∀y |A|xy → ∃v∀u |B|vu) by induction hypothesis

↔ ∀x(∀y |A|xy → ∃v∀u |B|vu)↔ ∀x∃v(∀y |A|xy → ∀u |B|vu) by (IP)

↔ ∀x∃v∀u(∀y |A|xy → |B|vu)↔ ∀x∃v∀u∃y(|A|xy → |B|vu) by (MP)

↔ ∃f∀x∀u∃y(|A|xy → |B|fxu ) by (AC)

↔ ∃f,g∀x,u(|A|xgxu → |B|fxu ) by (AC)

↔ ∃f,g∀x,u|A→ B|f,gx,uwhere the last step is by definition.

Without the Markov principle one can still prove some relations betweenA and its Godel translation. This, however, requires conditions G+(A),G−(A) on A, defined inductively by

G±(P~s ) := >,G+(A→ B) := (τ−(A) = ) ∧G−(A) ∧G+(B),

G−(A→ B) := G+(A) ∧G−(B),

G±(A ∧B) := G±(A) ∧G±(B),

G±(∀xA) := G±(A), G±(∃xA) := G±(A).

Proposition.

AC ` ∃x∀y |A|xy → A if G−(A),(7.39)

AC ` A→ ∃x∀y |A|xy if G+(A).(7.40)

Proof. Both directions are proved simultaneously, by induction on A.Case ∀zA. (7.39). Assume G−(A).

∃f∀z,y |∀zA|fz,y → ∃f∀z,y |A|fzy by definition

→ ∀z∃x∀y |A|xy→ ∀zA by induction hypothesis, using G−(A).

(7.40). Assume G+(A).

∀zA→ ∀z∃x∀y |A|xy by induction hypothesis, using G+(A)

→ ∃f∀z∀y|A|fzy by (AC)

→ ∃f∀z,y |∀zA|fz,y by definition.

Case A→ B. (7.39). Assume G+(A) and G−(B).

∃f,g∀x,u|A→ B|f,gx,u → ∃f,g∀x,u(|A|xgxu → |B|fxu ) by definition

→ ∃f∀x∀u∃y(|A|xy → |B|fxu )

→ ∀x∃v∀u∃y(|A|xy → |B|vu)→ ∀x∃v∀u(∀y |A|xy → |B|vu)→ ∀x∃v(∀y |A|xy → ∀u |B|vu)


→ ∀x(∀y |A|xy → ∃v∀u |B|vu)→ (∃x∀y |A|xy → ∃v∀u |B|vu)→ (A→ B) by induction hypothesis,

where in the final step we have used G+(A) and G−(B).(7.40). Assume τ−(A) = , G−(A) and G+(B).

(A→ B)→ (∃x |A|xε → ∃v∀u |B|vu) by induction hypothesis

→ ∀x(|A|xε → ∃v∀u |B|vu)→ ∀x∃v∀u(|A|xε → |B|vu)

→ ∃f∀x∀u(|A|xε → |B|fxu ) by (AC)

→ ∃f∀x,u|A→ B|fx,u by definition.

Case ∃zA. (7.39). Assume G−(A).

∃z,x∀y |∃zA|z,xy → ∃z∃x∀y |A|xy by definition

→ ∃zA by induction hypothesis, using G−(A).

(7.40). Assume G+(A).

∃zA→ ∃z∃x∀y |A|xy by induction hypothesis, using G+(A).

→ ∃z,x∀y |∃zA|z,xy by definition.

7.4.4. Soundness. Let Heyting arithmetic HAω in all finite types bethe fragment of TCF where (i) the only base types are N and B, and (ii)the only inductively defined predicates are totality, Leibniz equality Eq, the(proper) existential quantifier and conjunction. We prove soundness of theDialectica interpretation for HAω+AC+IP+MP, for our natural deductionformulation of the underlying logic.

We first treat some axioms, and show that each of them has a “logicalDialectica realizer”, that is, a term t such that ∀y|A|ty can be proved in HAω.

For (∃+) this was proved in example (d) of 7.4.2. The introductionaxioms for totality and Eq, conjunction introduction (∧+) and elimination(∧−) all have obvious Dialectica realizers. The elimination axioms for total-ity (i.e., induction) and for existence are treated below, in their (equivalent)rule formulation. The elimination axiom for Eq can be dealt with similarly.

The axioms (MP), (IP) and (AC) all have the form C → D whereτ+(C) ∼ τ+(D) and τ−(C) ∼ τ−(D), with ρ ∼ σ indicating that ρ and σare canonically isomorphic. This has been verified for (MP), (IP) and (AC)in examples (a)-(c) of 7.4.2, respectively. Such canonical isomorphisms canbe expressed by λ-terms

f+ : τ+(C)→ τ+(D),

g+ : τ+(D)→ τ+(C),

f− : τ−(C)→ τ−(D),

g− : τ−(D)→ τ−(C).

(they have been written explicitly in 7.4.2). It is easy to check that theGodel translations |C|ug−v and |D|f

+uv are equal (modulo β-conversion). But

then 〈f+, λu g−〉 is a Dialectica realizer for the axiom C → D, because

|C → D|f+,λu g−u,v = |C|ug−v → |D|

f+uv .


Theorem (Soundness). Let M be a derivation

HAω + AC + IP + MP ` A

from assumptions ui : Ci (i = 1, . . . , n). Let xi of type τ+(Ci) be variablesfor realizers of the assumptions, and y be a variable of type τ−(A) for achallenge of the goal. Then we can find terms et+(M) =: t of type τ+(A)with y /∈ FV(t) and et−i (M) =: ri of type τ−(Ci), and a derivation in HAω

of |A|ty from assumptions ui : |Ci|xiri .

Proof. Induction on M . We begin with the logical rules and leave thetreatment of the remaining axioms – induction, cases and (∃−) – for theend.

Case u : A. Let x of type τ+(A) be a variable for a realizer of theassumption u. Define et+(u) := x and et−0 (u) := y.

Case λuAMB. By induction hypothesis we have a derivation of |B|tzfrom u : |A|xr and ui : |Ci|xi

ri , where u : |A|xr may be absent. Substitute y0 for

x and y1 for z. By (→+) we obtain |A|y0r[x,z:=y0,y1] → |B|t[x:=y0]y1 , which is (up

to β-conversion)

|A→ B|λxt,λx,zry , from u′i : |Ci|

xi

ri[x,z:=y0,y1].

Here r is the canonical inhabitant of the type τ−(A) in case u : |A|xr is absent.Hence we can define the required terms by (assuming that uA is u1)

et+(λuM) := (λxet+(M), λx,zet−1 (M)),

et−i (λuM) := et−i+1(M)[x, z := y0, y1].

Case MA→BNA. By induction hypothesis we have a derivation of

|A→ B|tx = |A|x0t1(x0)(x1) → |B|t0(x0)x1 from |Ci|xi

pi, |Ck|xk

pk, and of

|A|sz from |Cj |xjqj , |Ck|xk

qk.

Substituting 〈s, y〉 for x in the first derivation and of t1sy for z in the secondderivation gives

|A|st1sy → |B|t0sy from |Ci|xi

p′i, |Ck|xk

p′k, and

|A|st1sy from |Cj |xj

q′j, |Ck|xk

q′k.

Now we contract |Ck|xk

p′kand |Ck|xk

q′k: since |Ck|xk

w is quantifier-free, there is aboolean term rCk

such that

(7.41) |Ck|xkw ↔ rCk

w = tt.

Hence with rk := [if rCkp′k then q′k else p′k] we can derive both |Ck|xk

p′kand

|Ck|xk

q′kfrom |Ck|xk

rk. The derivation proceeds by cases on the boolean term

rCkp′k. If it is true, then rk converts into q′k, and we only need to derive

|Ck|xk

p′k. But this follows by substituting p′k for w in (7.41). If rCk

p′k is false,then rk converts into p′k, and we only need to derive |Ck|xk

q′k, from |Ck|xk

p′k. But

the latter implies ff = tt (substitute again p′k for w in (7.41)) and thereforeevery quantifier-free formula, in particular |Ck|xk

q′k.


Using (→−) we obtain

|B|t0sy from |Ci|xi

p′i, |Cj |

xj

q′j, |Ck|xk

rk.

Let et+(MN) := t0s and et−i (MN) := p′i, et−j (MN) := q′j , et−k (MN) := rk.Case λxMA(x). By induction hypothesis we have a derivation of |A(x)|tz

from ui : |Ci|xiri . Substitute y0 for x and y1 for z. We obtain |A(y0)|t[x:=y0]y1 ,

which is (up to β-conversion)

|∀xA(x)|λxty , from u′i : |Ci|

xi

ri[x,z:=y0,y1].

Hence we can define the required terms by

et+(λxM) := λxet+(M),

et−i (λxM) := et−i (M)[x, z := y0, y1].

Case M∀xA(x)s. By induction hypothesis we have a derivation of

|∀xA(x)|tz = |A(z0)|t(z0)z1 from |Ci|xiri .

Substituting 〈s, y〉 for z gives

|A(s)|tsy from |Ci|xi

ri[z:=〈s,y〉].

Let et+(Ms) := ts and et−i (Ms) := ri[z := 〈s, y〉].Case Indn,A~aaM

A(0)0 M

∀n(A(n)→A(n+1))1 ; here we restrict ourselves to N.

Note that we can assume that the induction axiom appears with sufficientlymany arguments, so that it can be seen as an application of the inductionrule. This can always be achieved by means of η-expansion. Let Ik be theset of all indices of assumption variables ui : Ci occuring free in the stepderivation Mk; in the present case of induction over N we have k ∈ 0, 1.By induction hypothesis we have derivations of

|∀n(A(n)→ A(n+ 1))|tn,f,y =

|A(n)→ A(n+ 1)|tnf,y =

|A(n)|ftn1fy → |A(n+ 1)|tn0fy from (|Ci|xi

ri1(n,f,y))i∈I1

and of

|A(0)|t0x0from (|Ci|xi

ri0(x0))i∈I0 .

It suffices to construct terms (involving recursion operators) t, ri with freevariables among ~x such that

(7.42) ∀n,y((|Ci|xiriny

)i → |A(n)|tny ).

For then define et+(Indn,A~aaM0M1) := ta and et−i (Indn,A~aaM0M1) := riay.The recursion equations for t are

t0 = t0, t(n+ 1) = tn0(tn).

For ri the recursion equations may involve a case distinction correspondingto the well-known need of contraction in the Dialectica interpretation. Thishappens for the k-th recursion equation if and only if (i) we are not in abase case of the induction, and (ii) i ∈ Ik, i.e., the i-th assumption variableui : Ci occurs free in Mk. Therefore in the present case of induction over N


the recursion equation for ri needs a case distinction only if i ∈ I1 and weare in the successor case; then

ri(n+ 1)y =

ri1(n, tn, y) =: s if ¬|Ci|xi

s ,

rin(tn1(tn)y) otherwise.

For i /∈ I1 the second alternative suffices:

ri(n+ 1)y = rin(tn1(tn)y).

In the base case we can simply define ri0y = ri0(y). Now t, ri can be writtenexplicitly with recursion operators:

tn = Rnt0λn(tn0),

rin =

Rn(λyri0)λn,p,y[if rCis then p(tn1(tn)y) else s] if i ∈ I1,Rn(λyri0)λn,p,y(p(tn1(tn)y)) otherwise

with s := ri1(n, tn, y), as above. It remains to prove (7.42). We only considerthe successor case. Assume

(7.43) |Ci|xi

ri(n+1)y for all i.

We must show |A(n+ 1)|t(n+1)y . To this end we prove

|Ci|xi

ri1(n,tn,y)for all i ∈ I1, and(7.44)

ri(n+ 1)y = rin(tn1(tn)y) for all i.(7.45)

First assume i ∈ I1. Let s := ri1(n, tn, y). If ¬|Ci|xis , then by definition

ri(n+ 1)y = s, contradicting (7.43). Hence |Ci|xis , which is (7.44). Then by

definition (7.45) holds as well. Now assume i /∈ I1. Then (7.44) does notapply, and (7.45) holds by definition.

Recall the global induction hypothesis for the step derivation M1. Usedwith n, tn, y it gives

(|Ci|xi

ri1(n,tn,y))i∈I1 → |A(n)|tn

tn1(tn)y→ |A(n+ 1)|tn0(tn)

y .

Because of (7.44) it suffices to prove the middle premise. By inductionhypothesis (7.42) with y := tn1(tn)y it suffices to prove |Ci|xi

rin(tn1(tn)y)for

all i. But this follows from (7.43) by (7.45).

Remark. It is interesting to note that (7.42) can also be proved byquantifier-free induction. To this end, define

s0zm := z, s(l + 1)zm := t(m−· l −· 1)1(t(m−· l −· 1))(slzm).

We fix z and prove by induction on n that

(7.46) n ≤ m→ (|Ci|xi

rin(s(m−· n)zm))i → |A(n)|tns(m−· n)zm.

Then (7.42) will follow with n := m. For the base case n = 0 we must show

(|Ci|xi

ri0(smzm))i → |A(0)|t0smzm.

Recall that the global induction hypothesis for the base derivation gives withx0 := smzm

(|Ci|xi

ri0(smzm))i∈I0 → |A(0)|t0smzm.


By definition of t and ri this is what we want. Now consider the successorcase. Assume n+1 ≤ m. We write sl for slzm, and abbreviate s(m−· n−· 1)by y. Notice that for l + 1 = m−· n by definition of s we have s(m−· n) =tn1(tn)y. With this notation the previous argument goes through literally:

Assume (7.43). We must show |A(n + 1)|t(n+1)y . To this end we prove

(7.44) and (7.45). First assume i ∈ I1. Let s := ri1(n, tn, y). If ¬|Ci|xis ,

then by definition ri(n+ 1)y = s, contradicting (7.43). Hence |Ci|xis , which

is (7.44). Then by definition (7.45) holds as well. Now assume i /∈ I1. Then(7.44) does not apply, and (7.45) holds by definition.

Recall the global induction hypothesis for the step derivation M1. Usedwith n, tn, y it gives

(|Ci|xi

ri1(n,tn,y))i∈I1 → |A(n)|tn

tn1(tn)y→ |A(n+ 1)|tn0(tn)

y .

Because of (7.44) it suffices to prove the middle premise. By inductionhypothesis (7.42) with y := tn1(tn)y it suffices to prove |Ci|xi

rin(tn1(tn)y)for

all i. But this follows from (7.43) by (7.45).

Case Cn,AaMA(0)0 M

∀nA(n+1)1 . This can be dealt with similarly, but some-

what simpler. By induction hypothesis we have derivations of

|∀nA(n+ 1)|tn,y = |A(n+ 1)|tny from |Ci|xi

ri1(n,y)

and of

|A(0)|t0y from |Ci|xi

ri0(y).

i ranges over all assumption variables in Cn,AaM0M1 (if necessary choosecanonical terms ri0 and ri1). It suffices to construct terms t, ri with freevariables among ~x such that

(7.47) ∀m,y((|Ci|xirimy

)i → |A(m)|tmy ).

For then we can define et+(Cn,AaM0M1) = ta and et−i (Cn,AaM0M1) = riay.The defining equations for t are

t0 = t0, t(n+ 1) = tn

and for riri0y = ri0, ri(n+ 1)y = ri1(n, y) =: s.

t, ri can be written explicitly:

tm = [if m then t0 else tm], rim = [if m then λyri0(y) else λn,ys]

with s as above. It remains to prove (7.47). We only consider the successorcase. Assume |Ci|xi

ri(n+1)y for all i. We must show |A(n + 1)|t(n+1)y . To see

this, recall that the global induction hypothesis (for the step derivation)gives

(|Ci|xis )i → |A(n+ 1)|tny

and we are done.Case ∃−x,A,BM∃xAN∀x(A→B). Again it is easiest to assume that the axiom

appears with two proof arguments, for its two assumptions. Then it can beseen as an application of the existence elimination rule. We proceed similarto the treatment of (→−) above:


By induction hypothesis we have a derivation of

|∀x(A(x)→ B)|tx = |A(x0)→ B|t(x0)x1

= |A(x0)|x10t(x0)1(x10)(x11) → |B|t(x0)0(x10)x11

from |Ci|xipi

, |Ck|xkpk

, and of

|∃xA(x)|sz = |A(s0)|s1z from |Cj |xjqj , |Ck|xk

qk.

Substituting 〈s0, 〈s1, y〉〉 for x in the first derivation and of t(s0)1(s1)y forz in the second derivation gives

|A(s0)|s1t(s0)1(s1)y → |B|t(s0)0(s1)y from |Ci|xi

p′i, |Ck|xk

p′k, and

|A(s0)|s1t(s0)1(s1)y from |Cj |xj

q′j, |Ck|xk

q′k.

Now we contract |Ck|xk

p′kand |Ck|xk

q′kas in case (→−) above; with rk :=

[if rCkp′k then q′k else p′k] we can derive both |Ck|xk

p′kand |Ck|xk

q′kfrom |Ck|xk

rk.

Using (→−) we obtain

|B|t(s0)0(s1)y from |Ci|xi

p′i, |Cj |

xj

q′j, |Ck|xk

rk.

So et+(∃−MN) := t(s0)0(s1) and

et−i (∃−MN) := p′i, et−j (∃−MN) := q′j , et−k (∃−MN) := rk.

7.4.5. A unified treatment of modified realizability and the Di-alectica interpretation. Following Oliva (2006), we show that modifiedrealizability can be treated in such a way that similarities with the Dialec-tica interpretation become visible. To this end, one needs to change thedefinitions of τ+(A) and τ−(A) and also of the Godel translation |A|xy in theimplicational case, as follows.

τ+r (A→ B) := τ+

r (A)→ τ+r (B),

τ−r (A→ B) := τ+r (A)× τ−r (B),

||A→ B||fx,u := ∀y||A||xy → ||B||fxu .

Note that the (changed) Godel translation ||A||xy is not quantifier-free anymore, but only ∃-free. – Then the above definition of r can be expressed interms of the (new) ||A||xy :

` r r A↔ ∀y||A||ry.

This is proved by induction on A. For prime formulas the claim is obvious.Case A→ B, with τ+

r (A) 6= , τ−r (A) 6= .

r r (A→ B)↔ ∀x(x r A→ rx r B) by definition

↔ ∀x(∀y||A||xy → ∀u||B||rxu ) by induction hypothesis

↔ ∀x,u(∀y||A||xy → ||B||rxu )

= ∀x,u||A→ B||rx,u by definition.

The other cases are similar (even easier).


7.4.6. Dialectica interpretation of general induction. Recall thegeneral recursion operator introduced in (6.5) (in 6.2.1):

FµxG = Gx(λy[if µy < µx then FµyG else ε]),

where ε denotes a canonical inhabitant of the range. Using general inductionone can prove that F is total:

Theorem. If µ, G and x are total, then so is FµxG.

Proof. Fix total functions µ and G. We apply general induction on xto show that FµxG is total, which we write as (FµxG)↓. By (7.3) it sufficesto show that

∀y;µy<µx(FµyG)↓ → (FµxG)↓.But this follows from (6.5), using the totality of µ, G and x.

Again, in our special case of the <-relation general recursion is easilydefinable from structural recursion; the details are spelled out in Schwicht-enberg and Wainer (1995, pp.399f). However, general recursion is preferablefrom an efficiency point of view.

For an implementation of the Dialectica interpretation it is advisable toreplace axioms by rules whenever possible. In particular, more perspicuousrealizers for proofs involving general induction can be obtained if the in-duction axiom appears with sufficiently many arguments, so that it can beseen as an application of the induction rule. Note that this can always beachieved by means of η-expansion.

Case GIndn,A~ahkMProghnA(n) : A(n). By induction hypothesis we can

derive

|ProghnA(n)|tn,f,z =

|∀n(∀m;hm<hnA(m)→ A(n))|tn,f,z =

|∀m;hm<hnA(m)→ A(n)|tnf,z =

|∀m;hm<hnA(m)|ftn1fz → |A(n)|tn0fz =

(h(tn1fz0) < hn→ |A(tn1fz0)|f(tn1fz0)tn1fz1 )→ |A(n)|tn0f

z from |Ci|xi

ri(n,f,z),

where i ranges over all assumption variables in GIndn,A~ahkM (if necessarychoose canonical terms ri). It suffices to construct terms (involving generalrecursion operators) t, ri with free variables among ~x such that

(7.48) ∀n,z((|Ci|xirinz

)i → |A(n)|tnz ),

for then we can define et+(GIndn,A~ahkM) = tk and et−i (GIndn,A~ahkM) =rikz. The recursion equations for t and ri are

tn = tn0[t]<hn, rinz =

ri(n, [t]<hn, z) =: s if ¬|Ci|xi

s ,

[ri]<hn(t′0)(t′1) otherwise,

with the abbreviations

[r]<hn := λm[if hm < hn then rm else ε], t′ := tn1[t]<hnz.

7.5. OPTIMAL DECORATION OF PROOFS 311

It remains to prove (7.48). For its proof we use general induction. Fix n.We can assume

(7.49) ∀m;hm<hn∀z((|Ci|xirimz

)i → |A(m)|tmz ).

Fix z and assume |Ci|xirinz

for all i. We must show |A(n)|tnz . If ¬|Ci|xis for

some i, then by definition rinz = s and we have |Ci|xis , a contradiction.

Hence |Ci|xis for all i, and therefore rinz = [ri]<hn(t′0)(t′1). The induction

hypothesis (7.49) with m := t′0 and z := t′1 gives

h(t′0) < hn→ (|Ci|xi

ri(t′0)(t′1))i → |A(t′0)|t(t

′0)t′1 .

Recall that the global induction hypothesis (for the derivation of progres-siveness) gives with f := [t]<hn

(|Ci|xis )i → (h(t′0) < hn→ |A(t′0)|[t]<hn(t′0)

t′1 )→ |A(n)|tn0[t]<hnz .

Since t(t′0) = [t]<hn(t′0) and rinz = [ri]<hn(t′0)(t′1) = ri(t′0)(t′1) we aredone.

Notice that we can view this proof as an application of quantifier-freegeneral induction, where the formula (|Ci|xi

rinz)i → |A(n)|tnz is proved w.r.t.

the measure function h′nz := hn.

7.5. Optimal Decoration of Proofs

In this section we are interested in “fine-tuning” the computational con-tent of proofs, by inserting decorations. Here is an example (due to Consta-ble) of why this is of interest. Suppose that in a proof M of a formula C wehave made use of a case distinction based on an auxiliary lemma stating adisjunction, say L : A ∨B. Then the extract et(M) will contain the extractet(L) of the proof of the auxiliary lemma, which may be large. Now sup-pose further that in the proof M of C, the only computationally relevantuse of the lemma was which one of the two alternatives holds true, A or B.We can express this fact by using a weakened form of the lemma instead:L′ : A ∨nc B. Since the extract et(L′) is a boolean, the extract of the modi-fied proof has been “purified” in the sense that the (possibly large) extractet(L) has disappeared.

In 7.5.1 we consider the question of “optimal” decorations of proofs:suppose we are given an undecorated proof, and a decoration of its endformula. The task then is to find a decoration of the whole proof (including afurther decoration of its end formula) in such a way that any other decoration“extends” this one. Here “extends” just means that some connectives havebeen changed into their more informative versions, disregarding polarities.We show that such an optimal decoration exists, and give an algorithm toconstruct it.

We then consider applications. In 7.5.2 we take up the example of list re-versal used by Berger (2005a) to demonstrate that usage of ∀nc rather than∀c can significantly reduce the complexity of extracted programs, in thiscase from quadratic to linear. The Minlog implementation of the decorationalgorithm automatically finds the optimal decoration. A similar application


of decoration is treated in 7.5.3. It occurs when one derives double induc-tion (recurring to two predecessors) in continuation passing style, i.e., notdirectly, but using as an intermediate assertion (proved by induction)

∀cn,m((Qn→c Q(Sn)→c Q(n+m))→c Q0→c Q1→c Q(n+m)).

After decoration, the formula becomes

∀cn∀ncm ((Qn→c Q(Sn)→c Q(n+m))→c Q0→c Q1→c Q(n+m)).

This is applied (as in Chiarabini (2009)) to obtain a continuation based tailrecursive definition of the Fibonacci function, from a proof of its totality.

7.5.1. Decoration algorithm. We denote the sequent of a proof Mby Seq(M); it consists of its context and end formula.

The proof pattern P(M) of a proof M is the result of marking in c.r.formulas of M (i.e., those not above a c.i. formula) all occurrences of impli-cations and universal quantifiers as non-computational, except the “unin-stantiated” formulas of axioms and theorems. For instance, the inductionaxiom for N consists of the uninstantiated formula ∀cn(P0 →c ∀cn(Pn →c

P (Sn))→c PnN) with a unary predicate variable P and a predicate substi-tution P 7→ x | A(x) . Notice that a proof pattern in most cases is not acorrect proof, because at axioms formulas may not fit.

We say that a formula D extends C if D is obtained from C by changingsome (possibly zero) of its occurrences of non-computational implicationsand universal quantifiers into their computational variants →c and ∀c.

A proof N extends M if (i) N and M are the same up to variants ofimplications and universal quantifiers in their formulas, and (ii) every c.r.formula of M is extended by the corresponding one in N . Every proof Mwhose proof pattern P(M) is U is called a decoration of U .

Note. Notice that if a proof N extends another one M , then FV(et(N))is essentially (that is, up to extensions of assumption formulas) a supersetof FV(et(M)). This can be proven by induction on N .

In the sequel we assume that every axiom has the property that for everyextension of its formula we can find a further extension which is an instanceof an axiom, and which is the least one under all further extensions that areinstances of axioms. This property clearly holds for axioms whose uninstan-tiated formula only has the decorated →c and ∀c, for instance induction.However, in ∀cn(A(0) →c ∀cn(A(n) →c A(Sn)) →c A(nN)) the given exten-sion of the four A’s might be different. One needs to pick their “least upperbound” as further extension. To make this assumption true for the other(introduction and elimination) axioms we simply add all their extensions asaxioms, if necessary.

We will define a decoration algorithm, assigning to every proof patternU and every extension of its sequent an “optimal” decoration M∞ of U ,which further extends the given extension of its sequent.

Theorem. Under the assumption above, for every proof pattern U andevery extension of its sequent Seq(U) we can find a decoration M∞ of Usuch that(a) Seq(M∞) extends the given extension of Seq(U), and


(b) M∞ is optimal in the sense that any other decoration M of U whosesequent Seq(M) extends the given extension of Seq(U) has the propertythat M also extends M∞.

Proof. By induction on derivations. It suffices to consider derivationswith a c.r. endformula. For axioms the validity of the claim was assumed,and for assumption variables it is clear.

Case (→nc)+. Consider the proof patternΓ, u : A| UB (→nc)+, u

A→nc Bwith a given extension ∆ ⇒ C →nc D or ∆ ⇒ C →c D of its sequentΓ ⇒ A →nc B. Applying the induction hypothesis for U with sequent∆, C ⇒ D, one obtains a decoration M∞ of U whose sequent ∆1, C1 ⇒ D1

extends ∆, C ⇒ D. Now apply (→nc)+ in case the given extension is ∆ ⇒C →nc D and xu /∈ FV(et(M∞)), and (→c)+ otherwise.

For (b) consider a decoration λuM of λuU whose sequent extends thegiven extended sequent ∆ ⇒ C →nc D or ∆ ⇒ C →c D. Clearly thesequent Seq(M) of its premise extends ∆, C ⇒ D. Then M extends M∞ byinduction hypothesis for U . If λuM derives a non-computational implicationthen the given extended sequent must be of the form ∆ ⇒ C →nc D andxu /∈ FV(et(M)), hence xu /∈ FV(et(M∞)). But then by construction wehave applied (→nc)+ to obtain λuM∞. Hence λuM extends λuM∞. IfλuM does not derive a non-computational implication, the claim followsimmediately.

Case (→nc)−. Consider a proof patternΦ,Γ| U

A→nc B

Γ,Ψ| VA (→nc)−

BWe are given an extension Π,∆,Σ ⇒ D of Φ,Γ,Ψ ⇒ B. Then we proceedin alternating steps, applying the induction hypothesis to U and V .

(1) The induction hypothesis for U for the extension Π,∆ ⇒ A →nc Dof its sequent gives a decoration M1 of U whose sequent Π1,∆1 ⇒ C1 → D1

extends Π,∆ ⇒ A →nc D, where → means →nc or →c. This alreadysuffices if A is c.i., since then the extension ∆1,Σ ⇒ C1 of V is a correctproof (recall that in c.i. parts of a proof decorations of implications anduniversal quantifiers can be ignored). If A is c.r.:

(2) The induction hypothesis for V for the extension ∆1,Σ⇒ C1 of itssequent gives a decoration N2 of V whose sequent ∆2,Σ2 ⇒ C2 extends∆1,Σ⇒ C1.

(3) The induction hypothesis for U for the extension Π1,∆2 ⇒ C2 → D1

of its sequent gives a decoration M3 of U whose sequent Π3,∆3 ⇒ C3 → D3

extends Π1,∆2 ⇒ C2 → D1.(4) The induction hypothesis for V for the extension ∆3,Σ2 ⇒ C3 of

its sequent gives a decoration N4 of V whose sequent ∆4,Σ4 ⇒ C4 extends∆3,Σ2 ⇒ C3. This process is repeated until in V no further proper extension


of ∆3 and C3 is returned. Such a situation will always be reached since thereis a maximal extension, where all connectives are maximally decorated. Butthen we easily obtain (a): Assume that in (4) we have ∆4 = ∆3 and C4 = C3.Then the decoration

Π3,∆3

|M3

C3 → D3

∆4,Σ4

| N4

C4 →−D3

of UV derives a sequent Π3,∆3,Σ4 ⇒ D3 extending Π,∆,Σ⇒ D.For (b) we need to consider a decoration MN of UV whose sequent

Seq(MN) extends the given extension Π,∆,Σ ⇒ D of Φ,Γ,Ψ ⇒ B. Wemust show that MN extends M3N4. To this end we go through the alter-nating steps again.

(1) Since the sequent Seq(M) extends Π,∆ ⇒ A →nc D, the inductionhypothesis for U for the extension ∆⇒ A→nc D of its sequent ensures thatM extends M1.

(2) Since then the sequent Seq(N) extends ∆1,Σ ⇒ C1, the inductionhypothesis for V for the extension ∆1,Σ ⇒ C1 of its sequent ensures thatN extends N2.

(3) Therefore Seq(M) extends the sequent Π1,∆2 ⇒ C2 → D1, and theinduction hypothesis for U for the extension Π1,∆2 ⇒ C2 → D1 of U ’ssequent ensures that M extends M3.

(4) Therefore Seq(N) extends ∆3,Σ2 ⇒ C3, and induction hypothesisfor V for the extension ∆3,Σ2 ⇒ C3 of V ’s sequent ensures that N alsoextends N4.

But since ∆4 = ∆3 and C4 = C3 by assumption, MN extends thedecoration M3N4 of UV constructed above.

Case (∀nc)+. Consider a proof pattern

Γ| UA (∀nc)+∀ncx A

with a given extension ∆⇒ ∀ncx C or ∆⇒ ∀cxC of its sequent. Applying the

induction hypothesis for U with sequent ∆ ⇒ C, one obtains a decorationM∞ of U whose sequent ∆1 ⇒ C1 extends ∆ ⇒ C. Now apply (∀nc)+ incase the given extension is ∆ ⇒ ∀nc

x C and x /∈ FV(et(M∞)), and (∀c)+otherwise.

For (b) consider a decoration λxM of λxU whose sequent extends thegiven extended sequent ∆⇒ ∀nc

x C or ∆⇒ ∀cxC. Clearly the sequent Seq(M)of its premise extends ∆⇒ C. ThenM extendsM∞ by induction hypothesisfor U . If λxM derives a non-computational generalization, then the givenextended sequent must be of the form ∆ ⇒ ∀nc

x C and x /∈ FV(et(M)),hence x /∈ FV(et(M∞)) (by the remark above). But then by constructionwe have applied (∀nc)+ to obtain λxM∞. Hence λxM extends λxM∞. IfλxM does not derive a non-computational generalization, the claim followsimmediately.


Case (∀nc)−. Consider a proof pattern

Γ| U

∀ncx A(x) r

(∀nc)−A(r)

and let ∆⇒ C(r) be any extension of its sequent Γ⇒ A(r). The inductionhypothesis for U for the extension ∆⇒ ∀nc

x C(x) produces a decoration M∞of U whose sequent extends ∆ ⇒ ∀nc

x C(x). Then apply (∀nc)− or (∀c)−,whichever is appropriate, to obtain the required M∞r.

For (b) consider a decoration Mr of Ur whose sequent Seq(Mr) extendsthe given extension ∆ ⇒ C(r) of Γ ⇒ A(r). Then M extends M∞ byinduction hypothesis for U , and hence Mr extends M∞r.

7.5.2. List reversal, again. We first give an informal weak existenceproof for list reversal. Recall that the weak (or “classical”) existential quan-tifier is defined by

∃xA := ¬∀x¬A.

The proof is similar to the one given in 7.2.8. Again assuming (7.20) and(7.21) we prove

(7.50) ∀v∃wRvw ( := ∀v(∀w(Rvw → ⊥)→ ⊥)).

Fix v and assume u : ∀w¬Rvw; we need to derive a contradiction. To this endwe prove that all initial segments of v are non-revertible, which contradicts(7.20). More precisely, from u and (7.21) we prove

∀v2A(v2) with A(v2) := ∀v1(v1v2 = v → ∀w¬Rv1w)

by induction on v2. For v2 = nil this follows from our initial assumption u.For the step case, assume v1(xv2) = v, fix w and assume further Rv1w. Wemust derive a contradiction. By (7.21) we conclude that R(v1x, xw). Onthe other hand, properties of the append function imply that (v1x)v2 = v.The induction for v1x gives ∀w¬R(v1x,w). Taking xw for w leads to thedesidered contradiction.

We formalize this proof, to prepare it for decoration. The followinglemmata will be used.

Compat: ∀P∀v1,v2(v1 = v2 → Pv1 → Pv2),

Symm: ∀v1,v2(v1 = v2 → v2 = v1),

Trans : ∀v1,v2,v3(v1 = v2 → v2 = v3 → v1 = v3),

L1 : ∀v(v = v nil),

L2 : ∀v1,x,v2((v1x)v2 = v1(xv2)),

The proof term is

M := λvλ∀w¬Rvwu (Indv2,A(v2)vvMBaseMStep nil Tnil v=v nil InitRev)


with

MBase := λv1λv1nil=vu1

(Compat v | ∀w¬Rvw vv1(Symm v1v(Trans v1(v1 nil)v(L1v1)u1))u),

MStep := λx,v2λA(v2)u0

λv1λv1(xv2)=vu1

λwλRv1wu2

(

u0(v1x)(Trans ((v1x)v2)(v1(xv2))v(L2v1xv2)u1)

(xw)(GenRev v1wxu2)).

We now have a proof M of ∀v∃wRvw from the clauses InitRev : D1 andGenRev : D2, with D1 := R(nil,nil) and D2 := ∀v,w,x(Rvw → R(vx, xw)).Using the refined A-translation (cf. section 7.3) we can replace ⊥ throughoutby ∃wRvw. The end formula ∀v∃wRvw := ∀v¬∀w¬Rvw := ∀v(∀w(Rvw →⊥) → ⊥) is turned into ∀v(∀w(Rvw → ∃wRvw) → ∃wRvw). Since itspremise is an instance of existence introduction we obtain a derivation M∃

of ∀v∃wRvw. Moreover, in this case neither the Di nor any of the axiomsused involves ⊥ in its uninstantiated formulas, and hence the correctness ofthe proof is not affected by the substitution. The term neterm extractedin Minlog from a formalization of the proof above is (after “animating”Compat)[v0](Rec list nat=>list nat=>list nat=>list nat)v0([v1,v2]v2)([x1,v2,g3,v4,v5]g3(v4:+:x1:)(x1::v5))(Nil nat)(Nil nat)

with g a variable for binary functions on lists. In fact, the underlying algo-rithm defines an auxiliary function h by

h(nil, v2, v3) := v3, h(xv1, v2, v3) := h(v1, v2x, xv3)

and gives the result by applying h to the original list and twice nil.Notice that the second argument of h is not needed. However, its pres-

ence makes the algorithm quadratic rather than linear, because in each re-cursion step v2x is computed, and the list append function is defined byrecursion on its first argument. We will be able to get rid of this super-fluous second argument by decorating the proof. It will turn out that inthe proof (by induction on v2) of the auxiliary formula A(v2) := ∀v1(v1v2 =v → ∀w¬Rv1w)), the variable v1 is not used computationally. Hence, in thedecorated version of the proof, we can use ∀nc

v1 .Let us now apply the general method of decorating proofs to the example

of list reversal. To this end, we present our proof in more detail, particu-larly by writing proof trees with formulas. The decoration algorithm thenis applied to its proof pattern with the sequent consisting of the contextR(nil,nil) and ∀nc

v,w,x(Rvw →nc R(vx, xw)) and the end formula ∀ncv ∃lwRvw.

Rather than describing the algorithm step by step we only display theend result. Among the axioms used, the only ones in c.r. parts are Compatand list induction. They appear in the decorated proof in the form

Compat: ∀P∀ncv1,v2(v1 = v2 → Pv1 →c Pv2),

Ind: ∀cv2(A(nil)→c ∀cx,v2(A(v2)→c A(xv2))→c A(v2))


Compat v | ∀cw¬∃Rvw v v1

v=v1 → ∀cw¬∃Rvw →c ∀cw¬∃Rv1w

[u1 : v1 nil=v]| N

v=v1∀cw¬∃Rvw →c ∀cw¬∃Rv1w ∃+ : ∀cw¬∃Rvw

∀cw¬∃Rv1w (→nc)+u1v1 nil = v → ∀cw¬∃Rv1w

∀ncv1(v1 nil = v → ∀cw¬∃Rv1w) (= A(nil))

Figure 1. The decorated base derivation

[u0 : A(v2)] v1x

(v1x)v2=v → ∀cw¬∃R(v1x,w)

[u1 : v1(xv2)=v]| N1

(v1x)v2=v∀cw¬∃R(v1x,w) xw

¬∃R(v1x, xw)

[u2 : Rv1w]| N2

R(v1x, xw)∃lwRvw (→nc)+u2¬∃Rv1w∀cw¬∃Rv1w (→nc)+u1

v1(xv2)=v → ∀cw¬∃Rv1w∀ncv1(v1(xv2)=v → ∀

cw¬∃Rv1w) (=A(xv2))

(→c)+u0A(v2)→c A(xv2)

∀cx,v2(A(v2)→c A(xv2))

Figure 2. The decorated step derivation

with A(v2) := ∀ncv1(v1v2=v → ∀

cw¬∃Rv1w) and ¬∃Rv1w := Rv1w → ∃lwRvw.

M∃Base is the derivation in Figure 1, where N is a derivation involving L1

with a free assumption u1 : v1 nil=v. M∃Step is the derivation in Figure 2,

where N1 is a derivation involving L2 with free assumption u1 : v1(xv2)=v,and N2 is one involving GenRev with the free assumption u2 : Rv1w.

The extracted term neterm then is[v0](Rec list nat=>list nat=>list nat)v0([v1]v1)([x1,v2,f3,v4]f3(x1::v4))(Nil nat)

with f a variable for unary functions on lists. To run this algorithm one hasto normalize the term obtained by applying neterm to a list:(pp (nt (mk-term-in-app-form neterm (pt "1::2::3::4:"))))

The returned value is the reverted list 4::3::2::1:. This time, the under-lying algorithm defines an auxiliary function g by

g(nil, w) := w, g(x :: v, w) := g(v, x :: w)


and gives the result by applying g to the original list and nil. In conclusion,we have obtained (by machine extraction from an automated decoration ofa weak existence proof) the standard linear algorithm for list reversal, withits use of an accumulator.

7.5.3. Passing continuations. A similar application of decoration oc-curs when one derives double induction

∀cn(Qn→c Q(Sn)→c Q(S(Sn)))→c ∀cn(Q0→c Q1→c Qn).

in continuation passing style, i.e., not directly, but using as an intermediateassertion (proved by induction)

∀cn,m((Qn→c Q(Sn)→c Q(n+m))→c Q0→c Q1→c Q(n+m))

After decoration, the formula becomes

∀cn∀ncm ((Qn→c Q(Sn)→c Q(n+m))→c Q0→c Q1→c Q(n+m)).

This can be applied to obtain a continuation based tail recursive defi-nition of the Fibonacci function, from a proof of its totality. Let G be thegraph of the Fibonacci function, defined by the clauses

G00, G11,

∀ncn,v,w(Gnv →nc G(Sn,w)→nc G(S(Sn), v + w)).

We view G as a predicate variable without computational content. Fromthese assumptions one can easily derive

∀cn∃vGnv,

using double induction (proved in continuation passing style). The termextracted from this proof is

[n0](Rec nat=>nat=>(nat=>nat=>nat)=>nat=>nat=>nat)n0([n1,k2]k2)([n1,p2,n3,k4]p2(Succ n3)([n7,n8]k4 n8(n7+n8)))

applied to 0, ([n1,n2]n1), 0 and 1. An unclean aspect of this term is thatthe recursion operator has value type

nat=>(nat=>nat=>nat)=>nat=>nat=>nat

rather than (nat=>nat=>nat)=>nat=>nat=>nat, which would correspond toan iteration. However, we can repair this by decoration. After (automatic)decoration of the proof, the extracted term becomes

[n0](Rec nat=>(nat=>nat=>nat)=>nat=>nat=>nat)n0([k1]k1)([n1,p2,k3]p2([n6,n7]k3 n7(n6+n7)))

applied to ([n1,n2]n1), 0 and 1. This indeed is iteration in continuationpassing style.

7.6. APPLICATION: EUCLID’S THEOREM 319

7.6. Application: Euclid’s Theorem

Yiannis Moschovakis suggested the following example of a classical exis-tence proof with a quantifier-free kernel which does not obviously contain analgorithm: the gcd of two natural numbers a1 and a2 is a linear combinationof the two. Here we treat this example as a case study for program extrac-tion from classical proofs. We will apply both methods discussed above: therefined A-translation (7.3) and the Dialectica interpretation (7.4). It willturn out that in both cases we obtain reasonable extracted terms, which arein fact quite similar.

7.6.1. Informal proof. We spell out the usual informal proof, whichuses the minimum principle. This is done in rather great detail, because forthe application of the metamathematical methods of proof interpretation weneed a full formalization.

Theorem. Assume 0 < a2. Then there are natural numbers k1, k2 suchthat 0 < |k1a1 − k2a2| and Rem(ai, |k1a1 − k2a2|) = 0 (i = 1, 2).

Proof. Assume 0 < a2. Let A(k1, k2) := (0 < |k1a1−k2a2|). There arek1, k2 such that A(k1, k2): take k1 := 0 and k2 := 1. The minimum principlefor A(k1, k2) with measure |k1a1 − k2a2| provides us with k1, k2 such that

A(k1, k2),(7.51)

∀l1,l2(|l1a1 − l2a2| < |k1a1 − k2a2| → A(l1, l2)→ ⊥).(7.52)

Assume

∀k1,k2(0 < |k1a1 − k2a2| → Rem(a1, |k1a1 − k2a2|) = 0→Rem(a2, |k1a1 − k2a2|) = 0→ ⊥).

We must show ⊥. To this end we apply the assumption to k1, k2. Since0 < |k1a1 − k2a2| by (7.51) it suffices to prove Rem(ai, |k1a1 − k2a2|) = 0(i = 1, 2); for symmetry reasons we only consider i = 1. Abbreviate

q := Quot(a1, |k1a1 − k2a2|), r := Rem(a1, |k1a1 − k2a2|).Because of 0 < |k1a1 − k2a2| general properties of Quot and Rem ensure

a1 = q|k1a1 − k2a2|+ r, r < |k1a1 − k2a2|.From this – using the Step lemma below – we obtain

r = |Step(a1, a2, k1, k2, q)︸︷︷︸=:l1

a1 − qk2︸︷︷︸=:l2

a2| < |k1a1 − k2a2|.

(7.52) applied to l1, l2 gives A(l1, l2)→ ⊥ and hence 0 = |l1a1−l2a2| = r.

Lemma (Step).

a1 = q · |k1a1 − k2a2|+ r → r = |Step(a1, a2, k1, k2, q)a1 − qk2a2|.

Proof. Let

Step(a1, a2, k1, k2, q) :=

qk1 − 1 if k2a2 < k1a1 and 0 < q,

qk1 + 1 otherwise.

Clearly the values are natural numbers. Assume 0 < q. In case k2a2 < k1a1

a1 = q · (k1a1 − k2a2) + r


r = (1− qk1)a1 + qk2a2

= −(qk1 − 1)a1 + qk2a2

= −Step(a1, a2, k1, k2, q) + qk2a2

= |Step(a1, a2, k1, k2, q)a1 − qk2a2|.and in case k2a2 ≥ k1a1

a1 = −q · (k1a1 − k2a2) + r

r = (qk1 + 1)a1 − qk2a2

= |Step(a1, a2, k1, k2, q)a1 − qk2a2|.For q = 0, Step(a1, a2, k1, k2, 0) = 1 and the claim is correct.

7.6.2. Extracted terms. The refined A-translation when applied toa formalization of the proof above produces a term eta :=[n0,n1][if (0=Rem n0 n1)(0@1)[if (0<Rem n0 n1)((Rec nat=>nat=>nat=>nat@@nat)([n2,n3]0@0)([n2,f3,n4,n5][if (0=Rem n1(Lin n0 n1(n4@n5)))[if (0=Rem n0(Lin n0 n1(n4@n5)))(n4@n5)[if (0<Rem n0(Lin n0 n1(n4@n5)))(f3 (Step n0 n1(n4@n5)(Quot n0(Lin n0 n1(n4@n5))))

(Quot n0(Lin n0 n1(n4@n5))*n5))(0@0)]]

[if (0<Rem n1(Lin n0 n1(n4@n5)))(f3(Quot n1(Lin n0 n1(n4@n5))*n4)(Step n1 n0(n5@n4)(Quot n1(Lin n0 n1(n4@n5)))))(0@0)]])

n1(Step n0 n1(0@1)(Quot n0 n1))(Quot n0 n1))(0@0)]]

The term extracted via the Dialectica interpretation from a formalization ofthis proof is etd :=[n0,n1][let pf712((Rec nat=>nat@@nat=>nat@@nat)([p3]0@0)([n3,pf4,p5][if (0<Lin n0 n1 p5 impb

Rem n0(Lin n0 n1 p5)=0 impbRem n1(Lin n0 n1 p5)=0 impb False)

(pf4[let p6(Step n0 n1 p5(Quot n0(Lin n0 n1 p5))@

Quot n0(Lin n0 n1 p5)*right p5)

7.6. APPLICATION: EUCLID’S THEOREM 321

[if (Lin n0 n1 p6<n3 impb 0<Lin n0 n1 p6 impb False)(Quot n1(Lin n0 n1 p5)*left p5@Step n1 n0(right p5@left p5)(Quot n1(Lin n0 n1 p5)))p6]])

p5])n1)[let p2[if (0<n1 impb Rem n0 n1=0 impb False)(pf712(Step n0 n1(0@1)(Quot n0 n1)@Quot n0 n1))(0@1)][if (0<Lin n0 n1 p2 impb

Rem n0(Lin n0 n1 p2)=0 impbRem n1(Lin n0 n1 p2)=0 impb False)

(pf712(0@[if (0<n1) 0 2]))p2]]]

Application of term-to-expr to etd as well as eta results in a Schemeexpression which can be “evaluated”, provided we have “defined” (in thesense of the underlying programming language) the functions |Step| and|Lin|:

(define |Step|(lambda (a1)(lambda (a2)(lambda (p)(lambda (q)(if (and (< (* (cdr p) a2) (* (car p) a1)) (< 0 q))

(- (* q (car p)) 1)(+ (* q (car p)) 1)))))))

(define |Lin|(lambda (a1)(lambda (a2)(lambda (p)n(abs (- (* (car p) a1) (* (cdr p) a2)))))))

The result for (((ev (term-to-expr etd)) 66) 27) is (16 . 39). Indeed|16 ∗ 66 − 39 ∗ 27| = 3, which is the greatest common divisor of 66 and27. For (((ev (term-to-expr eta)) 66) 27) the result is (2 . 5), andagain, |2 ∗ 66− 5 ∗ 27| = 3.

Remarks. As one sees from this example the recursion parameter nis not really used in the computation but just serves as a counter or moreprecisely as an upper bound for the number of steps until both remaindersare zero. This will always happen if the induction principle is used only inthe form of the minimum principle (or, equivalently, <-induction). Becausethen in the extracted terms of <-induction, the step term has in its kernelno free occurrence of n.

If one removes n according to this remark it becomes clear that our gcdalgorithm is similar to Euclid’s. The only difference lies in the fact that wehave kept a1, a2 fixed in our proof whereas Euclid changes a1 to a2 and a2


to Rem(a1, a2) provided Rem(a1, a2) > 0 (using the fact that this doesn’tchange the ideal).

7.7. Notes

Much of the material in the present chapter is due to Troelstra (1973).More information on the BHK-interpretation and its history may be foundin (Troelstra and van Dalen, 1988, 1.3, 1.5.3).

The concept of a “non-computational” universal quantifier has been in-troduced by Berger (1993b), and later – in Berger (2005a) – been renamedinto “uniform” universal quantifier. A somewhat related idea in the contextof so-called pure type systems has been formulated in Miquel (2001). How-ever, in his Gen rule used to introduce the non-computational quantifierMiquel is much more restrictive: the generalized variable is required to notat all occur in the given proof M , whereas Berger only requires that it is nota computational variable in M , which is expressed here by x /∈ FV(et(M)).

Section 7.3 is based on (Berger, Buchholz, and Schwichtenberg, 2002). Itgeneralizes previously known results sinceB in ∀x∃yB need not be quantifier-free, but only has to belong to the strictly larger class of goal formulas(defined in 7.3.1). Furthermore we allow unproven lemmata D in the proofof ∀x∃yB, where D is a definite formula (defined in 7.3.1). Closely relatedclasses of formulas have (independently) been introduced by Ishihara (2000).

Other interesting examples of program extraction from classical proofshave been studied by Murthy (1990), Coquand’s group (see e.g. Coquandand Persson (1999)) in a type theoretic context and by Kohlenbach (1996)using a Dialectica interpretation.

There is also a different line of research aimed at giving an algorithmicinterpretation to (specific instances of) the classical double negation rule. Itessentially started with Griffin’s observation (1990) that Felleisen’s controloperator C (Felleisen et al., 1987; Felleisen and Hieb, 1992) can be given thetype of the stability schema ¬¬A → A. This initiated quite a bit of workaimed at extending the Curry-Howard correspondence to classical logic, e.g.by Barbanera and Berardi (1993), Constable and Murthy (1991), Krivine(1994) and Parigot (1992).

Klaus Weich originally proposed the functional algorithm computing theFibonacci numbers. The example in 7.3.6 is due to Ulrich Berger. MonikaSeisenberger – apart from being a coauthor of (Berger et al., 2001) – andFelix Joachimski have contributed a lot to the Minlog system, particularlyto the implementation of the translation of classical proofs into constructiveones. We also benefitted from helpful comments by Peter Selinger and Mat-teo Slanina, who presented this material in a seminar in Stanford, in the fallof 2000.

The concept of critical predicate symbols in 7.3.1 is taken from (Berger,1995) and (Berger and Schwichtenberg, 1995).

The history of natural deduction based treatments of the Dialectica in-terpretation is nicely described in Hernest’s thesis (2006):

Natural deduction formulations of the Diller and Nahm(1974) variant of D-interpretation were provided by Dil-ler’s students Rath (1978) and Stein (1976). Only in the

7.7. NOTES 323

year 2001 Jørgensen provided a first Natural Deductionformulation of the original Godel’s functional interpreta-tion. In the Diller-Nahm setting all choices between thepotential realizers of a contraction are postponed to thevery end by collecting all candidates and making a sin-gle final global choice. In contrast, Jørgensen’s formula-tion respects Godel’s original treatment of contraction byimmediate (local) choices. Jørgensen devises a so-called“Contraction Lemma” in order to handle (in the givenNatural Deduction context) the discharging of more thanone copy of an assumption in an Implication Introduction→+. If n + 1 undischarged occurrences of an assumptionare to be cancelled in an→+ then Jørgensen uses his Con-traction Lemma n times, shifting partial results n timesback and forth over the “proof gate” `. We find this notonly inefficient from the applied program-extraction per-spective but also inelegant since the soundness proof forthe D-interpretation complicates unnecessarily w.r.t. con-traction. We will instead use the n-selector Ifnτ for equali-zing in one single (composed) step all LD-interpretationsof the n + 1 undischarged occurrences . . . . The practicalgain w.r.t. Jørgensen’s solution is that the handling of con-traction is directly moved from the proof level to the termlevel: back and forth shifting over ` is no longer requiredwhen building the verifying proof.

In all these natural deduction formulations of the Dialectica interpretationopen assumptions are viewed as formulas, and consequently the problem ofcontractions arises when an application of the implication introduction rule→+ discharges more than one assumption formula. However, it seems to bemore in the spirit of the Curry-Howard correspondence (formulas correspondto types, and proofs to terms) to view assumptions as assumption variables.This is particularly important when – say in an implementation – one wantsto assign object terms (“realizers”, in Godel’s T) to proof terms. To seethe point, notice that a proof term M may have many occurrences of a freeassumption variable uA. The associated realizer et(M) then needs to con-tain an object variable xτ(A)

u uniquely associated with uA, again with manyoccurrences. To organize this in an appropriate way it seems mandatory tobe able to refer to an assumption A by means of its “label” u.

7.6 is based on Berger and Schwichtenberg (1996).

CHAPTER 8

Linear Two-Sorted Arithmetic

In this final chapter we focus much of the technical/logical work ofprevious chapters onto theories with limited (more feasible) computationalstrength. The initial motivation is the surprising result of Bellantoni andCook (1992) characterizing the polynomial-time functions by the primitiverecursion schemes, but with a judicially placed semicolon first used by Sim-mons (1988), separating the variables into two kinds (or sorts). The first“normal” kind controls the length of recursions, and the second “safe” kindmarks the places where substitutions are allowed. Various alternative nameshave arisen for the two sorts of variables, which will play a fundamental rolethroughout this chapter, thus “normal”/“input” and “safe”/“output”; weshall use the input - output terminology. The important distinction here isthat input and output variables will not just be of base type, but may be ofarbitrary higher type.

We begin by developing a basic version of arithmetic which incorporatesthis variable separation. This theory EA(;) will have elementary recursivestrength (hence the prefix E) and sub-elementary (polynomially bounded)strength when restricted to its Σ1-inductive fragment. EA(;) is a first or-der theory which we use as a means to illustrate the underlying principlesavailable in such two-sorted situations. Our aim however is to extend theBellantoni and Cook variable separation to also incorporate higher types.This produces a theory A(;) extending EA(;) with higher type variables andquantifiers, having as its term system a two-sorted version T(;) of Godel’s T.T(;) will thus give a functional interpretation for A(;), which has the sameelementary computational strength, but is more expressive and applicable.

We then go a stage further in formulating a theory LA(;) all of whoseprovable recursions are polynomially bounded, not just those in the Σ1-inductive fragment; but to achieve this, an important additional aspect nowcomes into play. We need the logic to be linear (hence the prefix L) and thecorresponding term system LT(;) to have a linearity restriction on highertype output variables in order to ensure that the computational contentremains polynomial-time computable.

The following relationships will hold between the theories and their cor-responding functional interpretations:

ArithmeticGodel’s T

=A(;)T(;)

=LA(;)LT(;)

.

The leading intuition is of course that one should use the Curry-Howardcorrespondence between terms in lambda-calculus and derivations in arith-metic. However, in the two-sorted versions we are about to develop, care

325

326 8. LINEAR TWO-SORTED ARITHMETIC

must be taken to arrive at flexible and easy-to-use systems which can beunderstood in their own right.

The first recursion theoretic definition of polynomial-time computablefunctions was given by Cobham (1965) and much later Cook and Kapron(1990) proposed a notion of “basic feasible functional” of higher type, intheir system PVω. One should also mention the work of Leivant and Marion(1993), which gave a “tiered” typed λ-calculus characterization of poly-time.However, Buss’ (1985) Bounded Arithmetic gave the first proof-theoreticcharacterization of polynomial-time in terms of provable recursiveness, andthen Leivant (1995a,b) characterized it (poly-time) in a “predicative” theorywithout explicit bounds on quantifiers. “Implicit complexity” (in theorieswithout explicit bounds) subsequently became a topic in itself. Our devel-opment is based on EA(;) introduced in Ostrin and Wainer (2005), whichreworks Leivant’s results in a simpler context, and on the papers Bellan-toni et al. (2000) and Schwichtenberg and Bellantoni (2002), where linear-ity was first introduced in the setting of Godel’s T, in conjunction withthe Bellantoni-Cook style of two-sorted recursion. However, the notion oflinearity used here is very down-to-earth, meaning essentially “no contrac-tion”, and it should not be confused with Girard’s Linear Logic and its(1998) “light” variant. Other related work is that of Bellantoni and Hof-mann (2002), based on Hofmann’s (1999) concept of “non-size-increasing”recursion. A quite different, and particularly simple, approach to proof the-oretic characterizations of poly-time is Marion’s (2001), where quantifiersare restricted to “actual terms”.

8.1. Provable Recursion and Complexity in EA(;)

In this, and the following sections, we consider ways of characterising theelementary functions (and complexity-subclasses of them) by proof-theoreticsystems which have a more immediate computational relevance than prov-able Σ1-definability in I∆0(exp) say. Thus we require new, alternative no-tions of “provable recursiveness”, more directly related to recursion andcomputation than to logical definability. One such alternative approach, avery natural one due to, and developed extensively by, Leivant (1995a,b),is based on recursive definability in the equation calculus. EA(;) will havethe same strength as Leivant’s “two-sorted intrinsic theory” over N but isdifferent in its conception, the emphasis being on syntactic simplicity. Theaxioms are arbitrary equational definitions of partial recursive functions,and we call a function f , introduced by a system of defining equations E,“provably recursive” if ∃a(f(~x ) ' a) is derivable from those axioms E.Of course the logic has to be set up carefully so as to prevent proofs of∃a(f(~x ) ' a) when f is only partially defined. Furthermore, the inductionrules must be sufficiently restrictive that only functions of finitely iteratedexponential complexity are provably total. In contrast to I∆0(exp) however,the restriction will not be on the classes of induction formulas allowed, buton the kinds of variables allowed, as the genesis of the theory lies in the“normal-safe” recursion schemes of Bellantoni and Cook (1992). They showhow the polynomial-time functions can be defined by an amazingly simple,

8.1. PROVABLE RECURSION AND COMPLEXITY IN EA(;) 327

two-sorted variant of the usual primitive recursion schemes, in which (essen-tially) one is only allowed to substitute for safe variables and do recursionover normal variables. So what if one imposes the same kind of variableseparation on formal arithmetic? Then one obtains a theory with two kindsof number variables: “safe” or “output” variables which may be quantifiedover, and “normal” or “input” variables which control the lengths of induc-tions and only occur free! The analogies between this logically weak theoryand classical arithmetic are quite striking.

The key notion is that of “definedness” of a term t, expressed by

t↓ := ∃a(t ' a)

and it is this definition which highlights the principal logical restrictionwhich must be applied to the ∃-introduction and (dually) ∀-elimination rulesof the theory described below. For if arbitrary terms t were allowed aswitnesses in ∃-introduction, then from the axiom t ' t we could immediatelydeduce ∃a(t ' a) and hence in particular, f(x)↓ for every partial recursivef ! This is clearly not what we want. Thus we make the restriction thatonly “basic” terms: variables or 0 or their successors or predecessors, maybe used as witnesses. This is not quite so restrictive as it first appears, sincefrom the equality axiom

t ' a→ A(t)→ A(a)

we can derive immediately

t↓ → A(t)→ ∃aA(a).

Thus a term may be used to witness an existential quantifier only when ithas been proven to be defined. In particular, if f is introduced by a definingequation f(x) ' t then to prove f(x)↓ we first must prove (compute) t↓.Here we can begin to see that, provided we formulate the theory carefullyenough, proofs in its Σ1 fragment will correspond to computations in theequation calculus, and bounds on proof-size will yield complexity measures.

8.1.1. The theory EA(;). There will be two kinds of variables: “in-put” (or “normal”) variables denoted n,m, . . . , and “output” (or “safe”)variables denoted a, b, c, . . . , both here intended as ranging over naturalnumbers. Output variables may be bound by quantifiers, but input vari-ables will always be free. The basic terms are: variables of either kind,the constant 0, or the result of repeated application of the successor S orpredecessor P . General terms are built up in the usual way from 0 and vari-ables of either kind, by application of S, P and arbitrary function symbolsf, g, h, . . . denoting partial recursive functions given by sets E of Herbrand-Godel-Kleene-style defining equations.

Atomic formulas will be equations t1 ' t2 between arbitrary terms, andformulas A,B, . . . are built from these by applying propositional connectivesand quantifiers ∃a, ∀a over output variables a. The negation of a formula¬A will be defined as A→ F, where (as before) F stands for “false”.

We shall work in minimal, rather than classical, logic. This is compu-tationally more natural, and is not a restriction for us here, since (as has


already been shown) a classical proof of f(n)↓ can be transformed, by thedouble-negation interpretation, into a proof in minimal logic of

(∃a((f(n) ' a→ ⊥)→ ⊥)→ ⊥)→ ⊥and since minimal logic has no special rule for ⊥ we could replace it through-out by the formula f(n)↓ and hence obtain an outright proof of f(n)↓, sincethe premise of the above implication becomes provable.

It is not necessary to list the propositional rules. However, as stressedabove, the quantifier rules need to be restricted to basic terms as witnesses.Thus the ∀− rule is

∀aA(a) t∀−

A(t)where t is a basic term, and thus from the ∃+ axiom one obtains A(t) →∃aA(a), but again, only when t is basic.

Two further principles are needed, describing the data-type N, namelyinduction

A(0)→ ∀a(A(a)→ A(Sa))→ A(t)where t is a basic term on an input variable, and cases

A(0)→ ∀aA(Sa)→ ∀aA(a).

Definition. Our notion of Σ1-formula will be restricted to those ofthe form ∃~aA(~a ) where A is a conjunction of atomic formulas. A typicalexample is f(~n )↓. Note that a conjunction of such Σ1-formulas is provablyequivalent to a single Σ1-formula, by distributivity of ∃ over ∧.

Definition. A k-ary function f is provably recursive in EA(;) if it can bedefined by a system E of equations such that, with input variables n1, . . . , nk,

E ` f(n1, . . . , nk)↓where E denotes the set of universal closures (over output variables) of thedefining equations in E.

8.1.2. Elementary functions are provably recursive. Let E be asystem of defining equations containing the usual primitive recursions foraddition and multiplication:

a+ 0 ' a, a+ Sb ' S(a+ b),

a · 0 ' 0, a · Sb ' (a · b) + a

and further equations of the forms

p0 ' S0, pi ' pi0 + pi1 , pi ' pi0 · b

defining a sequence pi : i = 0, 1, 2 . . . of polynomials in variables ~b =b1, . . . , bn. Henceforth we allow p(~b ) to stand for any one of the polynomialsso generated (clearly all polynomials can be built up in this way).

Definition. The progressiveness of a formula A(a) with distinguishedfree variable a, is expressed by the formula

ProgaA := A(0) ∧ ∀a(A(a)→ A(Sa))

thus the induction principle of A(;) is equivalent to

ProgaA→ A(n).


The following lemmas derive extensions of this principle, first to any polyno-mial in ~n, then to any finitely iterated exponential. In the next subsectionwe shall see that this is the most that EA(;) can do.

Lemma. Let p(~b ) be any polynomial defined by a system of equationsE as above. Then for every formula A(a) we have, with input variablessubstituted for the variables of p,

E, ProgaA ` A(p(~n ))

Proof. Proceed by induction over the build-up of the polynomial p ac-cording to the given equations E. We argue in an informal natural deductionstyle, deriving the succedent of a sequent from its antecedent.

If p is the constant 1 (that is, S0) then A(S0) follows immediately fromA(0) and A(0)→ A(S0), the latter arising from substitution of the defined,basic term 0 for the universally quantified variable a in ∀a(A(a)→ A(Sa)).

Suppose p is p0 + p1 where, by the induction hypothesis, the result isassumed for each of p0 and p1 separately. First choose A(a) to be theformula a↓ and note that in this case ProgaA is provable. Then the inductionhypothesis applied to p0 gives p0(~n )↓. Now again with an arbitrary formulaA, we can easily derive

E, ProgaA, A(a) ` Progb(a+ b↓ ∧A(a+ b))

because if a + b is assumed to be defined, it can be substituted for theuniversally quantified a in ∀a(A(a)→ A(Sa)) to yield A(a+b)→ A(a+Sb).Therefore by the induction hypothesis applied to p1 we obtain

E, ProgaA, A(a) ` a+ p1(~n )↓ ∧A(a+ p1(~n ))

and henceE, ProgaA ` ∀a(A(a)→ A(a+ p1(~n ))).

Finally, substituting the defined term p0(~n ) for a, and using the inductionhypothesis on p0 to give A(p0(~n )) we get the desired result

E, ProgaA ` A(p0(~n ) + p1(~n )).

Suppose p is p1 · b where b is a fresh variable not occurring in p1. By theinduction hypothesis applied to p1 we have as above, p1(~n )↓ and

E, ProgaA ` ∀a(A(a)→ A(a+ p1(~n )))

for any formula A. Also, from the defining equations E and since p1(~n )↓,we have p1(~n ) · 0 ' 0 and p1(~n ) ·Sb ' (p1(~n ) · b)+ p1(~n ). Therefore we canprove

E, ProgaA ` Progb(p1(~n ) · b↓ ∧A(p1(~n ) · b))

and an application of the EA(;)-induction principle on variable b gives, forany input variable n,

E, ProgaA ` p1(~n ) · n↓ ∧A(p1(~n ) · n)

and hence E, ProgaA ` A(p(~n )) as required.


Definition. Extend the system of equations E above by adding thenew recursive definitions:

f1(a, 0) ' Sa, f1(a, Sb) ' f1(f1(a, b), b)

and for each k = 2, 3, . . . ,

fk(a, b1, . . . , bk) ' f1(a, fk−1(b1, . . . , bk))

so that f1(a, b) = a+ 2b and fk(a,~b) = a+ 2fk−1(~b ). Finally define

2k(p(~n )) ' fk(0, . . . , 0, p(~n ))

for each polynomial p given by E, and similarly for exponential bases otherthan 2.

Lemma. In EA(;) we can prove, for each k and any formula A(a),

E, ProgaA ` A(2k(p(~n ))).

Proof. First note that by a similar argument to one used in the previ-ous lemma (and going back all the way to Gentzen) we can prove, for anyformula A(a),

E, ProgaA ` Progb∀a(A(a)→ f1(a, b)↓ ∧A(f1(a, b)))

since the b := 0 case follows straight from ProgaA, and the induction stepfrom b to Sb follows by appealing to the hypothesis twice: from A(a) wefirst obtain A(f1(a, b)) with f1(a, b)↓, and then (by substituting the definedf1(a, b) for the universally quantified variable a) from A(f1(a, b)) followsA(f1(a, Sb)) with f1(a, Sb)↓, using the defining equations for f1.

The result is now obtained straightforwardly by induction on k. Assum-ing E and ProgaA we derive

Progb∀a(A(a)→ f1(a, b)↓ ∧A(f1(a, b)))

and then by the previous lemma,

∀a(A(a)→ f1(a, p(~n ))↓ ∧A(f1(a, p(~n ))))

and then with a := 0 and using A(0) we have 21(p(~n ))↓ and A(21(p(~n ))),which is the case k = 1. For the step from k to k+1 do the same, but insteadof the previous lemma use the induction to replace p(~n ) by 2k(p(~n )).

Theorem. Every elementary (E3) function is provably recursive in thetheory EA(;), and every sub-elementary (E2) function is provably recursivein the fragment which allows induction only on Σ1-formulas.

Proof. Any elementary function g(~n ) is computable by a register ma-chine M (working in unary notation with basic instructions “successor”,“predecessor”, “transfer” and “jump”) within a number of steps boundedby 2k(p(~n )) for some fixed k and polynomial p. Let r1(c), r2(c), . . . , rn(c)be the values held in its registers at step c of the computation, and let i(c)be the number of the machine instruction to be performed next. Each ofthese functions depends also on the input parameters ~n, but we suppressmention of these for brevity. The state of the computation 〈i, r1, r2, . . . , rn〉at step c+ 1 is obtained from the state at step c by performing the atomicact dictated by the instruction i(c). Thus the values of i, r1, . . . , rn at stepc+ 1 can be defined from their values at step c by a simultaneous recursive


definition involving only the successor S, predecessor P and definitions bycases C. So now, add these defining equations for i, r1, . . . , rn to the systemE above, together with the equations for predecessor and cases:

P (0) ' 0, P (Sa) ' aC(0, a, b) ' a, C(Sd, a, b) ' b

and notice that the cases rule built into EA(;) ensures that we can prove∀d,a,bC(d, a, b)↓. Since the passage from one step to the next involves onlyapplications of C or basic terms, all of which are provably defined, it is easyto convince oneself that the Σ1-formula

∃~a (i(c) ' a0 ∧ r1(c) ' a1 ∧ · · · ∧ rn(c) ' an)is provably progressive in variable c. Call this formula A(~n, c). Then by thesecond lemma above we can prove

E ` A(~n, 2k(p(~n )))

and hence, with the convention that the final output is the value of r1 whenthe computation terminates,

E ` r1(2k(p(~n )))↓.Hence the function g given by g(~n ) ' r1(2k(p(~n ))) is provably recursive.

In just the same way, but using only the first lemma above, we seethat any sub-elementary function (which, e.g. by Rodding (1968), is registermachine computable in a number of steps bounded by just a polynomialof its inputs) is provably recursive in the Σ1-inductive fragment. This isbecause the proof of A(~n, p(~n )) by the first lemma only uses inductions onsubstitution instances of A, and here, A is Σ1.

8.1.3. Provably recursive functions are elementary. Because theinput variables of EA(;), once introduced by an induction, do not get uni-versally quantified thereafter, they are never substituted by more complexterms (as happens in standard single-sorted theories like PA). This meansthat, for any fixed numerical assignment to the inputs, the inductions canbe “unravelled” directly within the theory EA(;) itself, but the height of theresulting unravelled proof will depend linearly on the values of the numeri-cal inputs. This theory then admits normalization with iterated exponentialcomplexity. Therefore a proof of f(~n )↓ will be transformed into a normalproof of size elementary in ~n. This process is completely uniform in ~n. Hencean elementary complexity bound for the function f itself may be extractedand f is therefore elementary.

For Σ1-proofs the argument is similar, but the simpler cut formulas thatoccur when one unravels inductions will now lead to polynomial bounds,because for fixed inputs n the height of the unravelled proof will in factbe logarithmic in n (since it is a binary branching tree) and so the sizeof the proof, and hence the computation of f(n), will be exponential inlog n (more precisely 2d·logn where d is the number of nested inductions)and thus polynomial in n. If one begins instead with binary, rather thanunary, representations of numbers, then the complexity would be polynomialin log n. Thus, in unary style the provable functions of the Σ1-inductive


fragment of EA(;) will be the “subelementary” or “linear-space” functions,and in binary style, the poly-time functions. We will return to this and givea more detailed proof later on.

8.1.4. Two-sorted arithmetic in higher types. The theory EA(;)provides a very basic setting in which more feasible computational notionsmay be developed and proven, but in order to build a more robust theoryapplicable to program development it would be natural to extend EA(;) to atheory A(;) incorporationg variables in all finite types and a more elaborateand expressive term structure. The theory A(;) will be to EA(;) as HAω isto HA.

We shall work with two forms of arrow types, abstraction terms andquantifiers:

N → σ

λnr

∀nAas well as

ρ→ σ

λar

∀aAand a corresponding syntactic distinction between input and output (typed)variables. The intuition is that a formula ∀nA may be proved by induction,but a formula ∀aA may not, and similarly a function of type N → σ maybe defined by recursion on its argument, but a function of type N→ σ maynot.

The formulas of A(;) will be built from prime formulas by two formsof implication A → B and A → B and the two forms above of universalquantifiers. The existential quantifier, conjunction and disjunction will bedefined inductively, as was done previously in 7.1.5.

The induction axiom is

Indn,A : ∀n(A(0)→ ∀a(A(a)→ A(Sa))→ A(n))

for all “safe” formulas A, i.e., all those not containing → or ∀n. In additionwe have all the other usual axioms of arithmetic in finite types, as listed in7.1, with the output arrow → and universal quantification ∀a over outputvariables only.

Though it is far more expressive, A(;) will have the same elementaryrecursive strength as EA(;). The underlying computational power of thetheory is incorporated into its term system T(;), which we now develop.

We shall later restrict A(;) to a linear style logic LA(;) with a corre-sponding term system LT(;). The consequence of this will be that termsof arbitrary type will then be of polynomial complexity only, so the systemwill automatically yield polynomial-time program extraction. Complexity isof course an important consideration when extracting content from proofs,and the first author’s Minlog system has this capability since both A(;) andLA(;) are incorporated into it.

8.2. A Two-Sorted Variant T(;) of Godel’s T

We define a two-sorted variant T(;) of Godel’s T, by lifting the approachof Simmons (1988) and Bellantoni and Cook (1992) to higher types. Itis shown that the functions definable in T(;) are exactly the elementaryfunctions. The proof is based on the observation that β-normalization of

8.2. A TWO-SORTED VARIANT T(;) OF GODEL’S T 333

terms of rank ≤ k has elementary complexity. Generally, the two-sortednessrestriction allows to unfold R in a controlled way, rather as inductions areallowed to be unravelled in EA(;).

8.2.1. Higher order terms with input/output restrictions. Weshall work with two forms of arrow types and abstraction terms:

N → σ

λnras well as

ρ→ σ

λar

and a corresponding syntactic distinction between input and output (typed)variables. The intuition is that a function of type N → σ may recurse onits argument. On the other hand, a function of type ρ → σ is not allowedto recurse on its base type argument.

Formally we proceed as follows. The types are

ρ, σ, τ ::= ι | N → σ | ρ→ σ

with a finitary base type ι. A type is called safe if it does not contain theinput arrow →.

The constants are the constructors for all the finitary base types, con-taining output arrows only, and the recursion and cases operators. Thetyping of the recursion operators requires usage of both → and→ to ensuresufficient control over their unfoldings. In the present case of finitary basetypes the recursion operator w.r.t. ι = µα ~κ and result type τ is Rτι of type

ι → δ0 → . . .→ δk−1 → τ

where the step types δi are of the form ~ρ→ ~ι→ ~τ → τ , the ~ρ,~ι correspondingto the components of the object of type ι under consideration, and ~τ to thepreviously defined values. Recall that the first argument is the one that isrecursed on and hence must be an input term, so the type starts with ι →.For example, the recursion operator RτN over (unary) natural numbers hastype

N → τ → (N→ τ → τ)→ τ.

In general, however, we shall require simultaneous recursion operatorsas described in 6.2.1, but now the type of the jth component will be of theform ιj → δ0 → . . .→ δk−1 → τj .

The typing for the cases variant of recursion is less problematic and canbe done with the output arrow → only. Recall that in the cases operatorno recursive calls occur: one just distinguishes cases according to the outerconstructor form. Thus the cases operator is Cτι of type

ι→ δ0 → . . .→ δk−1 → τ

where all step types δi now have the simpler form ~ρ→ ~ι→ τ . For exampleCτN has type

N→ τ → (N→ τ)→ τ.

Because of its more convenient typing we shall normally use the cases oper-ator rather than the recursion operator for explicit base types.

Note, however, that both the recursion and the cases operators needto be restricted to safe value types τ . This restriction is necessary in theproof of the Normalization Theorem below (analogously to cut reduction of


EA(;)-formulas which, as the reader will recall, only have quantification overoutput variables).

Terms are built from these constants and typed input and output vari-ables by introduction and elimination rules for the two type forms N → σand ρ→ σ, i.e.,

n | a | Cρ (constant) |(λnrσ)N→σ | (rN→σsN)σ (s an input term) |(λaρrσ)ρ→σ | (rρ→σsρ)σ,

where a term s is called an input term if all its free variables are inputvariables.

A function f is said to be definable in T(;) if there is a closed termtf : N . . .N N (∈ →,→) denoting this function. Notice that it isalways desirable to have more output arrows → in the type of tf , becausethen there are fewer restrictions on its argument terms.

8.2.2. Examples. In EA(;), the functions of interest were provided byHerbrand-Godel-Kleene-style defining equations, which is appropriate for afirst order theory. However, in the present setting of higher-order theorieswe have to prove the existence of such functions, and moreover we mustdecide which are input or output arguments. We will view input positionsas a convenient way to control the size of intermediate computations, whichis well-known to be a crucial requirement for feasible definitions of functions.For ease of reading, we use n for input and a, b for output variables of typeN, and p for general output variables.

Elementary functions. Addition can be defined by a term t+ of typeN→ N → N. The recursion equations are

a+ 0 := a, a+ Sn := S(a+ n),

and the representing term is

t+ := λa,n.RNna(λ ,p.Sp).

The predecessor function P can be defined by a term tP of type N → N ifwe use the cases operator C:

tP := λa.CNa0(λbb).

From the predecessor function we can define modified subtraction −· :

a−· 0 := a, a−· Sn := P (a−· n)

by the termt−· := λa,n.RNna(λ ,p.Pp).

If f is defined from g by bounded summation f(~n, n) :=∑

i<n g(~n, i), i.e.,

f(~n, 0) := 0, f(~n,Sn) := f(~n, n) + g(~n,Sn)

and we have a term tg of type N → . . . → N → N defining g, then we canbuild a term tf of the same type defining f by

tf := λ~n,n.RNn0(λ ,p.p+ (tg~nn)).


Higher type definitions. Consider iteration I(n, f) = fn:

I(0, f, a) := a,

I(n+ 1, f, a) := I(n, f, f(a)),or

I(0, f) := id,

I(n+ 1, f) := I(n, f) f.

It can be defined by a term with f a parameter of type N→ N:

If := λn.RN→Nn(λaa)(λ ,p,a(pN→N(fa))).

In T(;), f can be either an input or an output variable, but in LT(;), f willneed to be an input variable, because the step argument of recursion is aninput argument.

For the general definition of iteration, let the pure safe types ρk bedefined by ρ0 := N and ρk+1 := ρk → ρk. Then we can define

Inak . . . a0 := ankak−1 . . . a0,

with ak of type ρk. These variables ak must be output variables, becausethe value type of a recursion is required to be safe. Therefore, the definitionF0ak . . . a0 := Ia0ak . . . a0 which, as noted before, is sufficient to generateall of Godel’s T, is not possible: Ia0 is not allowed.

This observation also confirms the necessity of the restrictions on thetype of R. We must require that the value type is a safe type, for otherwisewe could define

IE := λn.RN→Nn(λmm)(λ ,p,m(pN→N(Em)))),

and IE(n,m) = En(m), a function of superelementary growth.We also need to require that the “previous”-variable is an output vari-

able, because otherwise we could define

S := λn.RNn0(λ ,m(Em)) (superelementary).

Then S(n) = En(0).

8.2.3. Elementary functions are definable. We now show that inspite of our restrictions on the formation of types and terms we can definefunctions of exponential growth.

Probably the easiest function of exponential growth is B(n, a) = a+ 2n

of type B : N → N→ N, with the defining equations

B(0, a) = a+ 1, B(n+ 1, a) = B(n,B(n, a)).

We formally define B as a term in T(;) by

B := λn.RN→NnS(λ ,p,a(pN→N(pa))).

Notice that this will not be a legal definition in the linear term system LT(;),because of the double occurrence of the higher type variable p. From B wecan define the exponential function E := λn.Bn0 of type E : N → N, andalso iterated exponential functions like λn.E(En).

Theorem. For every elementary function f there is a term tf of typeN → . . . → N → N defining f as a function on inputs.


Proof. We use the characterization in 2.2.3 of the class E of elemen-tary functions: it consists of those number theoretic functions which canbe defined from the initial functions: constant 0, successor S, projections(onto the ith coordinate), addition +, modified subtraction −· , multiplica-tion · and exponentiation 2x, by applications of composition and boundedminimization.

Recall that bounded minimization

f(~n,m) = µk<m(g(~n, k) = 0)

is definable from bounded summation and −· :f(~n,m) =

∑i<m

(1−·∑k≤i

(1−· g(~n, k))).

The claim follows from the examples above.

The main problem with the representation of the elementary functionsin the theorem above is that they have input arguments only. This preventssubstitution of terms involving output variables, which is a severe restric-tion on the use of such functions in practice. A possible solution is to (1)introduce an additional input argument acting as a bound for the results ofintermediate computations, and (2) replace the recursion operator by thecases operator as much as possible, exploiting the fact that the latter is ofsafe type. For example, addition can be obtained as f+(a, b,m) with f+ oftype N→ N→ N → N, defined by

f+(a, b, 0) := 0, f+(a, b,m+ 1) :=

b if a = 0f+(P (a), b,m) + 1 otherwise,

where P is the predecessor function of type N → N defined above, usingthe cases operator. Then

a, b ≤ m→ f+(a, b,m) = a+ b.

Similarly, multiplication can be obtained as f×(a, b,m) with f× of typeN→ N→ N → N, by

f×(a, b, 0) := 0,

f×(a, b,m+ 1) :=

0 if b = 0f+(f×(a, P (b),m), a,m+ 1) otherwise.

Thena, b ≤ m→ f×(a, b,m2) = a · b.

Generally, we have

Theorem. For every n-ary elementary function f we can find a T(;)-term tf of type N→ . . .→ N → N such that, for some k,

~a ≤ m→ tf (~a, 2k(m)) = f(~a ).

Proof. We proceed as in the theorem in 8.1.2. For arguments ~a with~a ≤ m, any elementary function f(~a ) is computable by a register machineM (working in unary notation with basic instructions “successor”, “prede-cessor”, “transfer” and “jump”) within a number of steps bounded by 2k(m)for some fixed k. Let r1(n), r2(n), . . . , rl(n) be the values held in its registers


at step n of the computation, and let i(n) be the number of the machineinstruction to be performed next. Each of these functions depends also onthe input parameter m, but we suppress mention of this for brevity. Thestate of the computation 〈i, r1, r2, . . . , rl〉 at step n+ 1 is obtained from thestate at step n by performing the atomic act dictated by the instructioni(n). Thus the values of i, r1, . . . , rl at step n+ 1 can be defined from theirvalues at step n by a simultaneous recursive definition involving only thesuccessor, predecessor and definitions by cases. The terms representing thiswill be of the (simultaneous) form

tj := λ~aλn.R~N, ~Nj n0~a~0(λ ,~pC0)(λ ,~pC1) . . . (λ ,~pCl)

where j ≤ l, 0~a~0 are the initial values of i, r1, r2, . . . , rl and C0, C1, . . . , Cl areterms which predict the next values of i, r1, r2, . . . , rl given their previousones ~p. The required term tf will then be t1, assuming r1 is the outputregister.

8.2.4. Definable functions are elementary. We give an elementaryupper bound on the complexity of functions definable in T(;). This willbe achieved by a careful analysis of the normalization process. Since thecomplexity of β-normalization is well-known to be elementary, we can treatit separately from the elimination of the recursion operator.

Recall the conversion rules (6.3) for the recursion operator R and (6.4)for the cases operator C. In addition we need the β-conversion rule (6.1),which we will employ in a slightly generalized form; see below. The η-conversion rule (6.2) is not needed, since we are interested in the computa-tion of numerals only. In fact, we can assume that all recursion and casesoperators are “η-expanded”, i.e., appear with sufficiently many argumentsfor the conversion rules to apply: if not, apply them to sufficiently manynew variables of the appropriate types and abstract them in front. This η-expansion process clearly does not change the intended meaning of the term.Notice that the property of a term to have η-expanded recursion and casesoperators only is preserved under the conversion rules (since η-conversion isleft out).

The size (or length) ||r|| of a term r is the number of occurrences ofconstructors, variables and constants in r: ||x|| = ||C|| = 1, ||λnr|| = ||λar|| =||r||+ 1, and ||rs|| = ||r||+ ||s||+ 1.

Let us first consider β-normalization. Here the distinction between inputand output variables and our two type formers → and → plays no role. Itwill be convenient to allow generalized β-conversion:

(λ~x,xr(~x, x))~ss 7→ (λ~xr(~x, s))~s.

β-redexes are instances of the left side of the β-conversion rule. A term issaid to be in β-normal form if it does not contain a β-redex.

We want to show that every term reduces to a β-normal form. This canbe seen easily if we follow a certain order in our conversions. To define thisorder we have to make use of the fact that all our terms have types.

A β-redex (λ~x,xr(~x~ρ, xρ))~ss is also called a cut with cut-type ρ. By thelevel of a cut we mean the level of its cut-type. The cut-rank of a term r isthe least number bigger than the levels of all cuts in r. Now let t be a term


of cut-rank k+1. Pick a cut of the maximal level k in t, such that s does notcontain another cut of level k. (E.g., pick the rightmost cut of level k.) Thenit is easy to see that replacing the picked occurrence of (λ~x,xr(~x~ρ, xρ))~ss int by (λ~xr(~x, s))~s reduces the number of cuts of the maximal level k in t by1. Hence

Theorem (β-Normalization). We have an algorithm which reduces anygiven term into a β-normal form.

We now want to give an estimate of the number of conversion steps ouralgorithm takes until it reaches the normal form. The key observation forthis estimate is the obvious fact that replacing one occurrence of

(λ~x,xr(~x, x))~ss by (λ~xr(~x, s))~s.

in a given term t at most squares the size of t.An elementary bound Ek(l) for the number of steps our algorithm takes

to reduce the rank of a given term of size l by k can be derived inductively,as follows. Let E0(l) := 0. To obtain Ek+1(l), first note that by inductionhypothesis it takes ≤ Ek(l) steps to reduce the rank by k. The size of theresulting term is ≤ l2

nwhere n := Ek(l) since any step (i.e., β-conversion)

at most squares the size. Now to reduce the rank by one more, we convert –as described above – one by one all cuts of the present rank, where each suchconversion does not produce new cuts of this rank. Therefore the number ofadditional steps is bounded by the size n. Hence the total number of stepsto reduce the rank by k + 1 is bounded by

Ek(l) + l2Ek(l)

=: Ek+1(l).

Theorem (Upper bound for the complexity of β-normalization). Theβ-normalization algorithm given in the proof above takes at most Ek(l) stepsto reduce a given term of cut-rank k and size l to normal form, where

E0(l) := 0, Ek+1(l) := Ek(l) + l2Ek(l)

.

We now show that we can also eliminate the recursion operator, and stillhave an elementary estimate on the time needed.

Lemma (R-Elimination). Let t(~x ) be a β-normal term of safe type.There is an elementary function Et such that: if ~r are safe type R-free termsand the free variables of t(~r ) are output variables of safe type, then in timeEt(||~r ||) (with ||~r || :=

∑i ||ri||) one can compute an R-free term rf(t; ~x;~r )

such that t(~r )→∗ rf(t; ~x;~r ).

Proof. Induction on ||t||.If t(~x ) has the form λxu1, then x is an output variable and x, u1 have

safe type because t has safe type. If t(~x ) is of the form D~u with D a variableor a constant different from R, then each ui is a safe type term. Here (incase D is a variable) we need that ~x and the free variables of t(~r ) are of safetype.

In all of the preceding cases, the free variables of each ui(~r ) are outputvariables of safe type. Apply the induction hypothesis to obtain u∗i :=rf(ui; ~x;~r ). Let t∗ be obtained from t by replacing each ui by u∗i . Then t∗


is R-free. The result is obtained in linear time from ~u∗. This finishes thelemma in all of these cases.

The only remaining case is when t is an R-clause. The first (input)argument must be present, because the term has safe type and thereforecannot be R alone. Recall that we may assume that t is of the form Rrus~t(by η-expansion with safe variables, if necessary). One obtains rf(r; ~x;~r )in time Er(||~r ||) by the induction hypothesis. By assumption t(~r ) has freeoutput variables only. Hence r(~r ) is closed, because the type of R requiresr(~r ) to be an input term. By β-normalization one obtains the numeral N :=nf(rf(r; ~x;~r )) in a further elementary time, E′r(||~r ||). Here nf(·) denotes afunction on terms which produces the β-normal form.

For the step term s we now consider sa with a new variable a, and lets′ be its β-normal form. Since s is β-normal, ||s′|| ≤ ||s||+ 1 < ||t||. Applyingthe induction hypothesis to s′ one obtains a monotone elementary boundingfunction Esa. One computes all si := rf(s′; ~x, a;~r, i) (i < N) in a total timeof at most ∑

i<N

Esa(||~r ||+ i) ≤ E′r(||~r ||) · Esa(||~r ||+ E′r(||~r ||)).

Consider u, ~t. The induction hypothesis gives u := rf(u; ~x;~r ) in timeEu(||~r ||), and all ti := rf(ti; ~x;~r ) in time

∑iEti(||~r ||). These terms are also

R-free by induction hypothesis.Using additional time bounded by a polynomial P in the lengths of these

computed values, one constructs the R-free term

rf(Rrus~t; ~x;~r ) := (sN−1 . . . (s1(s0u )) . . . )~t.

Defining Et(l) := P (Eu(l)+∑

iEti(l)+E′r(l) ·Esa(l+E′r(l))), the total timeused in this case is at most Et(||~r ||).

Let the R-rank of a term t be the least number bigger than the level ofall value types τ of recursion operators Rτ in t. By the rank of a term wemean the maximum of its cut-rank and its R-rank. Combining the last twolemmas now gives the following.

Lemma. For every k there is an elementary function Nk such that everyT(;)-term t of rank ≤ k can be reduced in time Nk(||t||) to βR-normal form.

It remains to remove the cases operator C. We may assume that onlyCN occurs.

Lemma (C-Elimination). Let t be an R-free closed β-normal term ofbase type N. Then in time linear in ||t|| one can reduce t to a numeral.

Proof. If the term does not contain C we are done. Otherwise removeall occurrences of C, as follows. The term has the form Sr or Crts. Proceedwith r and iterate until we reach Crts where r does not contain C. Then ris 0 or Sr0. In the first case, convert C0ts to t. In the second case, noticethat s has the form λas0(a). Convert C(Sr0)t(λas0(a)) first into (λas0(a))r0and then into s0(r0). Each time we have removed one occurrence of C.

We can now combine our results and state the final theorem.


Theorem (Normalization). Let t be a closed T(;)-term of type N . . .N N (∈ →,→). Then t denotes an elementary function.

Proof. We produce an elementary function Ft such that for all numer-als ~n with t~n of type N one can compute nf(t~n) in time Ft(||~n||). Let ~x benew variables such that t~x is of type N. The β-normal form β-nf(t~x) of t~xis computed in an amount of time that may be large, but it is still only aconstant with respect to ~n.

By R-Elimination one reduces to an R-free term rf(β-nf(t~x); ~x;~n) intime bounded by an elementary function of ||~n||. Since the running timebounds the size of the produced term, ||rf(β-nf(t~x); ~x;~n)|| is also boundedby this elementary function of ||~n||. By a further β-normalization one cantherefore compute

βR-nf(t~n) = β-nf(rf(β-nf(t~x); ~x;~n))

in time elementary in ||~n||. Finally in time linear in the result we can removeall occurrences of C and arrive at a numeral (elementarily in ~n).

8.3. A Linear Two-Sorted Variant LT(;) of Godel’s T

We restrict T(;) to a linear style term system LT(;). The consequence isthat terms of arbitrary type will now be of polynomial-time complexity.

Recall that in the first example concerning T(;) of a recursion producingexponential growth, we defined B(n, a) = a+ 2n by the term

B := λn.RN→NnS(λ ,p,a(pN→N(pa))).

Crucially, the higher type variable p for the “previous” value appears twicein the step term. The linearity restriction will forbid this in a fairly brutalway, by simply requiring that higher type output variables are only allowedto appear (at most) once in a term. Now the output arrow ρ→ σ (where ρis not a base type) really is the linear arrow, one of the fundamental featuresof “linear logic”.

The term definition will now involve the above linearity constraint.Moreover, the typing of the recursion operator R needs to be carefully mod-ified because we allow higher types as argument types for →, not just basetypes like N. The (higher type) step argument may be used many times,hence we need an input arrow → after it, not the → as before because thelinearity of → would now prevent multiple use. The type of the recursionoperator will thus be

N → τ → (N→ τ → τ) → τ.

The point is that the typing now ensures that the step term of a recursion isan input argument. This implies that it cannot contain higher type outputvariables, which would be duplicated when the recursion is unfolded.

8.3.1. LT(;)-terms. We extend the usage of arrow types and abstrac-tion terms from 8.2.1, by allowing higher type input variables as well. Wework with two forms of arrow types and abstraction terms:

ρ → σ

λxρras well as

ρ→ σ

λxρr

8.3. A LINEAR TWO-SORTED VARIANT LT(;) OF GODEL’S T 341

and a corresponding syntactic distinction between input and output (typed)variables, the intuition being that a function of type ρ → σ may recurse onits argument (if it is of base type) or use it many times (if it is of highertype). On the other hand, a function of type ρ→ σ is not allowed to recurseon its argument if it is of base type and can use it only once if it is of highertype.

At higher types we shall need a large variety of variable names, and aclear input/output distinction. A convenient way to achieve this is simplyto use an overbar to signify the input case. Thus x, y, z . . . will now denotearbitrary output variables, and x, y, z . . . will always be an input variable.

Formally, the types are

ρ, σ, τ ::= ι | ρ → σ | ρ→ σ.

with a finitary base type ι. Again, a type is called safe if does not containthe input arrow →. The jth component R~ι,~τj of a simultaneous recursionoperator now has type

ιj → δ0 . . . δk−1 τj

where for each i < k, if the step type δi demands a recursive call then thearrow after it must be →, and otherwise it must be the linear →.

The typing of R~ι,~τj with its careful choices of → and → deserves somecomments. The first argument is the one that is recursed on and hence mustbe an input term, so the type starts with ι →. The recursive step argumentsare of a higher type and will be used many times when the recursion operatoris unfolded, so in LT(;) it must be an input term as well. Hence we thenneed a → after such step types.

For the base type N of (unary) natural numbers the type of the recursionoperator RτN now is

N → τ → (N→ τ → τ) → τ.

The type of the cases operator is as for T(;) (cf. 8.2.1). Also, both therecursion and cases operators need to be restricted to safe value types τj .

Terms are built from these constants and typed variables xσ (inputvariables) and xσ (output variables) by introduction and elimination rulesfor the two type forms ρ → σ and ρ→ σ, i.e.,

xρ | xρ | Cρ (constant) |(λxρrσ)ρ→σ | (rρ→σsρ)σ (s an input term) |(λxρrσ)ρ→σ | (rρ→σsρ)σ (higher type output variables in r, s distinct),

where again a term s is called an input term if all its free variables areinput variables. The restriction on output variables in the formation of anapplication rρ→σs ensures that every higher type output variable can occurat most once in a given LT(;)-term.

Again a function f is called definable in LT(;) if there is a closed termtf : N . . .N N (∈ →,→) denoting this function.

8.3.2. Examples. We now look at some examples intended to explainwhat can be done in LT(;), and in particular, how our restrictions on the for-mation of types and terms make it impossible to obtain exponential growth.


However, for definiteness we first have to say precisely what we mean by anumeral , this time being binary ones.

Terms of the form rρ1 ::ρ (rρ2 ::ρ . . . (rρn ::ρ nilρ) . . .) are called lists; we

concentrate on lists of booleans. Let W := L(B), and

1 := nilB, S0 := λv(ff :: vW), S1 := λv(tt :: vW).

Particular lists are Si1(. . . (Sin1) . . . ), called binary numerals (or words),denoted by v, w . . . .

Polynomials. It is easy to define ⊕ : W → W → W such that v ⊕ wconcatenates ||v|| bits onto w:

1⊕ w = S0w, (Siv)⊕ w = S0(v ⊕ w).

The representing term is

v ⊕ w := RW→WvS0(λ , ,p,w.S0(pW→Ww))w.

Similarly we define : W →W →W such that vy has output length||v|| · ||w||:

v 1 = v, v (Siw) = v ⊕ (v w).The representing term is v w := RWwv(λ , ,p.v ⊕ p).

Note that the typing ⊕ : W → W → W is crucial: it allows usingthe output variable p in the definition of . If we try to go on and defineexponentiation from multiplication just as was defined from ⊕, we findthat we cannot go ahead, because of the different typing : W →W →W.

Two recursions. ConsiderD(1) := S0(1),

D(Si(w)) := S0(S0(D(w))),

E(1) := 1,

E(Si(w)) := D(E(w)).

The corresponding terms are

D := λw.RWw(S01)(λ , ,p.S0(S0p)), E := λw.RWw1(λ , ,p.Dp).

Here D is legal, but E is not: the application Dp is not allowed.Recursion with parameter substitution. Consider

E(1, v) := S0(v),

E(Si(w), v) := E(w,E(w, v)),or

E(1) := S0,

E(Si(w)) := E(w) E(w).

The corresponding term

λw.RW→WwS0(λ , ,p,v.pW→W(pv))

does not satisfy the linearity condition: the higher type variable p occurstwice, and the typing of R requires p to be an output variable.

Higher arguments types. Recall the definition of iteration I(n, f) = fn

in 8.2.2:I(0, f, w) := w,

I(n+ 1, f, w) := I(n, f, f(w)),or

I(0, f) := id,

I(n+ 1, f) := I(n, f) f.It can be defined by a term with f a parameter of type W→W:

If := λn.RW→Wn(λww)(λ ,p,w(pW→W(fw))).

In LT(;), f must be an input variable, because the step argument of arecursion is by definition an input argument. Thus λfIf may only be appliedto input terms of type W → W. This severely restricts the applicability


of I, and raises a crucial point. The fact is, that we cannot define theexponential function by

λn.RW→WnS(λ ,p(Ip2)))

since, on the one hand the step type requires p to be an output variable,whereas on the other hand Ip is only correctly formed if p is an input variable.

8.3.3. Polynomial-time functions are LT(;)-definable. We showthat the functions definable in LT(;) are exactly the polynomial-time com-putable ones. Recall that for this result to hold it is important that we workwith the binary representation W of the natural numbers. As in 8.2.3 wecan prove

Theorem. For every k-ary polynomial-time computable function f wecan find an LT(;)-term tf of type W(k) → W → W such that, for somepolynomial p,

||~a|| ≤ m→ tf (~a, p(m)) = f(~a ).

Proof. We analyse successive state transitions of a register machine Mcomputing f , this time working in binary notation with the two successorsof W. Otherwise the proof is exactly the same.

Corollary. Each polynomial-time function f can be represented by theterm tf (~n, p(max(~n ))) of type W(k) →W.

8.3.4. LT(;)-definable functions are polynomial-time. To obtaina polynomial-time upper bound on the complexity of functions definablein LT(;), we again need a careful analysis of the normalization process.In contrast to the T(;)-case, β-normalization and the elimination of therecursion operator can not be separated but must be treated simultaneously.Moreover, it will be helpful not to use register machines as our model ofcomputation, but another one closer to the lambda terms we have to workwith. This model will be described as we go along; it is routine to see thatit is equivalent to the register machine model.

A dag is a directed acyclic graph. A parse dag is a structure like a parsetree but admitting in-degree greater than one. For example, a parse dag forλxr has a node containing λx and a pointer to a parse dag for r. A parsedag for an application rs has a node containing a pair of pointers, one toa parse dag for r and the other to a parse dag for s. Terminal nodes arelabeled by constants and variables.

The size ||d|| of a parse dag d is the number of nodes in it. Starting at anygiven node in the parse dag, one obtains a term by a depth-first traversal; itis the term represented by that node. We may refer to a node as if it werethe term it represents.

A parse dag is conformal if (i) every node having in-degree greater than1 is of base type, and (ii) every maximal (that is, non-extensible) path to abound variable x passes through the same binding λx node.

A parse dag is h-affine if every higher-type variable occurs at most oncein the dag.

We adopt a model of computation over parse dags in which operationssuch as the following can be performed in unit time: creation of a node


given its label and pointers to the sub-dags; deletion of a node; obtaining apointer to one of the subsidiary nodes given a pointer to an interior node;conditional test on the type of node or on the constant or variable in thenode. Concerning computation over terms (including numerals), we use thesame model and identify each term with its parse tree. Although not allparse dags are conformal, every term is conformal (assuming a relabeling ofbound variables).

A term is called simple if it contains no higher type input variables.Obviously simple terms are closed under reductions, taking of subterms,and applications. Every simple LT(;)-term is h-affine, due to the linearityof higher-type output variables.

Lemma (Simplicity). Let t be a base type term whose free variables areof base type. Then nf(t) contains no higher type input variables.

Proof. Suppose a variable xσ with lev(σ) > 0 occurs in nf(t). It mustbe bound in a subterm (λxσr)σ→τ of nf(t). By the well known subtypeproperty of normal terms, the type σ → τ either occurs positively in thetype of nf(t), or else negatively in the type of one of the constants or freevariables of nf(t). The former is impossible since t is of base type, and thelatter by inspection of the types of the constants.

Lemma (Sharing Normalization). Let t be an R-free simple term. Thena parse dag for nf(t), of size at most ||t||, can be computed from t in timeO(||t||2).

Proof. Under our model of computation, the input t is a parse tree.Since t is simple, it is an h-affine conformal parse dag of size at most ||t||. Ifthere are no nodes which represent a redex, then we are done. Otherwise,locate a node representing a redex; this takes time at most O(||t||). We showhow to update the dag in time O(||t||) so that the size of the dag has strictlydecreased and the redex has been eliminated, while preserving conformality.Thus, after at most ||t|| iterations the resulting dag represents the normal-form term nf(t). The total time therefore is O(||t||2).

Assume first that the redex in t is (λxr)s with x of base type (see Figure1, where is a node with in-degree at most one and • is an arbitrary node);the argument is similar for an input variable x. Replace pointers to x in r bypointers to s. Since s does not contain x, no cycles are created. Delete theλx node and the root node for (λxr)s which points to it. By conformality (i)no other node points to the λx node. Update any node which pointed to thedeleted node for (λxr)s, so that it now points to the revised r subdag. Thiscompletes the β reduction on the dag (one may also delete the x nodes).Conformality (ii) gives that the updated dag represents a term t′ such thatt→ t′.

One can verify that the resulting parse dag is conformal and h-affine,with conformality (i) following from the fact that s has base type.

If the redex in t is (λxr)s with x of higher type (see Figure 2), then xoccurs at most once in r because the parse dag is h-affine. By conformality(i) there is at most one pointer to that occurrence of x. Update it to pointto s instead, deleting the x node. As in the preceding case, delete the λxand the (λxr)s node pointing to it, and update other nodes to point to the


•x

•

AAAA

r

?

λx

• @@R• s

•

AAAA

r

• s

-

-

Figure 1. Redex (λxr)s with r of base type.

x

•

AAAA

r

?

λx

• @@R s

•

AAAA

r

s

-

Figure 2. Redex (λxr)s with r of higher type.

::ρ •r

@@R

• @@R•

l

Cτ

@@R

@@R•

t

• @@R

s

::ρ •r

@@R

• @@R•

l

s•t

•

Figure 3. Cτ (r ::ρ l)ts 7→ srl with ρ possibly a base type.

revised r. Again by conformality (ii) the updated dag represents t′ such thatt → t′. Conformality and acyclicity are preserved, observing this time thatconformality (i) follows because there is at most one pointer to s.

The remaining reductions are for the constant symbols. We only need toconsider the case Cτ (r ::ρ l)ts 7→ srl with ρ possibly a base type; see Figure3.

Corollary (Base Normalization). Let t be a closed R-free simple termof type W. Then the binary numeral nf(t) can be computed from t in timeO(||t||2), and ||nf(t)|| ≤ ||t||.

Proof. By the Sharing Normalization Lemma we obtain a parse dagfor nf(t) of size at most ||t||, in time O(||t||2). Since nf(t) is a binary nu-meral, there is only one possible parse dag for it – namely, the parse tree


of the numeral. This is identified with the numeral itself in our model ofcomputation.

Lemma (R-Elimination). Let t(~x ) be a simple term of safe type. Thereis a polynomial Pt such that: if ~r are safe type R-free closed simple termsand the free variables of t(~r ) are output variables of safe type, then in timePt(||~r ||) one can compute an R-free simple term rf(t; ~x;~r ) such that t(~r )→∗

rf(t; ~x;~r ).

Proof. By induction on ||t||.If t(~x ) has the form λzu1, then z is an output variable and z, u1 have

safe type because t has safe type. If t(~x ) is of the form D~u with D a variableor a constant different from R, then each ui is a safe type term. Here (incase D is a variable xi) we need that xi is of safe type. In all of these cases,each ui(~r ) has only free output variables of safe type. Apply the inductionhypothesis as required to simple terms ui to obtain u∗i := rf(ui; ~x;~r ); soeach u∗i is R-free. Let t∗ be obtained from t by replacing each ui by u∗i .Then t∗ is an R-free simple term; here we need that ~r are closed, to avoidduplication of variables. The result is obtained in linear time from ~u∗. Thisfinishes the lemma in all of these cases.

If t is (λyr)s~u with an output variable y of base type, apply the inductionhypothesis to yield (r~u)∗ := rf(r~u; ~x;~r ) and s∗ := rf(s; ~x;~r ). Redirectthe pointers to y in (r~u)∗ to point to s∗ instead. If t is (λyr)s~u with aninput variable y of base type, apply the induction hypothesis to yield s∗ :=rf(s; ~x;~r ). Note that s∗ is closed, since it is an input term and the freevariables of s(~r ) are output variables. Then apply the induction hypothesisagain to obtain rf(r~u; ~x, y;~r, s∗). The total time is at most Q(||t||)+Ps(||~r ||)+Pr(||~r || + Ps(||~r ||)), where Q(||t||) is some linear function bounding the timeit takes to construct r~u from t = (λyr)s~u.

If t is (λyr(y))s~u with y of higher type, then y can occur at most once inr, because t is simple. Thus ||r(s)~u|| < ||(λyr)s~u|| and hence we may anpplythe induction hypothesis to obtain rf(r(s)~u; ~x;~r ). Note that the time isbounded by Q(||t||) + Pr(s)~u(||~r ||) for a degree one polynomial Q, since ittakes at most linear time to make the at-most-one substitution in the parsetree.

The only remaining case is if the term is an R-clause. Then it is ofthe form Rlus~t, because the term has safe type, meaning that s must bepresent. Since l is an input term, all free variables of l are input variables –they must be in ~x since free variables of (Rlus~t )[~x := ~r ] are output variables.Therefore l(~r ) is closed, implying nf(l(~r )) is a list. One obtains rf(l; ~x;~r )in time Pl(||~r ||) by the induction hypothesis. Then by Base Normalizationone obtains the list l := nf(rf(l; ~x;~r )) in a further polynomial time. Letl = b0 ::ρ (b1 ::ρ . . . (bN−1 ::ρ nilρ) . . .) and let li, i < N be obtained froml by omitting the initial elements b0, . . . , bi. Thus all bi, li | i < N areobtained in a total time bounded by some polynomial P ′l (||~r ||).

Now consider szy with new variables zρ and yL(ρ). Applying the induc-tion hypothesis to szy one obtains a monotone bounding polynomial Pszy.


One computes all si := rf(szy; ~x, z, y;~r, bi, li) in a total time of at most∑i<N

Pszy(||bi||+ ||li||+ ||~r ||) ≤ P ′l (||~r ||) · Pszy(2P ′l (||~r ||) + ||~r ||).

Each si is R-free by the induction hypothesis. Furthermore, no si has afree output variable: any such variable would also be free in s, contradictingthat s is an input term.

Consider u, ~t. The induction hypothesis gives u := rf(u; ~x;~r ) in timePu(||~r ||), and all ti := rf(ti; ~x;~r ) in time

∑i Pti(||~r ||). These terms are also

R-free by induction hypothesis. Clearly u and the ti do not have any free(or bound) higher type output variables in common. The same is true of uand all ti.

Using additional time bounded by a polynomial p in the lengths of thesecomputed values, one constructs the R-free term

(λxs0(s1 . . . (sN−1x) . . . ))u~t.

Defining Pt(n) := P (∑

i Pti(n) + P ′l (n) · Pszy(2P ′l (n) + n)), the total timeused in this case is at most Pt(||~r ||). The result is a term because u andthe ti are terms which do not have any free higher-type output variable incommon and because si does not have any free higher-type output variablesat all.

Theorem (Normalization). Let r be a closed LT(;)-term of type W . . .W W (∈ →,→). Then r denotes a polytime function.

Proof. One must find a polynomial Qt such that for all R-free simpleclosed terms ~n of types ~ρ one can compute nf(t~n) in time Qt(||~n||). Let ~x benew variables of types ~ρ. The normal form of t~x is computed in an amountof time that may be large, but it is still only a constant with respect to ~n.By the Simplicity Lemma nf(t~x) is simple. By R-Elimination one reducesto an R-free simple term rf(nf(t~x); ~x;~n) in time Pt(||~n||). Since the runningtime bounds the size of the produced term, ||rf(nf(t~x); ~x;~n)|| ≤ Pt(||~n||). BySharing Normalization one can compute

nf(t~n) = nf(rf(nf(t~x); ~x;~n))

in timeO(Pt(||~n||)2), so forQt we can take some constant multiple of Pt(||~n||)2.

8.3.5. The first-order fragment T1(;) of T(;). Let T1(;) be the frag-ment of T(;) where recursion and cases operators have base value types only.It will turn out that – similarly to the restriction of EA(;) to Σ1-induction– we can characterize polynomial-time complexity this way. The proof isa simplification of the argument above. A term is called first-order if itcontains no higher type variables. Obviously first-order terms are simple,and they are closed under reductions, taking of subterms, and applications.

Lemma (R-Elimination for T1(;)). Let t(~x ) be a first-order term of safetype. There is a polynomial Pt such that: if ~r are R-free closed first-orderterms and the free variables of t(~r ) are output variables, then in time Pt(||~r ||)one can compute an R-free first-order term rf(t; ~x;~r ) such that t(~r ) →∗

rf(t; ~x;~r ).


Proof. By induction on ||t||.If t(~x ) has the form λzu1, then z is an output variable because t has

safe type. If t(~x ) is of the form D~u with D a variable or a constant differentfrom R, then each ui is a safe type first-order term.

In all of the preceding cases, each ui(~r ) has free output variables only.Apply the induction hypothesis as required to first-order safe type terms uito obtain u∗i := rf(ui; ~x;~r ); so each u∗i is R-free. Let t∗ be obtained from tby replacing each ui by u∗i . Then t∗ is an R-free first-order term. The resultis obtained in linear time from ~u∗. This finishes the lemma in all of thesecases.

If t is (λyr)s~u with an output variable y, apply the induction hypothesisto yield (r~u)∗ := rf(r~u; ~x;~r ) and s∗ := rf(s; ~x;~r ). Redirect the pointers to yin (r~u)∗ to point to s∗ instead. If t is (λyr)s~u with an input variable y, applythe induction hypothesis to yield s∗ := rf(s; ~x;~r ). Note that s∗ is closed,since it is an input term and the free variables of s(~r ) are output variables.Then apply the induction hypothesis again to obtain rf(r~u; ~x, y;~r, s∗). Thetotal time is at most Q(||t||) + Ps(||~r ||) + Pr(||~r || + Ps(||~r ||)), as it takes atmost linear time to construct r~u from (λyr)s~u.

The only remaining case is if the term is an R-clause of the form Rlus~t.Since l is an input term, all free variables of l are input variables – theymust be in ~x since free variables of (Rlus~t )[~x := ~r ] are output variables.Therefore l(~r ) is closed, implying nf(l(~r )) is a list. One obtains rf(l; ~x;~r )in time Pl(||~r ||) by the induction hypothesis. Then by Base Normalizationone obtains the list l := nf(rf(l; ~x;~r )) in a further polynomial time. Letl = b0 ::ρ (b1 ::ρ . . . (bN−1 ::ρ nilρ) . . .) and let li, i < N be obtained froml by omitting the initial elements b0, . . . , bi. Thus all bi, li | i < N areobtained in a total time bounded by P ′l (||~r ||) for a polynomial P ′l .

Now consider szy with new variables zρ and yL(ρ). Applying the induc-tion hypothesis to szy one obtains a monotone bounding polynomial Pszy.One computes all si := rf(szy; ~x, z, y;~r, bi, li) in a total time of at most∑

i<N

Pszy(||bi||+ ||li||+ ||~r ||) ≤ P ′l (||~r ||) · Pszy(2P ′l (||~r ||) + ||~r ||).

Each si is R-free by the induction hypothesis. Furthermore, no si has a freeoutput variable: any such variable would be free in s contradicting that s isan input term.

Consider u, ~t. The induction hypothesis gives u := rf(u; ~x;~r ) in timePu(||~r ||), and all ti := rf(ti; ~x;~r ) in time

∑i Pti(||~r ||). These terms are R-free

by induction hypothesis.Using additional time bounded by a polynomial p in the lengths of these

computed values, one constructs the R-free term

(λxs0(s1 . . . (sN−1x) . . . ))u~t.

Defining Pt(n) := P (∑

i Pti(n) + P ′l (n) · Pszy(2P ′l (n) + n)), the total timeused in this case is at most Pt(||~r ||).

Theorem (Normalization). Let r be a closed T1(;)-term of type W . . .W W (∈ →,→). Then r denotes a polytime function.

8.4. TWO-SORTED SYSTEMS A(;), LA(;) 349

Proof. One must find a polynomial Qt such that for all numerals ~none can compute nf(t~n) in time Qt(||~n||). Let ~x be new variables of type W.The normal form of t~x is computed in an amount of time that may be large,but it is still only a constant with respect to ~n.

nf(t~x) clearly is a first-order term, since the R- and C-operators havebase value types only. By R-Elimination one reduces to an R-free first-orderterm rf(nf(t~x); ~x;~n) in time Pt(||~n||). Since the running time bounds the sizeof the produced term, ||rf(nf(t~x); ~x;~n)|| ≤ Pt(||~n||).

By Sharing Normalization one can compute

nf(t~n) = nf(rf(nf(t~x); ~x;~n))

in time O(Pt(||~n||)2). Let Qt be the polynomial referred to by the big-Onotation.

8.4. Two-Sorted Systems A(;), LA(;)

Using the fundamental Curry-Howard correspondence, we now transferthe term systems T(;) and LT(;) to corresponding logical systems A(;) andLA(;) of arithmetic. As a consequence, LA(;) and also the Σ1-fragment ofA(;) will automatically yield polynomial-time extracts.

The goal is to ensure, by some annotations to proofs, that the extract ofa proof is a term, in LT(;) or T(;), with polynomial complexity. The anno-tations are such that if we ignore them, then the resulting proof is a correctone, in ordinary arithmetic. Of course, we could also first extract a term in Tand then annotate this term to obtain a term in LT(;). However, the wholepoint of the present approach is to work with proofs rather than terms. Anadditional benefit of annotating proofs is that when interactively developingsuch a proof and finally checking its correctness w.r.t. input/output anno-tations, one can provide informative error messages. More precisely, theannotations consist in distinguishing

• two type arrows, ρ → σ and ρ→ σ,• two sorts of variables, input ones x and output ones x, and• two implications, A → B and A→ B.

Implication A → B is the “input” one, involving restrictions on the proofsof its premise: such proofs are only allowed to use input assumptions orinput variables. In contrast, A→ B is the “output” implication, for at mostone use of the hypothesis in case its type is not a base type.

8.4.1. Motivation. To motivate our annotations let us look at someexamples of arithmetical existence proofs exhibiting exponential growth.

Double use of assumptions. Consider

E(1, y) := S0(y),

E(Si(x), y) := E(x,E(x, y)),or

E(1) := S0,

E(Si(x)) := E(x) E(x).

Then E(x) = S(2||x||−1)0 , i.e., E grows exponentially. Here is a corresponding

existence proof. We have to show

∀x,y∃v(||v|| = 2||x||−1 + ||y||).


Proof. By induction on x. The base case is obvious. For the step letx be given and assume (induction hypothesis) ∀y∃v(||v|| = 2||x||−1 + ||y||). Wemust show ∀y∃w(||w|| = 2||x||+ ||y||). Given y, construct w by using (inductionhypothesis) with y to find v, and then using (induction hypothesis) again,this time with v, to find w.

The double use of the (“functional”) induction hypothesis clearly is re-sponsible for the exponential growth. Our linearity restriction on outputimplications will exclude such proofs.

Substitution in function parameters. Consider the iteration functionalI(x, f) = f (||x||−1); it is considered feasible in our setting. However, substitut-ing the easily definable doubling function D satisfying ||D(x)|| = 2||x|| yieldsthe exponential function I(x,D) = D(||x||−1). The corresponding proofs of

∀x(∀y∃z(||z|| = 2||y||)→ ∀y∃v(||v|| = 2||x||−1 + ||y||)),(8.1)

∀y∃z(||z|| = 2||y||)(8.2)

are unproblematic, but to avoid explosion we need to forbid applying a cuthere.

Our solution is to introduce a ramification concept. (8.2) is proved byinduction on y, hence needs a quantifier on an input variable: ∀y∃z(||z|| =2||y||). We exclude applicability of a cut by our ramification condition, whichrequires that the “kernel” of (8.1) – to be proved by induction on x – is safeand hence does not contain such universal subformulas proved by induction.

Iterated induction. It might seem that our restrictions are so tight thatthey rule out any form of nested induction. However, this is not true. Onecan define, e.g., (a form of) multiplication on top of addition: First oneproves ∀x∀y∃z(||z|| = ||x|| + ||y||) by induction on x, and then ∀y∃z(||z|| =||x|| · ||y||) by induction on y with a parameter x.

8.4.2. LA(;)-proof terms. We assume a given set of inductively de-fined predicates I, as in 7.2. Recall that each predicate I is of a fixed arity(“arity” here means not just the number of arguments, but also covers thetype of the arguments.) When writing I(~r ) we implicitly assume correctlength and types of ~r. LA(;)-formulas (formulas for short) A,B, . . . are

I(~r ) | A → B | A→ B | ∀xρA | ∀xρA.

In I(~r ), the ~r are terms from T. Define falsity F by tt = ff and negation ¬Aby A→ F.

We adapt the assigment in 7.2.3 of a type τ(A) to a formula A to LA(;)-formulas. Again it is convenient to extend the use of ρ → σ and ρ → σ tothe nulltype symbol : for ∈ →,→,

(ρ ) := , ( σ) := σ, ( ) := .With this understanding we can simply write

τ(I(~r )) :=

if I does not require witnessesιI otherwise,

τ(A → B) := (τ(A) → τ(B)), τ(∀xρA) := (ρ → τ(A)),

τ(A→ B) := (τ(A)→ τ(B)), τ(∀xρA) := (ρ→ τ(A)).


A formula A is called safe if τ(A) is safe, i.e., →-free. For instance, everyformula without → and universal quantifiers ∀xρ over an input variable x issafe. Recall the definition of the level of a type (in 6.1.4); types of level 0are called base types.

The induction axiom for N is

Indn,A : ∀n(A(0)→ ∀a(A(a)→ A(Sa)) → A(nN))

with n an input and a an output variable of type N, and A a safe formula.It has the type of the recursion operator which will realize it, namely

N → τ → (N→ τ → τ) → τ where τ = τ(A) is safe.

The cases axioms are as expected.By an ordinary proof term we mean a standard proof term built from

axioms, assumption and object terms by the usual introduction and elimi-nation rules for both implications → and → and both universal quantifiers(over input and output variables). The construction is as follows:

cA (axiom) |uA, uA (input and output assumption variables) |(λuAMB)A→B | (MA→BNA)B | (λuAMB)A→B | (MA→BNA)B |

(λxρMA)∀xA | (M∀xρA(x)rρ)A(r) | (λxρMA)∀xA | (M∀xρA(x)rρ)A(r).

In the two introduction rules for the universal quantifier we assume theusual condition on free variables, i.e., that x must not be free in the formulaof any free assumption variable. In the elimination rules for the universalquantifier, r is a term in T (not necessarily in LT(;)).

If we disregard the difference between input and output variables andalso between the two implications → and → and the two type arrows →and →, then every ordinary proof term becomes a proof term in HAω.

Definition (LA(;)-proof term). The proof terms which make up LA(;)are exactly those whose “extracted terms” (see below) lie in LT(;).

To complete the definition we need to define the extracted term et(M)of an ordinary proof term M . This definition is an adaption of the corre-sponding one in 7.2.4. We may assume that M derives a formula A withτ(A) 6= . Then

et(uA) := xτ(A)u ,

et(uA) := xτ(A)u ,

et((λuAM)A→B) := λx

τ(A)u

et(M),

et((λuAM)A→B) := λx

τ(A)u

et(M),

et(MA→BN) := et(MA→BN) := et(M)et(N),

et((λxρM)∀xA) := λxρet(M),

et(M∀xAr) := et(M)r,


with x an input or output variable. Extracted terms for the axioms aredefined in the obvious way: constructors for the introductions and recursionoperators for the eliminations, as in 7.2.4.

The LA(;)-proof terms and their corresponding sets CV(M) of computa-tional variables may alternatively be inductively defined. If τ(A) = thenevery ordinary proof term MA is an LA(;)-proof term and CV(M) := ∅.

• Every assumption constant (axiom) cA and every input or outputassumption variable uA or uA is an LA(;)-proof term. CV(uA) :=xu and CV(uA) := xu.• If MA is an LA(;)-proof term, then so is (λuAM)A→B and also

(λuAM)A→B. CV(λuAM) = CV(M) \ xu and CV(λuAM) =CV(M) \ xu.• If MA→B and NA are LA(;)-proof terms, then so is (MN)B, pro-

vided all variables in CV(N) are input. CV(MN) := CV(M) ∪CV(N).• If MA→B and NA are LA(;)-proof terms, then so is (MN)B, pro-

vided the higher type output variables in CV(M) and CV(N) aredisjoint. CV(MN) := CV(M) ∪ CV(N).• If MA is an LA(;)-proof term, and x /∈ FV(B) for every for-

mula B of a free assumption variable in M , then so is (λxM)∀xA.CV(λxM) := CV(M) \ x (x an input or output variable).• IfM∀xA(x) is an LA(;)-proof term and r is an input LT(;)-term, then

(Mr)A(r) is an LA(;) proof term. CV(Mr) := CV(M) ∪ FV(r).• If M∀xA(x) is an LA(;)-proof term and r is an LT(;)-term, then

(Mr)A(r) is an LA(;)-proof term, provided the higher type outputvariables in CV(M) are not free in r. CV(Mr) := CV(M)∪FV(r).

It is easy to see that for every LA(;)-proof term M , the set CV(M) of itscomputational variables is the set of variables free in the extracted termet(M).

Theorem (Characterization). The LA(;)-proof terms are exactly thosegenerated by the above clauses.

Proof. We proceed by induction onM , assuming thatM is an ordinaryproof term. We can assume τ(A) 6= , for otherwise the claim is obvious.

Case MA→BNA with τ(A) 6= . The following are equivalent.• MN is generated by the clauses.• M , N are generated by the clauses, and the higher type output

variables in CV(M) and CV(N) are disjoint.• et(M) and et(N) are LT(;)-terms, and the higher type output vari-

ables in FV(et(M)) and FV(et(N)) are disjoint.• et(M)et(N) (= et(MN)) is an LT(;)-term.

The other cases are similar.

The natural deduction framework now provides a straightforward for-malization of proofs in LA(;). This applies, e.g., to the proofs sketched in8.4.1. Further examples will be given below.

8.4.3. LA(;) and its provably recursive functions. A k-ary nume-rical function f is provably recursive in LA(;) if there is a Σ1-formula


Gf (n1, . . . , nk, a) denoting the graph of f , and a derivation Mf in LA(;)of

∀n1,...,nk∃aGf (n1, . . . , nk, a).

Here the ni, a denote input, respectively output variables of type W.

Theorem. The functions provably recursive in LA(;) are exactly the de-finable functions of LT(;) of type Wk →W, which are exactly the functionscomputable in polynomial time.

Proof. LetM be a derivation in LA(;) proving a formula of type Wk →W. Then et(M) belongs to LT(;) and hence denotes a polynomial timefunction which, by the Soundness Theorem, is f .

Conversely, any polynomial time function f is represented by an LT(;)term, say t(~n ), and from t(~n ) = t(~n ) we deduce ∀~n∃a(t(~n ) = a). We maytake t(~n ) = a to be the formula Gf . Thus f is provably recursive.

8.4.4. A(;)- and Σ1-A(;)-proof terms. In much the same way as wehave defined LA(;) from LT(;) above, we can define an arithmetical sys-tem A(;) corresponding to T(;). A(;) is just LA(;), but with all linearityrestrictions removed. The analogue of the theorem above is now

Theorem. The functions provably recursive in A(;) are exactly the de-finable functions of T(;) of type Nk → N, which are exactly the elementaryfunctions.

In 8.3.5 we have defined T1(;) to be the first-order fragment of T(;),where recursion and cases operators have base type values only. Let Σ1-A(;)be the corresponding arithmetical system, that is, the induction and casesaxioms are allowed for formulas A of base type only, which is the appropriategeneralization of Σ1-formulas in our setting. Σ1-A(;) therefore is the Σ1-fragment of A(;). Then again

Theorem. The functions provably recursive in Σ1-A(;) are exactly thedefinable functions of T1(;) of type Wk → W, which are exactly the poly-nomial time computable functions.

8.4.5. Application: Insertion sort in LA(;). We show that the in-sertion sort algorithm is the computational content of an appropriate proof.

To this end we recursively define a function I inserting an element a intoa list l, in the first place where it finds an element bigger:

I(a,nil) := a :: nil, I(a, b :: l) :=

a :: b :: l if a ≤ b,b :: I(a, l) otherwise

and, using I, a function S sorting a list l into ascending order:

S(nil) := nil, S(a :: l) := I(a, S(l)).

These functions need only be presented to the theory by inductive definitionsof their graphs. Thus, writing I(a, l, l′) to denote I(a, l) = l′ and similarly,S(l, l′) to denote S(l) = l′, we have the following axioms:I(a,nil, a :: nil),

a ≤ b→ I(a, b :: l, a :: b :: l),

b < a→ I(a, l, l′)→ I(a, b :: l, b :: l′),

S(nil,nil),

S(l, l′)→ I(a, l′, l′′)→ S(a :: l, l′′).


We need that the Σ1 inductive definitions of I and S are admitted in safeLA(;)-formulas. As an auxiliary function we use tli(l), which is the tail ofthe list l of length i, if i < lh(l), and l otherwise. Its recursion equations are

tli(nil) := nil, tli(a :: l) := [if i ≤ lh(l) then tli(l) else a :: l].

We will need some easy properties of S and tl:

S(l, l′)→ lh(l) = lh(l′),

i ≤ lh(l)→ tli(b :: l) = tli(l),

tllh(l)(l) = l, tl0(l) = nil.

We now want to derive S(l)↓ in LA(;). That is, ∃l′S(l, l′). However, weshall not be able to do this. All we can achieve is, for any input parametern, lh(l) ≤ n→ S(l)↓.

Lemma (Insertion). ∀a,l,n∀i≤n∃l′I(a, tlmin(i,lh(l))(l), l′).

Proof. We fix a, l and prove the claim by induction on n. In the basecase we can take l′ := a :: nil, using tl0(l) = nil. For the step we must show

∀i≤n∃l′I(a, tlmin(i,lh(l))(l), l′)→ ∀i≤n+1∃l′I(a, tlmin(i,lh(l))(l), l

′).

Assume the premise, and i ≤ n+1. If i ≤ n we are done, by the premise. Solet i = n+1. If lh(l) ≤ n then the premise for i := n gives ∃l′I(a, tllh(l)(l), l′),which is our goal. If n + 1 ≤ lh(l) we need to show ∃l′I(a, tln+1(l), l′).Observe that tln+1(l) = b :: tln(l) with b := hd(tln+1(l)), because of n+ 1 ≤lh(l). We now use the definition of I. If a ≤ b, we explicitly have the desiredvalue l′ := a :: b :: tln(l). Otherwise it suffices to know ∃l′I(a, tln(l), l′). Butthis follows from the premise for i := n.

Using this we now prove

Lemma (Insertion Sort). ∀l,n,m(m ≤ n→ ∃l′S(tlmin(m,lh(l))(l), l′)).

Proof. We fix l, n and prove the claim by induction on m. In the basecase we can take l′ := nil, using tl0(l) = nil. For the step we must show

(m≤n→ ∃l′S(tlmin(m,lh(l))(l), l′))→ (m+1≤n→ ∃l′S(tlmin(m+1,lh(l))(l), l

′)).

Assume the premise andm+1 ≤ n. If lh(l) ≤ m we are done, by the premise.Ifm+1 ≤ lh(l) we need to show ∃l′′S(tlm+1(l), l′′). Now tlm+1(l) = a :: tlm(l)with a := hd(tlm+1(l)), because ofm+1 ≤ lh(l). By definition of S it sufficesto find l′, l′′ such that S(tlm(l), l′) and I(a, l′, l′′). Pick by the premise anl′ with S(tlm(l), l′). Further, the Insertion Lemma applied to a, l′, n andi := m gives an l′′ such that I(a, tlm(l′), l′′). Using lh(l′) = lh(tlm(l)) = c weobtain I(a, l′, l′′), as desired.

Specializing this to l, n, n we finally obtain

lh(l) ≤ n→ ∃l′S(l, l′).

8.5. NOTES 355

8.5. Notes

The elementary variant T(;) of Godel’s T developed in 8.2 has manyrelatives in the literature.

Beckmann and Weiermann (1996) characterize the elementary functionsby means of a restriction of the combinatory logic version of Godel’s T.The restriction consists in allowing occurrences of the iteration operatoronly when immediately applied to a type N argument. For the proof theyuse an ordinal assignment due to Howard (1970) and Schutte (1977). Theauthors remark (on p. 477) that the methods of their paper can also beapplied to a λ-formulation of T: the restriction on terms then consists inallowing only iterators of the form IρtN and in disallowing λ-abstraction ofthe form λx . . . IρtN . . . where x occurs in tN; however, no details are given.Moreover, our restrictions are slightly more liberal (input variables in t canbe abstracted), and also the proof method is very different.

Aehlig and Johannsen (2005) characterize the elementary functions bymeans of a fragment of Girard’s system F . They make essential use of theChurch style representation of numbers in F . A somewhat different approachfor characterizing the elementary functions based on a “predicative” settinghas been developed by Leivant (1994).

The material in the present chapter is partially based on Schwichtenberg(2006a).

Bibliography

A. Abel and T. Altenkirch. A predicative strong normalization proof fora λ-calculus with interleaving inductive types. In Types for Proofs andPrograms, volume 1956 of LNCS, pages 21–40. Springer Verlag, Berlin,Heidelberg, New York, 2000.

S. Abramsky. Domain theory in logical form. Annals of Pure and AppliedLogic, 51:1–77, 1991.

W. Ackermann. Zur Widerspruchsfreiheit der Zahlentheorie. Math. Annalen,117:162–194, 1940.

P. Aczel, H. Simmons, and S. S. Wainer, editors. Proof Theory. A selectionof papers from the Leeds Proof Theory Programme 1990, 1992. CambridgeUniversity Press.

K. Aehlig and J. Johannsen. An elementary fragment of second-orderlambda calculus. ACM Transactions on Computational Logic, 6(2):468–480, Apr. 2005.

R. M. Amadio and P.-L. Curien. Domains and Lambda-Calculi. CambridgeUniversity Press, 1998.

J. Avigad and R. Sommer. A model theoretic approach to ordinal analysis.Bulletin of Symbolic Logic, 3:17–52, 1997.

F. Barbanera and S. Berardi. Extracting constructive content from classicallogic via control–like reductions. In M. Bezem and J. Groote, editors,Typed Lambda Calculi and Applications, pages 45–59. LNCS Vol. 664,1993.

H. Barendregt, M. Coppo, and M. Dezani-Ciancaglini. A filter lambda modeland the completeness of type assignment. The Journal of Symbolic Logic,48(4):931–940, 1983.

H. P. Barendregt. The Lambda Calculus. North–Holland, Amsterdam, sec-ond edition, 1984.

A. Beckmann and A. Weiermann. A term rewriting characterization of thepolytime functions and related complexity classes. Archive for Mathema-tical Logic, 36:11–30, 1996.

S. Bellantoni and S. Cook. A new recursion-theoretic characterization of thepolytime functions. Computational Complexity, 2:97–110, 1992.

S. Bellantoni and M. Hofmann. A new “feasible” arithmetic. The Journalof Symbolic Logic, 67(1):104–116, 2002.

S. Bellantoni, K.-H. Niggl, and H. Schwichtenberg. Higher type recursion,ramification and polynomial time. Annals of Pure and Applied Logic, 104:17–30, 2000.

H. Benl. Konstruktive Interpretation induktiver Definitionen. Master’s the-sis, Mathematisches Institut der Universitat Munchen, 1998.

357

358 Bibliography

U. Berger. Uniform Heyting Arithmetic. Annals of Pure and Applied Logic,133:125–148, 2005a.

U. Berger. Continuous semantics for strong normalization. In Proc. CiE2005, volume 3526 of LNCS, pages 23–34, 2005b.

U. Berger. Total sets and objects in domain theory. Annals of Pure andApplied Logic, 60:91–117, 1993a.

U. Berger. Program extraction from normalization proofs. In M. Bezem andJ. Groote, editors, Typed Lambda Calculi and Applications, volume 664of LNCS, pages 91–106. Springer Verlag, Berlin, Heidelberg, New York,1993b.

U. Berger. Programs from classical proofs. In M. Behara, R. Fritsch, andR. Lintz, editors, Symposia Gaussiana. Proceedings of the 2nd Gauss Sym-posium. Conference A: Mathematics and Theoretical Physics. Munich,Germany, August 2-7, 1993, pages 187–200, Berlin, New York, 1995. Wal-ter de Gruyter.

U. Berger and H. Schwichtenberg. An inverse of the evaluation functionalfor typed λ-calculus. In R. Vemuri, editor, Proceedings 6’th Symposiumon Logic in Computer Science (LICS’91), pages 203–211. IEEE ComputerSociety Press, Los Alamitos, 1991.

U. Berger and H. Schwichtenberg. Program extraction from classical proofs.In D. Leivant, editor, Logic and Computational Complexity, InternationalWorkshop LCC ’94, Indianapolis, IN, USA, October 1994, volume 960 ofLNCS, pages 77–97. Springer Verlag, Berlin, Heidelberg, New York, 1995.

U. Berger and H. Schwichtenberg. The greatest common divisor: a casestudy for program extraction from classical proofs. In S. Berardi andM. Coppo, editors, Types for Proofs and Programs. International Work-shop TYPES ’95, Torino, Italy, June 1995. Selected Papers, volume 1158of LNCS, pages 36–46. Springer Verlag, Berlin, Heidelberg, New York,1996.

U. Berger, H. Schwichtenberg, and M. Seisenberger. The Warshall Algorithmand Dickson’s Lemma: Two Examples of Realistic Program Extraction.Journal of Automated Reasoning, 26:205–221, 2001.

U. Berger, W. Buchholz, and H. Schwichtenberg. Refined program extrac-tion from classical proofs. Annals of Pure and Applied Logic, 114:3–25,2002.

U. Berger, M. Eberl, and H. Schwichtenberg. Term rewriting for normaliza-tion by evaluation. Information and Computation, 183:19–42, 2003.

E. Beth. Semantic construction of intuitionistic logic. Medelingen de KNAWN.S., 19(11), 1956.

M. Bezem and V. Veldman. Ramsey’s theorem and the pigeonhole principlein intuitionistic mathematics. J. London Math. Soc, 47:193–211, 1993.

F. Blanqui, J.-P. Jouannaud, and M. Okada. The Calculus of AlgebraicConstructions. In RTA’99. LNCS 1631, 1999.

E. Borger, E. Gradel, and Y. Gurevich. The Classical Decision Problem.Perspectives in Mathematical Logic. Springer Verlag, Berlin, Heidelberg,New York, 1997.

W. Buchholz. Three contributions to the conference on recent advances inproof theory. Handwritten notes, 1980.

Bibliography 359

W. Buchholz. An independence result for Π11–CA+BI. Annals of Pure and

Applied Logic, 33(2):131–155, 1987.W. Buchholz and S. S. Wainer. Provably computable functions and the

fast growing hierarchy. In S. Simpson, editor, Logic and Combinatorics,volume 65 of Contemporary Mathematics, pages 179–198. American Math-ematical Society, 1987.

W. Buchholz, S. Feferman, W. Pohlers, and W. Sieg. Iterated InductiveDefinitions and Subsystems of Analysis: Recent Proof–Theoretical Studies,volume 897 of Lecture Notes in Mathematics. Springer, Berlin, 1981.

W. Buchholz, A. Cichon, and A. Weiermann. A uniform approach to fun-damental sequences and hierarchies. Mathematical Logic Quarterly, 40:273–286, 1994.

S. R. Buss. Bounded Arithmetic. PhD thesis, Princeton University, 1985.S. R. Buss. The witness function method and provably recursive functions of

peano arithmetic. In D. Prawitz, B. Skyrms, and D. Westerstahl, editors,Proceedings of the 9th International Congress of Logic, Methodology andPhilosophy of Science, pages 29–68. North–Holland, 1994.

S. R. Buss, editor. Handbook of Proof Theory, volume 137 of Studies inLogic. Elsevier North–Holland, 1998.

L. Chiarabini. Program Development by Proof Transformation. PhD thesis,Fakultat fur Mathematik, Informatik und Statistik der LMU, Munchen,2009.

A. Cichon. A short proof of two recently discovered independence proofsusing recursion theoretic methods. Proceeding of the American Mathe-matical Society, 87(704–706), 1983.

A. Cobham. The intrinsic computational difficulty of functions. In Y. Bar-Hillel, editor, Logic, Methodology and Philosophy of Science II, pages 24–30. North–Holland, Amsterdam, 1965.

R. L. Constable and C. Murthy. Finding computational content in classicalproofs. In G. Huet and G. Plotkin, editors, Logical Frameworks, pages341–362. Cambridge University Press, 1991.

S. A. Cook and B. M. Kapron. Characterizations of the basic feasible func-tionals of finite type. In S. Buss and P. Scott, editors, Feasible Mathemat-ics, pages 71–96. Birkhauser, 1990.

T. Coquand and H. Persson. Grobner Bases in Type Theory. In T. Al-tenkirch, W. Naraschewski, and B. Reus, editors, Types for Proofs andPrograms, volume 1657 of LNCS. Springer Verlag, Berlin, Heidelberg,New York, 1999.

T. Coquand and A. Spiwack. A proof of strong normalisation using domaintheory. In Proceedings LICS 2006, pages 307–316, 2006.

T. Coquand, G. Sambin, J. Smith, and S. Valentini. Inductively generatedformal topologies. Annals of Pure and Applied Logic, 124:71–106, 2003.

N. G. de Bruijn. Lambda calculus notation with nameless dummies, atool for automatic formula manipulation, with application to the Church-Rosser theorem. Indagationes Math., 34:381–392, 1972.

L. Dickson. Finiteness of the odd perfect and primitive abundant numberswith n distinct prime factors. Am. J. Math, 35:413–422, 1913.

360 Bibliography

J. Diller and W. Nahm. Eine Variante zur Dialectica-Interpretation derHeyting-Arithmetik endlicher Typen. Archiv fur Mathematische Logikund Grundlagenforschung, 16:49–66, 1974.

A. Dragalin. New kinds of realizability. In Abstracts of the 6th InternationalCongress of Logic, Methodology and Philosophy of Sciences, pages 20–24,Hannover, Germany, 1979.

Y. L. Ershov. Everywhere defined continuous functionals. Algebra i Logika,11(6):656–665, 1972.

M. V. Fairtlough and S. S. Wainer. Ordinal complexity of recursive defini-tions. Information and Computation, 99:123–153, 1992.

M. V. Fairtlough and S. S. Wainer. Hierarchies of provably recursive func-tions. In Buss (1998), pages 149–207.

S. Feferman. Arithmetization of metamathematics in a general setting. Fun-damenta Mathematicae, XLIX:35–92, 1960.

S. Feferman. Classifications of recursive functions by means of hierarchies.Transactions American Mathematical Society, 104:101–122, 1962.

S. Feferman. Formal theories for transfinite iterations of generalized induc-tive definitions and some subsystems of analysis. In Kino et al. (1970),pages 303–325.

S. Feferman. Logics for termination and correctness of functional programs.In Y. Moschovakis, editor, Logic from Computer Science, Proceedings of aWorkshop held November 13–17, 1989, number 21 in MSRI Publications,pages 95–127, New York, 1992. Springer.

M. Felleisen and R. Hieb. The revised report on the syntactic theory ofsequential control and state. Theoretical Computer Science, 102:235–271,1992.

M. Felleisen, D. P. Friedman, E. Kohlbecker, and B. Duba. A syntactictheory of sequential control. Theoretical Computer Science, 52:205–237,1987.

A. Filinski. A semantic account of type-directed partial evaluation. InPrinciples and Practice of Declarative Programming 1999, volume 1702of LNCS, pages 378–395. Springer Verlag, Berlin, Heidelberg, New York,1999.

H. Friedman. Iterated inductive definitions and Σ12-AC. In Kino et al. (1970),

pages 435–442.H. Friedman. Classically and intuitionistically provably recursive functions.

In D. Scott and G. Muller, editors, Higher Set Theory, volume 669 ofLecture Notes in Mathematics, pages 21–28. Springer Verlag, Berlin, Hei-delberg, New York, 1978.

H. M. Friedman and M. Sheard. Elementary descent recursion and prooftheory. Annals of Pure and Applied Logic, 71:1–45, 1995.

G. Gentzen. Untersuchungen uber das logische Schließen. MathematischeZeitschrift, 39:176–210, 405–431, 1934.

G. Gentzen. Die Widerspruchsfreiheit der reinen Zahlentheorie. Mathema-tische Annalen, 112:493–565, 1936.

J.-Y. Girard. Une extension de l’interpretation de Godel a l’analyse, et sonapplication a l’elimination des coupures dans l’analyse et la theorie destypes. In J. Fenstad, editor, Proceedings of the Second Scandinavian Logic

Bibliography 361

Symposium, pages 63–92. North–Holland, Amsterdam, 1971.J.-Y. Girard. Π1

2–logic. Part I: Dilators. Annals of Mathematical Logic, 21:75–219, 1981.

J.-Y. Girard. Proof Theory and Logical Complexity. Bibliopolis, Napoli,1987.

J.-Y. Girard. Light linear logic. Information and Computation, 143, 1998.K. Godel. Uber formal unentscheidbare Satze der Principia Mathematica

und verwandter Systeme I. Monatshefte fur Mathematik und Physik, 38:173–198, 1931.

K. Godel. Uber eine bisher noch nicht benutzte Erweiterung des finitenStandpunkts. Dialectica, 12:280–287, 1958.

K. Godel. Collected Works, Volume II, Publications 1938–1974. OxfordUniversity Press, 1990.

R. Graham, B. Rothschild, and J. Spencer. Ramsey Theory. Wiley, 1990.T. G. Griffin. A formulae-as-types notion of control. In Conference Record of

the Seventeenth Annual ACM Symposium on Principles of ProgrammingLanguages, pages 47–58, 1990.

A. Grzegorczyk. Some classes of recursive functions. Rozprawy Matematy-czne, Warszawa, 1953.

P. Hajek and P. Pudlak. Metamathematics of First–Order Arithmetic. Per-spectives in Mathematical Logic. Springer Verlag, Berlin, Heidelberg, NewYork, 1993.

G. H. Hardy. A theorem concerning the infinite cardinal numbers. QuaterlyJournal of Mathematics, 35:87–94, 1904.

M.-D. Hernest. Feasible programs from (non-constructive) proofs by the light(monotone) Dialectica interpretation. PhD thesis, Ecole PolytechniqueParis and LMU Munchen, 2006.

A. Heyting, editor. Constructivity in Mathematics, 1959. North–Holland,Amsterdam.

D. Hilbert and P. Bernays. Grundlagen der Mathematik II, volume 50of Grundlehren der mathematischen Wissenschaften. Springer–Verlag,Berlin, second edition, 1970.

M. Hofmann. Linear types and non-size-increasing polynomial time com-putation. In Proceedings 14’th Symposium on Logic in Computer Science(LICS’99), pages 464–473, 1999.

W. A. Howard. Assignment of ordinals to terms for primitive recursivefunctionals of finite type. In Kino et al. (1970), pages 443–458.

S. Huber. On the computational content of choice axioms. Master’s thesis,Mathematisches Institut der Universitat Munchen, 2009.

H. Ishihara. A note on the Godel-Gentzen translation. Mathematical LogicQuarterly, 46:135–137, 2000.

F. Joachimski and R. Matthes. Short proofs of normalisation for the simply-typed λ-calculus, permutative conversions and Godel’s T . Archive forMathematical Logic, 42:59–87, 2003.

C. G. Jockusch. Ramsey’s theorem and recursion theory. The Journal ofSymbolic Logic, 37:268–280, 1972.

I. Johansson. Der Minimalkalkul, ein reduzierter intuitionistischer Forma-lismus. Compositio Mathematica, 4:119–136, 1937.

362 Bibliography

K. F. Jørgensen. Finite type arithmetic. Master’s thesis, University ofRoskilde, 2001.

N. Kadota. On Wainer’s notation for a minimal subrecursive inaccessibleordinal. Mathematical Logic Quarterly, 39:217–227, 1993.

L. Kalmar. Ein einfaches Beispiel fur ein unentscheidbares Problem (hun-garian, with german summary). Mat. Fiz. Lapok, 50:1–23, 1943.

J. Ketonen and R. Solovay. Rapidly growing Ramsey furnctions. Ann. ofMath, Ser 2, 113:267–314, 1981.

A. Kino, J. Myhill, and R. Vesley, editors. Intuitioninism and Proof Theory,Studies in logic and the foundations of mathematics, 1970. North–Holland,Amsterdam.

L. Kirby and J. Paris. Accessible independence results for Peano Arithmetic.Bulletin of the American Mathematical Society, 113:285–293, 1982.

S. C. Kleene. Introduction to Metamathematics. D. van Nostrand Comp.,New York, 1952.

S. C. Kleene. Extension of an effectively generated class of functions byenumeration. Colloquium Mathematicum, 6:67–78, 1958.

U. Kohlenbach. Analysing proofs in analysis. In W. Hodges, M. Hyland,C. Steinhorn, and J. Truss, editors, Logic: from Foundations to Applica-tions. European Logic Colloquium (Keele, 1993), pages 225–260. OxfordUniversity Press, 1996.

A. N. Kolmogorov. On the principle of the excluded middle (Russian).Matematicheskij Sbornik. Akademiya Nauk SSSRi Moskovskoe Matem-aticheskoe Obshchestvo, 32:646–667, 1925. Translated in J. van Heijenoort,From Frege to Godel. A Source Book in Mathematical Logic 1879–1931,Harvard University Press, Cambridge, MA., 1967, pp. 414–437.

A. N. Kolmogorov. Zur Deutung der intuitionistischen Logik. Math.Zeitschr., 35:58–65, 1932.

G. Kreisel. On the interpretation of non–finitist proofs I. The Journal ofSymbolic Logic, 16:241–267, 1951.

G. Kreisel. On the interpretation of non–finitist proofs II. The Journal ofSymbolic Logic, 17:43–58, 1952.

G. Kreisel. Interpretation of analysis by means of constructive functionalsof finite types. In Heyting (1959), pages 101–128.

G. Kreisel. Generalized inductive definitions. In Reports for the seminaron foundations of analysis, volume I. Stanford University, mimeographed,1963.

L. Kristiansen and D. Normann. Total objects in inductively defined types.Archive for Mathematical Logic, 36(6):405–436, 1997.

J.-L. Krivine. Classical logic, storage operators and second-order lambda-calculus. Annals of Pure and Applied Logic, 68:53–78, 1994.

K. G. Larsen and G. Winskel. Using information systems to solve recursivedomain equations. Information and Computation, 91:232–258, 1991.

D. Leivant. Syntactic translations and provably recursive functions. TheJournal of Symbolic Logic, 50(3):682–688, September 1985.

D. Leivant. Predicative recurrence in finite type. In A. Nerode andY. Matiyasevich, editors, Logical Foundations of Computer Science, vol-ume 813 of LNCS, pages 227–239, 1994.

Bibliography 363

D. Leivant. Ramified recurrence and computational complexity I: Wordrecurrence and poly–time. In P. Clote and J. Remmel, editors, FeasibleMathematics II, pages 320–343. Birkhauser, Boston, 1995a.

D. Leivant. Intrinsic theories and computational complexity. In D. Leivant,editor, Logic and Computational Complexity, International WorkshopLCC ’94, Indianapolis, IN, USA, October 1994, volume 960 of LNCS,pages 177–194. Springer Verlag, Berlin, Heidelberg, New York, 1995b.

D. Leivant and J.-Y. Marion. Lambda calculus characterization of poly–time. In M. Bezem and J. Groote, editors, Typed Lambda Calculi andApplications, pages 274–288. LNCS Vol. 664, 1993.

S. Liu. A theorem on general recursive functions. Proceedings AmericanMathematical Society, 11:184–187, 1960.

M. Lob and S. S. Wainer. Hierarchies of number theoretic functions I, II.Archiv fur mathematische Logik und Grundlagenforschung, 13:39–51, 97–113, 1970.

J.-Y. Marion. Actual arithmetic and feasibility. In L.Fribourg, editor, 15thInternational workshop, Computer Science Logic, CSL’01, volume 2142of Lecture Notes in Computer Science, pages 115–139. Springer, 2001.

P. Martin-Lof. Hauptsatz for the intuitionistic theory of iterated inductivedefinitions. In J. Fenstad, editor, Proceedings of the Second ScandinavianLogic Symposium, pages 179–216. North–Holland, Amsterdam, 1971.

P. Martin-Lof. The domain interpretation of type theory. Talk at theworkshop on semantics of programming languages, Chalmers University,Goteborg, August 1983.

G. E. Mints. Quantifier–free and one–quantifier systems. Journal of SovietMathematics, 1:71–84, 1973.

G. E. Mints. Finite investigations of transfinite derivations. Journal ofSoviet Mathematics, 10:548–596, 1978. Translated from: Zap. Nauchn.Semin. LOMI 49 (1975).

A. Miquel. The implicit calculus of constructions. extending pure type sys-tems with an intersection type binder and subtyping. In Proceedings of thefifth International Conference on Typed Lambda Calculi and Applications(TLCA’01), Cracovie (Pologne), 2001.

C. Murthy. Extracting constructive content from classical proofs. TechnicalReport 90–1151, Dep.of Comp.Science, Cornell Univ., Ithaca, New York,1990. PhD thesis.

J. Myhill. A stumbling block in constructive mathematics (abstract). TheJournal of Symbolic Logic, 18:190, 1953.

S. Negri and J. von Plato. Structural Proof Theory. Cambridge UniversityPress, 2001.

D. Normann. Computability over the partial continuous functionals. TheJournal of Symbolic Logic, 65(3):1133–1142, 2000.

D. Normann. Computing with functionals – computability theory or com-puter science? The Bulletin of Symbolic Logic, 12:43–59, 2006.

P. Oliva. Unifying functional interpretations. Notre Dame J. Formal Logic,47:262–290, 2006.

G. Ostrin and S. S. Wainer. Elementary arithmetic. Annals of Pure andApplied Logic, 133:275–292, 2005.

364 Bibliography

M. Parigot. λµ–calculus: an algorithmic interpretation of classical natu-ral deduction. In Proc. of Log. Prog. and Automatic Reasoning, St. Pe-tersburg, volume 624 of LNCS, pages 190–201. Springer Verlag, Berlin,Heidelberg, New York, 1992.

J. Paris. A hierarchy of cuts in models of arithmetic. In L. Pacholski et al.,editors, Model Theory of Algebra and Arithmetic, volume 834 of LectureNotes in Mathematics, pages 312–337. Springer Verlag, 1980.

J. Paris and L. Harrington. A mathematical incompleteness in Peano Arith-metic. In J. Barwise, editor, Handbook of Mathematical Logic, pages 1133–1142. North–Holland, Amsterdam, 1977.

C. Parsons. Ordinal recursion in partial systems of number theory (abstract).Notices of the American Mathematical Society, 13:857–858, 1966.

C. Parsons. On n–quantifier induction. The Journal of Symbolic Logic, 37(3):466–482, 1972.

G. D. Plotkin. LCF considered as a programming language. TheoreticalComputer Science, 5:223–255, 1977.

G. D. Plotkin. Tω as a universal domain. Journal of Computer and SystemSciences, 17:209–236, 1978.

D. Prawitz. Natural Deduction, volume 3 of Acta Universitatis Stockholmien-sis. Stockholm Studies in Philosophy. Almqvist & Wiksell, Stockholm,1965.

F. Ramsey. On a problem of formal logic. Proc. London Math. Soc., Ser. 2,30:264–286, 1930.

Z. Ratajczyk. Subsystems of true arithmetic and hierarchies of functions.Annals of Pure and Applied Logic, 64:95–152, 1993.

P. Rath. Eine verallgemeinerte Funktionalinterpretation der Heyting Arith-metik endlicher Typen. PhD thesis, Universitat Munster, FachbereichMathematik, 1978.

M. Rathjen. A proof–theoretic characterization of primitive recursive setfunctions. The Journal of Symbolic Logic, 57:954–969, 1992.

M. Rathjen. The realm of ordinal analysis. In S. Cooper and J. Truss,editors, Sets and Proofs: Logic Colloquium ’97, volume 258 of LondonMathematical Society Lecture Notes, pages 219–279. Cambridge Univer-sity Press, 1999.

W. Richter. Extensions of the constructive ordinals. The Journal of SymbolicLogic, 30(2):193–211, 1965.

J. W. Robbin. Subrecursive Hierarchies. PhD thesis, Princeton University,1965.

R. M. Robinson. An essentially undecidable axiom system. In Proceed-ings of the International Congress of Mathematicians (Cambridge 1950),volume I, pages 729–730, 1950.

D. Rodding. Klassen rekursiver Funktionen. In Proceedings of the SummerSchool in Logic, volume 70 of Lecture Notes in Mathematics, pages 159–222. Springer Verlag, Berlin, Heidelberg, New York, 1968.

H. E. Rose. Subrecursion: Functions and hierarchies, volume 9 of OxfordLogic Guides. Clarendon Press, Oxford, 1984.

N. Routledge. Ordinal recursion. Proc. Cambridge Phil. Soc., 49:175–182,1953.

Bibliography 365

D. Schmidt. Built-up systems of fundamental sequences and hierarchies ofnumber-theoretic functions. Archiv fur Mathematische Logik und Grund-lagenforschung, 18:47–53, 1976.

K. Schutte. Beweistheoretische Erfassung der unendlichen Induktion in derZahlentheorie. Mathematische Annalen, 122:369–389, 1951.

K. Schutte. Proof Theory. Springer Verlag, Berlin, Heidelberg, New York,1977.

H. Schwichtenberg. An arithmetic for polynomial-time computation. Theo-retical Computer Science, 357:202–214, 2006a.

H. Schwichtenberg. Minlog. In F. Wiedijk, editor, The Seventeen Provers ofthe World, volume 3600 of LNAI, pages 151–157. Springer Verlag, 2006b.

H. Schwichtenberg. Eine Klassifikation der ε0-rekursiven Funktionen.Zeitschrift fur Mathematische Logik und Grundlagen der Mathematik, 17:61–74, 1971.

H. Schwichtenberg. Elimination of higher type levels in definitions of prim-itive recursive functionals by means of transfinite recursion. In H. Roseand J. Shepherdson, editors, Logic Colloquium ’73, pages 279–303. North–Holland, Amsterdam, 1975.

H. Schwichtenberg. Proof theory: Some applications of cut-elimination. InJ. Barwise, editor, Handbook of Mathematical Logic, volume 90 of Studiesin Logic and the Foundations of Mathematics, chapter Proof Theory andConstructive Mathematics, pages 867–895. North–Holland, Amsterdam,1977.

H. Schwichtenberg. Proofs as programs. In P. Aczel, H. Simmons, andS. Wainer, editors, Proof Theory. A selection of papers from the LeedsProof Theory Programme 1990, pages 81–113. Cambridge UniversityPress, 1993.

H. Schwichtenberg. Density and choice for total continuous functionals. InP. Odifreddi, editor, Kreiseliana. About and Around Georg Kreisel, pages335–362. A.K. Peters, Wellesley, Massachusetts, 1996.

H. Schwichtenberg and S. Bellantoni. Feasible computation with highertypes. In H. Schwichtenberg and R. Steinbruggen, editors, Proof andSystem-Reliability, Proceedings NATO Advanced Study Institute, Mark-toberdorf, 2001, pages 399–415. Kluwer Academic Publisher, 2002.

H. Schwichtenberg and S. S. Wainer. Ordinal bounds for programs. InP. Clote and J. Remmel, editors, Feasible Mathematics II, pages 387–406.Birkhauser, Boston, 1995.

D. Scott. Domains for denotational semantics. In E. Nielsen and E. Schmidt,editors, Automata, Languages and Programming, volume 140 of LNCS,pages 577–613. Springer Verlag, Berlin, Heidelberg, New York, 1982.

J. Shepherdson and H. Sturgis. Computability of recursive functions. J.Ass. Computing Machinery, 10:217–255, 1963.

W. Sieg. Fragments of arithmetic. Annals of Pure and Applied Logic, 28:33–71, 1985.

W. Sieg. Herbrand analyses. Archive for Mathematical Logic, 30:409–441,1991.

H. Simmons. The realm of primitive recursion. Archive for MathematicalLogic, 27:177–188, 1988.

366 Bibliography

S. Simpson. Subsystems of Second Order Arithmetic. Perspectives in Ma-thematical Logic. Springer Verlag, Berlin, Heidelberg, New York, 1999.

C. Smorynski. Logical Number Theory I. Universitext. Springer Verlag,Berlin, Heidelberg, New York, 1991.

R. Sommer. Ordinal arithmetic in I∆0. In P. Clote and J. Krajicek, ed-itors, Arithmetic, Proof Theory and Computational Complexity. OxfordUniversity Press, 1992.

R. Sommer. Transfinite induction within Peano arithmetic. Annals of Pureand Applied Logic, 76:231–289, 1995.

M. Stein. Interpretationen der Heyting–Arithmetik endlicher Typen. PhDthesis, Universitat Munster, Fachbereich Mathematik, 1976.

V. Stoltenberg-Hansen, E. Griffor, and I. Lindstrom. Mathematical Theoryof Domains. Cambridge Tracts in Theoretical Computer Science. Cam-bridge University Press, 1994.

W. W. Tait. Nested recursion. Math. Annalen, 143:236–250, 1961.W. W. Tait. Normal derivability in classical logic. In J. Barwise, editor, The

Syntax and Semantics of Infinitary Languages, volume 72 of Lecture Notesin Mathematics, pages 204–236. Springer Verlag, Berlin, Heidelberg, NewYork, 1968.

W. W. Tait. Normal form theorem for bar recursive functions of finitetype. In J. Fenstad, editor, Proceedings of the Second Scandinavian LogicSymposium, pages 353–367. North–Holland, Amsterdam, 1971.

M. Takahashi. Parallel reductions in λ–calculus. Information and Compu-tation, 118:120–127, 1995.

G. Takeuti. Proof Theory. North–Holland, Amsterdam, second edition,1987.

A. S. Troelstra, editor. Metamathematical Investigation of IntuitionisticArithmetic and Analysis, volume 344 of Lecture Notes in Mathematics.Springer Verlag, Berlin, Heidelberg, New York, 1973.

A. S. Troelstra and H. Schwichtenberg. Basic Proof Theory. CambridgeUniversity Press, second edition, 2000.

A. S. Troelstra and D. van Dalen. Constructivism in Mathematics. AnIntroduction, volume 121, 123 of Studies in Logic and the Foundationsof Mathematics. North–Holland, Amsterdam, 1988.

J. V. Tucker and J. I. Zucker. Provable computable selection functions onabstract structures. In Aczel et al. (1992), pages 275–306.

F. van Raamsdonk and P. Severi. On normalisation. Computer ScienceReport CS-R9545 1995, Centrum voor Wiskunde en Informatica, 1995.

S. S. Wainer. A classification of the ordinal recursive functions. Archiv furMathematische Logik und Grundlagenforschung, 13:136–153, 1970.

S. S. Wainer. Ordinal recursion, and a refinement of the extended Grzegor-cyk hierarchy. The Journal of Symbolic Logic, 38:281–292, 1972.

S. S. Wainer. Slow growing versus fast growing. The Journal of SymbolicLogic, 54(2):608–614, June 1989.

S. S. Wainer. Accessible recursive functions. The Bulletin of Symbolic Logic,5(3):367–388, 1999.

A. Weiermann. How to characterize provably total functions by local pred-icativity. The Journal of Symbolic Logic, 61(1):52–69, 1996.

Bibliography 367

A. Weiermann. What makes a (pointwise) subrecursive hierarchy slow grow-ing? In S. Cooper and J. Truss, editors, Sets and Proofs: Logic Collo-quium ’97, volume 258 of London Mathematical Society Lecture Notes,pages 403–423. Cambridge University Press, 1999.

R. S. Williams. Finitely iterated inductive definitions over a predicativearithmetic. PhD thesis, Department of Pure Mathematics, Leeds Univer-sity, 2004.

F. Zemke. P.R.–regulated systems of notation and the subrecursive hierarchyequivalence property. Transactions of the American Math. Soc., 234:89–118, 1977.

J. Zucker. Iterated inductive definitions, trees and ordinals. In Troelstra(1973), pages 392–453.

Index

→, 20→+, 20→∗, 20, 221A(t), 5

E [~x := ~t], 5E [x := t], 5accessible part, 82, 259Ackermann, iii, 117Ackermann-Peter function, 67addition, 334Alexandrov, 205

condition, 205algebra, 208

explicit, 210finitary, 210of witnesses, 270simultaneously defined, 209structure-finitary, 210

algebraic, 202animate, 281append, 216application, 203, 207approximable map, 204argument type

parameter, 208recursive, 208

arity, 255, 350of a defined constant, 220

assignment, 29, 36assumption, 6

cancelled, 7closed, 7open, 7

atom, 4critical, 292

Avigad, 118axiom of choice, 302axiom of dependent choice, 38, 83

bar, 29Barendregt, 221base type, 210basis, 202Bellantoni, 87

Benl, 252Berger, 251, 265BHK-interpretation, 246, 253Borger, 110boolean, 208bounded complete, 202bounded summation, 334branch

generic, 38main, 26

Brouwer, 157, 159, 246, 253Bruijn, de, 5Buchholz, 118, 138Buss, 118

canonical inhabitant, 216Cantor normal form, 124case-construct, 218C-operator, 217, 333Church, iii, 86, 102Cichon, 118, 148, 149clause, 255

nullary, 256clause form, 255closure ordinal, 80coclause form, 262coclosure ordinal, 81codirected, 81coinduction, 79, 262

for monotone inductive definitions, 79strengthened form, 79

coinductive definitionmonotone, 78

commutation with directed unions, 206compact, 202compatibility, 257complete partial ordering, 202composition, 207comprehension term, 256computation rule, 216, 220concatenation, 52conclusion, 6confluent, 222conjunction, 9, 258

369

370 INDEX

double ∧d, 267, 273left ∧l, 267, 273non-computational ∧nc, 271, 273right ∧r, 267, 273

consistency, 111consistent, 201consistent set of formulas, 40consistently complete, 202Constable, 311constant, 4, 333constructor

as continuous function, 211constructor pattern, 220constructor sysmbol, 209constructor type

nullary, 208, 209context, 98continuous, 81conversion, 220, 229

permutative, 15simplification, 19

conversion relation, 215conversion rule, 20, 337Cook, 87corecursion, 284corecursion operator, 219cototality, 262, 263, 282cpo, 202Curry, 323Curry-Howard correspondence, 322cut, 25

DC, 83deanimate, 281decoding, 51decoration, 312decoration algorithm, 312deductive closure, 202deductively closed, 201definability

explicit, 27definition

coinductive, 78inductive, 78

dense, 247depth, 211

of a formal neighborhood, 247of an extended token, 247

derivability conditions, 113derivable, 7

classically, 10intuitionistically, 10

derivation, 6, 209destructor, 221, 284diamond property, 222directed, 81directed set, 202

disjunction, 9, 259domain, 202drinker formula, 13

E-part, 26E-rule, 7effectivity principle, 200elimination, 20elimination axiom, 256, 268, 270elimination part, 26entailment, 201Eq, 271equality

decidable, 221, 257Leibniz, 257pointwise, 257

equivalentcomputationally, 270

Ershov, 202, 251Ex-Falso-Quodlibet, 257ex-falso-quodlibet, 4, 10existence, 221existential quantifier, 9, 258, 271

double ∃d, 267, 273left ∃l, 267, 273non-computational ∃nc, 271, 273right ∃r, 267, 273

explicit definability, 27extensionality, 258

Fairtlough, 118, 132falsity F, 257falsum ⊥, 4fast growing hierarchy, 127Feferman, 118Felleisen, 322field, 72finitely observable property, 205fixed point

least, 59forces, 29formal neighborhood, 201formula, 4

Σ1, 88atomic, 4bounded, 88closed, 5computationally irrelevant, 269computationally relevant, 269invariant, 271prime, 4safe, 332, 351with negative content, 300with positive content, 300

free (for a variable), 5Friedman, 33, 118function

computable, 57

INDEX 371

constant, 207elementary, 46µ-recursive, 56recursive, 61representable, 103strict, 212subelementary, 46, 86

function symbol, 4functional

computable, 211recursive in pcond and ∃, 241

FV, 5

gap condition, 190Gentzen, iii, 3, 117, 136, 137, 148Girard, iii, 118, 159, 252Godel, iii, 86, 87, 105, 107, 110, 111,

113, 253kernel, 301number, 96translation, 301

Goodstein, 117, 148sequence, 148, 150

Gradel, 110Graham, 151greatest fixed point, 78Griffin, 322Grzegorczyk, 86, 117Gurevich, 110

Hajek, 118Hardy, 127, 128, 133

hierarchy, 127, 148, 149Harrington, 117, 148Harrop formula, 269Herbrand, iii, 86Herbrand-Godel-Kleene equation

calculus, iiiHeyting, 246, 253Heyting arithmetic, 304higher type, 210Hilbert, iiihonest, 67Howard, 171, 323

I-part, 26I-rule, 7i-term, 191ideal, 201

cototal, 213structure-cototal, 213structure-total, 213, 246total, 213, 246

identity, 207identity lemma, 282implication

computational →c, 265input →, 349

non-computational →nc, 265output →, 349

incompleteness theoremfirst, 105

independence of premise, 302induction, 261

computational, 275for monotone inductive definitions,

79, 82general, 261strengthened form, 79, 255transfinite, 135transfinite structural, 134

inductive definitionexplicit, 260monotone, 78

infix, 4information system, 201

atomic, 212coherent, 212flat, 201of a type ρ, 210

inhabitantcanonical, 216

instruction number, 54interval

standard rational, 213, 262introduction axiom, 256introduction part, 26intuitionistic logic, 4invariant, 271Ishihara, 322

Kadota, 164Kalmar, 46, 86, 161, 171Ketonen, 117, 148, 151Kirby, 118, 148, 150Kleene, iii, iv, 43, 86, 156, 157, 159,

239, 251Kleene-Brouwer ordering, 76Knaster-Tarski theorem, 78, 82Konig, 83Konig’s lemma, 83Kolmogorov, iii, 246, 253Kreisel, iv, 117, 251

Lob, 114Lowenheim, 40language

elementarily presented, 96leaf, 28least fixed point, 78least number operator, 46least-fixed-point axiom, 268least-fixed-point operator, 239left f -minimum, 299Leibniz equality, 271Leivant, 87

372 INDEX

length, 51of a segment, 25of a term, 337

let, 282level, 210list, 209Liu, 155Loeb, 118Lob, 111, 113logic

minimal, 7

Malcev, iiimarker, 7Markov principle, 302Martin-Lof, 221maximal segment, 25McCarthy, 86measure function, 135, 261, 297minimum part, 26minimum principle, 295Minlog, v, 234, 253, 311, 322, 332Mints, 118model, 36

classical, 37modus ponens, 7monotone, 206monotonicity principle, 200Moschovakis, 319Myhill, 155, 159

negation, 5negation normal form, 40neighborhood

formal, 201node

consistent, 38stable, 38

normal form, 20theorem, 55

normalizationby evaluation, 239

nullterm, 269nulltype, 269number

binary, 209positive, 209unary, 208

numeral, 102, 342binary, 342

object, 201one-step extension, 213operator, 78

Σ0r-definable, 83

closure of an, 80coclosure of an, 81cocontinuous, 81

continuous, 81inclusive, 80monotone, 78selective, 80

order of a track, 26ordinal, 209

recursive, 76

parallel or, 241parameter argument type, 208parameter premise, 256parentheses, viiParis, 117, 118, 148, 150Parsons, 117–119, 138part

elimination, 26introduction, 26minimum, 26strictly positive, 6

partial continuous functionals, 211Peano, 114, 118, 119, 123Peano Arithmetic, 114Peirce formula, 32permutative conversion, 15Peter, iiiPFS, 206Plotkin, 241, 251predecessor, 218, 221, 334predicate

coinductively defined, 262finitary coinductively defined, 263finitary inductively defined, 256inductively defined, 256

predicate parameter, 255predicate symbol, 4

critical, 322premise, 6

major, 7, 9minor, 7, 9

principle of finite support, 199, 206product, 209progressive, 135, 261, 280, 295, 328

structural, 134projection, 216proof, 6proof term

ordinary, 351propositional symbol, 4Pudlak, 118

Ramsey, 117, 150rank, 339Ratajczyk, 118Rathjen, 118rational interval

standard, 213, 262real

abstract, 219, 282

INDEX 373

realizability, 87, 269recursion

general, 218operator, simultaneous, 215

recursion operator, 215, 234recursion theorem

first, 60recursive argument type, 208recursive premise, 256redex, 337

β, 20, 216D, 216, 220η, 216permutative, 20R, 216simplification, 20

reducible, 71reduction, 20

inner, 20one-step, 20parallel, 222proper, 20

reduction sequence, 20, 82reduction system, 82register machine, 43register machine computable, 45relation

∆0r-definable, 69

∆1r-definable, 73

Π0r-definable, 69

Π1r-definable, 72

Σ0r-complete, 71

Σ0r-definable, 69

Σ1r-complete, 75

Σ1r-definable, 72

analytical, 72arithmetical, 69confluent, 222definable, 102elementarily enumerable, 56elementary, 48noetherian, 82recursive, 101recursively enumerable, 68representable, 103terminating, 82universal, 70

relation symbol, 4renaming, 5representability, 103Robbin, 117Robinson, 109, 111Rose, 118Rosser, 105–107, 111Rothschild, 151Routledge, 155, 159rule, 7

satisfiable set of formulas, 40SC, 235Schmidt, 164Schutte, 117, 138Schwichtenberg, 118Scott, 200, 202, 205, 251

condition, 205topology, 205

Scott-Ersov-domain, 202segment, 25

maximal, 25minimum, 26

separating, 247sequence

reduction, 20set

consistent, 201unbounded, 299

set of formulasdefinable, 102elementary, 101primitive recursive, 101recursive, 101recursively enumerable, 101

Sheard, 118Shepherdson, 43, 86Sieg, 118Σ1-formulas

of the language L1, 110signature, 4signed digit, 213, 219size of a term, 337Skolem, iii, 40slow growing hierarchy, 125Solovay, 117, 148, 151Sommer, 118soundness theorem

for classical logic, 37for Dialectica, 305for minimal logic, 31for realizability, 276

Spencer, 151s.p.p., 6stability, 10, 322standard rational interval, 213, 262state

of computation, 55step type, 215, 219, 333stream representation, 213, 219, 262strengthening, 267strict, 212strictly positive, 208, 255strictly positive part, 6strong computability, 235structural existence, 221structure-cototality, 263structure-total, 246

374 INDEX

structure-totality, 260Sturgis, 43, 86subformula, 5

negative, 6positive, 6strictly positive, 6

subformula (segment), 25subformula property, 27substitution, 5, 98substitution lemma

for Σ01-definable relations, 69

substractionmodified, 334

sum, 209syntactically total, 293

T, 234Tait, 118, 121, 132, 138, 142, 221Takeuti, 118Tarski, 103, 104TCF, 256term, 4

extracted, 273first-order, 347input, 334, 341LT(;), 334, 341of T+, 220of Godel’s T, 215, 234simple, 344

term family, 238theory

axiomatized, 101complete, 101consistent, 101elementarily axiomatizable, 101incomplete, 101inconsistent, 101primitive recursively axiomatizable,

101recursively axiomatizable, 101

token, 201extended, 210

total, 246totality, 254, 260track, 25

main, 26transitive closure, 259, 274tree, 209

binary, 209infinite, 28

tree model, 29for intuitionistic logic, 32

tree ordinal, 161structured, 162

Troelstra, 118truth, 102Tucker, 118

Turing, iii, 43, 86type, 208, 333, 341

base, 210dense, 247higher, 210safe, 333, 341separating, 247with nullary constructors, 247

type parameter, 208

undefinability theorem, 103unit, 208universal quantifier

computational ∀c, 265non-computational ∀nc, 265

validity, 36valmax, 240variable, 4

assumption, 7computational, 352free, 5normal, 325object, 7safe, 325

variable condition, 8, 9

Wainer, 118, 132, 164, 195Weiermann, 118well-foundedness, 294witnessing predicate, 270word, 342

Zemke, 118Zucker, 118, 252

Documents

Schwichtenberg & Wainer- Proofs and Computations