NummSquared a new well-founded, functional foundation for formal methods Samuel Howse poohbist.com [email protected] October 10, 2006 Copyright

NummSquareda new well-founded, functional foundation for formal methods

Samuel [email protected] 10, 2006

Copyright © 2006 Samuel Howse. All rights reserved.

Formal methods

• Logic and mathematics applied to computer science

• Prove that implementation is “correct”– Compliant with specification

• Incorrect specification – a separate issue

– Well-behaved• Terminates, i.e. does not run forever• Memory safety – easier to solve with automatic

memory management

Some programming paradigms

• Typical operating system – programmer view:– File system– Interprocess communication– Global state – indeterminate, insecure

• Imperative paradigm– Side-effects – memory can change unexpectedly

• Functional paradigm– No side-effects

• No global state + functional paradigm– More secure– Easier starting point for formal methods

Some problems in formal methods

• Assuming no global state + functional paradigm, ≥ 2 remaining problems:1. Specification

• Express specification precisely (math)• Express implementation behavior (math)• Prove imp. satisfies spec. (logic)

2. Termination – does a program run forever?• Turing: general answer cannot be computed• Prove special cases (math + logic)

1. Specification

• Integrated approach is 1 language for:– Specification– Implementation– Proof

• 1 language must be:– A mathematical foundation– A programming language– A logic

• A typical programming language is too complex to be a mathematical foundation or a logic.

2. Termination

• Does it matter? Just hit Ctrl-C (or Ctrl-Alt-Del and End Process).

• Non-termination is closely related to (but does not always imply) inconsistency, i.e. exists a proposition P such that P and not(P) both provable.

• Inconsistency is a critical bug in a logic.

Untyped lambda calculus

• Everything is a function.

• Everything can be used as an argument.

• Non-termination:– Let f(x) = x(x).– f(f) → f(f) → f(f) → …

Russell’s paradox

• Russell’s paradox: [Seldin]– In naïve set theory: Let R be the set of all sets that

do not contain themselves. Is R in R?– In untyped lambda calculus plus negation– Let R(x) = not(x(x)).– R(R) = not(R(R)).– R(R) is neither true nor false – inconsistency.

• Can avert paradox by a somehow non-classical logic (e.g. Andrews, Gilmore’s NaDSyL, Grue’s map theory).

Type theory

• Types can ensure termination and avoid paradoxes.• But more complex – 2 fundamental concepts:

– Type– Function

• Compile-time type checking is computable.– Benefit – catch some bugs early and automatically– Cost – additional constraints on programmer

• my research addresses by using runtime coercion instead of compile-time type checking

• Coq proof assistant– nice mix of computation and proof checking– practical tools, including extraction to OCaml– highly developed

Set theory

• Everything is a set.• Well-founded set theory

– Membership (ε) is well-founded, i.e. no infinite descending ε chains.

– A set is formed “after” its elements.

• Some well-founded set theories:– Zermelo Fraenkel (ZF)– von Neumann-Bernays-Gödel (NBG)

• Usually no reduction (computation)

Well-founded functions

• Everything is a function.• Well-founded

– A function is formed “after” elements of its domain and range. [Jones]

– Membership in field of function a well-founded relation

• Few foundations of this kind, and not popular• von Neumann in 1925

– Others changed functions to sets – became NBG• Jones’s Pure Functions in 1998• von Neumann and Jones – do not define

reduction (computation) or proof

NummSquared

• Everything a function, no types, no side-effects, no global state, well-founded

• Reduction (computation) and proof• Sound: the proposition of a valid proof is true• Termination ensured without proofs by programmer• Proofs as desired, but not required• Classical logic• Follows set theory as much as possible• Simple variable-free syntax• Reflection

– NummSquared is its own macro language.– NummSquared is used to manipulate NummSquared

proofs.

NummSquared coercion for domain membership• When calling a function, how to ensure that

the argument belongs to the function’s domain?

• NummSquared coercion: if it isn’t so, then make it so!

• Type conversion generalized to higher-order (function with function argument)

• Coerce a function to domain/codomain by applying pre-coercion/post-coercion.

• NummSquared coercion somewhat related to Howe’s restriction of untyped lambda terms.– Howe does not ensure termination.

Semantics: small function extensions

p<zero>= left(p)

p<one>= right(p)

zero one

pair p

p<null>= null

null

zero

zero<null>= null

null

one

one<null>= null

one<zero>= zero

zero

leaf

null

null (base case)

null<null>= null

rule f

f<x>

x

dom(f)small

(no largerthan a ZFC set)

…

null

simple

Semantics: domain extensions

• For a rule f, dom(f) is not, in general, amenable to coercion.

• So dom(f) represented by domain extension (tag):– Same information as a type in type theory– Different purpose – not compile-time type checking, but

runtime coercion– Domain extensions never appear directly in NummSquared

programs, but are available as small function extensions.

• Domain extension:– Constant: Null, Nuro (Null or Zero), Leaf, Tree– Combination:

• dependent sum A – dom(A) contains null and certain pairs• dependent product A – dom(A) contains null and certain rules

Semantics: coercion

• For a valid domain extension A and a tagged small function extension f, coercion to A of f, denoted by coer(A, f), is in dom(A).

• If A is a dependent sum, and f is a pair, coerce left(f) first, then right(f).

• If A is a dependent product, and f is a rule, add pre-coercion and post-coercion to f:– A contains domain extension family F.– coer(A, f) is the rule r where dom(r) = dom(F) and

r<x> = coer(F<x>, f<coer(domExt(f), tagged(F, x))>)

Semantics: result

• See NummSquared Formally for:– The precise recursive definition of coercion– Justification of that recursive definition by a

well-founded relation

• Use coercion to define tagged small function extensions over all tagged small function extensions

• For tagged small function extensions f and x, f(x) = f<coer(domExt(f), x)>

• Coercion and result are computable.

Semantics: large function extensions

• Abstract over all tagged small function extensions

• f such that, for each tagged small function extension x, f(x) is a tagged small function extension

• A large function extension is also a proposition extension:– f is true iff, for each tagged small function

extension x, f(x) = one

NsGo

• NummSquared interpreter• Work in progress• Mostly automatically extracted from a

Coq program– Enhanced reliability

• F# and C# .NET assembly– Automatic memory management for free

• Complete: integrated Coq with MSBuild (Visual Studio build system)

NsGo components

NummSquaredprogram(string)

Parser (ANTLR) NummSquaredabstractprogram

(Coq)

NummSquarednormalized

abstractprogram

(Coq)

Normalization(Coq)

NummSquarednormalized

program(string) Printer

Printer

Parser (ANTLR)

The Future

• Continue work on the NsGo, the NummSquared interpreter

• Equality for rules:– Currently extensional:

• Functions equal when domains and results equal• Equates different algorithms• Not computable

– Future:• Gilmore’s intensional equality and HiLog equality distinguish

functions by name.• Gilmore’s “use” vs. “mention” distinction.

• Lambda syntactic sugar• Reduction under lambdas• Efficient reduction algorithms for coercion

References

• NummSquared 2006a0 Done Formally– poohbist.com– Click on NummSquared 2006a0.

• My thesis, NummSquared 2006a0 Explained– poohbist.com

• Other references are in the bibliography of NummSquared Formally.

• These slides– poohbist.com

Supplementary material

Domain membership

• When calling a function, how to ensure that the argument belongs to the function’s domain?

• Unrestricted domain – untyped lambda calculus

• Type constraints on programmer• Set theory: higher-order domain membership

not computable – need proofs by programmer

• NummSquared: coercion

NummSquared coercion summary

• Coercion to a valid domain extension of a tagged small function extension:– Ensures termination– Avoids paradoxes– Untyped– Supports computation

Semantics: small function extensions• Well-founded to ensure termination and avoid

paradoxes• Defined inductively• Rule f with:

– dom(f) a small subset of small function extensions• Small means no larger than a ZF set.

– for x in dom(f), f<x> is a small function extension• Leaf: null, zero, one

– null means absence of relevant information – does not mean 0, false, undefined or non-termination

• Pair p such that left(p) and right(p) are small function extensions

• A tree is a small function extension recursively containing only leaves and pairs.

NummSquared syntax

• Variable-free• Large functions

– Constants– Combinations

• Proofs– Axioms– Inferences

• Reflection– Quoting and unquoting large functions– Quoting and unquoting proofs– The quoted representation is a tree.– Quoting is easy because NummSquared is variable-free.

• See NummSquared Formally for details.

Small function extensions in Coq• Leaves and pairs are easy.• Rules are somewhat similar to Benjamin Werner’s

“Sets in Types”.• Inductive Func_Sm_Ext : Type :=

| Func_Sm_Ext_null : Func_Sm_Ext| Func_Sm_Ext_zero : Func_Sm_Ext| Func_Sm_Ext_one : Func_Sm_Ext| Func_Sm_Ext_pair :

Func_Sm_Ext -> Func_Sm_Ext -> Func_Sm_Ext| Func_Sm_Ext_rule :

forall(A : Type),(A -> Func_Sm_Ext) -> domain: must be 1-1(A -> Func_Sm_Ext) -> resultFunc_Sm_Ext.

• Define appropriate equality on small function extensions

Documents

NummSquared a new well-founded, functional foundation for formal methods Samuel Howse poohbist.com [email protected] October 10, 2006 Copyright