55
Automated Reasoning for Artificial Intelligence A SAT-based theorem prover for modal logic K Darren Lawton Supervisor: Professor Rajeev Goré College of Engineering and Computer Science The Australian National University This report is submitted for COMP8755: Individual Computing Project May 2018

Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

Automated Reasoning forArtificial Intelligence

A SAT-based theorem prover for modal logic K

Darren Lawton

Supervisor: Professor Rajeev Goré

College of Engineering and Computer ScienceThe Australian National University

This report is submitted forCOMP8755: Individual Computing Project

May 2018

Page 2: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight
Page 3: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

"When there are disputes amongpersons, we can simply say: Let uscalculate, without further ado, to seewho is right."

Gottfried Leibniz, 1685

Page 4: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight
Page 5: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

Declaration

I hereby declare that except where specific reference is made to the work of others, thecontents of this report are original and have not been submitted in whole or in partfor consideration for any other degree or qualification in this, or any other university.This report is my own work and contains nothing which is the outcome of work donein collaboration with others, except as specified in the text and Acknowledgements.

Darren LawtonMay 2018

Page 6: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight
Page 7: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

Acknowledgements

I would like to thank Professor Rajeev Goré for allowing me the opportunity to workon this project and his patience throughout the entire process. He was always availableto assist with any questions or issues that were encountered, and proactively sort waysto try and improve the results we ultimately obtained. I would also like to thank myfamily and friends for all the support throughout my studies at ANU.

Page 8: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight
Page 9: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

Abstract

We present a sound and complete method for using a SAT-solver to implement an unla-belled tableau procedure for automated reasoning in modal logic K. At a fundamentallevel, we distil a modal problem into to a boolean satisfiability problem with additionaltheory for the remaining modal aspects. Furthermore, we use a modal clausal form torepresent the problem.

In benchmarking the performance of our theorem prover to existing methods, wenote poor relative performance. This is largely due to redundant exploration of thetableau tree. Our results indicate that using a SAT solver with an unlabelled tableausystem may potentially not be an effective approach for searching a tableau tree inmodal logic K.

Despite the poor performance, there still remains potential extensions to ourresearch. Particularly in extending our theorem prover to incorporate tableau calculifor multimodal logics of belief.

Page 10: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight
Page 11: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

Table of contents

1 Introduction 11.1 Automated reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Automated theorem proving . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Report Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Modal Logic 52.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.3 Tableau calculi for modal logic . . . . . . . . . . . . . . . . . . . . . . . 7

2.3.1 Tableau calculi . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.3.2 Tableau based theorem provers . . . . . . . . . . . . . . . . . . 9

3 A SAT-Based Method 113.1 Modal clausal form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.1.1 Lexing and parsing . . . . . . . . . . . . . . . . . . . . . . . . . 123.1.2 Negation normal forming (NNF) . . . . . . . . . . . . . . . . . . 123.1.3 Simplification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123.1.4 Modal clausal forming . . . . . . . . . . . . . . . . . . . . . . . 13

3.2 SAT solver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.3 Proving Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.3.1 Saturation function . . . . . . . . . . . . . . . . . . . . . . . . . 153.3.2 Transition function . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.4 Formal proof of prover . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.4.1 Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.4.2 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.4.3 Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Page 12: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

xii Table of contents

4 Optimisation 214.1 Thrashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214.2 Modal deactivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5 Evaluation 255.1 Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255.2 Test Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265.3 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

6 Conclusion 296.1 Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

References 31

Appendix A Independent study contract 33

Appendix B Software artefacts 37

Appendix C README file for software 39

Page 13: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

Chapter 1

Introduction

Given an arbitrary statement, there are potentially many reasons as to why we mightassign truth to it. It could be that we were told it is true by a friend, or that wehave conducted rigorous experimentation that led us to this conclusion. Regardlessof our method, if the reasoning is devoid of a logical underpinning, it is not infalliblesince assumptions can be left unjustified and inferences disputed [8]. Formal logicmitigates these limitations by defining a syntax to abstract away context, and providinga calculus to guide deduction.

Logical reasoning does not reveal new insights into a domain. However, it doesforce one to consider necessary relationships between facts, and highlights instanceswhere possibly undesired states can enter into these relationships [8].

In a practical sense, we can distil a problem into a conclusion and set of assumptions.The conclusion represents a contention, whilst the assumptions represent the knowledgebase of the problem domain. Logical reasoning solves the problem by proving that theconclusion can be inferred from the given assumptions by systematically applying therules of deduction [20]. So naturally, given the syntactic nature of our inference rules,we program computers to derive these inferences for us.

1.1 Automated reasoningAn automated theorem prover is an algorithm that uses a set of logical deduction rules,of a given logic, so as to prove or verify the validity of a proposition. A deductiveargument is said to be sound if and only if it can be shown that the conclusion islogically entailed within premises of the argument. If the logical deductions of thesystem are then perceived as being a formalisation of intended meaning (semantics),then the theorem prover becomes an algorithm of reasoning [4].

Page 14: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

2 Introduction

In practice, there are two key steps to automating reasoning [8]:

1. Express statement of theorems in a formal language

2. Use automated algorithmic manipulations on those formal expressions

In building an automated reasoning program we must provide an algorithmicdescription of a formal calculus for efficient implementation. The primary considerationsof such an implementation involve defining the class of problems to be solved, specifyingthe rules of inference and how best to perform these computations efficiently [20]. Inaddition to efficiency, generality is a very desirable property for a theorem prover.Generality is the ability of the prover to handle a variety of logical systems with enoughflexibility to allow users to define their own axioms and inference rules [15].

1.2 Automated theorem provingModal logic is an extension of classical propositional logic and is used as a formal wayof representing and reasoning about notions of knowledge, belief and time. The directapplications of modal logic include linguistics [17], system verification [8], agent-basedsystems [16] and process analysis [12].

The design and implementation of systems and processes are fundamental to theoperation of any organisation. For example, aircraft manufacturers must design highlycomplex systems which are then built into airplanes. The cost of such a system failingis immeasurable. Therefore, organisations need to ensure that systems will performas required and that undesirable states will be unreachable within the context of thesystem. However, given the growing size and complexity of modern systems, the task offormal verification can be increasingly difficult. Automated reasoning aims to mitigatethis complexity by automating the verification process [19].

Given a logical encoding of a system Γ, and a required property σ, a theorem proveraims to determine whether σ is a logical consequence of Γ, in some given logic L. Thatis, we determine whether the required property is always forced to be true, given theassumptions of the system. Unfortunately, such problems in modal logic are decidablewith PSPACE-complete complexity [14].Example: Consider a plane with an alarm that senses engine failure, specifically whenan engine’s output falls beneath a given threshold. Then considering the primitivepropositions a (alarm on), f (alarm is faulty) and o (output of engine is too low), wecan let Γ consist of the following assumption:

Page 15: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

1.3 Report Overview 3

1. □(a→ f ∨o): meaning, that in all related worlds, a triggered alarm implies thatthe alarm is faulty or the output of the engine is too low.

As a aircraft manufacturer, we might want to show that for every instance, a is alogical consequence of Γ. That is, the alarm always triggers when detected as faulty orengine output falls beneath a given threshold.

Automated reasoning is a mature area of research, in that there are many existingtheorem provers for modal logic. Theorem provers such as FaCT++ [21] and LWB [9]are highly efficient but rely on complicated procedures, sophisticated data structuresand optimised techniques to extract maximum performance [15]. That is, they arehighly tailored to the logic systems that they are built for.

Recent work in this area has focused on satisfiability modulo theories (SMT) whichreduce logics to boolean satisfiability problems by adding equality reasoning theories.This allows for more expressive reasoning simply by adding lightweight theories on topof existing algorithms. Instances of such theorem provers include InKreSAT [11] andBDDTAB [19], both of which perform very efficiently and at a state-of-the-art level(more detail in subsection 2.3.2).

1.3 Report OverviewOur objective in this report is to develop an elegant yet efficient algorithm for provingtheorems in modal logic K by distilling a modal problem into to a boolean satisfiabilityproblem with additional theory for the remaining modal aspects. The fundamentalmotivation of this idea is to leverage the enormous advances in SAT solving techniques,so as to build a simple, yet scalable theorem prover for modal logic K. Furthermore,we use a modal clausal form to represent the problem.

The current chapter provides a high-level motivation as to why automated reasoningmaintains relevance given an increasingly complex world. We also introduced the basictenet of automated theorem proving for modal logic.

Chapter two introduces the semantics and calculus of modal logic. Chapter threeour method which introduces the algorithm underlying our implementation and theclausification technique we have used to preprocess input formulae. Optimisation issuesrelating to our implementation are discussed in chapter four. Chapter five providesan evaluation of our prover, including comparisons against existing state-of-the-arttheorem provers for modal logic K. Finally, in chapter six we highlight potentialopportunities for further work in this area, and our conclusion.

Page 16: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight
Page 17: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

Chapter 2

Modal Logic

Classical propositional logic is truth-functional, in that a statement is either true orfalse. If we seek to add further expressiveness to such statements by way of first orderlogic, then we no longer have a decidable logic [5]. That is, any reasoning algorithm offirst order logic cannot guarantee termination.

Modal logic allows us to enhance expressiveness, but importantly maintains decid-ability. This is achieved by abstracting context to different states where the modalaspect then captures the notion of transitioning between the states. Therefore wecan represent a complex system as a relational frame constructed using worlds andrelations [15]. In order to understand this deeper, we must first explore the syntax andsemantics of modal logic. The following sections introduce the syntax, semantics andtableau procedure for modal logic per the work of Goré [6].

2.1 SyntaxModal logic extends classical propositional logic and accordingly is more expressivewith two additional unary operators - box (□) and diamond (♢).

A formula φ in modal logic is an expression built recursively out of atomic proposi-tions and connectives as follows, where an atomic proposition is any element from acountably infinite set of atomic propositions:

p ::= p0 | p1 | p2 | ...

φ ::= p | φ | □φ | ♢φ| φ∧φ | φ∨φ | φ→ φ

Page 18: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

6 Modal Logic

2.2 SemanticsThe semantics of modal logic are defined by Kripke models [13]. The most basic notionof a model is a Kripke Frame F, which is a directed graph ⟨W, R⟩ where W is anon-empty set of worlds and R ⊆ W x W is a binary relation over W. The valuationon a given F is a map υ: W x atomic proposition 7→ t, f that provides the truth valueon every atomic proposition at every world of W. A Kripke model M comprises both aF and υ, such that M = ⟨W, R, υ⟩.

So given a model M and some world w ∈ W, we can lift υ to compute the truthvalue of a non-atomic formula by recursion on its shape.Semantics for classical propositional connectives:

ϑ(w,¬φ) =

t if υ(w,φ) = ff otherwise

(2.1)

ϑ(w,φ∧γ) =

t if υ(w,φ) = t and υ(w,γ) = tf otherwise

(2.2)

ϑ(w,φ∨γ) =

t if υ(w,φ) = t or υ(w,γ) = tf otherwise

(2.3)

ϑ(w,φ→ γ) =

t if υ(w,φ) = f or υ(w,γ) = tf otherwise

(2.4)

Semantics for modal connectives:

ϑ(w,□φ) =

t ∀v ∈ W if wRv then υ(v,φ) = tf otherwise

(2.5)

ϑ(w,♢φ) =

t ∃v ∈ W such that wRv and υ(v,φ) = tf otherwise

(2.6)

From these semantics we see that classical connectives remain truth functional,whereas the truth of modal connectives are dependent on the underlying R and hencenot truth functional. This is where the additional expressiveness of modal logic isderived.

To this point, modal logic includes a semantic forcing relation ⊩, which is definedon a world w, a model M and a given formula φ:

Page 19: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

2.3 Tableau calculi for modal logic 7

Table 2.1 Consequence relations

Forces We Say We Write When • ⊮ φin a world w forces φ w ⊩ φ υ(w, φ) = t υ(w, φ) = fin a model M forces φ M ⊩ φ ∀w∈W. w ⊩ φ ∃w∈W. w ⊮ φ

Extending these consequence relations, we can define both the logical consequence⊨ and satisfiability, across any set of models K:

Γ ⊨ φ iff ∀M ∈M ⊩ Γ =⇒ M ⊩ φ (2.7)φ is K-satisfiable iff ∃M = ⟨W,R,υ⟩ ∈K.w ∈W.w ⊨ φ (2.8)

We say a formula φ is satisfiable in a model M if and only if there exists someworld w ∈ W in M at which φ is true. Furthermore a set of formulae Γ is forced by amodel when each φ in Γ is satisfied by the model. We can use this fact to determinelogical consequence in the following way:

Γ ⊨ φ iff Γ∧¬φ is not K-satisfiable (2.9)

Therefore if Γ is an empty set, we can show that φ is K-valid by proving that φis a logical consequence of the empty set. This is ultimately the goal of our theoremprover, to show the validity of any given φ.

Modal logic systems are generally characterised by restrictions on the reachabilityrelation R. Accordingly each system comprises various sets of formulae which are validwithin their systems. Modal logic K has no restrictions on R, and is the only logicconsidered in this report.

There are two main deductive systems for determining logical consequence in modallogic. The first is Hilbert calculi which aims to show a sound derivation of φ from Γvia syntactic manipulation. The second deductive system is tableau calculi, which weuse in our implementation.

2.3 Tableau calculi for modal logicTableau-based methods are widely used deductive systems to show satisfiability, andthus logical consequence, in a modal logic (L). In this section, we present the procedureof unlabelled tableau as presented by Gore [6]. We then briefly discuss existing theoremprovers for modal logic K which also utilise tableau based methods.

Page 20: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

8 Modal Logic

2.3.1 Tableau calculi

A tableau rule σ consists of a numerator N and a finite list of denominators possiblyseparated by vertical bars. The numerator is a finite set of formulae, as is each elementin the denominator. Each rule is interpreted as "if the numerator is L-satisfiable, thensome denominator is L-satisfiable".

Tableau proofs are achieved via refutation. That is, to prove that a formula φ

is valid we need to show that its negation ¬φ is L-unsatisfiable. According to thecorrespondence that we established above, if ¬φ is L-unsatisfiable then φ is a logicalconsequence of an empty set and therefore K-valid.

A tableau procedure is represented as a tree for a given formula set ¬φ, where eachchild branch is obtained from their parent nodes by instantiating a tableau rule. Theapplication of a rule can either be a deterministic expansion or a non-deterministicexpansion. A deterministic (e.g. ∧ operator) expansion begets a single child node,whereas a non-deterministic expansion (e.g. ∨ operator) results in two child nodes sincethere are multiple possibilities for truth assignment. Note, ¬φ must be represented innegation normal form (subsection 3.1).

The tableau rules for modal logic K (assuming NNF) are as follows:X, Y, Z are possibly empty multi-sets of formulaeStatic rules:

p;¬p;X(id): x

φ∧ψ;X(∧):φ;ψ;X

φ∨ψ;X(∨):φ;X | ψ;X

Transitional rule:

♢φ;□X;Z(♢K): ∀ψ.□ψ /∈ Z and □X = {ψ | ψ ∈X}φ;X

We let ∆ be the set of tableau rules for a modal logic K. By applying rules from ∆to X, we can obtain a path to a leaf node Y. A branch in the tableau is closed if a leafnode is an instance of (id) - a contradiction. A tableau is closed if all its branches areclosed. Conversely, a tableau is open if it is not closed.

Tableau procedures can be considered a tree search problem. In that, we want toexpand each tableau branch in order to reveal a potential satisfying model. If such a

Page 21: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

2.3 Tableau calculi for modal logic 9

model is found, we have a proof that X is K-satisfiable. Otherwise, we have a closedtableau and X is determined to be K-unsatisfiable.

Therefore, to prove the validity of formula φ, we must show that the tableau of ¬φis closed. Explicitly, we prove that ¬φ is not K-satisfiable and accordingly φ is alwaystrue.

2.3.2 Tableau based theorem provers

There are many existing tableau based algorithms for modal logic that perform verywell against benchmarks and accordingly, are used in applications of modal logic.

Traditional provers are designed as all-encompassing reasoning engines that arehighly specialised and purpose-built to reason within a set of logics. An instance of sucha theorem prover is FaCT++ which is considered a state of the art reasoners in thisfield. FaCT++ is implemented in C++ and uses many optimisation techniques thathave been adapted from propositional logic such dependency-directed backtracking,boolean constant propagation and ordering heuristics [21].

Recent work in this area has focused on satisfiability modulo theories which reducea given logic to boolean satisfiability problems by adding equality reasoning theories.A SAT solver can then be used to decide the satisfiability of formulas in these logics[3].The main benefits of such approaches are simplicity and scalability, as lightweighttheories can be built on top of existing SAT solvers or similar structures.

Using such an approach, Kaminiski and Tebbi develop a theorem prover (InKreSAT)for modal logic by using an incremental SAT solver and an explicit representation ofthe tableau rules [11]. Specifically, InKreSAT relies on a labelled tableau system whichrequires labelled formulae and tableau rules that operate on labelled formulae. A labelrepresents a world w of the Kripke model, to which w ∈ W. A labelled formula is astructure of the form w:φ where w is a label and φ is a formula. It has the semanticmeaning that the formula φ is true in the world w [15].

Using binary decision diagrams (BDDs), Olesen [19] presents a method for imple-menting unlabelled tableau calculus for modal logic K. Given the recursive natureof unlabelled tableau calculus, the tableau expansion does not require labelling ofexpanded formula to maintain which formulae must be forced in which worlds.

Our implementation relies on unlabelled tableau calculus for derivation, meaningthere is no explicit representation of the reachability relation since these propertiesare built into the rules of deduction [6]. Furthermore, our saturation phase relies on aSAT solver which has very similar internal mechanics to that of BBDs.

Page 22: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight
Page 23: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

Chapter 3

A SAT-Based Method

As introduced, we aim to develop an elegant, scalable and automated method forproving theorems in modal logic K. Explicitly, we want to prove the validity of a givenmodal formula. Therefore to show validity, we want to show that the negated formulais not satisfiable. The key insight to our reasoning algorithm is the fact that modallogic is an extension of classical propositional logic. Therefore we try to relegate asmuch of the required deduction to a SAT solver, and use the tableau transition rulefor the remaining modal aspects of the formula [2].

Our implementation is written in Python, and being an interpreted high-levellanguage, we immediately sacrifice some computational efficiency. That said, thepremise of our research is to rely on the optimisation and efficiency of an existing SATsolver, therefore any additional computation cost to our wrapper should be immaterial.We also note that Python does provide a rich library and more importantly has anAPI for the SAT solver we use in our implementation.

The underlying algorithm for our prover is recursive in structure, but the overalldesign is very simple. The result is a robust theorem prover, however shows poorperformance on standard benchmarks.

The remaining sections of this chapter discuss each aspect of our algorithm indetail.

3.1 Modal clausal formSince our algorithm finds a proof by refutation, before completing the followingtransformation to modal clausal form, we first negate the input formula and then let itequal to φ. Transformation into sequent representation was applied to tableau calculiby Nguyen [18], who also showed that a clausal form can give better space bounds and

Page 24: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

12 A SAT-Based Method

improved decision procedures for some modal logics. Additionally, a clausal form alsoallows us to easily distil the propositional components of φ which is important to ourimplementation. The following sequence of functions are called to transform φ intoclausal form:

3.1.1 Lexing and parsing

Lexical analysis and parsing of φWe now represent φ as an abstract syntax tree (AST) for further simplification andtransformation.

3.1.2 Negation normal forming (NNF)

Convert simplified φ to NNFTo minimise the number of inference rules required to decompose a formula set, we firstgenerally convert the formula into NNF before commencing any proof procedure. Aformula is in NNF if and only if all occurrences of ¬ appear in front of atomic formulaeonly and there are no occurrences of → and ↔. Accordingly, we recursively distributenegations over subformulae using the rules in table 3.1.

φ→ υ = ¬φ∨υ ¬(φ→ υ) = φ∧¬υ¬(φ∧υ) = ¬φ∨¬υ ¬(φ∨υ) = ¬φ∧¬υ¬¬φ= φ ¬♢φ= □¬φ¬□φ= ♢¬φTable 3.1 Negation normal forming rules

3.1.3 Simplification

Simplify AST(φ)This step is a quasi optimisation to promote early inconsistencies, but also simplifythe concepts of the expression [3]. Simplifications include removing constants (e.g. a ∧False -> False) and eliminating redundancy in expressions (e.g. ¬¬ a -> a). Theserules are summarised in table 3.2.

Page 25: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

3.1 Modal clausal form 13

φ∧⊤ = φ φ∧⊥ = ⊥φ∨⊤ = ⊤ φ∨⊥ = φ

φ→ ⊤ = ⊤ ⊤ → φ= φ

⊥ → φ= ⊤ ⊤ → ⊥ = ⊥Table 3.2 Basic boolean simplification rules

3.1.4 Modal clausal forming

Now that our formula is in NNF, we can begin our transformation into modal clausalform [7]. We use a,b,c to denote atomic literals and � to denote a possibly emptysequence of modal box operators, which we can think of as an universal modality. Werefer to � as the modal context from herein. A modal clause is a formula of the form⊥, �[a1,..,ak], �[a,□b], �[a,♢b] or �♢b; where �[φ1,..,φk] is written to denote �[φ1∨.. ∨φk].

Per Goré and Nguyen [7], for any modal logic L, every set X of formulae can betransformed in quadratic time to a set of modal clauses X’ such that X is L-satisfiableiff X’ is L-satisfiable.

In the context of our prover, we repeatedly apply the following transformations toNNF(φ) in order to achieve a modal clausal form:

1. If of the form �[γ∧ψ] then replace by �γ and �ψ.

2. If of the form �[γ1, ..,γk,ψ∨ ξ] then replace by �[γ1, ..,γk,ψ,ξ].

3. If of the form �[a1, ..,ah,γ1, ..,γk] where γ1, ..,γk are complex sub-formula, andk ≥ 2, or k = 2 and h > 1, then replace by �[a1, ..,ah,p1, ..,pk], �[¬p1,γ1] ..�[¬pk,γk] where p1, ..,pk are new atomic literals.

4. If of the form �[a,∇γ] where ∇ is either □ or ♢, and γ is a complex sub-formula,then replace by �[a,∇p] and �□[¬p,γ] where p is a new atomic literal. We alsonote that in the absence of a, we simply let a 7→ ⊥ and include it in the originalformula.

5. If of the form �[a,γ∧ψ], then replace by �[a,γ] and �[a,ψ].

The above transformation makes at most n replacements, where n is the sum ofthe lengths of the each formula in NNF(φ) [7]. Testing was completed, by use of anexternal theorem prover, to ensure that our entire transformation preserved the if andonly if satisfiability relation of a given ϕ and its modal clausal form. Using the theorem

Page 26: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

14 A SAT-Based Method

prover we checked whether ¬ϕ was not valid, meaning that ϕ was satisfiable. If ϕ wassatisfiable, we then needed to check that ¬mcf(ϕ) was also deemed not valid by theprover.

3.2 SAT solverA key aspect of our theorem prover is the use of a SAT solver. For the purpose of ourimplementation, we have used Z3, a SAT solver developed by Microsoft Research [3].The primary motivation for using Z3 was the fact that it is highly optimised and alsoprovides an API for Python. The extent to which we use the SAT solver is to retrievesatisfying models, if possible, based on a set of propositional constraints. This willbecome more evident in section 3.3.1.

3.3 Proving AlgorithmAt a fundamental level, our algorithm follows a tableau procedure, in that we simplyconstruct a search tree to find a satisfying model. This search is conducted in arecursive depth-first manner, with the termination conditions being a satisfying modelwas found (i.e. an open tableau branch) or closed tableau. The algorithm separatesthis tableau procedure into two phases, a saturation phase (3.3.1) and a transitionphrase (3.3.2). The saturation phase achieves a downward-saturated and consistent set;meaning all classical connectives, for a given world, have been expanded and there areno contradictions in the formulae. To prevent destroying any proceeding closures inour tableau, we only apply the transitional rule to downward-saturated and consistentsets [6].

Given the transformation of φ to modal clausal form, it is represented as a dictionarywhen passed the prover. Each key in the dictionary represents the depth of a modalcontext in φ. Values are then mapped to each key based on modal depth W and arerepresented as a set of four possibly empty lists constructed as following:

• A is a list of formulae with the form �[a1, ..,ak]

• IB is a list of formulae with the form �[a,□b]

• ID is a list of formulae with the form �[a,♢b]

• D is a list of formulae with the form �♢b

Page 27: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

3.3 Proving Algorithm 15

For the purpose of our implementation the keys in the dictionary are always relativeto the "starting world". To reflect modal jumps we track modal depth W and this valueis used to calculate the effective modal context for each formulae in the dictionary.Simply, the effective modal context of a formulae is the modal context relative to the"starting world" (dictionary key) less modal depth.Example:Let φ = ♢(p101∧p201) → ♢(p101∧p201)Therefore, nnf(φ) = □(¬p101∨¬p201)∨♢(p101∧p201)Finally, mcf(φ) = [{⊥,♢p0}],□[{¬p0∨p101},{¬p0∨p201},{¬p101∨¬p201}]

By applying our dictionary representation to mcf(φ), we get:key: 0 ; value: D: {{⊥,♢p0}}key: 1 ; value: A: {{¬p0∨p101}, {¬p0∨p201}, {¬p101∨¬p201}}Where each key represents the depth of a modal context (i.e. length of nested boxes).

From our main function we call the saturation function first at a modal depth of 0.Since this is proof by refutation, if the algorithm returns satisifiable then the inputformula is not valid. Otherwise, the input formula is valid.

3.3.1 Saturation function

In a traditional tableau construction, during a saturation phase, we would onlyinstantiate static tableau rules - (id), (∧) and (∨). However, per Kripke semantics,we see that these rules are all contained within a single world, hence are positionalin nature. Therefore, we allow the SAT solver to implicitly apply all static rules toachieve a downward-saturated and consistent set. By extension, we also observe thatthere are no modal aspects in the saturation phase.

Figure 3.1 shows our algorithm for the saturation phase of our tableau construction.In this algorithm, modal depth is analogous to the number of modal jumps from our"starting world". Based on this modal depth, we extract the sets A, IB, ID and Dwhere effective modal context is 0.

The function aggregates the propositional constraints in A (see section 3.3) and theset of forced modal propositions from a directly related prior world. The aggregatedset of constraints are then added to an instance of the SAT solver. If the SAT solvercan find a satisfying valuation, it returns the satisfying model for that world. We thenpass this valuation to the transition function to ensure all modal aspects, in IB, IDand D, are satisfied in this current world.

Page 28: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

16 A SAT-Based Method

'''M: Dictionary representation of modal clausal form per above.Key: modal depth (i.e. length of modal context)Value: [A, IB, ID, D]X: Dictionary of active modal literals from previous worldV: Dictionary of valuations at each modal depth.W: Modal depth (initially 0)'''def saturation_function(M, X, V, w):

if M[w] is None and X is None:return satisifiable

else:A_w, IB_w, ID_w, D_w <- M[w]sat_constraints = set()sat_constraints <- add {X[w], A_w}

valuation <- call_sat_solver(sat_constraints)if valuation is satisfiable:

V[w] <- valuationreturn transition_function(M, X, V, w)

else:return unsatisifiable

Fig. 3.1 Saturation of propositional connectives

If the SAT solver cannot find a satisfying valuation, this is an instance of (id) andthe branch will be closed.

3.3.2 Transition function

The transition function receives a valuation V, which is a downward-saturated andconsistent set. However, we still need to check the formulae in IB, ID and D (seesection 3.3), to ensure that any active modal propositions are satisfied.

Figure 3.2 shows our algorithm for the transition phase of our tableau construction.Formulae in lists IB and ID are of the form �[p,□b] and �[p,♢b] respectively. Thereforeif p is satisfied in V (or can be without contradiction), then each of the clauses in IBand ID are satisfied. If p is not satisfied, then the modal propositions are active andwill need to be forced in the model. By construction, diamond modal propositions inD are active, and therefore must also be forced in the model.

Page 29: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

3.3 Proving Algorithm 17

def transition_function(M, X, V, w):w1 = w + 1A_w, IB_w, ID_w, D_w <- M[w]

''' get_active returns set of active modalities basedon whether each clause in ID_w or IB_w is (or can be)satisified by V[w] '''active_box_literals <- get_active(IB_w, V[w])active_diamond_literals <- get_active(ID_w, V[w]), D_w

if active_diamond_literals is None: return satisifiable

for diamond in active_diamond_literals:X[w] <- add(diamond, active_box_literals)

branch = saturation_function(M, X, V, w1)if branch == unsatisifiable:

sat_constraints <- add {X[w], A_w}''' get new valuation for w: OR-branch '''valuation = call_sat_solver(sat_constraints)if valuation is satisifiable:

V[w] <- valuationreturn transition_function(M,X,V,w)

else: return unsatisifiablereturn satisifiable

Fig. 3.2 Transition by diamond modalities

If there are no active diamond propositions, then the branch is satisfied since anybox propositions are vacuously true. Conversely, if there are indeed active diamondpropositions, we apply the following modified transition rule:

♢φ1; ..;♢φi;□X;Y ;Z(♢K*): ∀ψ.□ψ /∈ Z and □X = {ψ | ψ ∈X}

φ1;X;Y * || .. || φi;X;Y *

For notational clarity, Y is the set of formulae where the effective modal context of theformulae is non-empty. In Y* one box has been stripped from the modal context foreach formulae 1.

Each denominator is passed to the saturation function at an incremented modaldepth. This in effect achieves a modal jump with AND-branching. That is, if any of

1For the purpose of implementation, this reduction to modal context is based on the incrementmade to modal depth.

Page 30: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

18 A SAT-Based Method

the denominators are not satisfiable then the entire branch closes. On this closure,we backtrack and seek a new satisfying valuation at the prior modal depth, whichwe directly pass into the transition function. This process of finding a new valuationdictates the OR-branching for our tableau tree, as only one of these models needs toresult in a branch that remains open.

3.4 Formal proof of proverTo prove correctness of our algorithm we need to formally show the following:

3.4.1 Termination

Termination requires that our algorithm terminates given any input, this also showsdecidability. The two termination conditions of our tableau are 1) an open tableaubranch is found, and 2) the tableau is closed.

In relation to condition 1, for a tableau branch to be open, means that every nodealong a branch, including the leaf node, must be satisfiable. A leaf node will not enforcea modal jump, therefore we should have a saturated and consistent set with no activediamond propositions. We can see from figure 3.2 that this is indeed a terminationcondition of our algorithm that recursively passes the satisfied result to the top levelprover. Furthermore, we note that each application of the (♢K*) rule reduces themaximal-modal degree of the formula, therefore ensuring this termination condition.

In relation to condition 2, a closed tableau is one in which every branch is closed.This means that the leaf node of every branch is an instance of (id) - i.e. unsatisfiable.Figure 3.1 shows that this is indeed a termination condition of our saturation phasewhen SAT solver cannot find a satisfying valuation. The algorithm then closes the (id)branch, and backtracks to a new OR-branch. OR-branching is exhaustive since thereare only a finite number of satisfying valuations for a fixed set of constraints.

3.4.2 Soundness

Soundness requires that our rules of inference are truth-preserving. Since our algorithmis a proof by refutation, this means if we can derive a closed tableau for ¬φ, then wemust be able to show that φ is logically valid with respect to the semantics of modallogic K [20].

The static rules (subsection 2.3.1) of our tableau procedure are propositional innature, therefore are inherently truth-functional. Assuming φ contains no modal

Page 31: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

3.4 Formal proof of prover 19

aspects, the SAT solver would handle the entire inference, which we can assume issound. Now we must simply prove the soundness of our transition rule (♢K*).Lemma: If {♢φ1; ..;♢φi;□X;Y ;Z} is K-satisfiable then {φ1;X;Y *} || .. || {φi;X;Y *

} is K-satisfiable.

Proof: Suppose {♢φ1; ..;♢φi;□X;Y ;Z} is K-satisfiable, where X, Y, Z are possiblyempty multisets of formulae. Then it follows that:

• There exists a Kripke model M = ⟨W,R,υ⟩ and some w∈W with w⊩♢φ1; ..;♢φi;□X;Y ;Z

• This means that there exists some v ∈W where vRw and w ⊩ ♢φj, for each j =1 .. i. We now observe the following forced relations:

– v ⊩ φj and;

– v ⊩X and;

– v ⊩ Y * being the set of formulae where modal context is stripped of onebox. Y* is forced inductively.

Therefore {φ1;X;Y *} || .. || {φi;X;Y *} is K-satisfiable and our rules of inference,specifically (♢K*), are a sound. ❚

3.4.3 Completeness

Completeness requires that if φ is a logical consequence of its assumptions withrespect to the semantics of modal logic K, then φ should also be derivable from theseassumptions. In application, this means that if our theorem prover returns φ as beingnot valid by finding an open tableau for ¬φ, then φ must indeed not be valid.

To show this, we build a Kripke model with a "starting world" w0 that satisfies ¬φ.As introduced in section 3.3, we first convert ¬φ to modal clausal form and store it asdictionary Y. Each key in Y represents a modal depth; for example, at w0 the modaldepth would be 0 and all formulae mapped to 0 would have an empty modal context.

Proof: Supposing no tableau branches close for ¬φ, we then consider the followingsystematic procedure:

• Stage 0: In the saturation function, let w be set of formulae where effectivemodal context is empty, and let C be set of formula where effective modal contextnon-empty.

Page 32: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

20 A SAT-Based Method

• Stage 1: Using the SAT solver, derive a downward-saturated and consistent nodew* by finding a satisfying valuation for the propositional components of w.

• Stage 2: Now in the transition function, let ♢γ1; ..;♢γi be all active diamonds inthe w*. We also let X be the set of active box propositions. So w* now lookslike: ♢γj,□X;C;Z for each i in 1 .. i. To force the modal aspects of the node,we apply (♢K*) to each diamond i - this includes incrementing the modal depthby one and recursing to the saturation function. This recursive step is analogousto a modal jump to vi where viRwi.

We repeat these steps for each node vj = (γj;X;C) until termination - see subsection3.4.1. Finally we let W be the set of all -nodes (downward-saturated and consistentnodes), where w* ◁ v*. This constitutes a model graph ⟨W,◁⟩ which by Hintikka is asufficient condition to showing ¬φ is satisfiable [6]. ❚

Page 33: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight
Page 34: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

Chapter 4

Optimisation

The naive implementation of our theorem prover shows poor performance, where wemeasure performance by timing efficiency. Therefore, optimisation is required in orderto achieve performance competitive with existing provers. However, in application, suchoptimisation proved to be difficult given the underlying structure of our implementation.

By using depth-first search, we expand each branch of our tableau completely beforewe begin expanding the next. To find an open tableau, we simply need to find thefirst open tableau branch, which represents a satisfying model for the negated formula.Conversely, to conclude that the tableau is closed, we need to expand all branches andshow that each leaf is an instance of (id). Therefore to minimise the computationalrequirement, we want to prioritise the expansion of branches with a higher likelihoodof remaining open and quickly fail on branches with potential contradictions. To thispoint, the key issue that our implementation encounters is thrashing.

4.1 ThrashingThrashing occurs during tableau search when branches with the same underlyingformulae are explored. This can have significant impacts on the performance of atheorem prover, particular where the branching factor of a formula is high. Hustadt andSchmidt [10] propose a backtracking technique so as to mitigate this issue. Essentiallythey propose that on an unsatisfiable node, the algorithm backtracks to the node ofthe tableau tree which is relevant to the failure of the given branch. Explicitly, thismeans that if the algorithm expands an instance of (id), it must then backtrack to thenode prior to the node responsible for allowing the eventual instantiation of the (id)leaf. This, of course, requires the maintenance of assumption sets which contain all theformulae and rules which have contributed to the closure of a branch.

Page 35: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

22 Optimisation

In terms of our implementation this method of backtracking, or similar, is notpossible given the use an unlabelled tableau procedure and the fact that we do notcache the formulae and rules from preceding nodes and branches. Therefore we cannotextract sufficient information from an contradicting leaf in order to guide our search.As an example, say we find a contradiction at a modal depth of 1, our algorithm willthen backtrack to a modal depth of 0 and seek a new satisfying valuation (i.e. newOR-branch at "starting world"). If such a valuation is found, the prover will transitionto a modal depth of 1 for each active diamond. Despite the constraints at this modaldepth being potentially very similar to the previous OR-branch (at the same depth),there is no available information that we can use to determine whether this branchshould be immediately closed. Therefore the SAT-solver will naively generate everyOR-branch which leads to redundant expansion. The transition rule is the only aspectof the tableau that we have complete control over. Accordingly, we look to adapt abacktracking technique suitable to our implementation.

4.2 Modal deactivationBy analysing the structure of our tableau, we see that there are two types of branching;OR and AND branches. OR-branches are generated by the SAT solver, hence limitingour control over the sequence of OR-branching. That said, we can influence valuationsvia the constraints we add to the SAT solver. AND-branches are instantiated by thetransition rule, which we apply to every active diamond modality. However, given thestructure of formulae in our ID (implied diamonds) or IB (implied boxes) sets, theactivity of these modal literals are dictated by the valuations provided by the SATsolver.

By using a modified backtracking technique, we attempt to promote potentiallyopen branches in our tableau by blocking modal literals that we deem responsible forcontradictions in proceeding worlds. This is achieved by the following procedure:

1. If a given world (w1) returns unsatisfiable, we identify the modal literals respon-sible for each contradiction in that world. In the case that the set of responsiblemodal literals is empty, the tableau is closed as there is an unavoidable contra-diction at the given depth for all branches.

2. If the responsible modal literals are from the sets ID or IB, then this means theywere activated from an implication clause in a preceding world (w). Therefore,

Page 36: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

4.2 Modal deactivation 23

we create a set C, comprising the antecedents of each relevant modal implicationclause.

3. Finally we backtrack to w, where we add C as a soft constraint 1 to the SATsolver. The aim being to block the activity of the responsible modal literals in w.

Ultimately, thrashing remains a critical problem for implementation that we cannoteasily avoid. Even by promoting positive branches as best we could, the performanceof our theorem prover remains poor in comparison to existing theorem provers. Adetailed discussion of performance is continued in chapter 5.

1 Soft constraints are those we would like to be true, but are not necessary in order to satisfy themodel.

Page 37: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight
Page 38: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

Chapter 5

Evaluation

This chapter provides an evaluation of our prover, including comparison against existingstate-of-the-art theorem provers for modal logic K. However first we introduce thebenchmarks we have used and provide a description of our test environment.

5.1 BenchmarksTo determine the performance of our implementation we use LWB benchmarks asconstructed by Balsiger [1]. These benchmarks are designed precisely to comparevarious methods for proof search in modal logic K. Per Balsiger, the key considerationsthat underpin the construction of these benchmarks are:

• Formulas are a combination valid and not-valid

• Formulas are of differing shapes and structures

• Benchmark formulas are difficult enough for future implementations

• The outcome of formula proofs are deterministic

• Artificial optimisations do not enhance provability

• Applying the benchmarks to a prover does not take an unreasonable amount oftime

• Results are easily summarised

In total there are nine subclasses of provable and non-provable formula, with eachclass having 21 instances of formula that increase in difficulty. Given the considerations

Page 39: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

26 Evaluation

above, the LWB benchmarks are the standard by which theorem provers for modallogic are compared.

5.2 Test EnvironmentOur testing was run on an intel 2.60GHz CPU with 8GB of memory. We used a timelimit of 60 seconds per formula which included parsing, transformation to modal clausalform and proof search of the tableau.

5.3 Experimental resultsWe evaluate the performance of our prover against BDDtab, InKreSAT and FaCT++;all of which achieve state-of-the-art performance. Table 5.1 shows the number ofinstances, per problem subclass, that each prover solves. It is clear from these resultsthat our prover does not compete with any of these existing methods.

Table 5.1 Results on LWB benchmarks for Modal Logic K

Subclass SAT-based BDDtab InKreSAT FaCT++branch_n 9 16 13 10branch_p 16 21 16 9d4_n 2 21 21 21d4_p 3 21 21 21dum_n 10 21 21 21dum_p 0 21 21 21grz_n 0 21 21 21grz_p 0 21 21 21lin_n 14 21 21 21lin_p 1 21 21 21path_n 9 21 10 21path_p 12 21 10 21ph_n 4 9 21 13ph_p 3 9 9 7poly_n 21 21 21 21poly_p 21 21 21 21t4p_n 1 21 21 21t4p_p 0 21 21 21

Our implementation performs well on subclasses branch and poly. Whereas, perfor-mance is extremely poor on subclasses grz and t4p. In subclass grz we fail to find a

Page 40: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

5.3 Experimental results 27

single proof within the defined time limit. In general, however, our theorem provershows sub par performance when compared to the other methods. Given the nature ofour implementation and the use of a higher level scripting language, these poor resultsare not entirely surprising.

The inefficiencies of our algorithm mainly stem from the fact that we are unableto optimally guide the branching of our tableau, particularly in relation to redundantbranches. This is evidenced by the non valid problems for subclass grz, whereby anopen branch can be found at a low modal depth (in smaller problems), however sinceour prover is stuck thrashing on one side of the tableau, it runs out of time to findthe open branch on the other side. We note, that by using an iterative depth-firstapproach we were able to find proofs in grz_n, however this was at a greater detrimentto other subclasses.

Using a SAT solver to generate the OR-branches of the tableau appears to hinderperformance. This is due to an inability to influence the order in which valuations aregenerated and therefore need to exhaustively expand each branch without the abilityto promote branches with a higher likelihood of remaining open. This is compoundedby the fact that we have to manually constrain the SAT solver each time we require anew satisfying valuation of a world previously expanded. Subclasses such as branchand poly have limited OR-branching factors, hence our prover performs very well inthese instances.

By using a binary decision diagram (BBD) in place of a SAT solver, BDDtab is ableto achieve much greater performance given the compact data structure and controlover how the BBD is refined when finding a new satisfying valuation on the closure ofa branch. Since a BBD is represented as a directed acyclic graph, when it is refined,the subformula in the closed leaf is removed from the set of satisfying valuations.Accordingly, all potential OR-branches containing that instance of subformula areremoved as well, therefore eliminating many more branches than the one originallyclosed [19]. Similarly, given the use of a labelled tableau calculi and a single instanceof a SAT solver, InKreSAT is able to maintain an agenda of unexpanded formulae ateach world hence allowing greater control over when and which branches are expanded[11]. Ultimately, our SAT-based method is inefficient given the naive manner in whichwe are forced to generate OR-branches.

Figures 5.1 and 5.2 compare the actual total time taken by our implementation(MCF transformation and proving algorithm) and BBDtab, for the problems in subclasspoly - not valid (n) and valid(p). Despite being able to solve all 42 instances in thissubclass, there is an evident and increasing discrepancy in the amount of time required

Page 41: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

28 Evaluation

by our theorem prover compared to BBDtab. The time complexity of our algorithmappears to grow at an exponential rate. We also note that, on average, the modalclausal transformation takes 12.80% of the total time per problem in subclass poly.However, this percentage proportion increases steadily as the problem sizes increased.Therefore, it is very evident that our inefficiencies stem from our proving algorithm.

Fig. 5.1 Not valid instances from subclass poly

Fig. 5.2 Valid instances from subclass poly

Page 42: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight
Page 43: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

Chapter 6

Conclusion

6.1 Further WorkDespite the poor performance of our naive SAT-based theorem prover, there remainseveral ideas for further improvement and experimentation in this area:

• Application to multimodal logics of beliefPer Goré and Nguyen [7], we transform input formula into a modal clausal form.However, in their research, the intended use of this sequent form was to developa set of clausal tableau calculi for multimodal logics designed for reasoning aboutmulti-degree belief in multi-agent systems. Therefore there exists scope to extendour proving algorithm to incorporate these tableau calculi.

• Application to Modal Logic S4Claessen proposed an SAT-based theorem prover for Intuitionistic Logic thatachieved very good results [2]. Given the accessibility relations of IntuitionisticLogic (reflexive, transitive and persistent), a single instance of an SAT solverwas required throughout an entire proof because constraints could incrementallybe added. Whilst Modal Logic S4 does not require a persistence condition, boxmodalities are still global constraints. As such, there is an immediate reductionin the amount of work that our SAT-solver would need to do, which could havea positive impact on performance.

• Re-implement in a low level programming languageOur initial hypothesis was that by relying on a highly optimised SAT-solver todo the bulk of computation the resulting performance of our prover would beacceptable. However, this was not the case. Using python to implement our

Page 44: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

30 Conclusion

algorithm certainly compounded any inefficiencies. Therefore scope certainlyexists to improve performance by reimplementation in a lower level programminglanguage.

• Algorithm optimisationOur current implementation of this SAT-based prover is rather naive. Whilstvarious backtracking techniques were considered, they were deemed infeasiblegiven an inability to control and maintain interactions between clauses thatwere passed to the SAT solver. Therefore methods to optimise OR-branching,and avoid the expansion of redundant branches given knowledge of realisedcontradictions, could be further explored in order to extract better performance.A potential consideration could be the use of caching, to store causes responsiblefor contradictions.

6.2 ConclusionThe objective of this report was to develop and implement a theorem prover for modallogic K that leveraged the efficiencies of an existing SAT solver. For this purpose, wedeveloped an algorithm that isolated the propositional and modal aspects of a tableauproof, and utilised a SAT solver for the propositional components of the proof search.Furthermore, we also show that algorithm that underpins our theorem prover is soundand complete.

Whilst memory efficient, the resulting performance of our theorem prover is verypoor when compared to other state-of-the-art theorem provers. Our results indicatethat using a SAT solver with an unlabelled tableau system may potentially not be aneffective approach for searching a tableau tree in modal logic K. This is due to the factthat we are forced to generate our branching in a very naive manner. However, ourability to conclude ineffectiveness of this SAT-based method is not definite given ourchoice of programming language and potential for algorithm optimisation. Despite this,given our use of modal clausal forms, there remain avenues to pursue this researchfurther in the implementation of tableau calculi for multimodal logics of belief [7].

Page 45: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight
Page 46: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

References

[1] Balsiger, P., Heuerding, A., and Schwendimann, S. (2000). A benchmark methodfor the propositional modal logics k, kt, s4. Journal of Automated Reasoning,24(3):297–317.

[2] Claessen, K. and Rosén, D. (2015). Sat modulo intuitionistic implications. In Davis,M., Fehnker, A., McIver, A., and Voronkov, A., editors, Logic for Programming,Artificial Intelligence, and Reasoning, pages 622–637, Berlin, Heidelberg. SpringerBerlin Heidelberg.

[3] de Moura, L. and Bjørner, N. (2008). Z3: an efficient smt solver. 4963:337–340.

[4] Everitt, T. (2010). Automated theorem proving. Master’s thesis, StockholmUniversity.

[5] Gabbay, D. M. and Guenthner, F. (2013). Handbook of Philosophical Logic: Volume17. Springer Publishing Company, Incorporated.

[6] Goré, R. (1999). Tableau Methods for Modal and Temporal Logics, pages 297–396.Springer Netherlands, Dordrecht.

[7] Goré, R. and Nguyen, L. A. (2009). Clausal tableaux for multimodal logics of belief.Fundam. Inf., 94(1):21–40.

[8] Harrison, J. (2009). Handbook of Practical Logic and Automated Reasoning. Cam-bridge University Press.

[9] Heuerding, A., Jäger, G., Schwendimann, S., and Seyfried, M. (1996). The logicsworkbench lwb: a snapshot. Euromath Bulletin, 2:177–186.

[10] Hustadt, U. and Schmidt, R. A. (1998). Simplification and backjumping in modaltableau. In de Swart, H., editor, Automated Reasoning with Analytic Tableaux andRelated Methods, pages 187–201, Berlin, Heidelberg. Springer Berlin Heidelberg.

[11] Kaminski, M. and Tebbi, T. (2013). Inkresat: modal reasoning via incrementalreduction to sat.

[12] Koehler, J., Tirenni, G., and Kumaran, S. (2002). From business process modelto consistent implementation: A case for formal verification methods. In EDOC.

[13] Kripke, S. A. (1963). Semantical considerations on modal logic. Acta PhilosophicaFennica, 16(1963):83–94.

Page 47: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

32 References

[14] Ladner, R. E. (1977). The computational complexity of provability in systems ofmodal propositional logic. SIAM Journal of Computing.

[15] Li, Z. (2008). Efficient and Generic Reasoning for Modal Logics. PhD thesis, TheUniversity of Manchester.

[16] Marek, V. W., Shvarts, G. F., and Truszczynski, M. (1991). Modal nonmonotoniclogics: Ranges, characterization, computation. J. ACM, 40:963–990.

[17] Moss, L. S. and Tiede, H.-J. (2007). 19 applications of modal logic in linguistics.In Blackburn, P., Benthem, J. V., and Wolter, F., editors, Handbook of Modal Logic,volume 3 of Studies in Logic and Practical Reasoning, pages 1031 – 1076. Elsevier.

[18] Nguyen, L. A. (1999). A new space bound for the modal logics k4, kd4 ands4. In Kutyłowski, M., Pacholski, L., and Wierzbicki, T., editors, MathematicalFoundations of Computer Science 1999, pages 321–331, Berlin, Heidelberg. SpringerBerlin Heidelberg.

[19] Olesen, K. (2012). Using bdds to implement propositional modal tableaux. Master’sthesis, The Australian National University.

[20] Portoraro, F. (2014). Automated reasoning. In Zalta, E. N., editor, The StanfordEncyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, winter2014 edition.

[21] Tsarkov, D. and Horrocks, I. (2006). Fact++ description logic reasoner: Systemdescription. In Furbach, U. and Shankar, N., editors, Automated Reasoning, pages292–297, Berlin, Heidelberg. Springer Berlin Heidelberg.

Page 48: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

Appendix A

Independent study contract

The final project description was to implement a sound and complete SAT-basedtheorem prover for modal logic K. The theorem prover was to show validity of aninput modal formula. The prover was to be implemented in Python and make use ofan existing SAT-solver. For this purpose, we decided to use Z3-solver by Microsoft.Another key requirement of the implementation was transformation of the modalproblem into modal clausal form.

A signed copy of the independent study contract is contained on the followingtwo pages. We note that the main deviations from this initial contract relate to theprogramming language and SAT solver used in the implementation.

Page 49: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight
Page 50: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight
Page 51: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight
Page 52: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

Appendix B

Software artefacts

Our theorem prover has been implemented using Python, and requires installationof the Z3 python package. The software structure has been separated into two mainpackages:

1. Clausifier: lexical analysis, parsing and transforming the input formula intomodal clausal form

2. Prover: constructs the tableau proof and returns the result to the top level caller

Both these packages are contained within the src folder and every sub modulewithin these packages have been written by me for the purpose of this project. Thetop level calling function is the main.py file, which is also in the src folder.

Both the clausifier and prover were tested using benchmark formula, as describedin subsection 5.1, for modal logic K. The test scripts are saved within each packagefolder along with a single set of results.

The test method for our clausification functions are detailed in subsection 3.1.4.We note that this test script relies on an external theorem prover which must acceptformula represented in the same syntax per the README file included in Appendix C.For the purpose of our testing we used BBDtab - this software has not been includedwithin our submitted artefacts.

To test the correctness of our prover, we simply analysed the output per thebenchmark problems saved in the submitted artefacts folder. The desired result of eachbenchmark problem is known from the outset, therefore so we compared the output ofour theorem prover to known result in order to determine correctness. There are 378problem instances in the LWB benchmarks.

Page 53: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

38 Software artefacts

Our testing and implementation was completed on a standard desktop using a linuxoperating system. The specifications of the desktop are an intel 2.60GHz CPU, with8GB of memory. Python3.6 was the version of compiler used.

Page 54: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

Appendix C

README file for software

A copy of the README file for this software is contained on the following page.

Page 55: Automated Reasoning for Artificial Intelligencecourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/reports/u5453384.pdf · Chapter1 Introduction Givenanarbitrarystatement,therearepotentiallymanyreasonsastowhywemight

README.md Sat May 19 22:47:49 2018 1

# SATtabA SAT-Based Theorem Prover for Modal Logic K.

### PrerequisitesTo run this software you will need to install the ply and z3-solver packages.Depending on your Python version, these packages can be installed using the following commands: * pip3 install ply * pip3 install z3-solver

### Using the softwareUsage: python3 src/main.py [-v]

The prover will read one line of standard input as a modal logic formula, andwill return whether this formula is provable or not. That is, it will negatethe input formula, and test whether this negation is satisfiable.

There are several further options:-v Output verbose basic statistics about the internal workings of the program.

The prover accepts formulae in the following syntax:

fml ::= ’(’ fml ’)’ ( parentheses ) | ’True’ ( truth ) | ’False’ ( falsehood ) | ’˜’ fml ( negation ) | ’<’ id ’>’ fml | ’<>’ fml ( diamonds ) | ’[’ id ’]’ fml | ’[]’ fml ( boxes ) | fml ’&’ fml ( conjunction ) | fml ’|’ fml ( disjunction ) | fml ’=>’ fml ( implication ) | fml ’<=>’ fml ( equivalence ) | id ( classical literals )

where identifiers (id) are arbitrary nonempty alphanumeric sequences([’A’-’Z’ ’a’-’z’ ’0’-’9’]+)

## Running the benchmarksThere are several benchmark sets that are available in the ’benchmarks’ folder. The problem files ending with .k are located within each subfolder and are represented in the correct syntax.

A benchmark testing script can be run as following: * python3 src/prover/testing/testing_prover.py

## Authors

* **Darren Lawton**

## License

This project is licensed under the MIT License

## Acknowledgments

* Professor Rajeev Gore