Download pdf - a qualitative reasoning framework for the simulation of sn1 and sn2

i

A QUALITATIVE REASONING FRAMEWORK FOR THE SIMULATION OF SN1 AND SN2 MECHANISMS

IN ORGANIC REACTIONS

TANG YEE CHONG

THESIS SUBMITTED IN FULFILMENT OF THE REQUIREMENTS

FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

FACULTY OF COMPUTER SCIENCE & INFORMATION TECHNOLOGY

UNIVERSITY OF MALAYA KUALA LUMPUR

FEBRUARY 2011

ii

Abstract

In organic chemical reactions, one has to understand the many cognitive steps involved

before a stable product is formed. Understanding these cognitive steps is among the

many difficulties faced by chemistry students. Traditional chemistry educational software

is inadequate in promoting understanding such as why and how things happen. These

programs do not “explain” simply because the results are obtained through chaining of

rules or by searching the reaction routes that have been pre-coded in software.

This thesis describes a qualitative reasoning framework for the simulation of SN1 and

SN2 mechanisms in organic reaction based on Qualitative Process Theory (QPT). The

modelling constructs of QPT provide grounds for representing chemical theories

qualitatively with notions of causality which can be used to explain the behaviour of a

chemical system. The major theme of this framework is that, in a qualitative simulation

environment, students are able to articulate his/her knowledge through the inspection of

explanations generated by software. These students are seen as the recipients of

knowledge delivered via the “explanation” pedagogy. To test the framework, a simulator

prototype, named QRiOM (Qualitative Reasoning in Organic Mechanism) was

implemented.

Specifically, this thesis investigates the qualitative reasoning approach and QPT ontology

applied to the task of constructing qualitative models and generating explanation for the

simulation of organic chemical reactions. The framework focuses on a few issues relating

to: (1) Automation of the qualitative model construction for organic reaction processes,

and (2) Improvement of the explanation generation approach since current chemistry

software cannot appropriately explain a chemical phenomenon. In this work, “make-

bond” and “break-bond” are identified as two generic processes in the simulation of

iii

organic reactions. From analysis of various chemical reactions occurring under SN1 and

SN2 mechanisms, the common set of chemical theories and behaviour for the generic

processes have been identified, from which the model automation procedures are

formulated. The issue of lack of explanation in chemistry software is addressed by

embedding a causal explanation generator that produces explanation in various forms.

The generator justifies and explains a simulated result by tracing the chains of causality

that stem from QPT model reasoning. These features are demonstrated via QRiOM.

Since QRiOM is developed to promote learners’ understanding of organic chemical

reactions, the effectiveness of QRiOM in explaining organic chemical phenomena has

also been evaluated. Evaluation results show that the tool has enhanced student

knowledge in organic chemical reactions and mechanisms.

This thesis comprises two main contributions. The first contribution is the application of

QPT to model various organic chemical reactions occurring under SN1 and SN2

mechanisms and to reproduce the chemical behaviour of the SN1 and SN2 mechanisms

“intuitively”. The thesis also provides justifications that QPT can be effectively used to

support learning. The second contribution is the development of an explanation module

obtained from the process model directly. This explanation module can be generalized

and used in other systems.

iv

Abstrak

Dalam reaksi kimia organik, kita harus memahami langkah-langkah kognitif yang terlibat

sebelum suatu produk yang stabil terbentuk. Memahami langkah-langkah kognitif

adalah salah satu masalah yang dihadapi oleh pelajar-pelajar kimia. Perisian kimia

tradisional untuk pembelajaran tidak mencukupi dalam meningkatkan pemahaman

seperti mengapa dan bagaimana sesuatu terjadi. Program-program ini tidak dapat

menjelaskan sesuatu koncep kerana keputusan yang diperolehi adalah melalui

penggunaan peraturan dan fakta atau dengan mencari laluan reaksi yang telah dikodkan

dalam perisian.

Tesis ini menggambarkan rangka kerja penaakulan kualitatif untuk mensimulasikan

mekanisme SN1 dan SN2 dalam reaksi organik berdasarkan Qualitative Process Theory

(QPT). Konstruk pemodelan yang terdapat pada QPT menyediakan asas untuk mewakili

teori kimia secara kualitatif yang boleh digunakan untuk menjelaskan perilaku sistem

kimia. Tema utama dari rangka kerja ini adalah bahawa, dalam lingkungan simulasi

kualitatif, pelajar mampu mengartikulasikan pengetahuannya dengan menyemak

penjelasan yang dihasilkan oleh perisian. Pelajar-pelajar ini dianggap sebagai penerima

pengetahuan yang disampaikan melalui pedagogi “penjelasan”. Untuk menguji rangka

tersebut, sebuah prototaip simulator bernama QRiOM (Qualitative Reasoning in Organic

Mechanism) telah dibangunkan.

Secara khusus, tesis ini meneliti pendekatan penaakulan kualitatif dan ontologi QPT

untuk membina model kualitatif dan simulasi untuk menghasilkan penjelasan untuk

reaksi kimia organik. Rangka kerja ini menumpukan pada beberapa isu berkaitan

dengan: (1) Pembangunan model kualitatif secara otomatik untuk reaksi organik, dan (2)

Peningkatan pendekatan dalam “penjelasan” kerana perisian kimia pada saat ini tidak

v

dapat secara tepat menggambarkan fenomena kimia. Dalam kajian ini, “make-bond” dan

“break-bond” dikenalpasti sebagai dua proses generik dalam simulasi reaksi organik.

Dari analisis pelbagai reaksi kimia yang berlaku di mekanisme SN1 dan SN2, teori umum

dan perilaku untuk proses generik itu telah dikenalpasti, dari mana prosedur untuk

automasi model dirumuskan. Masalah kurangnya penjelasan dalam perisian kimia

diselesaikan dengan adanya sebuah generator “penjelasan kausal” yang menghasilkan

berbagai bentuk penjelasan. Generator tersebut menggambarkan dan menjelaskan hasil

simulasi dengan menelusuri rantai kausal dari model QPT. Ciri-ciri ini ditunjukkan

melalui QRiOM. Tujuan QRiOM adalah untuk meningkatkan pemahaman pelajar-

pelajar, oleh itu keberkesanan QRiOM dalam menjelaskan fenomena kimia organik telah

dinilai. Keputusan penilaian menunjukkan bahawa QRiOM dapat meningkatkan

pengetahuan pelajar dalam kimia organik dan mekanisme reaksi.

Tesis ini mempunyai dua sumbangan utama. Sumbangan pertama adalah penggunaan

QPT untuk pemodelan pelbagai reaksi kimia organik dalam mekanisme SN1 dan SN2, dan

mengeluarkan semula perilaku mekanisme SN1 dan SN2 secara “intuitif”. Tesis ini juga

memberikan justifikasi bahawa QPT boleh digunakan secara berkesan untuk menyokong

pembelajaran. Sumbangan kedua adalah pembangunan modul penjelasan dalam prototaip

QRiOM. Modul ini juga boleh digunakan dalam sistem lain.

vi

Acknowledgements

I would like to express my sincere gratitude to my supervisor, Dr. Rukaini Abdullah, for

her intellectual support, guidance and suggestions for improvements on my thesis. This

had a significant impact on this thesis. I would also like to thank her for offering

abundant ideas in fixing the structure of this thesis and for being very patient with my

progress.

My heartfelt thanks are extended to Professor Dr. Sharifuddin Mohd Zain for being a

great supervisor and friend. Many thanks are due for his insightful comments, research

perspective, and encouragement throughout the period of this research. I feel extremely

fortunate to have him as my supervisor.

I would also like to express my sincere appreciation to Professor Dr. Noorsaadah Abdul

Rahman. I can never forget the several extremely valuable discussions we had during the

undertaking of this research. I can still remember the first time I studied organic

chemistry where I could not differentiate between nucleophiles and electrophiles. I would

like to thank her for having confidence and trust in me. The support from my three

supervisors formed the kernel around which this thesis has developed.

A special gratitude is extended to Associate Professor Dr. S.M.F.D. Syed Mustapha for

introducing me to the wonders of qualitative reasoning. His contributions to this work

include the time he devoted in guiding me and the expertise he shared with me at the

beginning of this research work.

vii

I would also like to thank the following colleagues of mine for their motivation and help:

• Dr. Sharifah Mumtazah Syed Ahmad (Systems & Networking Department, Head) –

for her full support to enable me to complete this work at the earliest possible time.

• Assoc. Prof. Dr. Siti Salbiah (College of Information Technology, Dean) – for

approving several sponsorships to enable me to present papers at overseas

conferences.

• Dr. Abdul Rahim Ahmad (College of Information Technology, Deputy Dean) – for

his willingness and kindness to share his research thoughts with me.

• Dr. Chai Mee Kin – who helped me in the earlier part of this research where I needed

expertise in the domain of inorganic and organic chemistry.

• Asma Shakil – for proofreading a few chapters of this thesis.

• My colleagues in College of Engineering, Science & Mathematics department – for

stimulating many interesting ideas in my work.

Lastly, closer to home, I am most grateful to my family members for their love and

support throughout the long process required to complete my PhD. My deepest gratitude

goes to my father for his encouragement during the course of this research.

viii

Dedication

This thesis is dedicated to a woman of strength and wisdom, who wished to see me reach

this point in my education path. She is my late mother – Madam H.M. Chua.

ix

Table of Contents

Abstract..............................................................................................................................ii

Abstrak..............................................................................................................................iv

Acknowledgements...........................................................................................................vi

Dedication.......................................................................................................................viii

Table of Contents.............................................................................................................ix

List of Figures.................................................................................................................xiii

List of Tables....................................................................................................................xx

List of Abbreviations.....................................................................................................xxii

Chapter 1 Introduction .................................................................................................... 1

1.1 Introduction ......................................................................................................... 1

1.2 Background Review ............................................................................................ 2

1.2.1 Qualitative Reasoning ............................................................................... 2

1.2.2 Qualitative Process Theory (QPT) ............................................................ 8

1.2.3 Organic Reaction and Organic Mechanism ............................................. 12

1.3 Problem Statement ............................................................................................ 17

1.4 Objectives ......................................................................................................... 19

1.5 Research Questions ........................................................................................... 20

1.6 Scope of Research ............................................................................................. 25

1.6.1 System Scope........................................................................................... 25

1.6.2 Course Scope ........................................................................................... 26

1.7 Main Results ..................................................................................................... 26

1.8 Thesis Structure ................................................................................................ 27

Chapter 2 Literature Review......................................................................................... 31

2.1 Introduction ....................................................................................................... 31

2.2 Review of the Literature on Qualitative Reasoning Applications .................... 31

2.2.1 In Industry ............................................................................................... 32

2.2.2 In Education............................................................................................. 33

2.3 Review of the Literature on Work Using Qualitative Process Theory ............. 36

2.4 Analyzing Domain Suitability .......................................................................... 38

2.4.1 Explaining Organic Chemical Reactions in the Classroom .................... 40

2.5 Use of Artificial Intelligence in Organic Chemistry ......................................... 40

2.5.1 The Traditional Knowledge-based Approach ......................................... 41

2.5.2 The Machine Learning Approach ............................................................ 41

2.6 Molecular Representation Schemes .................................................................. 43

2.6.1 The Simplified Molecular Input Line Entry System (SMILES) Codes .. 43

2.6.2 International Chemical Identifier (InChI) ............................................... 44

2.7 Related Works ................................................................................................... 45

2.7.1 LHASA ................................................................................................... 46

2.7.2 QALSIC ................................................................................................... 47

2.7.2.1 Limitations and Problems in the QALSIC Program ................. 48

2.7.2.2 Organic Reactions Versus Inorganic Reactions ........................ 50

2.7.2.3 Inorganic Reactions in Qualitative Reasoning: The Problems . 52

2.7.2.4 Discussion ................................................................................. 58

2.8 Conclusion ........................................................................................................ 59

Chapter 3 Qualitative Modelling of Organic Reactions ............................................. 60

3.1 Introduction ....................................................................................................... 60

3.2 State of the Art in Qualitative Modelling ......................................................... 61

3.3 Domain Knowledge Acquisition ....................................................................... 64

x

3.4 Understanding Organic Chemistry Reactions ................................................... 65

3.5 Organic Reaction as Modelling Task ................................................................ 67

3.5.1 Chemical Equation as a Reasoning Task................................................. 69

3.6 The Underlying Thought Processes for Organic Reactions .............................. 70

3.6.1 Individual Views Identification ............................................................... 75

3.6.2 Representing Individual Views ............................................................... 77

3.6.3 Relation Between View Pairs and Organic Processes ............................. 79

3.7 Reaction Steps Classified as “make-bond” and “break-bond” Processes ........ 81

3.7.1 Proof of Common Behaviour Exhibited in Organic Processes ............... 82

3.7.1.1 Behaviour Generalization for “make-bond” Process ................ 84

3.7.1.2 Behaviour Generalization for “break-bond” Process ................ 87

3.8 Representing Organic Chemistry Theories Using QPT Constructs .................. 90

3.8.1 Direct and Indirect Influences in Organic Reaction Simulation ............. 91

3.8.2 Postulating Limit Points .......................................................................... 91

3.8.3 The Quantities and Quantity Spaces........................................................ 94

3.9 Useful Guidelines in Modelling Views for Organic Reactions ........................ 95

3.10 Learning with Qualitative Models .................................................................... 98

3.10.1 Ontology Primitives as Explanation Facilitator .................................... 99

3.10.2 Learning Activities Manifestation ....................................................... 100

3.11 Conclusion ...................................................................................................... 102

Chapter 4 Qualitative Simulation and Explanation Generation ............................. 103

4.1 Introduction ..................................................................................................... 103

4.2 State of the Art in Qualitative Simulation and Explanation in Education ...... 104

4.3 Qualitative Simulation Scenario ..................................................................... 109

4.3.1 An Overview of the Simulation Architecture ........................................ 110

4.3.2 Reproducing Behaviour of Organic Reactions via QPT Reasoning ..... 113

4.4 Chemical Behaviour of SN1 and SN2 Mechanisms ......................................... 115

4.4.1 The SN1 Mechanism .............................................................................. 116

4.4.2 The SN2 Mechanism .............................................................................. 120

4.5 Simulation Scenario for Reproducing the Behaviour of SN1 .......................... 121

4.5.1 Contents of the View Instance Structure (VIS) During Reasoning ....... 124

4.5.2 Stopping Conditions for Reaction Steps and the Entire Simulation...... 125

4.6 QPT Process Model as Reusable Component ................................................. 126

4.6.1 Model Reuse by SN2 .............................................................................. 128

4.6.2 Model Reuse Scenario ........................................................................... 129

4.7 Qualitative Explanation Manifestation ........................................................... 131

4.7.1 Generating a Causal Graph .................................................................... 132

4.7.2 Design of Causality ............................................................................... 134

4.7.3 Interpreting a Causal Graph................................................................... 138

4.7.4 Deriving Explanation From a Causal Graph ......................................... 140

4.8 Discussion ....................................................................................................... 141

4.9 Conclusion ...................................................................................................... 142

Chapter 5 Qualitative Reasoning Framework for Organic Reaction Simulation .. 144

5.1 Introduction ..................................................................................................... 144

5.2 The Qualitative Reasoning Framework .......................................................... 145

5.2.1 Inputs ................................................................................................. 145

5.2.2 Outputs ................................................................................................. 146

5.2.3 Software Components ........................................................................... 146

5.3 Component Design .......................................................................................... 152

5.3.1 The Two-tier Architecture of Knowledge Base .................................... 152

5.3.2 The Chemical Knowledge Base ............................................................ 154

5.3.3 OntoRM: Objectives and Motivations ................................................... 155

xi

5.3.3.1 The Design of OntoRM .......................................................... 156

5.3.3.2 Validation Examples ............................................................... 161

5.3.4 The Substrate Recognizer ...................................................................... 165

5.3.5 The Model Constructor for Organic Processes ..................................... 166

5.3.6 The Reasoning Engine for Reaction Simulation ................................... 167

5.3.7 The Causal Model Generator ................................................................. 168

5.4 Storing Molecular Patterns in Software .......................................................... 170

5.4.1 Design of Attributes and Methods for an Atom .................................... 171

5.4.2 Connection Table................................................................................... 172

5.4.3 The Molecule Table ............................................................................... 173

5.5 Knowledge Structuring ................................................................................... 175

5.6 The Protocol for Interacting with QRiOM ...................................................... 178

5.7 Simulation Results and Discussion ................................................................. 180

5.7.1 Reaction Route ...................................................................................... 182

5.7.2 QPT Model ............................................................................................ 184

5.7.3 Causal Graph ......................................................................................... 185

5.7.4 Parameter State History and Atom Property Tables .............................. 186

5.7.5 List of Reacting Species (View Pairs) ................................................... 187

5.8 Conclusion ...................................................................................................... 188

Chapter 6 Evaluation of QRiOM ................................................................................ 190

6.1 Introduction ..................................................................................................... 190

6.2 The Evaluation Context .................................................................................. 190

6.3 Procedures Used for Conducting the Questionnaires ..................................... 192

6.3.1 Students’ Feedback on the use of QPT and Qualitative Reasoning Approaches ............................................................................................ 195

6.3.2 Assessment of Students’ Skills in Core Areas of Organic Reactions – The Pre-Questionnaire .................................................................................. 197

6.3.3 Assessment of Students’ Skills in Core Areas of Organic Reactions – The Post-Questionnaire ................................................................................ 198

6.3.4 Assessment of Effectiveness of QRiOM’s Explanation Facility .......... 200

6.3.5 Assessment of the Usefulness and Helpfulness of QRiOM .................. 202

6.3.6 Comments on Graphical User Interface Design .................................... 204

6.4 Conclusion ...................................................................................................... 205

Chapter 7 Conclusion ................................................................................................... 207

7.1 Thesis Summary .............................................................................................. 207

7.2 Results and Contributions ............................................................................... 209

7.2.1 Conceptual Framework Development ................................................... 210

7.2.1.1 QPT as the Knowledge Capture Tool ..................................... 211

7.2.1.2 Model Automation .................................................................. 212

7.2.2 QRiOM – A Tool for Explaining Organic Reactions ............................ 213

7.2.3 Evaluation Results of QRiOM............................................................... 214

7.3 Limitations ...................................................................................................... 217

7.4 Future Works .................................................................................................. 217

7.5 Concluding Remarks ....................................................................................... 219

xii

References.........................................................................................................................220 Appendix A: A Summary of Systems Related to Qualitative Reasoning........................230 Appendix B: Collection of Flowcharts for the Qualitative Reasoning Framework........237 Appendix C: Questionnaires Used for Collecting Students’ Feedback on the use of QPT

and Qualitative Reasoning Approaches ....................................................243 Appendix D: Selected Computer Screenshots.................................................................246 Appendix E: Program Snippets for the Main Software Modules in QRiOM

....................................................................................................................259

xiii

List of Figures Figure Page

1.1 A brick and an elastic string tied up at one end for demonstrating qualitative

reasoning technique. ......................................................................................... 4 1.2 A general view of qualitative reasoning. .......................................................... 7 1.3 A general view of QPT. .................................................................................. 10 1.4 A “charge” quantity and its space. .................................................................. 10 1.5 The five slots of a QPT process. ..................................................................... 11 1.6 Students’ barriers to understanding the organic chemistry course. ................ 15 1.7 Thesis layout. .................................................................................................. 30 2.1 Some of the benefits of applying qualitative reasoning to a chemical system

simulation. ....................................................................................................... 42 2.2 A proposed scheme to classify inorganic experiment types. .......................... 50 3.1 Simulation entails reasoning from model. ...................................................... 64 3.2 The conversion of a tertiary alcohol to yield alkyl chloride can be described as

a series of three small steps. ............................................................................ 72 3.3 The production of a tertiary alcohol can be described as a series of three

reaction steps. .................................................................................................. 74 3.4 The “dissociation” and “reaction with HO−−−−” are concerted steps. This is a

typical SN2 backside attack reaction. .............................................................. 75 3.5 (a) Generic definition for an electrophile described using QPT (b) An

electrophile used in “make-bond” process (c) An electrophile used in in “break-bond” process. ..................................................................................... 78

3.6 (a) Generic definition for a nucleophile described using QPT (b) A

nucleophile used in “make-bond” process (c) A “delta-minus” view. It is used when the covalent bond between a delta-plus and a delta-minus species is deleted. ........................................................................................................ 79

3.7 An instantiated “make-bond” process described using QPT modelling

constructs. The process focuses on the nucelophile (the “OH”) to be replaced and the proton. ................................................................................................ 89

3.8 An instantiated “break-bond” process described using QPT modelling

constructs. The process focuses on the leaving group and the electrophilic carbon centre. .................................................................................................. 90

xiv

3.9 Alcohol reactivity under SN1 mechanism. ...................................................... 97 3.10 The QPT process specification that models the behaviour of a “make-bond”

process. ............................................................................................................ 99 4.1 The use of qualitative reasoning, simulation and explanation within the

context of this work. ..................................................................................... 109 4.2 Workflow of the QPT-based reasoning. ....................................................... 112 4.3 A “make-bond” model fragment represented using QPT. This model fragment

is used to reproduce the behaviour of the first reaction step for “(CH3)3C–OH + HCl” reaction. ............................................................................................ 114

4.4 Reaction between the hydroxide ion (OH−−−−) and a tertiary halide. ................ 116 4.5 The stability of various structures of a carbocation under SN1 mechanism. . 118 4.6 A reaction that needs SN1. ............................................................................. 119 4.7 The organic processes occurred in the order of “make-bond” (Step 1), “break-

bond” (Step 2) and “make-bond” (Step 3). The reaction can be explained by the SN1 mechanism. ...................................................................................... 120

4.8 A “break-bond” model fragment represented using QPT. This model fragment

is used to reproduce the behaviour of the second step of “(CH3)3COH + HCl”.. ............................................................................................................ 123

4.9 A “make-bond” model fragment represented using QPT. This model fragment

is used to reproduce the behaviour of the third step of “(CH3)3COH + HCl”...............................................................................................................124

4.10 The contents in VIS during the simulation of “protonation” process. The VIS

is constantly updated to reflect the new intermediates produced until the entire reaction is ended. Content in (d) is the final product. ................................... 125

4.11 The contents in VIS during the simulation of the “dissociation” process.

Content in (d) is the final product of this reaction. ....................................... 125 4.12 The QPT process models constructed for Equation 3.2 can be reused by other

chemical equation simulation such as Equation 4.3. .................................... 127 4.13 The mechanism used in this simulation is SN2. The organic processes that

occurred are “break-bond” (expulsion of the leaving group) and “make-bond” (the approaching of the hydroxide ion to form a bond to the carbon centre).128

4.14 Model reuse scenario for the simulation of organic reactions. ..................... 130 4.15 (a) A problem solving method that uses concepts to tackle multiple problems

(b) A precoded KB of an expert system in solving a specific problem. ....... 131

xv

4.16 A causal graph showing cause-effect relationship of chemical parameters during the simulation of “(CH3)3C–OH + HCl” reaction. ............................ 133

4.17 Causal graph for the “protonation” process. The inequality above the dotted

line is the entry condition to the process. ...................................................... 135 4.18 Causal graph for the “dissociation” process. The process stops when the

oxygen (“O”) regains its equilibrium state. .................................................. 137 4.19 Causal graph for the “Capturing of carbocation by anion” process. ............. 137 5.1 A schematic view of the qualitative reasoning framework described in terms

of the input, process, output and the knowledge bases. ................................ 147 5.2 Main software components of QRiOM. ........................................................ 148 5.3 Architectural design of the knowledge base. ................................................ 153 5.4 Examples of chemical facts and theories used in reaction simulation...........154 5.5 Basic concepts in OntoRM ontology are hierarchically structured using the

IS-A relation. ................................................................................................. 159 5.6 Properties of basic concepts defined in the ontology are encapsulated in the

format of a Java class. ................................................................................... 160 5.7 Chemical properties of SN1 and SN2. ............................................................ 160 5.8 The main steps in the model constructor module..........................................167 5.9 The main steps of the simulation algorithm. ................................................. 168 5.10 Main steps in the QSA module. .................................................................... 169 5.11 The flowchart for generating a causal graph. ................................................ 170 5.12 A substrate’s functional group represented as a connection table. ............... 172 5.13 Connection table for initial structure of the substrate. .................................. 173 5.14 Connection table after the “protonation” process (“make-bond”). The digit

“1” is filled in the correct entry based on the individuals that activates the process. “H2” indicates the newly added atom. ............................................ 173

5.15 Algorithmic steps in the MUR module that updates the molecule table in

order to prepare the reaction route of a chemical reaction. ........................... 174 5.16 A molecule table is represented as a 2D array. This is the initial structure of

the alcohol substrate. ..................................................................................... 174 5.17 The “H” has been attached to the main compound. This is the effect of the

generic “make-bond” process. ...................................................................... 175

xvi

5.18 Protocol in using the simulator (Labels A – H can be found in Figure 5.19)...............................................................................................................179

5.19 Main interface of the QRiOM software. ....................................................... 180 5.20 Screenshots showing two reaction routes generated by QRiOM at the end of a

simulation. ..................................................................................................... 183 5.21 A computer generated QPT model. .............................................................. 184 5.22 A causal graph generated by QRiOM that enables learners to examine the

cause-effect relationships of chemical parameters during reasoning............ 185 5.23 The states of chemical parameter of each reacting species involved in a

simulation task can be examined in greater detail. ....................................... 186 5.24 (a) The chemical states possessed by each reacting unit during simulation are

stored in the atom property table (b) A reaction route drawn from using the data values in the atom property table. ......................................................... 187

5.25 The choice of reacting units for each reaction step and the intermediates

produced are displayed for further inspection............................................... 188 6.1 Flowchart of the QRiOM evaluation exercise. ............................................. 194 6.2 Examples of survey questions used for measuring students’ understanding

towards QPT. ................................................................................................ 195 6.3 Sample questions in a survey form that collect students’ opinions about

qualitative reasoning and modelling approaches. ......................................... 196 6.4 Students’ responses towards understanding QPT and qualitative reasoning

approaches. .................................................................................................... 197 6.5 The survey form for course competency assessment distributed before/after

using the simulator. ....................................................................................... 198 6.6 Student pre-test and post-test responses to the core skills. ........................... 199 6.7 Questions in the survey form for the measure of explanation-based learning in

skills reinforcement. ...................................................................................... 200 6.8 Students’ feedbacks on the extent to which the tool improves one’s

knowledge in terms of skill reinforcement through explanation-based learning. ........................................................................................................ 201

6.9 Examples of the survey questions for the measure of usefulness and

helpfulness of QRiOM in a student’s learning endeavour. ........................... 202 6.10 Students’ feedbacks on helpfulness (motivated) and usefulness (gain more

confidence) of QRiOM. ................................................................................ 204

xvii

7.1 Accomplishment of the QR approach when implemented in a tool for learning organic reactions. .......................................................................................... 216

B.1 Workflow of the QPT-based modelling, reasoning and explanation

framework……………………………………………..……………….......238 B.2 The task performed by the “Substrate Recognizer”……………..……...….239 B.3 Workflow for automating QPT model for organic

processes..…………………………………………………...…….……..…240 B.4 Workflow of the QPT-nased simulation and the micro steps in the QSA

module...........................................................................................................241 B.5 Workflow of the technique used in handling and generating an

explanation....................................................................................................242 C.1 Questionnaire to assess students’ understanding on QPT.............................244 C.2 Questionnaire to collect students’ opinions on qualitative modelling and

reasoning approaches of problem solving for organic chemistry.......................................................................................................245

D.1 Login page ....................................................................................................247 D.2 Front page of the QRiOM qualitative simulator ...........................................247 D.3 Main interface of QRiOM.............................................................................248 D.4 More learning activities and explanation can be viewed by clicking A, B and

C buttons…....................................................................................................248 D.5 Reaction route for the simulation of “CH3Cl + HO−” is

formed............................................................................................................249 D.6 Reaction route for the simulation of “CH3CH3CH3Br + H2O” is

formed............................................................................................................249 D.7 Reaction route for “CH3CH3CH2Cl + HO−”...............................................250 D.8 QPT model inspection page...........................................................................250 D.9 A “make-bond” process described in QPT terms (between a charged

nucleophile and a charged electrophile)........................................................251 D.10 A causal graph showing the cause and effect relationships of the various

chemical parameters during qualitative reasoning........................................251 D.11 Causal graph inspection page with annotation..............................................252

xviii

D.12 Brief explanation of each slot in a QPT model.............................................252 D.13 More explanation for the various modelling constructs of QPT...................253 D.14 Contents of the View Instance Structure (VIS) give the pairs of reacting

species used in each small reaction step........................................................253 D.15 A snapshot of the contents of the VIS during the simulation of

“CH3CH3CH3COH + HBr”.........................................................................254 D.16 Each chemical state change (parameter state history) is recorded for further

examination...................................................................................................254 D.17 Chemical states for “HO-”are retrieved and displayed.................................255 D.18 Contents in the “substrate table” showing the functional units involved in a

reaction..........................................................................................................255 D.19 The screenshot for a specific case where QRiOM is unable to predict the

output, where the reason is displayed via a pop-up window.........................256 D.20 A screenshot of “no reasoning” for an input pair of <CH3Cl, HF>, where the

system simply returns a short message..........................................................256 D.21 A QPT learning corner is included in the software………………………...257 D.22 A “terminology help window” that provides quick notes for important organic

chemistry terms used in simulation and explanation.....................................257 D.23 The main interface for “model building” by the students – for future

expansion of the tool.....................................................................................258 D.24 Knowledge base Editor – for adding/deleting chemical facts and

theories..........................................................................................................258 E.1 The Java code for retrieving chemical facts of reacting species and for

constructing a QPT process...........................................................................261 E.2 The associated Java statements for updating the VIS in order to suggest the

next organic process in the qualitative simulation environment...................262 E.3 The Java code for updating the chemical parameters’ states of each atom

during simulation...........................................................................................263 E.4 The Java statements for constructing a causal graph.....................................264 E.5 The Java statements for retrieving the parameter history of a reacting

unit.................................................................................................................265 E.6 A sample set of definitions for nucleophiles, electrophiles and the basic

concepts of organic mechanisms...................................................................267

xix

E.7 A Java method that checks the nucleophilic reactivity for a pair of nucleophiles for possible substitution...........................................................268

E.8 A Java method that checks whether a substrate can undergo SN1 or SN2.....269 E.9 A Java method that checks the types of individual views in order to

recommend a suitable chemical process........................................................269 E.10 The associated Java statements to stop the entire reaction simulation..........270 E.11 The Java statements for displaying the organic processes in the order of

occurring........................................................................................................271 E.12 The Java statements to display the final product...........................................271

xx

List of Tables

Table Page

1.1 Some notations and semantics of the QPT modelling constructs. .................. 11 1.2 Relationships between research problems and questions, objectives and the

corresponding thesis chapters that answered them. ........................................ 24 2.1 Some examples of SMILES codes. ................................................................. 44 2.2 Comparison of InChI to SMILES formats. ..................................................... 45 2.3 Comparison of the actual and simulated results for a selected sample of

inorganic chemistry reactions. ........................................................................ 49 3.1 Relationship between view pair and covalent bonding. .................................. 80 3.2 A summary of the covalent bonding needed by three chemical equations

presented in this thesis. ................................................................................... 83 3.3 Reacting species and their chemical changes in the “protonation” process

(“make-bond”) of Equation 3.2. ...................................................................... 84 3.4 Reacting species and their chemical changes in the “capturing of halide anion

by carbocation” process (“make-bond”) of Equation 3.2. .............................. 84 3.5 Reacting species and their chemical changes in the “reacts with water”

process (“make-bond”) of Equation 3.3 for the formation of alcohol. ........... 85 3.6 Reacting species and their chemical changes in the “nucleophile attacks”

process (“make-bond”) of Equation 3.4 for the formation of ethanol. ........... 85 3.7 The reacting species involved in this “break-bond” process are “C” from the

alkyl group and the “O” from the oxonium ion. The carbon is δ+, so that the electrons are pushed towards “O” which is more electronegative. ................. 87

3.8 In this “break-bond” process, the atoms involved are “C” and “Br” from the

same molecule. ................................................................................................ 88 3.9 In this “break-bond” process, the atoms involved are “C” and “Br”. Bromine

is more electronegative than the other hygrogen substituents. So, it is the Br that leaves the molecule. ................................................................................. 88

3.10 Quantity spaces and limit points for the three main quantities used in the

framework. ...................................................................................................... 91 3.11 Examples of quantities and associated quantity spaces. ................................. 95 4.1 A set of queries and explanations. The explanation is generated based on Step

1 in the causal graph presented in Figure 4.16. ............................................. 140

xxi

4.2 A set of queries and explanations. The explanation is generated based on the second step of the causal graph presented in Figure 4.16. ............................ 141

5.1 Main modules and their roles. ....................................................................... 149 5.2 Software modules and the associated inputs and outputs. ............................ 151 5.3 Data types and the associated values. ........................................................... 161 5.4 Some attributes and methods associated with an atom. ................................ 171 5.5 Three abstraction levels of knowledge for use in QRiOM. .......................... 176 5.6 Knowledge types, abstraction levels and roles for use in QRiOM. .............. 177 5.7 Computer screenshots, objectives and the questionnaires used to test it. ..... 182 6.1 Questionnaires and the fulfilment of respective educational objective. ....... 193

A.1 Examples of educational software employing QR approaches......................231

xxii

List of Abbreviations

AI Artificial Intelligence

CAD Computer-Aided Design

CHMTRN Chemistry Translation

ES Expert Systems

GUI Graphical User Interface

InChi International Chemical Identifier

IT Information Technology

ITS Intelligent Tutoring Systems

IUPAC International Union for Pure and Applied Chemistry

KB Knowledge Base

KBS Knowledge-Based Systems

LG Leaving Group

LHASA Logic and Heuristics applied to Synthetic Analysis

MUR Molecule Update Routine

OntoRM Ontology for Reaction Mechanisms

OOP Object Oriented Programming

QPT Qualitative Process Theory

QR Qualitative Reasoning

QRiOM Qualitative Reasoning in Organic Mechanisms

QSA Quantity Space Analyzer

SMILES Simplified Molecular Input Line Entry System

SOM Self-Organizing Map

SN1 Single Molecular Nucleophilic Substitution

SN2 Bi Molecular Nucleophilic Substitution

VIS View Instance Structure

1

Chapter 1 Introduction

1.1 Introduction

In the past, simulations are based on complex mathematical procedures. These

procedures are used for calculating how the specific aspects within the simulation are to

be manipulated. Numerical analysis based on mathematical models provides no

conceptual access to the objects and their behaviour in the simulation. It is not possible

to derive causal explanation of the behaviour of a particular system from the

mathematical models. As a result, approaches based on mathematical models are not

suitable for inclusion in learning tools for many science subjects such as organic

chemistry. A substantial body of research in Qualitative Reasoning (QR) has shown

that many powerful reasoning can be done with only partial or less detailed knowledge

and without using mathematical models with differential equations (Iwasaki, 1997).

QR would be able to make acceptable predictions using only qualitative information

about new situations. Even though QR has been around for many years, no one has

reported work on organic reaction simulation using the QR technology.

There has been many strives for innovation in teaching and learning chemistry using

computer software. However, most of the chemistry educational software used

traditional approaches (Cartwright, 1993). In the standard rule-based systems,

explanation is generated by tracing all the rules that executed during a search for

solution. As such, these systems are incapable of providing behavioural types of

explanation on demand such as explaining why things happen and how they happen.

The programs were often difficult and time consuming to learn to use. To improve

understanding in chemistry, chemical knowledge and chemical commonsense can be

2

represented using an appropriate ontology (as the knowledge representation tool) for

reasoning and simulation use. The reasoning approach referred here is QR and the

appropriate ontology is Qualitative Process Theory (Forbus, 1984), a process-based QR

ontology. As there is a strong link between mental model and knowledge representation,

this relationship can help build new kind of educational software for teaching science

subjects such as organic chemistry more effectively. In view of this, the QR approach

based on qualitative process theory was investigated and applied to the problem domain

described in this work.

1.2 Background Review

1.2.1 Qualitative Reasoning

Qualitative Reasoning (QR) is an area of research combining Artificial Intelligence (AI)

and cognitive science. Briefly, AI is an attempt to reproduce intelligent reasoning using

machines while cognitive science is the study of the human mind (thought). The field

of cognitive science overlaps AI. Cognitive Science is an interdisciplinary field that has

arisen during the past decade at the intersection of a number of existing disciplines,

including psychology, linguistics, computer science, philosophy, and physiology. The

study of QR was originally motivated by observing human reasoning. For example,

people who do not know differential equations reason about many physical phenomena

perfectly well. Scientists and engineers also rely on simpler, qualitative models when

interpreting data at an initial stage of a design. In numerical simulation, many of the

processes are characterized by differential equations that describe how the parameters of

objects are changed over time. However, the notion of “process” is more structured than

the appearance of the set of equations itself.

3

One of the main goals of qualitative reasoning research is to formalize the rules people

use to mentally simulate the behaviour of a system through time. The technique was

initially developed to model commonsense reasoning and human-like reasoning within

the physical world. Commonsense is not something that can be easily explained to

computers. One of the greatest French philosophers and writers, Voltaire (1694–1778)

once said “commonsense is not so common” and that it is even more difficult to

formalize it for representation in computers. In 2004, Kuipers interviewed by Ubiquity

(Web-based publication of the Association for Computing Machinery,

http://www.acm.org/ubiquity/interviews/v4i45_kuipers.html) gave his novel perspective

on what he sees as “commonsense” knowledge. He defines commonsense as:

“…knowledge about the structure of the external world that is acquired and

applied without concentrated effort by any normal human that allows him or her

to meet the everyday demands of the physical, spatial, temporal and social

environment with a reasonable degree of success.”

According to Kuipers, if computational models of the human mind need to be built, then

the formalization of the kind of commonsense needed would have to be worked out. He

added that, people will not have to argue about exactly what “commonsense” means,

just as biologists seldom argue about exactly what “life” means. One of the most

notable things about human commonsense is people’s ability to make sensible

judgments even when they do not know all the relevant information about a situation.

This is the so-called “incomplete knowledge”. Kuipers believes that part of the power

of human commonsense knowledge comes from the ability to represent and use

knowledge even when it is incomplete. Commonsense knowledge is also knowledge of

certain domains that children learn about at a young age (e.g. space, time, the conditions

and results of actions, objects and their properties and the properties of materials).

Based on his interpretation and description of the term “commonsense”, the domain

4

knowledge required to solve the organic reaction problem described in this thesis is

represented as mental models having only partial knowledge.

The understanding of commonsense reasoning would require the study of how to reason

qualitatively about processes, namely, the kinds of changes that occur and their effects.

The central role is played by qualitative simulation, i.e. the prediction of possible

behaviours consistent with the incomplete knowledge of the structure of the physical

system. Adopting the “a brick and an elastic string” example from Forbus (1984),

Figure 1.1 illustrates a physical situation that portrays a brick and an elastic string tied

up at one end. This example needs commonsense knowledge and qualitative

representation to represent its behaviours. By qualitative reasoning, some of the

conclusions that can be drawn from Figure 1.1 are as follows:

• Question 1: “What if it gets pumped?” A possible conclusion could be “If there is

no friction the elastic spring will eventually break. If there is friction and the driving

energy is constant then there will be a stable oscillation”.

• Question 2: “What happens if we let go the block?” A possible conclusion would

be “Assuming the elastic spring does not collapse, the block will oscillate back and

forth and if there is friction it will eventually stop”.

Figure 1.1 A brick and an elastic string tied up at one end for demonstrating qualitative reasoning technique.

The above explanation is reasonable and much like the way a human interprets and

concludes what he/she observes. People can simulate these kinds of dynamical systems

with purely qualitative (symbolic) knowledge. This type of reasoning happens without

5

processing formal scientific theories that encompasses the relevant features in a detailed

and numerical means (or the specific numbers that would be required for a

mathematical model to run). Notice also that the commonsense conclusions can be

drawn without involving any mathematical expression such as a physical law like F =

m.a. where, m is mass; a is acceleration, and F represents force. From the equation, it

can be easily seen that increasing values of a are followed by increasing value of F.

However, the mathematical equation itself captures nothing about this important notion.

This is because in quantitative problem solving the representation of a system is a set of

mathematical formulas expressing the relations between the different parameters in the

system, without the representation of causality and physical structure. Given a set of

relations, there is no knowledge available about how the parameters relate to the

physical organization of the system, that is, topological structure is not explicit. QPT

(Section 1.2.2) offers the necessary modelling constructs to support notions which are

implicit in the equation such as to show that F and a have some kind of monotonic

relationships.

Education has the most links to QR research. An overview of QR research has been

discussed in Bredeweg and Struss (2003) while an analysis of QR in education can be

found in Bredeweg and Forbus (2003). As noted earlier, symbolic reasoning is among

the fundamental capabilities of human intelligence. Systems based on qualitative

reasoning are expected to possess the ability to predict and explain the behaviour of

physical systems in qualitative terms without involving mathematical equations. Many

application areas can benefit from QR approaches. Some advantages are: (1) The ability

to cope with incomplete information, (2) The ability to return imprecise but correct

prediction, (3) The ability to provide exploration of alternatives, and (4) It has inherent

automatic interpretation. Recently, a new generation of QR related tools have been

6

developed. These include software tools for domains such as ecology, engineering,

spatial data mining, information technology services, strategy game, Web services,

chemistry and the building of articulate software for commercial and educational uses.

A review of the literature on qualitative reasoning applications, qualitative process

theory application in education and training is given in Section 2.2, Chapter 2.

Several ontologies for qualitative reasoning have been introduced in 1980’s. Among

the well-known QR ontologies are process-centred (Forbus, 1984), component-based

(de Kleer and Brown, 1984) and constraint-based (Kuipers, 1986) approaches. These

languages provide new capabilities for science education software. By embedding

human-like models of entities and processes in the software, explanations that are

directly coupled to how specific results were derived can be provided. Although the

QR field has addressed diverse problem areas and developed a variety of theories and

systems, there are several features that are typical for many of the approaches and

theories. A general view of QR approaches and their typical features is presented in

Figure 1.2. Some of the most important ones are described in Bredeweg and Struss

(2003). The following five typical features are used in this work:

• Qualitative reasoning provides explicit representations of the conceptual knowledge,

and it requires ontologies to support its knowledge representation. This layer of

representation is crucial to any attempt to support model building and even more to

automate it. The two families of ontology are (1) interacting processes, and (2)

interconnected components.

• Explaining the behaviour of a system in terms of their cause-effect chain (more

commonly termed as causality). QR formalisms which make causality explicit are

of value in education (Forbus and Gentner, 2009).

7

• Most QR systems adopt a reductionist view of the world and aim at building

libraries of elementary, independent model fragments (e.g. processes, component

behaviour). This approach is called compositional modelling, which provides basis

for reusing models, a desirable features for many industrial applications.

• Include only those distinctions in a behaviour model that are essential for solving a

particular task. The goal is to obtain a finite representation that leads to coarse,

intuitive representations of models (not a detailed design with complete set of

numerical data). This feature is termed as “qualitativeness”.

• Inference of behaviour from structure.

(a) Main approaches of qualitative reasoning

(b) Typical features of qualitative reasoning

Figure 1.2 A general view of qualitative reasoning.

Component-based Process-based Constraint-based

de Kleer & Brown’s confluence-based qualitative physics E.g. Models of automotive electronics (Struss and Price, 2004)

Forbus’s qualitative process theory E.g. CyclePad (Forbus et al., 1999) and Garp workbench (Bredeweg et al., 2007)

Kuipers’s QSIM where qualitative mathematics is used directly in simulation E.g. Error-based Simulation (Horiguchi et al., 2007)

Explicit representation of conceptual

knowledge

Inference of behaviour

from structure

Qualitativeness

(Intuitive representation of

model)

Reductionist

(Compositional modelling)

Explaining the behaviour in terms of their cause-effect

chain

8

1.2.2 Qualitative Process Theory (QPT)

Ontology is a specification of a representational vocabulary for a shared domain of

discourse. In the simplest sense, it is a model of “meanings” that represents our

conceptualization of the world. Ontology has the potential to facilitate the formation of

semantic relationships between various portions of useful information to enhance the

learning experience in an educational setting (Yang, 2007). Our qualitative models are

constructed using QPT. QPT is a process-centred ontology that supports knowledge

acquisition (gather the relevant knowledge) and model construction (creation of

relationships among chemical parameters) in the simulation environment. Figure 1.3

shows a general view of QPT.

A physical situation is usually described in terms of a collection of objects, their

properties and the relationships between them. QPT provides the means to draw types

of basic, qualitative deduction and reasoning about the combined effects of several

processes in a physical situation. The modelling constructs of QPT will be discussed in

turn. Note that words typed in italics which will appear in later illustrations are QPT

modelling constructs. In QPT, a model can be constructed for an individual view or a

process. The individual views describe objects and their general characteristics while

processes are the agents that cause changes in objects over time. In other words, a

process supports changes in system behaviour. There are five slots in a process

specification, namely Individuals, Preconditions, Quantity-conditions, Relations

(statements about functional dependencies among objects’ characteristics) and Direct

Influences (denoted by I+ or I−).

One of the important modelling constructs for describing the relationships between

quantities is the qualitative proportionalities (the P+/P-). These constructs propagate

9

the effects of processes that express unknown monotonic functions

(increasing/decreasing/unchanged) between two quantities (e.g. charge, covalent bond,

lone pair electrons, electro-negativity and nucleophilic reactivity). A quantity space is

defined by a set of alternating points (e.g. [negative, neutral, positive]). For example, at

any given point of time, the charge of any atom is either negative or neutral or positive

(Figure 1.4). The signs of change (e.g. [-1, 0, 1]) are used for assigning values to

quantities. Note that “-1” means decreasing (i.e. the value on the left side of the current

state in a quantity space will be assigned); “0” is non-changing, and “1” means

increasing (i.e. the value on the right side of the current state in a quantity space will be

assigned). When a quantity’s value is above or below a specific limit point, some

physical phenomena occur. As an example, suppose that an atom’s current charge is

“neutral” and the sign of change for this parameter in a given process is “1”, then the

new state of its charge will be “positive”. Direct influences are represented as I+ (Q1,

Q2) and I- (Q1, Q2) where “Q” stands for “Quantity”. Influences can either be positive

or negative. A subset of the QPT notational system used in this work is defined Table

1.1. In chemistry, changes are caused by continuous physical processes. These changes

propagate through the system via qualitative proportionalities which indicate causal

relationships between quantities.

10

Figure 1.3 A general view of QPT.

charge = [negative, neutral, positive]

quantity quantity space

Figure 1.4 A “charge” quantity and its space.

Individual-Views modelling

(Describe objects and their general characteristics)

Processes modelling

(Processes are agents that cause changes in objects over

time) Qualitative Reasoning Ontology

Key ideas:

1. Direct influences 2. Qualitative proportionalities 3. Sign of change 4. Amount in magnitude 5. Correspondences 6. Quantity space

Interacting

Processes

(All causal changes stem from physical processes)

11

Table 1.1: Some notations and semantics of the QPT modelling constructs.

Notation Description

Is Direct influence notation. E.g. s = {-1, +1}, where +1 = increase (more covalent bonds are made); -1 = decrease (reduce a covalent bond). Only processes can have direct influence.

P

Proportionality statements. E.g. Q1 +

−P Q2 means “an increase in Q2 will cause a decrease in Q1”, where Q1 and Q2 are chemical parameters.

Am Amount in magnitude. E.g. the number of valence electrons of an atom.

Ds Sign of change. E.g. -1 = negative (move to the left side of the

quantity space), 1 = positive (move to the right side of the quantity space). A quantity space is a set of candidate values for a chemical parameter used in simulation.

This work expresses the general chemical principles of organic reactions as QPT

processes, where processes are the mechanisms that support changes in the chemical

system behaviours. The five slots of a QPT process specification are depicted in Figure

1.5.

QPT process

3. Entry-condition

4. Direct-influence

5. Relations

1. Individuals 2. Preconditions

Figure 1.5 The five slots of a QPT process.

This ontology is suitable for testing our reaction cases since in this formalism, changes

are caused by continuous physical processes (e.g. the series of organic processes),

which provide the notion of mechanism for causality (the way a phenomenon or a

12

prediction is explained). In QPT, histories are used to represent how things change

through time. This notion of history provides the means to describe the mechanism

used to produce a synthesis path (the path that leads the initial substrate to the formation

of the final product). We will also demonstrate that the modelling constructs presented

in Table 1.1 are sufficient to provide explanation at a conceptual and intuitive level to

chemistry students. Subsequently, we use representations inspired by QPT to develop

the reasoning algorithm for the simulator prototype which will be discussed in detail in

Chapters 3, 4 and 5.

1.2.3 Organic Reaction and Organic Mechanism

Reactions: An organic reaction is a chemical reaction involving organic compounds,

usually between an electrophilic centre and a nucleophilic centre. In any chemical

reaction, some bonds are broken and new bonds are made. A bond is what links two

atoms together within a structure. It is formed by the sharing of a pair of electrons

between two atoms. Atoms can form bonds by sharing unpaired electrons (also called

“lone pair electrons”). Often, these changes are too complicated to happen in one

simple stage. Thus, usually a reaction may involve a series of small changes one after

the other. A reaction mechanism describes this series of changes. Reactions can be

classified as acid/base reactions, functional group transformations (one functional group

can be converted into another) or as carbon-carbon bond formations. Reactions can also

be classified according to the process or mechanism taking place and these are specific

for particular functional groups (Groutas, 2000; Atkins and Carey, 1997).

Mechanisms: A detailed description of how a reaction can occur is called “reaction

mechanism” (Fessenden and Fessenden, 1998). A reaction mechanism (may also be

13

called “organic mechanism” or simply “mechanism”) can be defined as “a description

of the sequence of steps that occur during the conversion of reactants to product”. The

mechanism tells us how bonds are formed and broken and in what order things happen.

In other words, it is a structural description of the individual reaction steps during

conversion. Mechanism of reactions shows that chemical reactions occur by specific

routes. Hence, the route can be used to justify results and to explain why the reactions

perform the way they do. This is because the “how” of a chemical reaction is the issue

to be explained when a mechanism is proposed for it. The above description is rather

symbolic and qualitative, not needing quantitative data to predict the final products, or

explain towards the simulated results.

Organic synthesis, on the other hand, is the study of creating new compounds and the

planning for the “creation” task would require understanding of organic mechanisms.

Often, organic chemists will identify the electron-poor site and electron-rich group

when trying to work out a reaction mechanism. Most of the time, organic chemists

could work out the mechanisms by using commonsense developed from their chemical

intuition and knowledge. As one can see, the nature of the problem is “qualitative” in

that it is about electron movement, for example, from where should one start moving

the electrons around and to where the electrons should go and why so – a very suitable

field for applying the qualitative reasoning approach.

An organic chemist will usually look into the reaction mechanism to help explain the

outcome of a reaction. When chemists want to create a novel compound, they would

first draw the reactant structures and then draw the structure of the product(s). With

their chemistry knowledge and chemical insights, they then work out possible

mechanisms from reactant to product. In this scenario, the chemists will attempt to

14

carry out organic synthesis by following the mechanisms they proposed. Examples of

reaction mechanism are SN1 (unimolecular nucleophilic substitution), SN2 (bimolecular

nucleophilic substitution) and elimination.

Families of organic compounds are characterized by the presence of distinctive

functional groups. A vast majority of organic reactions take place at functional groups.

Functional groups are the structural units responsible for a given molecule’s chemical

reactivity. A functional group is a portion of an organic molecule, other than carbon

and hydrogen (the normal hydrocarbon framework) or which contain bonds other than

C−C and C−H. In this approach, each organic reaction is described as changes made

on the chemical parameters (e.g. charge, covalent bond and lone pair electrons) of the

functional groups. These units will determine what type of chemical process can be

activated. In the scope of this work, two specific functional groups, namely “OH” and

halogen atoms were tested. The mechanisms used for reactions described in this thesis

are SN1 and SN2 involving the two functional groups given earlier.

Many chemistry students learn organic reactions by memorizing the steps and formulas

of each reaction which can easily be forgotten. They face difficulties in dealing with

the principles governing the processes and the cause-effect interaction (the causal

theories) among these processes. Traditional approaches to organic chemistry

modelling are based on formulas and quantitative data. These approaches do not make

good use of the qualitative nature of organic mechanism knowledge. The lack of tight

coupling between concepts and their embodiment makes most education software

unable to explain or justify its results. Meanwhile in chemistry lab, students only see

the results of a reaction which may take the form of either some gases being released or

colour changes occurring in the reaction. Without proper explanation, these

15

observations do not help much in nurturing their understanding of the subject. Learning

organic reaction mechanism needs some basic skills and these skills are related to the

nature of the problems. Students’ barriers to understanding the course is depicted in

Figure 1.6.

Seen as a

difficult

subject Students learn the

subject by

memorizing the

reaction steps

In lab,

experiments

cannot explain

results

Requires

good chemical

intuition

Needs to

know the

principles that

govern the

processes

In classroom, pens & chalks

are used to show the

students how to use arrows

to indicate movement of

electrons

Understanding

organic reaction

and organic

mechanism

Figure 1.6 Students’ barriers to understanding the organic chemistry course.

Many AI techniques have been used to develop software for organic synthesis and the

study of reaction mechanisms. These applications do not utilize qualitative reasoning

approach, and they are not QPT-based systems. Previously, simulation of chemical

reactions relied heavily on precoded facts and rules where knowledge is first sought

from chemists and then transferred it into computer representation. The process is time

consuming and always causes bottlenecks in information upgrading (refer to Chapter 2

for further details). On the other hand, qualitative reasoning provides an alternative

way for chemist to represent, develop, organize, and implement models. Advantages of

this approach include the possibility of deriving conclusions about the organic

chemistry phenomena without numeric data; a compositional approach that enables the

16

reusability of models representing partial behaviours (such as a small step in the entire

reaction route) and the capability to provide causal interpretation of system behaviour.

We will now give one example of an organic chemical reaction to show that qualitative

description is sufficient for understanding its underlying chemical principles. In the

example, quantitative data and precise measurements are not at all required. In

chemistry class, students are taught that the compound “(CH3)3C−OH2+” will undergo a

“break-bond” process. The cleavage of the carbon-oxygen bond in tert-butyloxonium

ion ((CH3)3C−−−−OH2+) is due to the unstableness of the oxygen atom since it now has

three covalent bonds (valency for oxygen is two). Once the carbon-oxygen bond is

broken, the oxygen will regain its stability. However, the charge on carbon in the main

chain of the organic compound will become positive since one of its valence electron is

donated to the oxygen in order to neutralize it. The tertiary carbocation ((CH3)3C+) is

now unstable and it is reactive. Carbocations are transient, electron-poor and highly

reactive species. The changes that propagate from a chemical parameter to another can

be easily represented and modelled as a few functional dependency statements (or

“qualitative proportionalities” in QPT term), as follows. Note that “Y +

−P X” means

“increasing X followed by decreasing Y”, and those words after the “//” sign are

remarks. The initial values that are assigned to the parameters can be taken from the

basic chemical facts knowledge base.

lone-pair-electron(O) −

+P no-of-bond(O)

// decreasing oxygen’s covalent bond will increase its lone-pair electron

charge(O) +

−P lone-pair-electron(O)

// increasing oxygen’s lone-pair will decrease the charge on it; oxygen is being neutralized

charge(C) −

+P no-of-bond(C)

// decreasing carbon’s covalent bond will increase its charge; carbon is now positive

17

Overall, organic reactions are much easier to classify into generic structural types than

inorganic reaction (the specific reasons are discussed in Chapter 2). However, in order

to characterize and reason with instances of these structural types, appropriate

qualitative representations are needed. It is a challenge indeed to cast expert knowledge

into QPT models representing the behaviour of organic reactions and even more

challenging when the chemical processes are to be made reusable to fulfil one of the

research objectives as given in Section 1.4.

1.3 Problem Statement

When students were asked whether they find learning organic reaction difficult, most of

them claimed it is so. Students get confused mainly due to the abstract nature of the

problem. As such, most of the students learn organic reaction by memorizing the steps

involved in a reaction, and the formulas taught in classes. Consequently, some students,

particularly weak learners would require additional learning aids such as a software tool

to assist them with their learning. If students learn the subject by memorizing the steps

and format/pattern of each reaction, then they may not be able to answer simple

questions such as: Why would this reaction go this way? What is favourable about this

particular step? What is causing the reaction to begin? Why was the process stopped?

What happened to the nucleophile and electrophile? This is the educational problem

that is being solved, as memorizing formulas is not a good method in any type of

learning. In science education, it is believed that students should understand the

qualitative principles that govern the subject including the cause-effect relationships in

processes before they are immersed in complex problem solving. When these

fundamental skills are acquired, the entire learning activity can be made more effective.

The nature of the chemistry domain described in this work is very qualitative (Tang and

18

Syed Mustapha, 2006) and understanding the subject would require application of

chemical insight and good use of chemical commonsense. The models people use in

reasoning about physical world are called mental models (Gentner and Stevens, 1983).

It is useful to study the connection between mental models of chemists when solving the

reaction problems and qualitative reasoning approach. The result of the study can help

represent domain knowledge in the modelling constructs of the QPT ontology.

Since no work has been done on the application of qualitative reasoning for modelling

organic chemical reactions, along with an explanation facility for describing those

reactions, it is aimed to determine to what extent qualitative reasoning could be useful

for predicting the final products of an organic reaction and explaining its simulation

results.

Automating the construction of QPT process model during runtime is also a work to be

done since the so-called “model automation” is still an issue in the development of QR

related tools. In particular we would like to seek an answer for the question “can the

qualitative models be automated in this domain?” Furthermore, the extent to which

chemical process generalization can be accomplished to support different types of

reaction mechanism is also of our interest.

Computers have always been regarded as powerful tools for educational purposes.

Traditional ways of using computers in educational settings are often limited to

supporting the creation, sharing and presentation of information. The traditional

approach for developing science educational software has shortcomings, in that the

conceptual understanding can only be found in the program’s documentation and not in

the software itself. The lack of tight coupling between concepts and their embodiment

19

makes it difficult to explain the results (Forbus, 2001). These programs cannot

“explain” because the results are obtained through chaining the rules during runtime or

by searching the reaction routes that have been precoded. As such, traditional

chemistry educational software is inadequate to promote understanding in chemistry

subjects as the programs using traditional approach will only return the result. We see

there is a need to find new approaches to develop software that can help explain

chemistry phenomena to chemistry students. The qualitative reasoning approach

addresses this issue to a promising extent (Bredeweg and Struss, 2003).

QALSIC (Pang et al., 2001; Syed Mustapha et al., 2002; Syed Mustapha et al., 2005), a

previous work that uses QPT for modelling inorganic chemistry reactions faced several

problems. One of the major problems is the incorrect behaviour prediction of the

chemical system. The QALSIC program has not accomplished three major tasks: (1)

The modelling phase is not automated, (2) reusable processes are not clearly

demonstrated, and (3) evaluation of the software and mental change survey were not

conducted. Moreover, most of the explanations are precoded. A complete review of

QALSIC’s limitation is given in Chapter 2.

1.4 Objectives

There are three main objectives of this work. The primary objective of this study is to

design a qualitative reasoning framework that can be used to qualitatively model,

simulate and explain organic reaction mechanism for learning purposes. Due to the

wide scope of qualitative reasoning (from task-level reasoning, ontologies, techniques,

cognitive modelling, application, to creating new kinds of educational system called

articulate software), in the design of the framework, we focused on a few issues relating

20

to: (1) Improvement of the explanation generation approach since current chemistry

software cannot appropriately explain a chemical phenomenon, (2) Design of the

qualitative models from the perspective of promoting “model reuse” in order to support

the simulation of multiple organic chemistry reactions, and (3) Automation of the QPT

model construction for organic processes once the user has entered a pair of reactants.

The secondary objective is to develop a software tool (simulator prototype) to study the

extent to which the explanation generation approach can help improve a student’s

understanding of the subject. The tool should be able to accomplish the following

tasks: (1) To make correct predictions for a large set of reacting species, with no

specific answers in the knowledge base; purely through reasoning from the fundamental

principles of organic reactions, and (2) To improve students’ reasoning ability and their

understanding of the organic chemistry subject when they are exposed to the tool. We

believe that using a software tool installed on the student’s machine can provide

valuable advantage in education. Using computers to complement human instructors

have also been a long-standing motivation for research on AI in education.

The third objective is to investigate the main problem faced by QALSIC. The

investigation serves to solicit the main reason as to why the software sometimes returns

incorrect answers.

1.5 Research Questions

Reflecting on the current state of the art in chemistry educational software, what is still

missing is the application of qualitative reasoning to solve organic chemistry problems

and software that can explain its results. The overall research goal of this work is to

investigate how the qualitative reasoning approach can be utilized for this purpose and

21

to develop a framework for the simulation of organic chemistry reactions using the

approach. The framework will be implemented in a simulator prototype called QRiOM

(Qualitative Reasoning in Organic Mechanism) in order to test the simulation and

explanation capabilites. Briefly, the software is developed to play two roles; to

substantiate the achievement of the objectives and to support student learning in the

organic chemistry course. To accomplish the work, the research questions are set as

follows:

• How do chemistry experts construct their mental models when solving problems?

• How can QPT be used to represent the domain knowledge?

• How can qualitative reasoning be used to support organic reaction simulation?

• How can qualitative reasoning be used to support learning of organic processes?

• How can the modelling constructs of QPT be used to explain a chemical

phenomenon?

• How can qualitative model construction be automated?

• How can the qualitative model be made reusable?

• How can causal graph generation be automated at runtime?

• How can knowledge validation be carried out to ensure correct use of domain

knowledge in a simulation?

• How effective and useful is QRiOM as viewed by the users?

• How can a student’s mental change be measured?

• How can QALSIC be tested to reveal its prediction deficiency?

The above questions will be addressed in this work by a combination of theoretical

analysis, algorithm development, rapid prototyping, and user evaluation. The main

educational goals of the qualitative simulator are that

22

a) The students’ conceptual understanding of the subject is improved

b) After using the software, the students are able to explain a chemical phenomenon

in a more elaborate way as a result of acquiring skills in knowledge articulation

c) The students will undergo mental change such that they gain more confidence in

solving new problems

It is important to study the connection between mental models and the qualitative

reasoning approach. When the mental models of chemistry experts can be captured and

represented in QPT, then, weak students can be assisted in solving the same problem.

As a result, the students’ reasoning ability and their logical thinking will be improved

when learning from the software, especially from the causal explanation that explains a

situation using only ontological primitives of QPT. Moreover, conducting experiments

could be expensive and hazardous, but the software simulation approach allows users to

repeat an experiment any number of times at no extra cost.

In order to achieve all the objectives, the following tasks are needed:

• Capturing human expertise in the field of organic reaction mechanisms and

representing them as qualitative data in the form of qualitative models.

• Using a process-based ontology (the QPT) to represent chemical knowledge

qualitatively.

• Designing qualitative reasoning algorithm for reaction mechanism simulation.

• Finding an easy way of generating explanation effectively in order to facilitate

mastering of organic reaction concepts via the QPT-based explanation.

• Developing an algorithm that enables model automation.

23

• Classifying chemical processes for a variety of organic substrates in order to

promote model reusability.

• Developing a framework for hierarchical structuring of processes to facilitate

effective use of knowledge. This is not found in QALSIC.

• Developing a small set of chemistry ontology called OntoRM (Ontology for

Reaction Mechanism) for use with reaction mechanisms. The ontology will be

used to perform validation during reasoning.

• Developing and implementing a simulator prototype QRiOM.

• Evaluating the effectiveness of QRiOM.

• Studying how learners gain conceptual understanding when interacting with

QRiOM.

• Investigating the QALSIC software by testing it with a mixture of inorganic

reaction experiments with an intention to seek the main cause for the system error

(e.g. not producing correct prediction).

Table 1.2 tabulates the relationship between the research problems and questions, the

objectives, and the thesis chapters that fulfilled them.

24

Table 1.2: Relationships between research problems and questions, objectives and the corresponding thesis chapters that answered them.

Research problems Research questions Objectives Related

chapter

Organic reaction

mechanism is a difficult

subject to learn, even at

the conceptual level of

understanding. a. Most of the students

learn organic chemical reactions by memorizing the steps involved in a reaction

b. No work has been done on solving organic reaction problem using qualitative reasoning approach

• How do chemistry experts construct their mental model when solving the problem?

• How can qualitative reasoning be used to support the learning of organic processes?

• How can QPT be used to represent the domain knowledge?

• How can qualitative model construction be automated?

• How can the qualitative models be made reusable?

• How can qualitative reasoning be used to support organic reaction simulation?

• To capture human expertise in the field of organic reaction mechanisms and represent them as qualitative data in the form of qualitative models.

• To examine and use a process-based ontology (the QPT) to represent chemical knowledge qualitatively in order to model the behaviour of organic reaction mechanisms.

• To develop algorithm that enables model automation.

• To classify chemical processes for a variety of organic substrates in order to promote model reusability.

• To design qualitative reasoning algorithm for reaction mechanism simulation.

Chapter 3

Chapter 4

Existing chemistry

educational software

cannot explain simulated

results.

a. Traditional chemistry

educational software is inadequate to promote understanding

b. There is a need to find new approaches to develop software that can help explain chemistry phenomena

• How can the modelling constructs of QPT be used to explain a chemical phenomenon?

• How to automate causal graph generation at runtime?

• How can the domain knowledge (represented in QPT), and OntoRM ontology be effectively used?

• How can knowledge validation be carried out?

• How effective is the simulator as viewed by the users?

• How can a student’s mental change be measured?

• To find an easy way of generating explanation effectively in order to facilitate mastering of organic reaction concept via the QPT-based explanation.

• To automate causal graph (state graph) generation as a mean to explain an organic process phenomenon.

• To develop a reasoning framework for organic reaction simulation and explanation.

• To develop a small set of chemistry ontology called OntoRM for use with reaction mechanisms for knowledge validation use.

• To define the types and roles of chemical knowledge at different abstraction lelvels in order to facilitate effective use of the knowledge. This is not found (or rather unclear) in QALSIC.

• To develop and implement a simulator prototype QRiOM.

• To evaluate the effectiveness of QRiOM and its explanation facility.

• To measure a learner’s mental change when interacting with the software.

Chapter 4

Chapter 5

Chapter 5

Chapter 6

QALSIC – a qualitative

simulator for modelling

inorganic chemistry

reactions faced several

problems.

• How can QALSIC be tested to reveal its prediction deficiency?

• To test QALSIC software with a mixture of inorganic reaction experiments with an intention to seek the main cause in the system error (e.g. producing incorrect prediction).

Chapter 2

25

1.6 Scope of Research

In this work, qualitative reasoning based on qualitative process theory ontology is used

to simulate nucleophilic substitution reaction specifically on the following two organic

mechanisms:

• Unimolecular nucleophilic substitution (SN1)

• Bimolecular nucleophilic substitution (SN2)

1.6.1 System Scope

A simulator prototype, named QRiOM is developed. Features of QRiOM are

summarized as below:

• The system can only accept and recognize organic compounds as substrates. The

substrates are limited to alkanes, alcohols and alkyl halides.

• The system can model the behaviour of organic reaction automatically based on a

<substrate, reagent> input pair.

• The system can predict and return the final products based on qualitative

reasoning approach.

• The system is able to recommend an organic mechanism for a given pair of

reactants.

• The system can generate various forms of explanation (texts and diagrams) based

on the modelling constructs of QPT.

26

1.6.2 Course Scope

The software is suitable for use in the following courses:

• “Introduction to organic chemistry” at undergraduate level.

• “Organic reaction mechanism” at undergraduate level.

1.7 Main Results

A qualitative reasoning framework that supports model construction, model reasoning,

results prediction and justification has been developed. The framework has also been

implemented, resulted in QRiOM, a simulator prototype that can simulate and explain a

number of organic reactions to the chemistry students. QRiOM is the first chemistry

learning software that uses qualitative reasoning to explain chemical phenomena of

organic reactions. In particular, this work describes the use of QPT ontology to model

the conceptual knowledge and chemical theories of organic reactions and reaction

mechanisms at the finest granularity of processes, such that explanation at deeper level

can be achieved. Qualitative reasoning based on QPT models is able to predict the final

products (outcomes) of a reaction and to explain the predicted outcomes. Besides, the

qualitative models in the reasoning framework can support many types of reaction

mechanisms (other than the nucleophilic substitution reaction defined in this work).

Upon completion of the QRiOM software development, a preliminary system

evaluation was carried out. The results of the initial evaluation of QRiOM showed that

it is effective in terms of its ability to promote understanding of organic reactions

through the inspection of the explanation generated by the software.

In summary, the main results in the design and development of the reasoning

framework and the evaluation of QRiOM are as follows:

27

• A framework for modelling and simulation of organic reactions has been developed.

• Model automation logic has been formulated.

o Automating the construction of QPT models is made possible by first

identifying the type of the reacting species, then the chemical process that can

occur.

• Model reuse is supported by the framework.

• A simulator prototype for explaining organic reaction to the chemistry students has

been developed and implemented.

• OntoRM has been designed for use in validating the knowledge used in reaction

mechanism simulation, as well as the simulated results.

• An analysis of application of QR approach in inorganic versus organic reaction

simulation was carried out.

o Organic chemistry reactions are relatively easier to be modelled using QR

approaches as compared to inorganic chemistry reactions.

• User evaluation of the tool was conducted where a positive response was received

as far as student evaluation is concerned.

1.8 Thesis Structure

This thesis consists of seven chapters. Chapter 2 is a review of the relevant literature.

The review concentrates on early qualitative reasoning applications and systems that are

developed using the qualitative process theory technique. Next, the chapter presents the

chemists’ way of solving organic chemistry problems. Then a review of computer-

assisted applications in organic chemistry is given. This chapter also presents two

representation schemes for organic molecules (SMILES and InChi). A review of the

literature on two related works (LHASA and QALSIC) is then presented. Finally, a

28

brief description of the strengths and weaknesses of the reviewed approaches and

systems is presented.

Chapter 3 provides an overall description and justification of our model construction

logic. A review of existing literature specifically focuses on qualitative modelling is

first provided followed by performing a study on chemical reactions involving alcohols

and alkyl halides. From the study, “make-bond” and “break-bond” were identified as

the generic processes in the simulation of organic reactions involving the two groups of

substrates. From the analysis of various chemical reactions occurring under SN1 and

SN2 mechanisms, the common set of chemical theories and behaviour have been

identified for the two processes; from which the model automation procedures are

formulated. Proofs are given to justify the model automation procedures. Next, the

mapping of chemical theories onto QPT constructs is discussed. This chapter ends with

several examples of learning activity that are derived from the inspection of qualitative

models.

Chapter 4 describes the qualitative reasoning scenario for numerous organic reaction

problems. A review of existing literature specifically for qualitative simulation and

explanation is first provided, followed by a detailed discussion on the simulation and

explanation generation techniques. This chapter underpins how the QR approach can

be used to support a learning task, and that students actually manifest the kinds of

learning behaviour we anticipated. Reusability of processes is also described in this

chapter while the entire reasoning framework is left for the next chapter of this thesis.

Chapter 5 describes the QR framework developed in this research and discusses the

simulation results. This chapter discusses the roles of each functional component in the

29

framework. The chapter starts with a schematic view of the framework. Then it

describes the workflow of the framework. The design logic for each component is also

presented. This chapter also outlines all the algorithms used in each main component of

the framework. Reusability of the framework components is duly described in this

chapter. The OntoRM ontology for reaction mechanism is also presented. A few

validation examples to show how OntoRM helps validate (and constrain) the use of the

chemical knowledge are also included. The idea and motivation of knowledge

structuring are also presented. Finally, the simulated results are presented and

discussed.

Chapter 6 presents the evaluation results of the simulator prototype. The design of the

evaluation process places particular emphasis on how the explanation can help enhance

students’ understanding of organic reactions and the reaction mechanisms used in the

simulation.

In Chapter 7, the thesis is concluded by presenting the main results and achievements of

this work. Some of its limitations are described, and suggestions for future research are

also provided.

Figure 1.7 summarizes the thesis layout in graphical form.

30

Figure 1.7 Thesis layout.

Chapter 1 Introduction

Chapter 2 Literature Review

Chapter 3 Qualitative Modelling of

Organic Reactions

Chapter 5 Qualitative Reasoning Framework for Organic

Reaction Simulation

Chapter 4 Qualitative Simulation and

Explanation Generation

Chapter 6 Evaluation of QRiOM

Chapter 7 Conclusion

31

Chapter 2 Literature Review

2.1 Introduction

In this chapter, the state of the art of qualitative reasoning applications is reviewed, with

special attention paid to application of QPT to building educational software for

teaching and learning purposes. There is a wealth of literature on the topic of

qualitative reasoning systems in generating explanation. However, very few of this

literature are directly related to our study domain. Instead, as will be seen through this

review, the majority of the studies discuss the modelling activity as the way to acquire

knowledge and the application domains are not organic reaction mechanism. The

structure of this chapter is as follows. Section 2.2 reviews the relevant literature on

qualitative reasoning applications. Section 2.3 reviews the literature on previous work

using qualitative process theory. In Section 2.4, the suitability of the selected domain is

discussed. In Section 2.5, computer-assisted applications and use of AI in chemistry are

presented. Section 2.6 discusses the SMILES and InChI standards for complex

encoding organic molecule, a popular input format for complex organic compound. A

comprehensive study of related works is given in Section 2.7. Section 2.8 concludes

this chapter.

2.2 Review of the Literature on Qualitative Reasoning Applications

As development in qualitative reasoning has direct influence on educational software

development, some representatives of QR related tools (software systems) are reviewed.

The tools are divided into two categories, namely for engineering and education.

32

2.2.1 In Industry

Earlier discussion on QR applications is mainly focused on physics and engineering.

One of the discussions is on applying qualitative reasoning for complex controllers

(Bratko and Šuc, 2002; Bratko and Šuc, 2003a). The work explored qualitative data

mining to find qualitative patterns from numerical data. Some of their works include (1)

behavioural cloning (2) reverse engineering of controllers (3) QUINN induction

program for machine learning of qualitative trees (Bratko and Šuc, 2003b).

Application of QR and model-based technology in the automotive industry to support

on-board diagnosis, failure analysis, and automation of electrical design analysis tasks

are among research conducted by Advanced Reasoning Group at The University of

Wales, Aberystwyth (Advanced Reasoning Group Homepage,

http://www.aber.ac.uk/compsci/Research/mbsg/). Many prototypes and products have

been developed by the group. Some of these are SoftFMEA, Dougal and AutoSteve

that provide FMEA needed by some of the diagnosis carried out on the VMBD project

that runs on model-based on-board demonstrator vehicle. Some of the solutions (e.g.

the industry processes) have been deployed by car manufacturers (Struss and Price,

2004). Application of QR to spatial data mining in pandemic disease outbreaks and

spatial reasoning charts another milestone for the practical value of QR. For example,

Bailey et al. (2006) reported how analysis of spatial datasets can be done through

model-based reasoning for more effective use of data. Bailey and Zhao (2003) also

described approaches to data-poor and data-rich problems in Qualitative Spatial

Reasoning. More recent applications are the use of QR in learning turn-based strategy

game (Hinrichs et al., 2006), where qualitative models are used for providing and

acquiring strategies for an unsupervised player. Ricardo et al. (2006) applied QR to

manage Information Technology (IT) services such as capacity and incident

management when accessing the Web/Application/Database servers via Web pages.

33

The causal relations found in qualitative models was said to be of importance for

understanding IT system in improving Web services.

2.2.2 In Education

Education is a popular application area in qualitative reasoning research. The potential

of this new methodology for building science educational software has been

demonstrated by several high cited works such as CyclePad (Forbus, et al., 1999),

VisiGarp (Bredeweg and Winkels, 1998; Bouwer and Bredeweg, 2001; Bouwer, 2005),

ALI (D’Souza et al., 2001), and Betty’s Brain (Biswas et al., 2001). These systems

posses a common feature, that is the ability to predict and explain the behaviour of

physical systems in qualitative terms in an educational and training setting. The success

of the software to promote and induce learning and the birth of articulate software

(Forbus, 1997; Forbus and Whalley, 1994) marked another milestone for further

investigation, application, and popularity of qualitative reasoning techniques.

Modelling provides a means for articulating knowledge (Bredeweg and Forbus, 2003).

Describing physical processes in qualitative (conceptual) terms, and building a model

using a suitable QR ontology will require the learner to acquire reasonable concepts

(e.g. be able to articulate various knowledge aspects) about the subject. There are

different ways in which learners can acquire knowledge. Inspecting ready-made

simulations is one; another approach is to engage learners in building models as a way

to acquire knowledge. Typical examples are VModel (Forbus et al., 2001), Betty's

Brain (Biswas et al., 2001) and VisiGarp which became part of the Garp3 (Bredeweg et.

al, 2007) which is an environment that can be used to build models but also to learn

34

from running and inspecting ready-made models. Both the VModel and VisiGarp

environments use diagrammatic representations to facilitate knowledge articulation.

Over the past 20 years many prototypes and full systems using the QR approach have

been developed. Besides the five systems (CyclePad, VisiGarp, ALI, VModel, and

Betty’s Brain) stated above, other QR related tools for educational purposes, but not

limited to, are as follows:

• High school level mathematics by Neuper and Wotawa (2002). It is a framework

for handling knowledge based on model-based reasoning. The work constructs

mathematical model from textual description to describe a “mathematical concept”.

Generation of explanation is only possible in the modelling phase. Techniques

used are script and rewriting formula by application of theorems.

• CPRODS (Sime, 2002). The work consists of six qualitative and quantitative

models. Instructional design is provided. An overview of hypothesis scratchpad is

included. However, there is no reflection on the learning process. As a result, it

cannot differentiate between an excellent teacher and a poor teacher.

• A cognitive tool called MMforTED (Toppano, 2002). Implemented as hypermedia,

constituted by a collection of cases of simple electrical and fluid mechanical

devices. The tool consists of graphs of models that can be used for problem solving

or communication, and being able to reason about domain concepts and their

relations. The use of the Web as a content provider as well as the delivery medium

of instruction is considered pioneering at that time.

• Learn C++ tutoring system (Kumar, 2002). The tutoring system supports program

animation.

• Intelligent Tutoring Systems for Training by Vadillo and Ilarraza (1995). The

simulation run by the system is based on components ontology and QPT.

35

Structured behavioural explanations can be generated based on a causal domain

representation.

• QALSIC is a system for inorganic chemistry analysis and simulation using process-

based simulation approach. More discussion on this work is presented in Section

2.7.2.

• Salles and Bredeweg (2002) and Salles et al. (2003) explored qualitative models in

ecology and their use in intelligent tutoring system. The goal is to model the

effects of fire on vegetation dynamics for educational purposes. QR approaches

used are modelling based on SIMAO (Guerrin, 1991; Guerrin, 1992) and QPT-

based modelling.

• Authoring Graph of Microworld (Horiguchi and Hirashima, 2005; Horiguchi and

Hirashima, 2008; Horiguchi and Hirashima, 2009). It is a method that can assist an

author in indexing a set of microworlds based on the constructed qualitative

models. By using Graph of Microworld, it is possible to adaptively select the

microworld a student should learn next.

• Error-Based Simulation (ESB) applied QSIM as the qualitative reasoning technique

to predict qualitative behaviour in mechanics problems and to generate feedback

for learning from mistake (Hirashima et al., 1998; Hirashima and Horiguchi, 2001;

Horiguchi and Hirashima, 2006). Hirashima’s group has developed a prototype

and conducted an evaluation to measure what conceptual changes are caused by

using the EBS approach. Results showed that EBS learning was useful (Horiguchi

et al., 2005; Horiguchi et al., 2007).

A summary of the reviewed literature on QR work (and systems) is provided in

Appendix A. Most of the QR work has side products which are Intelligent Tutoring

Systems (ITS). Our work is not to build an ITS, but to prove that our reasoning

36

framework is feasible and practical and when implemented, the simulation results and

explanation generated by the software can help improve the chemistry students’

understanding of the organic reaction processes. Traditionally, simulations are based on

mathematical models which have several shortcomings when it comes to explaining

them to relative novices and since our target users are students, we believe qualitative

explanation will be more appropriate and useful. Model construction activity is not

included in our implementation. This is because it does not suit the learners’

background of this work. Instead, qualitative modelling is automated for their

inspection.

2.3 Review of the Literature on Work Using Qualitative Process Theory

Representatives of application of qualitative reasoning based on QPT are as follows:

• Ecology simulation (Salles et al., 1996; Bredeweg et al., 2006). The simulator is

able to predict and explain the behaviour of physical systems in qualitative terms.

• GARP (Generic Architecture for Reasoning about Physics) by Bredeweg (1992).

This qualitative reasoning engine is implemented in SWI-Prolog that allows users

to simulate qualitative models.

• HOMER by Machado and Bredeweg (2002). In HOMER, concepts and their

relationships are represented graphically. The system included the design of a

support module that can guide the users through the model building process. A

causal model viewer is also included.

• VisiGarp by Bouwer and Bredeweg (2001). The system implements a graphical

interface to GARP which allows users to inspect qualitative simulation models by

interacting with automatically generated visualizations.

37

• WiziGarp by Bouwer (2005). It is a prototype that extended the functionalities of

VisiGarp by utilizing aggregation techniques to simplify qualitative simulations

and by incorporating diagrams and textual means of communication.

• VModel (Forbus et al., 2001). The approach to modeling is to create a student-

friendly visual notation for qualitative process theory (Forbus 1984) and create a

software environment that helps students express their qualitative, conceptual

models as an aid to learning. The VModel qualitative modeling framework is

richer, incorporating physical processes and a student extendable ontology of types

of entities.

• QCM (Dehghani and Forbus, 2009) is a successor to VModel. QCM provides the

basic functionality needed for cognitive scientists to build, simulate and explore

qualitative mental models. QCM is the first modelling tool which has been

specifically designed for cognitive scientists. QCM provides a framework in which

the agent’s knowledge about the causal structure of the world can be captured using

the QPT formalism while the agent’s uncertain knowledge and expectations about

the outcomes of his/her actions can be captured by subjective probabilities and

represented by a Bayesian Network. Modellers can switch the mode of reasoning

from QPT to Bayesian and make probabilistic models. This feature allows

cognitive scientists to take advantage of different types of reasoning available in

both formalisms.

• CyclePad (Forbus et al., 1998) functions as a computer-aided design (CAD) system

for the conceptual design of thermodynamic cycles. Technologies used in CyclePad

are: Constraint propagation, logic-based truth maintenance, qualitative

representations, and compositional modelling (Forbus and Whalley, 1994; Forbus

et al., 1998; Forbus et al., 1999; Forbus et al., 2001). Explanations in CyclePad are

38

represented by structured explanations, an abstraction layer between the reasoning

system and the interface.

• QALSIC (Pang et al., 2001; Syed Mustapha et al., 2002; Syed Mustapha et al.,

2005). A qualitative simulator for learning inorganic chemistry. The limitations of

this software tool will be discussed in Section 2.7.2.

The last two applications of QPT provide the motivation for us to conduct the research

described in this thesis; as much as the teaching of organic chemistry in University

welcomes a novel approach to address some problems as outlined in Chapter 1 (Section

1.3 – Problem Statements). The work described in this thesis combines strengths of

these various approaches including the abstraction of simulation results for generating

explanation simple enough for chemistry students to understand, the use of qualitative

models to represent a chemist’s mental model to explicate the system behaviour of

organic reaction mechanisms, the use of causal theories for learning a topic in organic

chemistry, and the use of qualitative models to derive system behaviour and generate

explanations.

2.4 Analyzing Domain Suitability

Work on a qualitative simulator in the domain of organic reactions has not been

recorded in available literature. Hence, a preliminary study was first carried out before

the full research is embarked upon. The suitability of applying qualitative reasoning in

the problem domain is accordingly the motivation for the development of this project.

Organic chemistry is the study of carbon compounds. The study normally includes

examining the molecular structure and the chemical bonding of the compounds.

39

Organic chemistry is a science that deals with the composition, structure and properties

of substances and of the transformations that they undergo. When the organic chemists

want to create a novel compound, they would first draw the reactant structure and then

draw the structure of the compound they want to create, i.e. the product. They then

work out possible mechanisms from reactant to product. If there are a few possible

mechanisms, they would test them by doing experiments. In this scenario, the chemists

are doing organic synthesis by following the mechanisms they proposed. Most of the

time, the organic chemists could work out the mechanisms by only using commonsense

developed from chemical intuition and knowledge.

Even though the number of known organic compounds is more than 10 million, they

belong to a relatively few structural types and there are even fewer reaction types than

structural types. This makes “process generalization” possible and easier. Functional

groups are the structural units responsible for a given molecule’s chemical reactivity

under a particular set of conditions. Examples of common functional groups in organic

chemistry are alkenes, alkynes, alcohol, amines, amides, ketones, phenol and thiol.

We believe that chemical principles can be modelled and explained qualitatively by the

modelling constructs of QPT to enable the lowest level of reasoning. Traditional

approaches have not been successfully addressed the issue. It will be shown in this

thesis that the explanation follows almost isomorphically from the underlying QPT

reasoning. As a result, there is little need for complicated explanation generation

facilities.

40

2.4.1 Explaining Organic Chemical Reactions in the Classroom

When we asked the question: “How do chemistry lecturers explain organic reactions in

the classroom?” the answer is that most of the instructors use chalk and board and show

the students how to use arrows to indicate movement of electrons. They might show

some animation (which can be downloaded from the Internet) or they use PowerPoint

and painstakingly use the animation in the PowerPoint to show where electrons flow.

The lecturers at the University of Malaya and Universiti Tenaga Nasional do not use

any software that can suggest and explain a reaction from the “mechanism” point of

view. There is nothing wrong with organic chemistry instruction, but the explanation in

classroom lacks “reasoning”. The work described in this thesis is the first use of QR to

predict organic mechanisms and to explain and justify each reaction step in the entire

reaction route leading from the substrate until the most stable product is formed. The

program LHASA (Logic and Heuristics applied to Synthetic Analysis) developed by the

group under the leadership of Professor Corey at Harvard University uses AI techniques

to discover sequences of reactions which may be used to synthesize a compound.

However, the program does not show or suggest the mechanism used in a synthesis.

LHASA will be discussed separately in Section 2.7.1.

2.5 Use of Artificial Intelligence in Organic Chemistry

Application of Artificial Intelligence (AI) to solve chemistry problems started sometime

in the 1970’s. Computer applications mean that two components are needed in problem

solving, i.e. computer and data. There are two key aspects of AI research in chemistry.

These are knowledge representation and knowledge manipulation. Among the

prominent knowledge representation schemes are production rules, semantic nets,

neural nets, and state-space representation. There are, on the other hand, a few

41

knowledge manipulation techniques such as state space search and backward chaining.

In our approach, the reasoning engine is based on a chosen ontology (to model chemical

principles) and a suite of QR algorithms (to use the knowledge as if it is done by a

human chemist). To date, QR approach of reaction mechanism simulation and

explanation has not been investigated.

2.5.1 The Traditional Knowledge-based Approach

Existing knowledge-based systems for organic chemistry are not using qualitative

reasoning as the problem solving technique. Current Knowledge Based Systems (KBS)

or Expert Systems (ES) for solving organic chemistry problems are still very much

relying on precoded facts and rules, and search techniques. These programs will search

the entire knowledge base (KB) for possible “conclusions” for a given problem. As a

result, programs are not able to recognize and process inputs that are not stored in the

knowledge base. Explanation generation is also a great challenge to this type of

programming and development paradigm. There is almost no evaluation of the efficacy

of the traditional approach based on the generated explanation. Evaluation that can be

found is on the generated outputs (not on the system-based or domain-based

explanation).

2.5.2 The Machine Learning Approach

Machine learning is a broad field which includes methodologies such as neural

networks, genetic algorithms, symbolic inductive learning, explanation-based learning

and conceptual clustering. Techniques described in (Dolata, 1998) are self-organizing

map, neural networks and genetic algorithms. These techniques are sub-symbolic and

42

they rely much on the massive precoded facts and rules. The two general purposes for

which machine learning has been used in computational chemistry are the classification

and generalization of data (Rose, 1998; Zupan, 1998). The common theme shared by

the use of machine learning is that it is used to extract regularity from data. There are

many expert systems in chemistry, including toxicological systems, structure

elucidation, and reaction mechanism analysis. Nevertheless, these systems are not based

on qualitative representation and reasoning for describing and manipulating chemical

knowledge. All the above approaches cannot generate explanation dynamically to

explain the why, why-not, and how of a particular question. This is mainly because

systems that use precoded facts and rules cannot provide explanation that is both natural

and causal in nature, but our approach has made this type of explanation possible.

Figure 2.1 summarizes the benefits of applying QR approach in the organic chemistry

domain.

Figure 2.1 Some of the benefits of applying qualitative reasoning to a chemical system simulation.

Why Qualitative Reasoning?

Represents conceptual knowledge explicitly

Supports causal

explanation

Supports behaviour

prediction of chemical system

Provides good means to represent mental attributes

Allows partial

knowledge reasoning

43

2.6 Molecular Representation Schemes

Two standards for representing chemical molecules as linear structure (in software) are

reviewed. The two standards are: (1) Simplified Molecular Input Line Entry System

(SMILES) and (2) International Chemical Identifier (InChI). The two schemes were

studied for the purpose of finding one standard for the internal representation of organic

substrates used in this work.

2.6.1 The Simplified Molecular Input Line Entry System (SMILES) Codes

SMILES (Daylight Chemical Information Systems Inc.,

http://www.daylight.com/smiles/) has been surveyed. The purpose of SMILES is to

provide a simplified way for entering complex organic molecules and at the same time

reducing the storage capacity taken up by large volume of organic compounds. The

SMILES standard is not needed since this work will limit the types and families of

organic substrates, and the structures selected are not complex. Moreover, SMILES

code is more suitable for use in exchanging data format over the World Wide Web, but

the ultimate system is not meant for storing organic compounds for Web retrieval.

Another reason for not using the syntax proposed in SMILES for the input strings is that

the conversion from SMILES to IUPAC1 names will incur extra processing time. Table

2.1 shows some examples of SMILES codes of molecules.

1 IUPAC stands for International Union of Pure and Applied Chemistry. It is recognized as the world authority on chemical terminology and nomenclature. This nomenclature provides a unique name to each chemical structure.

44

Table 2.1: Some examples of SMILES codes. Atom/Molecule SMILES Name

H3O+ [OH3+] Hydronium cation CH3CH3 CC Ethane CO2 O=C=O Carbon dioxide CH3CH2OH [CH3] [CH2] [OH], or

CCO Ethanol

CH4 C Methane NH3 N Ammonia HCl Cl Hydrochloric acid H2O O Water H2S S Hydrogen sulfide H+ [H+] Proton OH- [OH-] Hydrogen anion Fe2+ [Fe++] Iron (II) cation CH2=CH2 C=C Ethane H2 [H] [H] Molecule hydrogen CH3CH2CH3 CCC Propane CH3CH3CH3COH CCCCO tert-butyl alcohol

2.6.2 International Chemical Identifier (InChI)

InChI is a textual identifier for chemical substances, designed to provide a standard and

human-readable way to encode molecular information and to facilitate the search for

such information in databases and on the web

(http://en.wikipedia.org/wiki/International_Chemical_Identifier). InChI was developed

in cooperation of IUPAC and National Institute of Standards and Technology (NIST).

It provides a way of describing chemical structures in text. The InChI string is

completely derived from the structure of a compound. One unique feature of the InChI

format is to assign the same InChI string to a compound regardless of the way it is

drawn. InChI can thus be seen as akin to a general and extremely formalized version of

IUPAC names. InChi notation can express more information than the simpler SMILES

notation and differ in that every structure has a unique InChI string which is important

in database applications. For the same reason given in Section 2.6.1, InChI is also not

used in this work. Table 2.2 compares InChI to SMILES chemical format. SMILES

code is normally not unique but has the possibility of a canonical form that is unique for

45

each structure. On the other hand, InChI formats separate the information about atoms

and bonds and thus their reading by human requires some knowledge of the format.

Table 2.2: Comparison of InChI to SMILES formats.

InChI SMILES

Linearized Yes Yes

Unique, canonical Yes Possibly

Human readable Hardly Easily

Includes atom coordinates No No

(Source:http://www.inchi.info/inchi_comparison_en.html)

2.7 Related Works

Two systems that share some similarities with our work (in terms of the application

domain and the technique used) are reviewed. The first system is LHASA

(http://derek.harvard.edu/). LHASA is an expert system using a database of retro-

reactions (called transforms). It has been under development at Harvard since late

1960’s. LHASA is one of the first computer programs for synthesis planning. The

project has produced over 20 PhD graduates. In the discussion the differences between

the system and ours will be highlighted, in terms of the techniques used in the

development. The second system is the QALSIC program. Even though QALSIC uses

the same technique (i.e. QPT reasoning) as we do, it was implemented differently in

software. The purpose of the program is to perform inorganic chemistry simulation (and

not organic reaction mechanism). This thesis will fill the gaps left by the QALSIC

program (in terms of model automation, knowledge structuring, and knowledge

validation).

46

2.7.1 LHASA

The LHASA program utilized heuristics provided by human experts, i.e. a number of

expert chemists. These heuristics gave a numerical estimate which may be used to

inform if a synthetic plan was progressing in the right direction. Originally, LHASA

only generated retrosynthetic routes. More recently it included a few software modules.

Examples of the modules are APSO for teaching of organic synthesis, PROTECT for

functional group protection, DEREK for toxicology prediction and LCOLI (LHASA for

Compound Libraries) for compound library generation. A proprietary language called

CHMTRN (CHeMistry TRaNslator) is used for the knowledge base development. The

knowledge base contains “rules” which dictate LHASA’s behaviour towards a target

molecule. The transform descriptions are an integral part of the knowledge base. When

LHASA reads a transform entry, it finds instructions (e.g. to build a precursor from the

target structure) and acts accordingly. The work described in this thesis, however, is to

predict and explain the target molecule (forward planning).

LHASA relied heavily on experienced chemists to find and select the best retrosynthetic

routes in an interactive and time-consuming manner. LHASA incorporated a more

complex strategy system which included multi-step plans based on useful reactions,

such as the Diels Alder reaction and the Robinson annulation. This allows the program

to rapidly find synthetic sequences which make useful changes. However, there are

some associated problems with this approach. According to the reference site, the long-

range transforms, which were created based on the expectations of a small set of expert

chemists, took as much as six months to prepare, and the program could easily give

cumbersome plans for molecules that contained unusual or unforeseen combinations of

functional groups. In addition, the modules were not dynamically updated when new

reactions were added so the modules slowly slipped out of date as new reactions were

47

discovered. As such, for sustainability purposes, as more synthesis and hence new

compounds are anticipated, information and search updating could be a burden (in

terms of time and effort incurred). Nevertheless, LHASA is still in use today, and new

strategic modules are still being added. The group is continuously seeking collaboration

for additions to the knowledge base and enhancement of the CHMTRN language. In

the course of development of the LHASA program, the knowledge base organization

has become very complex. This will not happen to our system since there is no

precoded solution or any reaction route kept in the knowledge base, rather only

chemical theories and basic facts required to perform the organic reactions defined in

the scope of this work. Consequently, less storage space is taken up. The system relies

heavily on expertise of chemist to select the order of the generated precursor molecules

to process further. It is believed that the QPT-based simulation is able to get rid of the

drawback outlined above. LHASA’s methods to overcome the problems include: (a)

Generating precursors for a synthetic target using all available tactics instead of a single

user-selected tactic, (b) Generating precursors automatically to find a solution sequence,

(c) Storing results in a relational database of essentially unlimited size, (d) Developing

algorithms and heuristics that emulate the decisions of an expert user in selecting which

precursors to process further.

2.7.2 QALSIC

QALSIC is among the earliest applications of QPT in chemistry for the qualitative

simulation of a small set of inorganic chemical reactions. QALSIC has managed to

break the proof-of-principle question of how inorganic chemistry can be presented in

qualitative terms especially in reasoning on its dynamic processes (such as precipitation

and dissociation). Although the QALSIC related literature claimed that the system is

48

able to simulate unknown reactants (substances whose name are not found in

knowledge bases), further examination reveals that, the system can make correct

prediction only if the chemical equation has the pattern “AB + CD � AD + CB”; i.e.

direct cross-linking of elements is obeyed. Furthermore, even with known reactants

prediction can still be erroneous. Sample reactions with erroneous answers are given in

Table 2.3. In the software, the equation balancing task is well-handled by the encoded

chemical theories and facts in QPT. For example, the system uses the orbital

information for checking the valence electrons per atom and the oxidation number is

used for assigning charges to respective ions. Nevertheless, a number of limitations

remained open, as discussed in the following three subsections.

2.7.2.1 Limitations and Problems in the QALSIC Program

Briefly, QALSIC does not cater for knowledge validation. This is the main reason why

wrong results are returned. Although the QALSIC program checks the type of a

substance, not all substances in the periodic table are included in the knowledge base.

Furthermore, only two processes were fully implemented in software (precipitation and

dissociation). Processes in QALSIC are precoded (in terms of the five-slot template).

This is because QALSIC software does not have a model construction module like the

one embedded in the QRiOM software. In addition, most of the explanations are

handcrafted. In contrast, QRiOM is able to construct qualitative models at runtime and

to provide various forms of explanation on demand.

All of the processes are precoded and that is why some experiments are restricted. The

software will not necessarily succeed in simulating a reaction. As a result, the system is

unable to predict correctly (and reasonably) for a large number of chemical equations.

49

Table 2.3 shows some of the tested chemical reactions. Our further investigation shows

that the wrong predictions are caused by the nature of the inorganic chemical reactions

(Section 2.7.2.2).

Table 2.3: Comparison of the actual and simulated results for a selected sample of inorganic chemistry reactions.

Chemical reaction Actual result Result given by QALSIC Correct/

Wrong

2Mg(s) + O2(g) 2MgO(s) MgMg, and the message: “no element to dissociate”.

�

2CO(g) + O2(g)

2CO2(g) No answer returned, and a message that says “dissociation” begins. In fact, 2CO does not dissociate.

�

Fe(s) + S(g)

FeS(s) FeS(s) ��

CaO(s) + CO2(g)

CaCO3(aq) 2CaO + C2O �

SO2 (g) + H2O(l)

H2SO3(aq) 2H2S + 4H2O �

Na2O(s) + H2O(l)

2NaOH(aq)

Na2O + H2O �

SO3(g) + H2O(l)

H2SO4(aq) H2SO3 + O “O” never exist

�

Ne + F2 No reaction Nil, and the message “F2 cannot dissociate”. The message given is actually not the real reason.

�

Ba(s)+ ZnSO4(aq) BaSO4(aq) + Zn(s) Ba + Zn4SO � 2Al(s) + 6HCl(aq) 2AlCl3(aq) + 3H2(g) AlCl3 + 3H � HCl(aq) + KOH(aq) KCl(aq) + H2O(l) KCl(aq) + H2O(l) �� BaCl2(aq) + 2AgNO3(aq)

2AgCl(s) + Ba(NO3)2(aq) 2AgCl(s) + Ba(NO3)2(aq) ��

AgNO3(aq) + NaCl(aq)

AgCl(s) + NaNO3(aq) AgCl(s) + NaNO3(aq) ��

CaO(s) + 2HCl (aq)

CaCl2(s) + H2O(l) CaCl2(s) + H2O(l) ��

K2S + 2HCl H2S(g) + 2KCl H2S(g) + 2KCl ��

Ba(s) + 2H2O(l) Ba(OH)2(aq) + H2(g) BaO + 2H � Zn + BaSO4

no reaction (since Zn is a less active metal than Ba)

Zn SO4 + Ba �

In Table 2.3, many incorrect results were produced (by QALSIC). The program was

not able to predict correctly for a large number of chemical reactions due to the

difficulty in associating processes with inorganic elements (refer to Section 2.7.2.3 for

further explanation). We had tried devising a scheme that can correlate reaction types

50

and processes. The intention is to classify experiments (hence reactions) so as to come

out with a classification scheme that looks like the one shown in Figure 2.2.

Experiment Types

Type-I (double-displacement) Type-II (single-displacement) …. Type-N

Process-I Process-II Process-III Process-II Process-IV ……

Figure 2.2 A proposed scheme to classify inorganic experiment types.

If such a classification scheme does exist, then it would provide some help in choosing

the correct chemical processes for the inorganic reactants. However, it is very difficult

to associate each inorganic reactant to the processes (e.g. Process-I, etc. in Figure 2.2).

The reason is that chemical reaction for inorganic reactants is largely determined by the

substance involved in a reaction. In other words, every analysis must go down to the

basic elements (e.g. atoms or ions). This condition is named as “substance-dependent

syndrome”, in which, most of the time, the following statement does not hold: “if you

are in this group, you will exhibit this particular behaviour, and possess properties A

and B, then undergo processes C and D”. More results of analysis are presented in the

next two subsections in order to justify our claim that “qualitative reasoning is less

suitable for modelling inorganic chemical reactions”.

2.7.2.2 Organic Reactions Versus Inorganic Reactions

In an organic reaction, the simulation focus is on the entire organic compound (the main

chain and the structural units). When performing reasoning, it is more towards

molecule-centred reasoning. On the other hand, in inorganic reactions, reasoning is

done on the individual atoms or ions. Therefore, when it comes to modelling and

51

reasoning, the way of characterization is different. In the organic reaction, we observe

changes made to the whole compound (that serves as the initial substrate) with

particular emphasis on the nucleophile being substituted, whilst in inorganic reaction

the final products are a mixture (exchange) of several subunits of the reactants.

Qualitative reasoning is best suited to domains and subjects that meet two basic criteria.

First being the problem description is qualitative in nature, and second is the degree of

generalization should be high, meaning the model is a logical consequence to a large

number of possible values. It seems that the latter requirement is not found in

qualitative simulations for inorganic chemical reactions, where classifying the

experiment types cannot help generalize their chemical processes (and chemical

principles). There are no pre-defined processes that can be associated with each

reaction type. Even within the same reaction type, the chemical equations can only be

constructed (hence simulated) after examining individual atoms and ions that formed

these substances. Different substances exhibit different chemical behaviour even

though they are in the same class (e.g. same column or row in the periodic table). In an

interview conducted, when asked about what processes are occurred when given some

reactants, the answer from chemists is that they have to look at the specific reactants to

determine the reaction type (e.g. acid + base or single-displacement) and then study the

properties of the substances. This answer suggests that prediction of inorganic chemical

reactions requires elementary level of analysis. In many occasions, general chemical

principles cannot be used in qualitative simulation of inorganic chemical reactions. As a

result, a system that is based on QR approach for predicting inorganic reaction cannot

generate reliable results. To resolve this problem, the system can be filled with as many

specific cases (chemical facts) as possible, without which the system cannot simulate all

52

reactions without errors. The system will eventually become a database, losing its

“intelligent” nature.

2.7.2.3 Inorganic Reactions in Qualitative Reasoning: The Problems

In this section, multiple cases of inorganic reaction with their associated problems (also

typical problems in QALSIC) are demonstrated. We then moved on to relate these

problems with the claim made in Section 2.7.2.2. Inorganic chemistry reactions can be

classified by the nature of the reaction. There exist five chemical reaction types (Murov

and Stedjee, 1997; Peller, 1997; Peller, 2003; Murov and Stedjee, 1997), namely the

Synthesis (or combination) reaction, Decomposition reaction, Combustion reaction,

Single-displacement reaction and the Double-displacement reaction.

A synthesis reaction takes two reactants to give one product. Typically the chemical

equation is written as “A + B � AB”. Synthesis reaction can further be divided into

types (i) to (v) below. Note that s=solid, l=liquid, aq=aqua, and g=gas. Representatives

of synthesis reactions are given as follows:

(i) Metal + oxygen →∆ metal oxide 2Mg(s) + O2(g) →∆ 2MgO(s) 4Al(s) + 3O2(g) →∆ 2Al2O3(s) 4Li(s) + O2 (g) →∆ 2Li2O(s)

(ii) Nonmetal + oxygen →∆ nonmetal oxide N2(g) + O2(g) →∆ 2NO(g) S(s) + O2(g) →∆ SO2(g) 2CO(g) + O2(g) →∆ 2CO2(g)

53

(iii) Metal + nonmetal �� salt

2Na(s) + Cl2(g) � 2NaCl(s) 2Al(s) + 3Br2(l) � 2AlBr3(s)

Fe(s) + S(g) � FeS(s) Mg(s) + Cl2 (g) � MgCl2(s) The reactions outlined above do not require qualitative reasoning since the reaction is

straightforward, i.e. when a metal reacts with oxygen, metal oxide is the product and

when a non-metal reacts with oxygen, one will get non-metal oxide as product. The

process needed is just one, namely the “oxidation”. In type (iii), the process is rather

straightforward, i.e. formation of precipitation. This type of reaction does not require

qualitative reasoning since it needs neither chemistry expertise nor commonsense to

solve it. All of these are dependent on the specific substance used. In experiment

subtypes (iv) and (v) below, “hydrolysis” is the process that each reaction will undergo.

(iv) Metal oxide + water �� metal hydroxide

Na2O(s) + H2O(l) � 2NaOH(aq) CaO(s) + H2O(l) � Ca(OH)2(aq) MgO(s) + H2O(l) � Mg(OH)2(s) (v) Nonmetal oxide + water �� oxy-acid

SO3(g) + H2O(l) � H2SO4(aq) N2O5(s) + H2O(l) � 2HNO3(aq)

However, when looking at the equations in (iv) and (v) above, there is no specific rule

to follow. What are the specific processes and in what sequence do these processes

occur? Can they be applied to all reactions that fall under the synthesis reaction? The

answer is “no”. When this cannot be done then the only way to model its behaviour is

by the traditional rule-based approach, i.e. store as many examples as we can in the

knowledge base. This is surely not practical because there are many metal (non-metal)

oxides. Qualitative reasoning is again not suitable to predict the outcomes. Each of the

above reaction employs different processes to yield the final products. That is, merely

knowing the experiment type cannot help to determine the processes to be applied

54

because the nature of inorganic substance reaction is substance dependent. As a result,

no generalization can be done. This is true for all of the main reaction types and not

only the chemical reactions classified as synthesis.

Next, some examples of the Single-displacement reaction type (or substitution) are

analyzed. This reaction involves an element (e.g. A) with a compound (e.g. BC) such

that the element (A) replaces one of the elements (B or C) in the compound. There exist

two formulas: “A + BC � B + AC” (if A is a metal and more reactive than B) and “A +

BC � C + BA” (if A is a halogen, it will replace C to form BA, provided A is a more

reactive halogen than C).

Single-displacement reaction:

Ba(s)+ ZnSO4(aq) � BaSO4(aq) + Zn(s) Zn(s) + 2HCl(aq) � H2(g) + ZnCl2(aq) H2(g ) + CuCl2(aq) � Cu(s) + 2HCl(aq) 2Al(s) + 6HCl(aq) � 2AlCl3(aq) + 3H2(g) Cl2(g) + CuBr2(aq) � CuCl2(aq) + Br2(aq)

Modelling this class of reaction is somewhat easier because the rules can be applied to

most cases, but not all. There are still reaction cases that violate the rules. For

example, the following pairs of substances do not react.

No reaction cases: Zn + BaSO4 � no reaction (since Zn is less active metal than Ba) Cu(s) + ZnCl2(aq) � no reaction (since Cu is less active metal than Zn)

We envisage that a qualitative reasoning engine will generate incorrect result for the

above reactions (“Zn + BaSO4” and “Cu(s) + ZnCl2”), since none of the two general

formulae (“A + BC � B + AC” and “A + BC � C + BA”) can be used. This is why

QALSIC produces wrong result for “Zn + BaSO4” reaction (Table 2.3); simply knowing

that Zn and Ba are metals is not enough to model its behaviour correctly. Let us show

55

another single-displacement reaction example. Note that the answer is not “B + AC”,

“C + BA”, or “no reaction”. The product of this reaction is “Ba(OH)2 + H2” instead.

Ba(s) + 2H2O(l) � Ba(OH)2(aq) + H2(g)

A + BC

This is because very active metals (e.g. Barium) can even displace hydrogen from

water, making it not obeying any general chemical principles. For cases like this, it is

suggested to treat them as special cases in the software and not by qualitative reasoning

approach.

We now examine some examples in the Double-displacement reaction (or Metathesis).

Double-displacement reactions can further be divided into types (i) to (iv) below.

Reactions in this class involved two ionic compounds in an aqueous solution and

usually one of the products is a compound insoluble in water (precipitate), a gas or a

slightly ionized compound (Goldberg, 1998). Typically the chemical equation is

written as “AB + CD � AD + CB”, i.e. it involves an exchange of positive and

negative groups. Examples in (i) show acid-base neutralization processes.

Neutralization means there is H+ from the acidic solution reacting with OH− from the

basic solution to form the primary product H2O and side product salt.

(i) Acid-base Neutralization HCl(aq) + KOH(aq) � KCl(aq) + H2O(l) H2SO4(aq) + Ca(OH)2(aq) � CaSO4(aq) + 2H2O(l) HCl(aq) + NaOH(aq) � NaCl(aq) + H2O(l)

However, some reactants are not seen as common acid or basic but they do give rise to

water and salt combination. Examples are:

Na2O + 2HCl � 2NaCl + H2O(l) 2NaOH + CO2 � Na2CO3 + H2O(l) [H+ is not seen in the equation, but when CO2 is dissolved in water, H+ will exist]

56

Same situation interpretation goes to examples in subtypes (ii) and (iii), as given below.

(ii) Metal oxide + Acid CuO(s) + 2HNO3(aq) � Cu(NO3)2(aq) + H2O(l) CaO(s) + 2HCl (aq)� CaCl2(s) + H2O(l) [When oxides of metals are added in water, bases are formed] (iii) Nonmetal oxide + Base SO3 + 2KOH � K2SO4 + H2O CO2 + 2Ca(OH)2 � CaCO3 + H2O [When oxides of non-metals are added in water, acids are formed]

If the above reactions are modelled based on their general chemical principle, i.e. <acid

+ base> pair will give <water + salt> as the product pair, then the system is (again)

unable to return correct answers unless all reactants that will become acid and basic in

water are stored as chemical facts.

Modelling double-displacement reaction examples shown in (iv) requires very minimal

effort since the “AB + CD � AD + CB” formula can be applied rather throughout. In

most cases, the products can be determined from the knowledge of the ionic charges of

the compounds.

(iv) Precipitation Reactions

E.g. Formation of an insoluble precipitation BaCl2(aq) + 2AgNO3(aq) � 2AgCl(s) + Ba(NO3)2(aq)

AgNO3(aq) + NaCl(aq) � AgCl(s) + NaNO3(aq)

BaCl2(aq) + Na2SO4(aq) � BaSO4(s) + 2NaCl(aq)

Ca(NO3)2(aq) + Na2C2O4(aq) � CaC2O4(s) + 2NaNO3(aq)

E.g. Formation of gas H2SO4(l) + NaCl(s) � NaHSO4(s) + HCl(g)

K2S + 2HCl � H2S(g) + 2KCl

57

However, there are still cases that do not follow the general principles. This is

illustrated by the following double-displacement reaction examples. In principle, the

following reactions should give two and only two products but they give three products.

NH4Cl + NaOH � NaCl + NH3(g)+ H2O (3 products) H2SO4 + Na2CO3 � Na2SO4 + H2O + CO2(g) (3 products) K2CO3(aq) + 2HNO3(aq) � 2KNO3(aq) + H2O(l) + CO2(g) (3 products)

Let us take another example: “CaO + CO2” is supposed to give “CaO2 + CO” as

products, since “CaO + CO2” has the format of “AB + CD” but the reaction formula

“CaO + CO2 � CaO2 + CO” is invalid. A reasoning engine that relies solely on general

principles will not be able to predict this outcome. The correct output is CaO3.

CaO + CO2 � CaO3 (1 product)

This is because even though the reaction formula has the format of what is seen in the

left-hand side of a double-displacement reaction, it is actually a combination reaction

because both compounds contain oxygen. As one can see, the specific type of reaction

is determined by the substance in use. Furthermore, when two non-metallic elements

combine, the product formed often depends on the relative quantities of the reactants

(refer to the reactions shown below), i.e. quantitative data is needed.

C(s) + O2 (g, excess oxygen) →∆ CO2(g) 2C(s) + O2 (g, limited quantity of oxygen) →∆ 2CO(g)

The “C(s) + O2” reaction is to subject carbon in flowing oxygen. Here, the number of

moles is not counted but practically, in the volume of O2 used. In the “2C(s) + O2”

reaction, the number of moles is usually computed and changed to volume then only

that much of O2 is introduced in the reaction. The second equation reads “2 moles of

carbon reacts with 1 mole of oxygen to produce 2 moles of carbon monoxide”. So, if 2

58

moles of CO (=12.01+16.0 = 28.01g/mol) are required, 1 mole of oxygen (32g) is

needed and carbon is used in excess, such as a little bit more than 2 moles (= 24.2g).

Prediction cannot be done by using commonsense knowledge about chemistry. In this

case, quantitative data is required to simulate and return the correct answer. There is

another case where general principle cannot be applied. The reactants Ne and F2 in the

equation “Ne + F2(g) � nr (no reaction)” will not result in any reaction, simply because

the gas neon is too stable. In such case, what is required is a simple look up table and

no reasoning is needed.

2.7.2.4 Discussion

There are many inorganic reactions that do not follow general chemical rules. For such

reactions, QPT is not the right reasoning technique to be used because the formation of

final products is not merely from one’s chemical intuition. It is difficult to associate

each reactant to the experiment types, otherwise the following statement will hold: “if

object A belongs to the same class of another object, then object A will behave similarly

as the other object in that class”. But this is not true for inorganic reactions. For

instance, when an individual is identified as “ion”, one still needs to look at what is the

specific ion (e.g. Cl−, Mg2+). So, it can be concluded that the reason why QALSIC

made some wrong predictions is largely due to the nature of the problem domain, rather

than the limitation in the QPT representation. This is the main reason that QALSIC is

unable to predict correctly for substances unknown to the system. Simply said, there are

many exceptions to the general rules of inorganic chemistry. We will show in Chapter

3 and Chapter 4 that organic chemical reactions are much easier to be classified into

generic structural types (groups) compared to inorganic reactions. As long as a

compound contains a particular functional group, it will possess the same chemical

59

property hence similar behaviour and as such the general principles can be applied

throughout all organic compounds that contain the functional group. Furthermore, two

reusable chemical processes have been identified for simulation use, namely the “make-

bond” and the “break-bond” processes, which will be discussed separately in

subsequent chapters. In summary, qualitative reasoning is not suitable for inorganic

reaction simulation because generalization is difficult to accomplish. If one proceeds,

there are two options, firstly to limit the number of experiments and secondly to adopt

the traditional rule-based approach.

2.8 Conclusion

This chapter reviewed relevant literature on qualitative reasoning applications in science

and engineering. The chapter also reviewed AI applications in chemistry. A

comprehensive study of a related work, QALSIC has also been presented. The

QALSIC program was tested with an intention to seek the main cause in its prediction

deficiency. Our investigation showed that qualitative reasoning is less suitable for

predicting the outcome of inorganic reactions because it is more difficult to seek

chemical property generalization in this problem domain. From our discussion, it can

be said that none of the educational software for organic chemistry addresses the

simulation and explanation issues from the standpoint of qualitative reasoning. Tight

coupling between concepts and their embodiment in software is crucial in building

smart educational software that can “reason” the processes intuitively. This is important

since conceptual understanding of a subject and the ability to provide explanation are

basic requirements for effective learning.

60

Chapter 3 Qualitative Modelling of Organic Reactions

3.1 Introduction

This chapter describes how the following objectives are achieved:

• To acquire human expertise in the field of organic reaction mechanisms.

• To examine QPT and use it to represent chemical theories qualitatively in order to

model the behaviour of organic reactions.

• To classify chemical processes for a variety of organic substrates in order to

promote model reusability.

• To develop an algorithm that automates model construction.

The organization of this chapter is as follows: A review of the state of the art in

qualitative modelling is presented in Section 3.2. Section 3.3 gives the procedures used

in domain knowledge acquisition. Section 3.4 – Section 3.6 present the domain theory,

the chosen reaction examples, and the underlying thought processes for organic

reactions. The approach for classifying reaction steps (organic processes) as “make-

bond” and “break-bond” is discussed with proof in Section 3.7. The identification of

individual views and their associated organic processes are also discussed since the

reproduction of the behaviour of reaction mechanisms relied on both the use of

individual views and the organic processes that occurred. Section 3.8 discusses the

design decision underlying the model automation task especially on the usage of the

modelling constructs of QPT to represent general organic reaction knowledge and

chemistry theories. Section 3.9 gives useful guidelines for modelling views for organic

61

reaction use. A handcrafted model is also presented in Section 3.10 to show how the

model is used to support learning. Section 3.11 concludes the chapter.

3.2 State of the Art in Qualitative Modelling

Much research in the field of qualitative reasoning has been committed to the questions

of representation of qualitative models. Typically, the qualitative models are used to

create the groundwork on which the quantitative models can be explained (Frederiksen

and White, 2002; Sime, 1996; Sime and Leitch, 1992; White and Frederiksen, 1990).

There are two important questions concerning qualitative modelling. First is how to

construct qualitative models, and second is how to automate model construction for the

application of qualitative modelling techniques especially for the QR technique to be of

widespread use (Bratko and Dorian, 2003a). Progress in qualitative reasoning about

physical systems has also led to new modelling languages that describe entities and

processes in conceptual terms and represent notions of causality explicitly

(Falkenhainer and Forbus, 1991; Forbus, 1996b; Weld and de Kleer, 1990).

Traditional mathematical and computer modelling languages do not attempt to

formalize such notions because they are designed for experts who already know such

things.

Bredeweg and Forbus (2003) expressed their hope to see qualitative modelling

vocabularies to be of widespread used among the educators as a new means to express

aspects of their expertise that are currently described as “intuition”. Their review also

emphasizes the importance of conceptual knowledge and causal theories in education,

particularly concerning reasoning about system behaviour. Mastering the causal

theories of physical phenomena can help students in answering fundamental questions

62

in science education (e.g. what happens? why does it happen? what does it affect?).

Our work goes along with the same emphasis. In this work, qualitative model

development involves the mapping of human reasoning model to ontological primitives

of the QPT while the role of qualitative modelling in this work is to prepare organic

processes (in terms of the modelling constructs of QPT). When reasoning is performed

on these models, the behaviour of organic reactions can be reproduced. Reproducing

the chemical behaviour of organic reactions can help predict the outcomes of a chemical

reaction.

The formalization of automated modelling techniques has been one of the hallmarks of

QR where a model for a scenario is automatically constructed from a structural

description and task constraints (Falkenhainer and Forbus, 1991). The approach used in

assembling a model (given a scenario or initial situation) in CyclePad and Garp3

(Bredeweg et al., 2007) is compositional modelling (Falkenhainer and Forbus, 1991).

More recent work done by Horiguchi and Hirashima (2009) also uses the compositional

modelling technique to provide intelligent support for authoring graph of microworlds.

Most qualitative reasoning systems adopt a reductionist view of the world and are

aimed at building libraries of independent, elementary model fragments. In

compositional modelling, model fragments are chained. This idea provides the basis for

reusing models, a highly desirable feature for industrial applications (Bredeweg and

Struss, 2003). QRiOM also uses a kind of model composition technique to construct

QPT models at runtime. The logical steps for automating a QPT model based on a

simple substrate input will be presented in the following sections.

Salles and Bredeweg (1997) said the type of user and the role of the model are

important factors for understanding the purpose of a model in a qualitative simulation

63

context. Starting with the former, it makes a difference whether the constructed model

is to be used by experienced domain experts or by students in colleges/universities. The

latter, on the other hand, suggests that a model may be used as a tool for inspecting the

dynamics of a physical system. In this situation the emphasis will be on correct and

complete simulation. In other situations however, different aspects may become more

important. Particularly, in an educational setting, the understanding of model

articulation is more important to the learner. Since our work has the role of assisting

the students, the “type of user” is an important factor in our design consideration, in that

the constructed models only consist of sufficient knowledge in understanding a

chemistry phenomenon.

More recent attempts on qualitative modelling tools are QCM system (Dehghani and

Forbus, 2009). QCM is a tool (aimed at cognitive scientisits) that allows users to create

situation-specific descriptions of physical processes rather than asking the user to first

create and then instantiate a first-principles domain theory. The use of experiential

knowledge is believed to have profound effects and consequences for human reasoning

(Forbus and Gentner, 2009). The term “dark knowledge” is used in their work to refer

to specific cases derived from personal experience or through culture. However, “dark

knowledge” is the type of knowledge that currently has no well-designed formalisms in

QR. Forbus and Gentner added that the episodic memories and experiences people

have are powerful mechanisms to construct generalizations and hence they play a

central role in human mental model.

A qualitative model can be used to generate predictions given an initial situation. This

process, as well as its result, is called qualitative simulation (discussed in Chapter 4).

The relationship between modelling and reasoning is depicted in Figure 3.1.

64

Figure 3.1 Simulation entails reasoning from model.

3.3 Domain Knowledge Acquisition

Knowledge acquisition involves the acquisition of knowledge from human experts,

books, documents, or computer files (Turban, 1999). The term knowledge elicitation, on

the other hand, often implies that the transfer is accomplished by a series of interviews

between a domain expert and a knowledge engineer who then develops a computer

program representing the knowledge. As such, domain experts (chemists) and

chemistry students were interviewed to elicit the domain knowledge and chemical

intuition they apply when solving organic reaction problems. From the interviews, it is

particularly ascertained that understanding reaction mechanisms requires the application

of chemical insight and chemical commonsense at intuitive level – a suitable application

domain for QR technology.

According to Chi et al. (1981), experts and novices categorise problems differently,

where experts tend to categorise problems using the underlying principles rather than

entities contained in the problem statement. The difference lies in the identification of

important features of the domain and the interpretation of this information. This

categorization can be seen as a means of accessing the most appropriate model of the

domain. By only selecting relevant information during the modelling process, the

resultant model is simpler and more importantly, reasoning from that model is

simplified. The work found that expert schemas contained a great deal of procedural

knowledge with explicit conditions of use. Novice schemas contained declarative

Qualitative Simulation

QPT Model

Building

QPT Model

Reasoning

65

knowledge but lacked the abstracted solution methods. Based on the findings of Chi et

al., the acquisition of expert knowledge was carried out as follows:

• Problem characteristics and the behaviour of organic reaction mechanisms were

sought and studied. This includes the conceptual understanding about organic

chemistry, organic reaction, and their mechanisms.

• The course outline of the chemistry subject was reviewed and the mental model of

a few domain experts (chemists at University of Malaya and Universiti Tenaga

Nasional) was documented.

• Dialogues with chemists were conducted to find the possibility of representing the

required knowledge in qualitative terms using QPT.

o Collecting the intuitive and causal aspects of chemists’ mental models helps

in the design of the cognitive steps used in the simulation algorithms.

3.4 Understanding Organic Chemistry Reactions

In this work, “A + B → C + D” is named as a chemical equation; “A + B” is an organic

reaction (where “A” is an organic substrate). Before a simulation can begin, the

reaction steps of a chemical equation must first be identified. For example, the

simulation of equation 3.1 can be described as a series of processes that occur and these

processes will be used to explain how the product is formed (= “the mechanism used”),

where “X” is any halogen (group VII in the periodic table). Examples of halogen are I,

Br, Cl, and F. The symbol “R” can take the forms of anyone of the following strings:

CH3, CH3CH2, CH3CH3CH, or (CH3)3C.

R−OH + HX � R−X + H2O (3.1) Alcohol Hydrogen halide Alkyl halide Water

66

The organic processes that occurred are identified as follows:

Process I: Protonation (=“make-bond”). There is a proton (H+, in our case) which is an

electrophile (electron poor species) and there is a lone-pair electron on the “O”. From

the chemical knowledge base, “OH” is a poor leaving group. The process that occurs is

“H+” sticks to “O:” (oxygen with a lone-pair) and this gives rise to “−OH2+” which is

unstable, so that the next process can begin (i.e. the “dissociation” process will occur, as

below).

Process II: Dissociation (=“break-bond”). This process describes the deletion of the

bond in the “R−OH2+” compound. This process happens because the “O” in the

compound is unstable since there are three covalent bonds (for oxygen, maximum is

two) and there should not be a positive charge on it. At the end of this step, it will

produce H2O and R+ (a stable tertiary carbocation).

Process III: Capturing of carbocation by halide anion (=“make-bond”). The process is

called upon since the reacting species are still in their unstable states with charges

around (very reactive). This step describes the formation of a covalent bond between

the two ions X− (anion) and R+ (cation). This process returns R−X (alkyl halide) as the

product.

By applying one’s chemical knowledge, after the third process, the reaction will stop

since the final products (water molecule and alkyl halide) are very stable and having

lower energy than the initial substrate. The sequence of use of the organic processes

(e.g. “make-bond”, “break-bond” and another “make-bond”) will be stored and then

67

checked with the chemical KB in order to determine the mechanism used in the

prediction of the outcomes.

3.5 Organic Reaction as Modelling Task

This section explains the fundamentals of the domain theory, namely the organic

reactions and mechanisms and how to represent them in QPT notations. QPT is

adequate for representing qualitative knowledge (Salles et al., 1996). This is because

the modelling constructs of QPT provide good means for describing processes in

conceptual terms, and embodies notions of causality which are important to explain

behaviour of chemical systems. Thus, it is useful as a language to write dynamical

theories in expressing the intuitive ideas of organic reactions. The QPT’s qualitative

proportionalities and influences are powerful primitives to be used in organic reaction

modelling in building chains of causality to describe and explain a mechanism. The

causal relationship provides good means to explain the overall change in a reaction

simulation, called “mechanism” in organic chemistry.

Three specific chemical equations (equation 3.2 – equation 3.4) will be used to show

how the general behaviour of organic reactions can be modelled. An organic reaction

usually takes place between a nucleophile and an electrophile. The three equations will

share some organic processes (hence reusability is achieved), and may use different

reaction mechanisms to yield the respective products. With a chemical reaction, we

know what to start off and after qualitative simulation we get what it finishes with (the

final product), but to understand the reaction we want to know the story in between and

this is called “mechanism”. An organic mechanism is normally used to explain how a

product is formed. The substrates (which are organic compounds) tested in this work

68

are two: (1) ROH (alcohol or alkanol), which has a functional group which contains

oxygen and a single bond. The hydroxide ion “OH−” is a strong base, hence a poor

leaving group, (2) RX (where X = F, Cl, Br or I), which is alkyl halide, involving a

functional group which contains a halogen atom. The selection of the chemical

equations is also based on the two different organic mechanisms defined in the scope of

this research. As shall be explained, process reusability can also be demonstrated by

choosing these three chemical equations. The products of equation 3.2 and equation 3.3

necessitate the SN1 mechanism while SN2 mechanism is used to explain equation 3.4.

SN stands for substitution nucleophilic and the “1” shows that the reaction is first order

or unimolecular, that is only one of the reactants affects the reaction rate. SN2 reaction

is second order since the rate is dependent on both the alkyl halide and the incoming

nucleophile. The “2” signifies that the rate of reaction is second order or bimolecular

and depends on both the concentration of the nucleophile and the concentration of the

alkyl halide. The specific behaviour of SN1 and SN2 will be discussed in Chapter 4. In

Chapter 4, we presented simulation scenario for reproducing the behaviour of SN1 and

SN2 by using the qualitative reasoning algorithm.

Most chemists will construct thought processes (a series of small reaction steps) in their

minds when solving a chemical equation and the organic reaction problem is also

solved along this line. The following chemical equations will be described as series of

small reaction steps. The thought processes for equation 3.2 – equation 3.4 are

presented in Section 3.6 (Figure 3.2 – Figure 3.4).

(CH3)3COH + HCl → (CH3)3CCl + H2O (3.2) (CH3)3CBr + H2O → (CH3)3COH + HBr (3.3)

HO− + CH3CH2Br → CH3CH2OH + Br

− (3.4)

69

3.5.1 Chemical Equation as a Reasoning Task

One of the objectives of this work is to use a QR technique to design a learning tool that

can improve students’ conceptual understanding of the subject. As such, if a student

learns what is behind the above three equations as “it is only an exchange of

nucleophile, hence it is just a futile problem”, or simply memorize “A + B” will give

rise to “C + D” then the learner would not be able to answer basic questions such as:

1. What will be the first reaction step?

2. Why did the process occur?

3. What is favourable in the step?

4. Why a bond is made at the particular atom and not at the other atom?

5. What is the main cause for the reduction of lone pair electrons on a particular atom?

6. What breaks the bond between atom1–atom2?

The QR approach provides a systematic way for converting and solving the chemical

equations so that students would know why a particular process is taking place and how

a particular outcome is produced. Deep knowledge is perceived as essential in

qualitative simulation of organic reactions. One form of reasoning with deep knowledge

is by qualitative reasoning. The qualitative reasoning approach is able to generate

explanation based on the why, what and how types of query, of which the thesis shall be

focusing and discussing in detail. We believe this approach of modelling organic

reactions will help improve a learner’s reasoning ability. An increasing level of

conceptual understanding about the subject can also be expected.

70

3.6 The Underlying Thought Processes for Organic Reactions

It is important that chemists can predict whether a reaction will occur and also where it

will occur. Most reactions involve electron-rich molecules forming bonds to electron

deficient molecules (i.e. nucleophiles links to electrophiles). The bond will be formed

specifically between the nucleophilic centre of the nucleophile and the electrophilic

centre of the electrophile. For completeness, a brief definition for the two terms is

given: Nucleophiles are electron-rich molecules and react with electrophiles.

Electrophiles are electron-deficient molecules and can react with nucleophiles. The

nucleophilic centre of a nucleophile is the specific atom or region of the molecule which

is electron-rich. The electrophilic centre of an electrophile is the specific atom or

region of the molecule which is electron deficient. An electrophile will accept electrons

in order to fill up their valence shell. If a molecule has negative charge, it must be

electron-rich. It is therefore a nucleophile. The nucleophile centre will be the atom

which has the negative charge. Likewise, if the charge on a molecule is positive, it is

electron-deficient. The electrophilic centre will be the atom bearing the positive charge.

The symbol “δ+” (delta-plus) refers to a partial positive charge species (or neutral

electrophile) while “δ-” (delta-minus) symbolizes partial negative charge species

(neutral nucleophile) that has a tendency to pull electrons towards it. For example,

O−H bonds are polar covalent because the oxygen atom is significantly more

electronegative than the hydrogen atom. As a result, the oxygen atom has a greater

share of the electrons in the O−H bond and is slightly negative (i.e. δ-). To provide an

example of δ+, let us use C−X (X is a halogen). The carbon (C) is δ+ since it has a

lesser share of electrons (less electronegative) in the C−X bond.

71

Thought processes for equation 3.2: The chemical equation “(CH3)3COH + HCl →

(CH3)3CCl + H2O” describes a functional group transformation reaction, where

nucleophilic substitution (halogen substitution) is the mechanism for obtaining the final

product. Halogens are atoms in the 7th column of the periodic table. The series of the

small reaction steps involved in converting the starting material ((CH3)3COH, a tertiary

alcohol) to the final product ((CH3)3CCl, alkyl halide) is depicted in Figure 3.2. Note

that double dots represent the electrons associated with the particular atom in the

molecule.

72

O =nucleophilic centre H+=electrophile

.. .. ..+ ..

(CH3)3C – O: + H – Cl: ↔ (CH3)3C–O–H + :Cl:-

|

.. | ..

H H tert-butyl alcohol hydrogen chloride tert-butyloxonium ion chloride ion

(a) Reaction step 1

C = δ+ O = δ-

..+ .. (CH3)3C– O–H ↔ (CH3)3C

+ + :O–H

| |

H H tert-butyloxonium tert-butyl cation water

(b) Reaction step 2 C

+ = electrophilic centre Cl

−= nucleophile

.. ..

(CH3)3C+ + :Cl:

-

→→→→ (CH3)3C–Cl: .. .. tert-butyl cation chloride ion tert-butyl chloride

(c) Reaction step 3 Name of the chemical process Reactant 1 Reactant 2

Protonation

(CH3)3COH (nucleophile)

H+

(electrophile) Dissociation

(CH3)3C–OH2+

Capturing of anion by carbocation

(CH3)3C+

(electrophile) Cl

−−−−

(nucleophile)

(d) Reactants and their associated chemical processes

Figure 3.2 The conversion of a tertiary alcohol to yield alkyl chloride can be described as a series of three small steps.

The reaction steps are explained below.

• Step 1: Protonation of tert-butyl alcohol to produce an oxonium ion. The curved

arrow sign that starts from a double-dot (“..”) and points to an atom means donating

electrons to form a covalent bond while the (curved) arrow pushing sign that starts

73

from a link (“−”) and points to an atom means breaking the link to donate electrons

to the atom.

• Step 2: Dissociation of tert-butyloxonium ion to produce a carbocation. Note that

in “(CH3)3C–OH2+” the “O” pulls electrons to it because by doing so it will produce

a stable and neutral product (water molecule).

• Step 3: Capture of tert-butyl cation by chloride ion.

Thought processes for equation 3.3: The chemical equation “(CH3)3CBr + H2O →

(CH3)3COH + HBr” describes the production of alcohol from an alkyl halide. The final

product is obtained through a series of reaction steps as depicted in Figure 3.3. The

reaction steps are explained below.

• Step 1: Dissociation. “Br−” is a weak base thus a good leaving group.

• Step 2: Reaction with water.

• Step 3: Fast acid-base reaction.

74

C = δ+ Br =δ-

(CH3)3C Br ↔ (CH3)3C

+ + Br−

(a) Reaction step 1

C = electrophilic centre O = nucleophilic centre .. (CH3)3C

+ + : O – H → (CH3)3C – O+– H | | H H

(b) Reaction step 2

H = electrophilic centre O = nucleophilic centre

.. .. .. .. (CH3)3C – O+ – H + : O – H ↔ (CH3)3C – O– H + H – O+ – H | | . . | H H H

(c) Reaction step 3

Name of the chemical process Reactant 1 Reactant 2

Dissociation

(CH3)3C–Br

Reaction with water

(CH3)3C+

(electrophile) H2O

(nucleophile) Fast acid-base reaction

(CH3)3COH2+

(electrophile) H2O

(nucleophile)

(d) Reactants and their associated chemical processes

Figure 3.3 The production of a tertiary alcohol can be described as a series of three reaction steps.

Thought processes for equation 3.4: The chemical equation “HO−

+ CH3CH2Br →

CH3CH2OH + Br−” describes the substitution of a nucleophile (Br, a leaving group) by

an incoming nucleophile (OH−), as shown in Figure 3.4. In chemical theory, the two

steps are concerted, i.e. the bond formation and bond cleavage happen at the same time.

But in the QPT representation (and thus program development), these two processes are

75

assumed to occur at two different time instances. First, break a bond (bromine acts as a

δ- and the carbon bearing the least number of hygrogen substituents acts as a δ+), then

make a new bond (the nucleophile is the OH− while the carbon attached to the bromine

atom is the electrophilic center). Overall, the presentation of such thought processes

allow easier identification of individual views needed in each process that occurred.

δ- δ+ δ-

HO− + CH3CH2Br → [HO

- ----- CH2CH3 -----Br]

Transition

→ CH3CH2OH + Br−

Final products

(a) Concerted steps

Name of the chemical process Reactant 1 Reactant 2

Dissociation

Br–CH2CH3 Br (δ-) C (δ+)

Reaction with HO−

C+ H2 CH3

(electrophile) HO− (nucleophile)

(b) Reactants and their associated chemical processes

Figure 3.4 The “dissociation” and “reaction with HO−” are concerted steps. This is a typical SN2 backside attack reaction.

3.6.1 Individual Views Identification

The view identification technique has just been presented in Section 3.6. Analysis

showed that only two chemical processes are required for the entire simulation to

reproduce the behaviour of the specified organic mechanisms. First, we discuss the

individual views, then the two chemical processes. In QPT, an individual view is to

76

describe both the contingent existence of objects and object properties that change

significantly with time. Individual views are used to model the behaviour of individuals

(objects) and to provide explanation about their general characteristics. Automatic

construction of individual views is made possible through recognizing the reacting

species as either a nucleophile or an electrophile. Our approach suggests that an organic

reaction is triggered based on the recognition of the reacting species, which is called

“view pair” in this work. Each view corresponds to a QPT’s “individual view”.

Individual views identified for equation 3.2 are the following:

• Individual-View Proton (e.g. H+)

(An electrophile used by step 1)

• Individual-View Hydroxyl (e.g. −OH)

(A nucleophile used by step 1)

• Individual-View Oxonium ion (e.g. the H2O+ in (CH3)3COH2

+ compound)

(“C” is delta-plus, “O” is delta-minus, both are used in step 2)

• Individual-View Halide-Ion (e.g. Cl−)

(A nucleophile used by step 3)

• Individual-View Carbocation (e.g. (CH3)3C+)

(An electrophile used by step 3)

Nucleophiles and electrophiles can further be classified as charged or neutral (the

states). As demonstrated in the above example, in some reactions, the reacting species

involved is a neutral electrophile (e.g. “C”) rather than a charged electrophile (e.g. “C+”

or “H+”). Also, a nucleophile can be charged (e.g. “Cl−”) or neutral (e.g. the “O” in an

alcohol oxygen) as given above. The design of the views used in this work caters for

both the charged and neutral nucleophile/electrophile. Examples of charged

77

nucleophile, charged electrophile, neutral nucleophile, and neutral electrophile are given

below (in predicate logic format).

nucleophile('O', neutralNu).

nucleophile('Cl-', chargedNu).

nucleophile('Br-', chargedNu).

nucleophile('HO-', chargedNu).

nucleophile('O+', chargedNu).

nucleophile('Br', neutralNu).

nucleophile('Cl', neutralNu).

nucleophile('I', neutralNu).

nucleophile('F', neutralNu).

electrophile('H+', chargedElec).

electrophile('H', neutralElec).

electrophile('C', neutralElec).

electrophile('C+', chargedElec).

3.6.2 Representing Individual Views

Note that equation 3.2 – equation 3.4 each uses different substrates and reagents but so

long as we are able to identify what class/group each individual view belongs to then

modelling QPT processes may be automated. A representative result has been tabulated

in Tang and Mustapha (2006). The main result is that, in a particular covalent bonding,

all nucleophiles will undergo the same chemical change. Likewise all electrophiles will

follow another pattern of change. This section shows the properties of views in QPT

terms. Properties unique to electrophile (e.g. C, H+, −C+) are given in Figure 3.5 while

properties unique to nucleophile and leaving group views are presented in Figure 3.6.

These specifications can serve as generic views for any functional group defined in the

scope of this work.

78

Individual-View “Nucleophile” (e.g. OH-, Cl-) Individuals

p ;a piece-of-stuff Preconditions

nucleophile (p)

Relations

Ds[charge(p)]= 1

lone-pair-electron (p) +

−P no-of-bond(p)

charge(p) −

+P lone-pair-electron (p) (a)

Individual-View “Charged-Nucleophile”

Individuals

p ;a piece-of-stuff e.g. Cl- or OH- Quantity-Conditions

charges(p, negative) Am[lone-pair-electron (p)] >= ONE

(b)

Individual-View “Delta-Minus”

(e.g. the “O” that bonds to “C” in the main chain) Individuals

p ;is a piece of stuff, the leaving group Preconditions

electronegativity(p) > electronegativity(carbon) Quantity-Conditions

leaving-group(p, good) Relations

Ds[charge(p)]= -1

lone-pair-electron(p) −

+P no-of-bond(p)

charge(p) +

−P lone-pair-electron(p)

(c)

Figure 3.5 (a) Generic definition for an electrophile described using QPT (b) An electrophile used in “make-bond” process (c) An electrophile used in in “break-bond” process.

79

Individual-View “Nucleophile” (e.g. OH-, Cl-) Individuals

p ;a piece-of-stuff Preconditions

nucleophile (p)

Relations

Ds[charge(p)]= 1

lone-pair-electron (p) +

−P no-of-bond(p)

charge(p) −

+P lone-pair-electron (p) (a)

Individual-View “Charged-Nucleophile”

Individuals

p ;a piece-of-stuff e.g. Cl- or OH- Quantity-Conditions

charges(p, negative) Am[lone-pair-electron (p)] >= ONE

(b)

Individual-View “Delta-Minus”

(e.g. the “O” that bonds to “C” in the main chain) Individuals

p ;is a piece of stuff, the leaving group Preconditions

electronegativity(p) > electronegativity(carbon) Quantity-Conditions

leaving-group(p, good) Relations

Ds[charge(p)]= -1

lone-pair-electron(p) −

+P no-of-bond(p)

charge(p) +

−P lone-pair-electron(p)

(c)

Figure 3.6 (a) Generic definition for a nucleophile described using QPT (b) A nucleophile used in “make-bond” process (c) A “delta-minus” view. It is used when the covalent bond between a delta-plus and a delta-minus species is deleted.

3.6.3 Relation Between View Pairs and Organic Processes

In this work, a view pair is defined as having two individual views, in the form of

<Individual-View-1, Individual-View-2>. A view pair is used as the means to select

(and activate) a chemical process. In this thesis, a chemical process means an organic

process. The term “organic process” is used interchangeably to the term “organic

reaction” since organic reactions are modelled as QPT processes. The view pairs and

their associated organic processes will be stored as basic facts in the chemical

knowledge base. Sample cases are given in Table 3.1. Such results are obtained through

a detailed analysis performed on numerous reaction cases and the result has also been

verified with the chemists. The first row in Table 3.1 says: “a bond will be made (or

80

formed) when the view pair of <neutral nucleophile, charged electrophile> exists”.

Likewise, a bond will be deleted when one of these view pairs exists: <neutral

electrophile, neutral nucleophile> or <neutral electrophile, charged nucleophile>.

Table 3.1: Relationship between view pair and covalent bonding.

No. Individual-View 1 Individual-View 2 Covalent

Bonding

1 neutral nucleophile charged electrophile make bond 2 charged electrophile charged nucleophile make bond 3 neutral electrophile charged nucleophile make bond 4 neutral electrophile charged nucleophile break bond 5 neutral nucleophile (delta minus) neutral electrophile (delta plus) make bond 6 neutral nucleophile (delta minus) neutral electrophile (delta plus) break bond

In Table 3.1, individual views in No. 3 and No. 4 are the same, but the covalent bonding

that will take place is different. So, which one would the software advise to occur?

The solution is to use the OntoRM ontology to disambiguate the situation (OntoRM will

be discussed in Chapter 5). We have presented the following two view pairs: <carbon,

−OH2+> and <carbon, HO−> in equation 3.2 and equation 3.4 respectively. Both view

pairs consist of a neutral electrophile and a charged nucleophile. But once they are

checked with the chemical knowledge, and if the two reacting units come from the same

compound then “deleting a bond” is the process that should occur. The OntoRM will

also check whether the carbocation is a stable one (in this case it is tertiary, so the

answer is yes). On the other hand, if the charged nucleophile is the one approaching the

compound then the reasoning engine will suggest the process of “adding a bond”

instead. By doing so, the nucleophile will substitute the leaving group. This is the

phenomenon of expulsion of the leaving group.

Let us examine another example. In the same table, individual views in No. 5 and No. 6

are having the same pair of views <delta-minus, delta-plus>, but the covalent bonding

81

that may take place can either be “adding a bond” or “deleting a bond”. The reasoning

engine resolves this situation by looking at the structural unit of the compound to see if

there is still a charged atom around, if so then it will suggest to break the bond (e.g. the

lone pair electrons on the oxygen atom in “H2O” are donated to the hydrogen atom in

the substrate “(CH3)3COH2+”; hence a stable compound “(CH3)3COH” is formed.

Otherwise the reasoning engine will recommend that a bond should be added.

3.7 Reaction Steps Classified as “make-bond” and “break-bond” Processes

To build a model, it is necessary to identify relevant entities (views), properties and

relationships. As such the properties of the views were first studied and then moved to

study general characteristics by examining the states change of these views (or reacting

species) along the reaction route from initial state until the entire simulation ends. The

attempt was to assemble general properties and behaviour patterns of the reacting

species and the associated organic processes (covalent bonding) needed. This is our

technique towards automating the views and QPT processes construction. In our earlier

study (Tang and Syed Mustapha, 2006), we hand-instantiated QPT models for

representing the chemical theories of “make-bond” and “break-bond” processes, and

these models can be used in many organic reactions simulation. From examining a

number of chemical equations, one conclusion that can be drawn is that all organic

reactions needed to simulate the behaviour of the reactions defined in the scope of this

work will require only nucleophiles and electrophiles (both charged and neutral types).

Further investigation showed that the different chemical processes involved in a

reaction can be placed under either one of the “adding a bond” and “deleting a bond”

activities (the proof of this claim is presented in Section 3.7.1). The two chemical

bonding activities are named as “make-bond” and “break-bond” respectively.

82

Thereafter, the two terms (“make-bond” and “break-bond”) will be used to refer to

adding a bond and deleting a bond.

3.7.1 Proof of Common Behaviour Exhibited in Organic Processes

As shown in Section 3.7, regardless of the names of the chemical processes, so long as a

process takes a view pair (consisting a nucleophile and an electrophile), then they are

either a “make-bond” or a “break-bond” process. In this work, the identified generic

processes are two, namely, “make-bond” and “break-bond”. This subsection details the

proof of our claim that the chemical properties of the two organic processes are easier to

be defined (as compared to the inorganic counterpart), therefore easing the construction

of the generic processes (represented as QPT models) to support task level reasoning.

Examples of task level processes are: “protonation”, “dissociation”, “halogenations”,

and so forth. Even though the names of the chemical processes are different, they can

be grouped under either one of the two generic processes. All organic chemistry

processes that involve adding a covalent bond between two views will exhibit the same

chemical behaviour and the individuals in the views will undergo the same chemical

changes. The sequence of use of these processes is vital in suggesting a particular

reaction mechanism. It is also ascertained that both the SN1 and SN2 mechanisms will

just need two processes (“make-bond” and “break-bond”), as given in Table 3.2.

83

Table 3.2: A summary of the covalent bonding needed by three chemical equations presented in this thesis.

Equation 3.2 Equation 3.3 Equation 3.4

First Reaction Step

Protonation: (CH3)3COH + H+

“make-bond” process

Dissociation: (CH3)3C–Br

“break-bond” process

Br (a halogen) is leaving: Br

[HO-----CH2CH3] “break-bond” process

Second Reaction Step

Dissociation: (CH3)3C–OH2

+

“break-bond” process

Reaction with water:

(CH3)3C+ + H2O


Nucleophile attacks: (HO− ---CH2CH3-----Br)


Third Reaction Step

Capturing of anion by cation: (CH3)3C

+ + Cl–


Fast acid-base reaction:

(CH3)3COH2

+ + H2O


--

Remarks

Mechanism used = SN1 (existence of carbocation intermediate)

Mechanism used = SN1

Mechanism used = SN2 (concerted process)

A summary of the results obtained from Table 3.2 are that:

• Equation 3.2 can be explained by the SN1 mechanism and the reaction steps

comprised of: “make-bond”, “break-bond” and another “make-bond”.


comprised of: “break-bond”, “make-bond” and another “make-bond”.


comprised of: “make-bond” and “break-bond”.

The above information is included in the chemical knowledge base. Such information

can be retrieved at any stage during a simulation task (e.g. before suggesting an organic

process to take place). Despite of the different names (e.g. protonation, capturing of

anion by cation, etc.) used in each reaction step, these reactions are either “make-bond”

or “break-bond” processes. In addition, for a particular organic process, regardless of

84

the specific names of the view pairs (e.g. <proton, alcohol oxygen> or <carbocation,

halide ion>), all the reacting species will undergo specific chemical changes. To justify

this claim, three chemical equations are used to demonstrate the chemical behaviour

which is common to these two organic processes involving different view pairs.

3.7.1.1 Behaviour Generalization for “make-bond” Process

In this section, three chemical equations for properties generalization purposes are

analyzed. Results of behaviour generalization for the two chemical bonding activities

are tabulated in Table 3.3 – Table 3.6. We rewrite the three chemical equations, as

follows:

(CH3)3COH + HCl → (CH3)3CCl + H2O (SN1mechanism) (CH3)3CBr + H2O → (CH3) 3COH + HBr (SN1mechanism) HO− + CH3CH2Br → CH3CH2OH + Br− (SN2 mechanism)

Table 3.3: Reacting species and their chemical changes in the “protonation” process (“make-bond”) of Equation 3.2.

Nucleophile (O)

Before

After Remarks Electrophile (H+)

Before After Remarks

Charge Neutral Positive Unstable Charge Positive Neutral Stable No. of

covalent bond

2

3

More than what it

should have

No. of covalent bond

0

1

Not informative

Lone pair electrons

2

1

Have not reached

maximum pair

Lone pair electrons

0

0

No change and not

informative

Table 3.4: Reacting species and their chemical changes in the “capturing of halide anion by carbocation” process (“make-bond”) of Equation 3.2.

Nucleophile

(Cl-) Before After Remarks Electrophile

(C+) Before After Remarks

Charge Negative Neutral Stable Charge Positive Neutral Stable No. of

covalent bond 0 1 Not

informative No. of covalent

bond 3 4 Stable

Lone pair electrons

4 3 Stable Lone pair electrons

0 0 No change

85

Table 3.5: Reacting species and their chemical changes in the “reacts with water” process (“make-bond”) of Equation 3.3 for the formation of alcohol.

Nucleophile

(the “O” in OH2)

Before After Remarks Electrophile

(C+) Before After Remarks

Charge Neutral Positive Unstable Charge Positive Neutral Stable No. of

covalent bond

2

3 More than

what it should have


3

4

Stable

Lone pair electrons

2 1 Have not reached

maximum pair

Lone pair electrons

0 0 No change

Table 3.6: Reacting species and their chemical changes in the “nucleophile attacks” process (“make-bond”) of Equation 3.4 for the formation of ethanol.

Nucleophile (HO-)

Before After Remarks Electrophile (the “C” that bond to the bromine)


Charge Negative Neutral Stable Charge Positive

Neutral

Stable


1 2 Stable No. of covalent bond

3

4 Stable

Lone pair electrons

3 2 Not informative

Lone pair electrons

0 0 No change

We will now show how the establishment of the set of functional dependencies for the

“make-bond” process is accomplished. When the numerical data in Table 3.3 – Table

3.6 are examined, the following quantities dependency and effect propagation can be

established (in QPT notations). Note that “Y +

−P X” is used instead of the original QPT

equivalence “Y α +

−Q X”. The symbol “P” will be used in the software and it means

“proportionality”. One of the reasons is that the “P” is more receptive than the “α”to

chemistry students. It is also our intention to avoid using additional symbols in the

software, so that the chemistry students do not have to learn new jargons which can be a

burden to them.

The numerical data in Table 3.3, Table 3.5 and Table 3.6 allow us to define the

following qualitative proportionalities:

86

lone-pair-electron(O) +

−P no-of-bond(O) …(a)

charge(O) −

+P lone-pair-electron (O) …(b)

lone-pair-electron (H+) P no-of-bond(H+) …(c)

charge(H+) +

−P no-of-bond(H+) …(d)

Likewise, the numerical values from Table 3.4, Table 3.5 and Table 3.6 also enable us

to define the following relationships:

lone-pair-electron (Cl-) +

−P no-of-bond(Cl-) …(e)

charge(Cl-) −

+P lone-pair-electron (Cl-) …(f)

lone-pair-electron (C+) P no-of-bond(C+) …(g)

charge(C+) +

−P no-of-bond(C+) ... (h)

Interpretation of the above is given here. In all cases, an increase in no-of-bond of the

nucleophile (e.g. O and Cl−) will cause a decrease in its lone-pair-electron ((a) and (e)).

This in turn will increase the charge of the affecting species either from neutral to

positive (increasing) or from negative to neutral (also increasing). Recall that the

quantity space designed for charge is [negative, neutral, positive]. Notice that the

charge on electrophile is neutral after the “make-bond” process in each case (shown in

(d) and (h)). The chemical properties and reaction behaviour of the nucleophile in

Table 3.3 and the nucleophiles in Table 3.5 and Table 3.6 are the same while the

electrophiles in Table 3.4 and Table 3.5 share similar behaviour to that in Table 3.3.

This result confirms that the same set of chemical properties for modelling the Relation

slot of a QPT model for “make-bond” process can be reused. This means, in a “make-

bond” process, if the individual is an electrophile (or nucleophile) the same set of

general properties can be applied. For example, regardless of whether it is C+ or H+, so

long as it is an electrophile then it will demonstrate the same chemical properties

87

change. Suppose that a “make-bond” process is determined (based on the view-pair

concept mentioned earlier), the required properties will be retrieved and composed to

give the required QPT model. Similarly, we can get hold of the general properties for

the “break-bond” process. In this work, these set of qualitative proportionalities are

stored in knowledge base as chemical theories (in the form of

qprop(X,Y,SignX,SignY)), where “qprop” stands for “qualitative proportionality”. For

example, charge(p) +

−P no-of-bond(p) is represented as qprop(no-of-bond, charge, plus,

minus).

3.7.1.2 Behaviour Generalization for “break-bond” Process

Table 3.7 tabulates the changes in chemical parametric values for the species in the

“dissociation” process of equation 3.2 for the production of alkyl halide. The

“(CH3)3COH2+” is viewed as having two reacting units (hence individuals). Here, “C”

and “O+” are modelled as the two individual views even though they are from the same

compound. Table 3.8 shows the states and values of the reacting species in the

“dissociation” process of equation 3.3 for the production of alcohol. Table 3.9 gives the

chemical data changes for the reacting species that are involved in the “X is leaving”

process of equation 3.4 for the production of ethanol.

Table 3.7: The reacting species involved in this “break-bond” process are “C” from the alkyl group and the “O” from the oxonium ion. The carbon is δ+, so that the electrons are pushed towards “O” which is more electronegative.

O+

| Before After Remarks C


Charge Positive Neutral Stable Charge Neutral Positive Unstable No. of

covalent bond 3

2 Stable No. of covalent bond

4 3 Unstable

Lone pair electrons

1 2 Stable Lone pair electrons

0 0 Not informative

88

Table 3.8: In this “break-bond” process, the atoms involved are “C” and “Br” from the same molecule.

Br Before After Remarks C Before After Remarks

Charge Neutral Negative Unstable & reactive

Charge Neutral Positive Unstable


1 0 Not informative


4 3 Unstable

Lone pair electrons

3 4 Not informative

Lone pair electrons

0 0 No change

Table 3.9: In this “break-bond” process, the atoms involved are “C” and “Br”. Bromine is more electronegative than the other hygrogen substituents. So, it is the Br that leaves the molecule.

Br Before After Remarks C Before After Remarks

Charge Neutral Negative Unstable & reactive

Charge Neutral Positive Unstable


1 0 Not informative


4 3 Unstable

Lone pair electrons

3 4 Not informative

Lone pair electrons

0 0 No change

Functional dependency and effect propagation that can be derived from the above three

tables are as follows. From Table 3.7, the following functional dependencies are

obtained:

lone-pair-electron(O+) −

+P no-of-bond(O+) …… (i)

charge(O+) +

−P lone-pair-electron (O+) …… (j)

lone-pair-electron (C) P no-of-bond(C) …… (k)

charge(C) −

+P no-of-bond(C) …… (l)

Reactions in Table 3.8 and Table 3.9 exhibit the same behaviour as shown below:

lone-pair-electron (Br) −

+P no-of-bond(Br) …… (m)

charge(Br) +

−P lone-pair-electron (Br) …… (n)

lone-pair-electron (C) P no-of-bond(C) …… (o)

charge(C) −

+P no-of-bond(C) …… (p)

89

The explanation is given as follows. The δ− species that pulls in electrons during the

“break-bond” process will get back one pair of unshared electrons (shown in (i) and

(m)) and further causing its charge to decrease ((j) & (n)). On the other hand, the

species that looses electron will increase its charge ((l) & (p)).

The above derivation of how the modelling decision is obtained has also been verified

and accepted by the domain experts. Based on the general properties, it is more

apparent how the various slots in Figure 3.7 (a “make-bond” specification in QPT) are

defined. The model can be used to reproduce the chemical behaviour of the first

reaction step of equation 3.2. In other words, when reasoning is applied to a QPT

model, the behaviour of adding a bond (or deleting a bond) can be reproduced.

On the other hand, the QPT model for “break-bond” process used in this work is

presented in Figure 3.8. The “break-bond” model is used to simulate the chemical

behaviour of the second step of equation 3.2. The QPT models were conceptually

validated by two chemists. The chemists concluded that the representation of chemical

theories in the models is acceptable.

Process Slots Neutral Nucleophile

(e.g. O)

Charged Electrophile

(e.g. H+)

Pre-Conds Am [no-of-bond(O)] = TWO is_reactive((CH3) 3COH) leaving_group(OH, poor)

--

Qty-Conds Am[lone-pair-electron(O)] >= ONE charges(O, neutral) nucleophile(O, neutral)

charges(H, positive) electrophile(H, charged)

Qualitative

Proportionality lone-pair-electron(O)

+

−P no-of-bond(O)

charge(O) −

+P lone-pair-electron(O)

lone-pair-electron(H) P no-of-bond(H)

charge(H) +

−P no-of-bond(H)

Direct

Influences

I + (no-of-bond(O), Am[bond-activity])

I + (no-of-bond(H), Am[bond-activity])

Figure 3.7 An instantiated “make-bond” process described using QPT modelling constructs. The process focuses on the nucelophile (the “OH”) to be replaced and the proton.

90

Process Slots Delta-Minus (δδδδ-)

(e.g. O+)

Neutral Electrophile (δδδδ+)

(e.g. C)

Pre-Conds bond-between(C, O) electronegativity(O) > electronegativity(C)

Qty-Conds Am[no-of-bond(O)] > Am[max-bond-allowed(O)] charges(O, positive)

--

Qualitative

Proportionality lone-pair-electron(O) −

+P no-of-bond(O)

charge(O) +

−P lone-pair-electron(O)

lone-pair-electron(C) P no-of-bond(C)

charge(C) −

+P no-of-bond(C)

Direct

Influences

I − (no-of-bond(O), Am[bond-activity])

I − (no-of-bond(C), Am[bond-activity])

Figure 3.8 An instantiated “break-bond” process described using QPT modelling constructs. The process focuses on the leaving group and the electrophilic carbon centre.

The common set of parameter dependency statements presented in this section forms

the basis for the automation of QPT models. The modelling constructs of QPT is

suitable for representing an organic reaction in terms of the movements of electrons

around an organic compound. This type of micro-level description mimics the way a

chemist explains how a reaction takes place.

3.8 Representing Organic Chemistry Theories Using QPT Constructs

In qualitative reasoning research, even though the key role is played by qualitative

simulation, the first and foremost task is to construct a model. Without a model,

reasoning cannot be started. As such the construction of process model is required

before qualitative reasoning can be performed. Once the right organic process is

determined, the fixed set of qualitative proportionalities (presented in Section 3.7.1) is

retrieved from the chemical KB and the required QPT processes are constructed. Since

chemical process reasoning based on QPT ontology starts from the direct influence, the

choice of representing the QPT slots is discussed in the following subsections.

91

3.8.1 Direct and Indirect Influences in Organic Reaction Simulation

The direct influences are only two, adding a bond to the organic compound and deleting

a bond from the compound. The former is the direct effect of a “make-bond” process

while the latter is the direct effect of a “break-bond” process. When a bond is added or

deleted, it will bring about other effects, as shown in the Relation-slot of the QPT

models (Figure 3.7 and Figure 3.8).

3.8.2 Postulating Limit Points

Postulating the existence of limit points is a challenge in this domain. In QPT

formalism, use of limit points is important since they are crucial for prediction. For

example, some physical phenomena occur when a quantity’s value is above or below a

limit point. Table 3.10 gives samples of the limit points used in this work.

Table 3.10: Quantity spaces and limit points for the three main quantities used in the framework.

Quantity Quantity space Limit points

Charge [negative, neutral, positive] Where, Negative means unstable Neutral means stable Positive means unstable

[negative, neutral, positive]

When the value of the “charge” is either negative or positive, then the current process will stop and the next process may begin.

Covalent bond [lessThan, enough, moreThan] Where, lessThan means unstable enough means stable moreThan means unstable

[lessThan, enough, moreThan]

When the value of the “covalent bond” is either lessThan or moreThan, then the current process will stop and the next process may begin.

Lone pair electrons

[lessPair, enough, extraPair] Where, lessPair means unstable enough means stable extraPair means unstable

[lessPair, enough, extraPair]

When the value of the “lone pair electrons” is either lessPair or extraPair, then the current process will stop and the next process may begin.

92

In literature, quantity space is defined as having the set of <decrease, nil, increase> or

the set of <plus, zero, minus>. However, the chemistry students are more familiar with

the numerical data representation of an atom’s basic chemical property. As “stable” or

“unstable” are related to “how many”, so the result presentation is given in numeric

form. Since it is easier for the students to match the answer to the one in their mind,

numerical output will be shown on the Graphical User Interface (GUI) when interacting

with the software. In other words, the numerical data are just “output representation”

and not used as variables in the simulation. Since the ultimate goal of this work is to

help students understand the underlying subject, the decision of displaying numerical

outputs is considered as acceptable and valid. Moreover, there is no mismatch of

answers between the system generated and the one from the textbook, and such decision

imposes no extra cost in terms of the effort to learn a new formalism.

Owing to the specific requirement in this chemistry system, limit analysis (Forbus,

1984) is not used the way it should be. Instead, a (new) phenomenon will occur

whenever a limit point is reached (i.e. having the exact condition). In QPT, when a

limit point is reached, something will occur and will cause the current process to stop.

In this work, however, the nature of this problem domain suggests that the limit point

may not necessarily be passed (in order to terminate a process), as illustrated in the

following example. The quantity space and its limit analysis have been adapted to cope

with the “a-bit-weird” condition caused by the nature of the problem. So, the standard

concept of limit point is not fully implemented, but it is replaced by tracking the

qualitative states (and their associated numeric value equivalence) in the quantity space.

93

Let us take no-of-bond as an example, and recall that the numerical data space for no-of-

bond is: bond = [0, 1, 2, 3, 4]. The qualitative states for no-of-bond are: bond =

[lessThan, enough, moreThan].

We now look at the oxygen atom (O) and the carbon atom (C) in the main chain of an

alcohol (e.g. the (CH3)3C−OH presented in equation 3.1 on page 65):

For “O”, its stable state = enough = 2 (in numeric). After “protonation”, the new value

is moreThan, which is a limit point, so the process stops. On the other hand, for “C”

atom, its stable state = enough = 4. After “dissociation”, the new value is lessThan,

which is also a limit point, so the process stops. The above two cases obey the use of

limit points.

As for the incoming nucleophile/electrophile and the leaving group, their values are

used to determine whether a reaction step in the entire simulation route should stop due

to violation of its quantity-condition, as manifested below: The proton (H+) has no

bond, when the “protonation” process occurred, the “H+” (which serves as an incoming

electrophile) binds to the oxygen atom. Now, the hydrogen has its covalent bond =

enough (not a limit point) = 1 (in numeric), so the quantity-condition of this process is

violated hence the protonation is deemed to stop. On another note, after the

“dissociation” process, the “O+” has its covalent bond = enough (not a limit point) = 2,

the process will still end since the quantity-condition has been violated. As for the

“Cl−”, after the “make-bond’ process, it is stable = enough (also not a limit point), and

since the quantity-condition is violated, the process is deemed to stop.

94

3.8.3 The Quantities and Quantity Spaces

Characteristics of objects are represented by quantities (object’s properties). Some

important quantities and the associated quantities spaces (values of numbers) are

provided in Table 3.11. Only “changing” quantities are shown in the table and not the

physical quantities. Examples of physical quantities are atomic number and period-

number. When quantities are coupled with derivatives, quantity space analysis can be

performed. For example, at any time the charge of any atom is either negative or

neutral or positive. The value of a quantity will change from left to right or from right

to left depending on the signs of changing (-1, 0, 1). There should not be any jump or

skip of value. In this work, the Quantity Space Analyzer (QSA) is the module that

keeps track of all these changes. Reading from literature, QPT was developed with

continuous changes in the developer’s mind. When designing the quantity spaces for

use with the quantities during reasoning, we modified a bit the standard way of defining

the quantity spaces. The non-standard way of using it in this work is due to the specific

requirements found in the problem domain, and we shall explain it here. Applications

of QPT rely on the understanding of physical laws and their mathematical expression in

physical and engineering systems (Forbus, 1993). These laws are used to specify

criteria to select values in composing each variable’s quantity space, expressed as the

relevance principle by Forbus (1984) and in combining values of different variables.

However, the organic chemistry system we are dealing with here finds no equivalent

physical laws or mathematical formalisms that can be used to specify criteria for

composing each variable’s quantity space. As such we modified the convention used in

selecting the quantity spaces to implement our QPT-based system.

Qualitative simulation used in this work consists of two levels of stopping conditions

described as follows. The completion of each of the small reaction steps is caused by

95

the updated states of some parameters (quantities) such as new chemical states no

longer matched to the entry conditions of the process (e.g. “H+” becomes “H”). In

Chapter 4 we will show that the overall simulation of a chemical equation will end

when there are no more view instances to be paired up. This situation also implies that

the substrate is in its most stable form.

Table 3.11: Examples of quantities and associated quantity spaces. Quantity Quantity Space Remarks

charges [negative, neutral, positive] • At any time the charge of any atom is either negative, neutral or positive.

lone-pair-electron [lessPair, enough, extraPair] • “enough” refers to the lone-pair in stable state.

no-of-bond [lessThan, enough, moreThan] • “enough” refers to the number of covalent bond in stable state.

bond-status [partial, complete] • Complete or incomplete octet status. Species with incomplete octet is still unstable, a tendency to react further.

bond-activity [break-bond , make-bond] • During a process, a bond is either being made or broken.

nucleophile-reactivity [charged, neutral] • Charged species is more reactive than a neutral one.

bond-type [single, double, triple] • Only single bond is considered at current stage of work.

reactivity [first-degree, second-degree, third-degree] • Make provision for checking the carbocation stability.

electro-negativity [low, high] • Index for comparing two species in the same compound for its electro-negativity level.

3.9 Useful Guidelines in Modelling Views for Organic Reactions

There are some useful tips during the modelling activity especially in the mapping of

the chemical properties to QPT primitives. For example, in inorganic chemistry, a

reaction takes place by dissolving the reactants to produce ions and these ions have the

similar chemical properties such as “an increase in concentration will result in the

96

increase of product formation, etc.” but when organic compound is used as the

substrate, it is the structure of the compound that determines what reaction mechanism

to apply in the synthesis route. Even though it is the functional group that is responsible

for a reaction, the mechanism used is dependent on the carbon centre (in the main chain

of a molecule). The primary (1ο), secondary (2ο), tertiary (3ο), or quaternary (4ο)

nomenclature is used to define a carbon centre. One of the ways of determining

whether a carbon centre is primary (1ο), secondary (2ο), tertiary (3ο), or quaternary (4ο)

is to count the number of bonds which are not bonded to hydrogen (Patrick, 1997a;

Patrick, 2000). For instance, a 3ο (tertiary) carbon is stable when its functional group

(the “OH”) is being removed, so the SN1 mechanism can be used to produce alkyl

halide.

Since the structure of a compound is important and there are many structural units in a

given substrate, this suggests that more than one view is required, one for each unit (and

not one view per substrate as used in QALSIC), as illustrated in Figure 3.9. In specific,

Figure 3.9 shows alcohol substrates with different degrees of carbons and thus these

substrates will exhibit varying reactivity under SN1 mechanism. For example, in the 3ο

case, one view should be designed for the OH portion and one more view for the

“(CH3)3C–”. This is what we meant by “looking at structures” is needed. This is

consistent with the example shown earlier (page 75) that tert-butyloxonium ion

“(CH3)3COH2+” is modelled as two views: “(CH3)3C−”, the alkyl part of the substrate,

and “−OH2+”, the oxonium ion.

97

1ο (Primary) 2ο (Secondary) 3ο (Tertiary) One carbon Two carbons Three carbons (Not reactive) (Most reactive) H H CH3 | | | CH3 − C − O−H CH3 − C − O−H CH3 − C − O−H | | | H CH3 CH3

View-1 View-2 View-1 View-2 View-1 View-2

Figure 3.9 Alcohol reactivity under SN1 mechanism.

Three challenges were faced during early part of the modelling work. First, knowledge

abstraction is difficult because chemical commonsense is required. Humans tend to

make a lot of assumption in their reasoning and the chemical intuition required to

suggest reaction mechanisms is largely dependent on the commonsense knowledge one

has. Second, the setting of inequality for the quantity-condition is challenging. Unlike

other physical systems, the modelling of reaction mechanisms is not a straightforward

task, in that it is difficult to write (differential) equations to establish relationships

among variables. For example one can easily establish equation F = m.a for “net-force

+

+P mass” and “acc +

−P mass”. Also, in ecology, “growth-rate +

+P recruitment” and

“growth-rate +

−P mortality” to represent the expression growth-rate = recruitment –

mortality (Salles et al., 1996; Salles and Bredeweg, 1997). This is also the case in the

description of a heat-flow process where it can easily be identified that there is a

difference between the source and the sink temperatures (source-temp > sink-temp or

source-temp – sink-temp > zero). However, this type of relationship is not clear in our

problem. Nevertheless, we have identified and used the most fundamental aspect of a

chemical reaction to trigger the series of steps in a reaction, namely the reacting species

should be in their unstable states such as incomplete octets (valences have not been

completed). Third, QPT was developed with continuous changes in mind. However,

98

this organic chemistry domain is found to be very precise for a qualitative reasoning

application from the perspective of the use of qualitative constraints over the variables.

As mentioned earlier, the use of qualitative constraints over the variables with discrete

values is part of our implementation efforts to put on “chemistry clothing” in the

software. Nevertheless, as far as the simulated results are concerned, the organic

reaction simulation can still be appropriately handled using QPT.

3.10 Learning with Qualitative Models

Model inspection can help sharpen a learner’s reasoning ability in the way that the

learner has to think hard why the statements in each slot (of the model) are relevant or

negligible. Note that in the evaluation stage, the students were given a short lecture on

the meanings and purpose of each QPT slot. Then, they are expected to read the model

constructed by the simulator prototype. In this section we will show how a qualitative

model can help articulate ideas about a learning task and to improve a learner’s

reasoning ability. The model inspection activity is divided into a series of learning

tasks. All learning tasks are based on the “protonation” process as illustrated in Figure

3.10 (representing the behaviour of the first reaction step for equation 3.2 on page 68).

Readers may refer to Appendix D.8 for a computer generated QPT model (captured

from the model viewer interface).

99

Process “Protonation” (e.g. ((CH3)3COH) is protonated by H+)

Individuals

; electrophile (charged)

1. H ;hydrogen ion

; nucleophile (neutral)

2. O ;alcohol oxygen has a lone-pair electrons Preconditions

3. Am [no-of-bond(O)] = TWO 4. is_reactive((CH3) 3COH) 5. leaving_group(OH, poor) Quantity-Conditions

6. Am[lone-pair-electron(O)] >= ONE

7. charges(H, positive) 8. electrophile(H, charged) 9. nucleophile(O, neutral) 10. charges(O, neutral) Relations

11. Ds[charge(H)]= -1 12. Ds[charge(O)]= 1

13. lone-pair-electron(O) +

−P no-of-bond(O)

14. charge(O) −

+P lone-pair-electron(O)

15. lone-pair-electron(H) P no-of-bond(H)

16. charge(H) +

−P no-of-bond(H)

Influences

17. I+ (no-of-bond(O), Am[bond-activity])

Figure 3.10 The QPT process specification that models the behaviour of a “make-bond” process.

3.10.1 Ontology Primitives as Explanation Facilitator

During a reaction simulation, several types of queries may be expected. From the

interview conducted during the domain knowledge acquisition stage, the most popular

questions the students would ask are:

• What are the reacting species (the “individuals” in QPT terms) used in the chemical

process that occurred? Refer to learning task 1 for the answer.

• What type of alcohol (the “views” in QPT term) involved? Refer to learning task 1

for the answer.

• What are the chemical facts and properties that are true even after a chemical

process has occurred? Refer to learning task 2 for the answer.

100

• Will a covalent bond be added or deleted from the compound? Refer to learning

task 3 for the answer.

• What happens to the functional group? Refer to learning task 4 for the answer.

• Why did the process occur? Refer to learning task 5 for the answer.

• Why was the process stopped? Refer to learning task 6 for the answer.

In this section only the usage of the various modelling constructs of QPT for answering

some conceptual questions will be explained while Chapter 4 will discuss the approach

used in handling questions after the entire simulation has been performed (i.e. the

understanding of the entire reaction route). One typical question of the latter is, in what

sequence the processes are activated and why it behaves so?

3.10.2 Learning Activities Manifestation

When the general chemical theories of a reaction are modelled as a QPT process, a

number of learning tasks can be devised, as follows. Note that line numbers are based

on the enumeration used in Figure 3.10 and equation 3.2 ((CH3)3COH + HCl →

(CH3)3CCl + H2O).

• Learning task 1: Proton (H+) and alcohol oxygen (OH) are needed by equation 3.2

simulation. Learners would be able to find this by inspecting the “Individuals” slot.

Briefly, the slot says that, in order to begin the first step of the equation 3.2

simulation, a proton is needed which serves as an electrophile together with a species

which has a nucleophilic centre. In this case, the nucleophilic centre is the “O” from

the “OH” group (termed as alcohol oxygen) which has lone-pair electrons to be

donated. Line 1 and Line 2 show exactly the existence of hydrogen ion together with

101

the “OH” functional group from the alcohol which help explain why the two

substances are required.

• Learning task 2: Lines 3, 4 and 5 collectively say that the number of covalent bonds

on “O” is two; “(CH3)3COH” is reactive and “OH” is a poor leaving group. These

are basic information of what chemical properties that will remain valid throughout a

reaction for the involved substances.

• Learning task 3: In Line 6, the inequality (lone-pair-electron >= ONE) for the “O”

(which is the nucleophilic centre of the alcohol substrate) says “there is at least one

lone pair of electrons to be donated to H+”. Lines 7 – 8 indicate that “H” is a charged

species and thus it will act as an electrophile in the reaction. As such a covalent

bond would be added to the compound (the alcohol substrate).

• Learning task 4: When the “make-bond” process begins, the “O” will have an extra

covalent bond while the “H” will be neutralized. Chemistry students can appreciate

such concept by examining the functional dependencies as defined in Lines 13–16.

• Learning task 5: The process occurred because the statements in quantity-

conditions (Lines 6 – 10) are satisfied, which states that “alcohol oxygen with at least

one lone pair of electrons is needed so that the electrons can be donated to the proton

in order to make a bond”.

• Learning task 6: Lines 13–16 manifest that when the process begins, the “O” will

have an extra covalent bond while “H” will be neutralized. When more covalent

bonds are made on “O”, its number of lone pair electrons will decrease via the

inverse qualitative proportionality. When the lone-pair-electron on “O” decreases

the charge on “O” will increase. These relationships explain how the “O” donates a

pair of electrons in order to form a bond. At this point of time, the quantity-

condition has been violated. Therefore, the process is deemed to stop.

102

3.11 Conclusion

This chapter fulfilled four objectives. First, domain knowledge acquisition has been

performed. Second, the QPT ontology has been examined and applied to modelling the

general behaviour of organic reactions. Third, classification of chemical processes for a

variety of organic substrates (as defined in the scope) has been accomplished. Last, the

chapter demonstrated the logic behind model automation and its justification; through

analyzing many chemical equations to reach a generalization state. In particular, this

chapter has answered two research questions: (1) “How can QPT be used to model the

behaviour of organic reactions?”, and (2) “How can qualitative model construction be

automated?” Our investigation determines that there are two main processes needed,

namely “make-bond” and “break-bond” for the entire reaction mechanisms, specifically

for SN1 and SN2. In the modelling stage, the basic principles of organic reactions were

learned, and the mental model of a few domain experts were sought and studied. Then,

the possibility of representing the chemical theories in qualitative terms using QPT was

attempted. The QPT ontology allows representation of chemical process elements at

the finest level of granularity. In the following chapter we shall discuss how these QPT

models can be used to support prediction of final products for a pair of reactants, as well

as the explanation generation approaches.

103

Chapter 4 Qualitative Simulation and Explanation Generation

4.1 Introduction

This chapter describes the work done in achieving the following three objectives as

stated in Chapter 1:

• To design the qualitative reasoning algorithm for reaction mechanism simulation.

• To find an easy (yet natural) way of generating explanation effectively, in order to

facilitate mastering of organic reaction concepst via the QPT-based reasoning.

• To automate causal graph (state graph) generation as a means to explain an organic

process phenomenon.

Qualitative simulation and explanation generation play crucial roles in qualitative

reasoning research. Simulation along with explanation and justification of simulated

result are the central questions of this work. The organization of this chapter is as

follows. Section 4.2 reviews the state of the art in qualitative simulation and

explanation in education. Section 4.3 gives a scenario for the simulation of organic

reactions through QPT-based reasoning. The qualitative simulation workflow and

organic reaction reasoning will first be presented. Then the simulation task in a step-

by-step manner will be discussed; from the identification of what chemical process to

activate until the production of the most stable outputs. Section 4.4 explains the

chemical behaviour of SN1 and SN2 mechanisms. Section 4.5 discusses the specific

simulation for reproducing the behaviour of SN1. Section 4.6 presents examples and

situations for QPT model reusability. A scenario to warrant the claim that our models

are reusable will also be provided. Qualitative explanation and the approaches used in

justifying a simulated result are manifested in Section 4.7, through the use of causal

104

graphs for deriving explanation. In particular, we will show how the explanation

follows isomorphically from the underlying QPT-based reasoning. The justification is

presented in several output forms (formats) using the vocabulary of the QPT formalism.

Section 4.8 discusses the simulation results based on QPT reasoning. Section 4.9

concludes the chapter.

4.2 State of the Art in Qualitative Simulation and Explanation in Education

Qualitative simulation allows the modeller to explicitly represent and reason about an

ill-defined dynamic system (with imprecise or partial knowledge) using only an abstract

structural model. From qualitative simulation, a description of all possible qualitatively

distinct behaviours can be derived (Forbus, 1984; de Kleer and Brown, 1984; Kuipers,

1994). These techniques have been used for tasks such as design, monitoring, and

explanation. In the past twenty years, simulation has been an effective method used in

modelling real world processes and objects for analyzing subjects such as behaviour of

a system. One of the earliest examples of the use of qualitative simulation in education

is the Meteorology Tutor (Brown et al., 1973). Brown and his group continued their

work which resulted in the SOPHIE systems (SOPHIE I, II, and III), in which a learner

can perform experiments easily and safely and receive informed feedback for the

troubleshooting of electronic circuits through the artificial lab or reactive learning

environment (Brown and Burton, 1982). In SOPHIE III, a qualitative simulator was

incorporated in an attempt to move towards more humanlike reasoning and explanation

capabilities. A significant feature of the SOPHIE systems was the robust natural

language interface which can handle a broad range of queries from the users. Kuipers

developed QSIM, another qualitative simulation program. The ontology used is

constraint-based (Kuipers, 1986). The approach started with a set of constraints

105

abstracted from a differential equation and proved that the QSIM algorithm is

guaranteed to produce a qualitative behaviour corresponding to any solution to the

original equation. His work also showed that any qualitative simulation algorithm will

sometimes produce spurious qualitative behaviours, such as answers which do not

correspond to any mechanism satisfying the given constraints. Kuipers (1993) stated

that special care must be taken in designing applications of qualitative causal reasoning

systems and in constructing and validating a knowledge base of mechanism

descriptions.

The Teachable Agents project at Vanderbilt University (Biswas et al., 2001) shows an

example of how qualitative modelling can be useful for students. The work extended

intelligent learning environments with teachable agents to enhance learning. Their

Betty’s Brain system uses qualitative representations expressed in concept maps to

foster learning. Their qualitative modelling framework uses qualitative mathematics,

with tables for composing discrete values to provide qualitative simulation. Basically,

the task they use is to “teach” Betty (software) by building concept maps so that Betty

can produce explanations (Leelawong et al., 2001).

The QPT framework that supports articulate knowledge representation is another

qualitative simulation and explanation example that has been employed in educational

software of various kinds. For example, in the form of the self-explanatory simulation

software which incorporates simulations of liquids cooling and evaporating from cups

of different materials to help students understand the principles of thermodynamics

(Forbus, 1993; Forbus, 1996a) and in the form of the CyclePad Articulate Virtual

Laboratory software which supports learning about thermodynamical engineering by

the design of thermodynamic cycles (Forbus et al., 1999).

106

Another example of qualitative reasoning work is Garp3 (Bredeweg et al., 2007).

Garp3 is a new workbench that allows modellers to build, simulate and inspect

qualitative models. The system was developed by Bredeweg and his research group

based on a number of previous software (e.g. GARP, Garp2 and VisiGarp). The entire

framework includes a simulation engine with access to model fragments and results via

a command-line interface. Model building tools such as HOMER (Bessa Machado and

Bredeweg, 2002; Bessa Machado and Bredeweg, 2003) and MOBUM (Bessa Machado,

2004; Bessa Machado et al., 2005) are part of the framework. VisiGarp and WiziGarp

are two recent add-ons to the workbench. The former allows users to inspect qualitative

simulation models by interacting with automatically generated visualizations. The work

investigated how explanations of dynamic phenomena can be generated using

qualitative simulations. The potential of aggregation principles to reduce the complexity

of qualitative simulations has also been explored (Bouwer, 2005). The latter, WiziGarp

prototype, incorporates the aggregation mechanisms and expands the communicative

functions of VisiGarp.

Salles and his team have also presented work on qualitative simulation in interactive

learning environment (in ecology domain) especially on how to find the minimum set of

model fragments needed for simulating the behaviour of a system and to answer

adequately a specific question, and how to provide the system with fault models that

reflect common misconceptions (Salles, et al., 1997; Salles and Bredeweg, 1997; Salles

and Bredeweg, 2001, Salles, et al., 2006).

Hirashima et al. (1998) introduced their Error-Based Simulation (EBS) method to

visualize an erroneous equation in a mechanical problem. In their work, the EBS-

manager predicts qualitative behaviour of the EBS by using qualitative simulation and

107

compares it with qualitative behaviour of a normal simulation. When a qualitative

difference is found, the EBS-manager judges that the EBS is effective for error

visualization. The EBS-manager also tries to find parameters by using comparative

analysis of which perturbation causes the qualitative differences between the EBS and

the normal situation. After deriving the sequence of qualitative states based on an

erroneous equation by QSIM (Kuipers, 1994), the EBS-manager derives the sequence

of qualitative directions corresponding to the sequence of qualitative states with

perturbation of a parameter by using DQ-analysis (Weld, 1988).

Explaining dynamic phenomena is particularly hard because it involves describing how

a system changes over time. The expertise required to generate explanation about the

behaviour includes knowledge about the system under study, knowledge of the entities

of interest and how they relate to each other and knowledge of the processes that apply

to the situation and how they change the state of the system. In Laraba (2006),

explanation was viewed as a problem solving process with its own reasoning and

knowledge. Laraba sees the acquisition of a new practice in a contextual graph

corresponds to the addition of actions and contextual elements justifying the addition of

the action(s), hence providing explanation for why it is done in such a way. In his

work, the explanation module for qualitative simulation is used for justifying any state

transition in the behaviour tree and for explaining why an expected behaviour is

missing. The work has been continued to explore the explanatory tool that makes use of

contextual knowledge and of contextual graphs for modeling agent activity. With

contexual knowledge and graphs, the generation of user-based explanation and real-

time explanation are possible (Laraba and Brezillon, 2009).

108

According to Valley (1992), there are two types of explanation, system-based and

domain-based. The former describes what has happened during a consultation, for

example, which rules have been fired and which facts have been deduced. To generate

this kind of explanation, a trace of the consultation must be kept. Domain-based

explanations contain information about the domain knowledge and justify system-based

explanations. Our system supports both types of explanation. To achieve system-

based explanation, several data structures are maintained (presented in Chapter 5), so

that it can be retrieved, translated into a user-friendly layout and then presented to the

user. To achieve domain-based explanations, the domain knowledge is explicitly

represented via the QPT process notion. Therefore, QRiOM can explain not only the

reaction steps occurred but also the reasons for following these steps.

The work described in this thesis combines the strengths of these various approaches,

including the use of simulations for education, the use of qualitative models to simulate

system behaviour and generate causal explanations. As far as qualitative simulation is

concerned, our framework supports four main tasks, as follows:

• Modelling (Chapter 3): automatic modelling of domain knowledge (into QPT

models) to simulate organic reactions.

• Reasoning and simulation (this chapter): apply reasoning algorithms to the

constructed models to reproduce the behaviour of “make-bond” and “break-bond”

organic reactions.

• Predicting (this chapter): make prediction of the most probable outcomes (final

products).

• Explaining (this chapter): provide explanation and justification for a simulated result

by using the proposed organic mechanism.

109

Figure 4.1 relates the three common terms (reasoning, simulation and explanation)

described in this chapter.

Figure 4.1 The use of qualitative reasoning, simulation and explanation within the context of this work.

4.3 Qualitative Simulation Scenario

There are many cognitive steps leading from a chemical reaction to chemical solution.

Understanding the cognitive steps is among the many difficulties chemistry students are

facing such as lacking the skills to analyze the steps and translate the reactions into the

forms that can be used to predict the final product in reasonable and justifiable ways.

This section will demonstrate how this problem can be addressed by the qualitative

reasoning approach.

Modelling organic processes for reproducing the behaviour of SN1 and SN2 requires the

inclusion of the equilibrium phenomena. In our context, equilibrium is achieved when

Reasoning about the behaviour of the model

1. Causal changes (that stem from

QPT process reasoning) 2. Qualitative states of all parameters 3. A piece of “history” of processes that

occurred in a simulation

Causal Reasoning (The study of the cause-effect

interaction among parameters)

Behaviour Prediction

Explanation


110

all of the reacting species reach the so-called “complete valence” state. This also

serves as the stopping condition. On the other hand, if valency is incomplete, reasoning

on the next process will proceed.

4.3.1 An Overview of the Simulation Architecture

Figure 4.2 depicts the workflow of the QPT-based reasoning for reproducing the

behaviour of organic reactions as well as the two mechanisms (SN1 and SN2). The

detailed version of the qualitative reasoning framework is presented as a collection of

flowcharts in Appendix B. In our approach, the construction of the qualitative model is

automated by a simple pair of substrate and reagent (the inputs). The key is to

recognize the name of the functional group that is attached to the input substrate. When

the “input recognizer” identifies the type of the inputs, the nucleophilic and

electrophilic centres will be known. The identities can then be used for determining the

chemical bonding activity (add or delete a bond). After that, a candidate organic

process will be selected and activated. During reasoning, some intermediates are

produced (and some are converted to other molecules) and they are placed in the View

Instance Structure (VIS). So long as there are view pairs left in the VIS, a new process

will be initiated and the reasoning process is repeated. When the entire reaction has

ended, users may ask for an explanation on any aspect of the organic reactions.

The workflow of the QPT-based reasoning can be summarized as follows: Given a

reaction in the form “A (substrate) + B (reagent)”, the nucleophilic and electrophilic

centres will be identified resulting in the nucleophile(s) and electrophile(s) to be stored

in the VIS. These electrophiles/nucleophiles are called “individuals” in QPT. After

this, a suitable process is determined by the view-pair concept. A candidate process is

111

the one that satisfies quantity-conditions and pre-conditions. Besides, individuals that

are needed by the process must also be available in the VIS. When a process is being

activated, qualitative reasoning will begin. The reasoning engine will keep track of the

values of the changing states of the quantities being affected, starting from the first

process until the entire reaction ends. A process will stop when the statements in its

quantity-condition slot are invalid. When a process has occurred, some individuals

may cease (become new individuals) and updating of the VIS is necessary. If the

process produces a stable product, it will be stored separately for future retrieval. If

there are still reactive units (charged species or species that still have not completed

their valences), the reasoning process will be repeated, acting on other process

instances. The entire reaction will end when there are no more views to be paired up.

When a reaction has ended, outputs will be displayed, together with all the

steps/processes that occurred to produce the outcomes. This task is rather

straightforward since all the processes that occurred are recorded and the changes made

to each individual (described in terms of the functional dependency among quantities)

are also recorded. If a user needs an explanation for the results or has a question

regarding the behaviour of a quantity, then the explanation module will be executed.

Using this architecture, for each reaction, four main outputs can be obtained. These are:

(1) The final products (2) The steps/processes occurred, (3) The mechanism used, and

(4) The various forms of explanation to justify the simulated results (e.g. the reaction

route to produce the outcome).

112

Start

Select a pair of

<substrate, reagent>

Anymore

view pair?

Construct QPT model based on the

suggested chemical process

Process Reasoning

(Qualitative Simulation)

Stop

no

Generate

explanation

Write final product to special

purpose array

Recognize reacting units in the

substrate and reagent, and

construct view pairs

Suggest an organic process based

on the view pairs

Module 1

yes

Display the final

product and the

mechanism used

Module 2

Explanations are generated based on the

chemical theories represented in the

QPT’s design primitives such as the direct

(I+/I-) and indirect (Ps) influences

Figure 4.2 Workflow of the QPT-based reasoning.

This chapter discusses module 1 and module 2 of Figure 4.2. Since QPT does not

describe how the constructed models are used, we have to design the “how” in our

reasoning algorithm (presented in Chapter 5) for running the qualitative simulator. The

simulation of a chemical equation is accomplished by “reasoning” about the behaviour

of the models constructed for the generic processes. As such, the application of QPT

reasoning to one specific organic process will first be introduced (Section 4.3.2) then

113

moved on to describe the complete reasoning route for one specific chemical equation

simulation (Section 4.3.3).

4.3.2 Reproducing Behaviour of Organic Reactions via QPT Reasoning

A QPT model (as shown in Figure 4.3) is used to describe a specific reasoning task of a

reaction that adds a covalent bond between a nucleophile and an electrophile (so, it is a

“make-bond” process). In the figure, you may read the right column as “If (A and B)

Then (C and D)”. In this case, C and D are qualitatively reasoned.

This process occurs when the individuals (a nucleophile and an electrophile) are present

in the VIS. It is the candidate process because the statements in quantity-conditions are

satisfied (Lines 3 – 7), which say “the process needs a proton and alcohol oxygen with

at least one lone pair electrons to be donated to the proton in order to make a bond”.

The notion of “processes” defined in (Forbus, 1984) is used, in which “processes” are

the main causes of change in a chemical system. We represent chemical changes as

starting from the direct influence which then propagates via indirect influences.

Influences contain statements that specify what can cause a quantity to change, through

direct influences imposed by the process (label C). As the process occurs, bond-activity

is a direct influence’s process quantity and it has a positive influence (I+) on the no-of-

bond, which is defined as two direct influence statements using the “I+/I-” notation of

the QPT, as shown in Line 8 and Line 9. Other propagation of effect is defined in

Relation-slot (label D). It is propagated via a set of qualitative proportionalities defined

in the QPT process model. In this case, the number of covalent bonds the “O”

possesses is directly influenced by the process. The oxygen’s lone pair electrons will

decrease when more covalent bonds are made on the “O” via the inverse qualitative

114

proportionality defined in Line 12. Decreasing lone pair electron on the “O”, will cause

and increase in its charge (Line 13). This will make the “O” a positive charge species

hence it is unstable. When the “O” is protonated, it is no longer neutral (explained in

Line 13) thus violating the statement in the quantity-conditions slot. As presented in

Chapter 3, these are the standard set of changes caused by a “make-bond” process for

the pair of <neutral nucleophile, charged electrophile> views. But how will this change

affects re-evaluation of the quantity conditions? The Quantity Space Analyzer (QSA)

will update the initial values of the affected quantities. If the entry conditions (as

defined in B) are violated, then this process will stop. The design of the algorithm for

the QSA module will be discussed in Chapter 5.

Process Slots Modelling constructs in QPT

A

B

C

D

Individuals 1. H ;represents hydrogen

2. O ;represents the alcohol oxygen

Quantity-Conditions 3. Am[lone-pair-electron(O)] >= ONE 4. charges(H, positive) 5. electrophile(H, charged) 6. nucleophile(O, neutral) 7. charges(O, neutral)

Direct Influences 8. I + (no-of-bond(O), Am[bond-activity]) 9. I + (no-of-bond(H), Am[bond-activity])

Relations 10. DS [charge(H)] = -1 ;decreasing sign 11. DS [charge(O)] = 1 ;increasing sign 12. lone-pair-electron(O) +

−P no-of-bond(O)

13. charge(O) −

+P lone-pair-electron(O) 14. lone-pair-electron(H) P no-of-bond(H) 15. charge(H) +

−P no-of-bond(H)

Figure 4.3 A “make-bond” model fragment represented using QPT. This model fragment is used to reproduce the behaviour of the first reaction step for “(CH3)3C–OH + HCl” reaction.

115

So far, “where” the reaction might occur can be predicted but not what sort of reaction

will occur. A “mechanism” is the story of how a reaction takes place. It tells us how

the starting materials (organic substrates) and reagents react together to give the final

product (Patrick, 1997b).

4.4 Chemical Behaviour of SN1 and SN2 Mechanisms

Organic mechanisms can be used as a means to facilitate the mastering and

understanding of the fundamental principles of organic chemistry. This work looks into

one particular type of organic mechanism called nucleophilic substitution.

Nucleophilic substitution of an alkyl halide involves the substitution of the halogen

atom with a different nucleophile. The halogen is lost as a halide ion. The presence of

a strongly electrophilic carbon centre makes alkyl halides susceptible to nucleophilic

attack whereby a nucleophile displaces the halogen as a nucleophilic halide ion.

Equation 4.1 will be used to explain the behaviour of nucleophilic substitution.

RX + Nu− → RNu + X− (4.1)

In the transition state for this process, the new bond from the incoming nucleophile is

partially formed and the C−X bond is partially broken. The hydroxide ion (OH−) is a

nucleophile and uses one of its lone pair of electrons to form a new bond to the

electrophilic carbon of the alkyl halide. At the same time the C−X bond breaks. Both

the electrons in that bond move onto the halogen to give it fourth lone pair of electrons

and a negative charge. Since the halide is electronegative, this charge can be stabilized,

thus the overall process is favoured. There are two types of mechanism for alkyl

halides – SN1 and SN2. The two mechanisms are explained in following subsections.

Initially this work only investigated SN1. Later on, however, we found that the

116

automated processes can be used to also support the simulation of SN2 – hence model

reusability is achieved.

4.4.1 The SN1 Mechanism

Let us examine the reaction of the hydroxide ion (OH−) with a tertiary halide (Figure

4.4). The hydroxide ion is obtained from NaOH when it is dissociates. This reaction

gives an alcohol product. Nucleophilic substitution has taken place and the rate of

reaction depends only on the concentration of the alkyl halide. Since the reaction rate

only depends on the concentration of the alkyl halide, the mechanism is known as the

SN1 reaction.

CH3 CH3 | NaOH | Br C CH3 HO C CH3 | | CH3 CH3

Figure 4.4 Reaction between the hydroxide ion (OH−) and a tertiary halide.

SN1 is a two stage mechanism. In the above reaction, in stage 1, the C−Br bond breaks

first with both electrons moving onto the bromine. The carbocation is stabilized by the

electron-donating effect of the three alkyl groups as well as hyperconjugation. This is

the rate determining step. The production of carbocation as an intermediate is unique to

SN1. The remaining alkyl portion becomes a planar carbocation. Steric strain is also

relieved. In stage 2, the hydroxide oxygen has a negative charge and is therefore a

nucleophilic centre. A lone pair of electrons is used to form a bond to the electrophilic

centre of the carbocation. The tetrahedral centre is reformed.

117

Due to the steric bulk of the alkyl substituents, it is very difficult for a nucleophile to

reach the electrophilic carbon centre of tertiary alkyl halides. The attacking nucleophile

is unable to reach the electrophilic carbon centre of the alkyl halide. Therefore, these

compounds should not react with nucleophiles. However, in practice they do. How

then could the carbon centre become more accessible? The answer is the carbon centre

is tetrahedral. If, however, the C−Br bond is broken, a flat carbocation is obtained.

When the C−Br breaks, both electrons in the bond move onto the bromine atom to give

a fourth lone pair of electrons on bromine. Therefore, bromine gains a negative charge.

However, bonds do not normally break without reason. There are a few driving forces

which make the process possible. One of these is steric. The other reason is electronic.

There are two electronic effects which are involved in stabilizing the intermediate

carbocation. One of these is the inductive effect. Alkyl groups can have an inductive

effect whereby they “push” electrons to a neighbouring centre. This electron donating

effect could be seen as encouraging the departure of the halide ion. More importantly,

the electron donating effect of the alkyl groups helps to stabilize the carbocation since

the inductive effect reduces and hence stabilizes the positive charge. An advantage is

that the ion is more stable when it is planar since all three alkyl groups are as far apart

from each other as possible. The carbon atom is now sp2 with an empty 2py orbital. In

the carbocation, they are further apart and there is less steric strain as a result. The

order of reactivity of alkyl halides under SN1 is that tertiary alkyl halides are more

reactive than secondary alkyl halides, with primary alkyl halides not reacting at all.

The stability of a carbocation is an important factor for reproducing the behaviour of the

SN1 mechanism. Our choice of including “stability” as an essential chemical parameter

in the KB is based on the conditions presented in Figure 4.5. Obviously, this

118

information is needed in order to make a decision whether to break a bond thus

allowing a structural unit to leave the compound.

Increasing carbocation stability

H H C | | |

C−C+ C−C+ C−C+

| | | H C C

Primary (1o) Secondary (2o) Tertiary (3o) carbocation carbocation carbocation

Figure 4.5 The stability of various structures of a carbocation under SN1 mechanism.

In order to make things clearer, an organic chemistry reaction is explained from the

perspective of organic mechanism. Our illustration is based on the following reaction:

“ROH + HX → RX + H2O”. One of the specific equations derived from the general

equation is: “(CH3)3C−OH + HCl → (CH3)3C−Cl + H2O”. The mechanism needs to be

the one that can explain a reaction which involves a hydroxyl (OH) functional group

transformation. To chemists, when a tertiary alcohol is used as the substrate, and

hydrogen halide as the reagent to produce alkyl halide, the SN1 mechanism is

necessitated. In summary, if the following evidences are observed, then the mechanism

for “(CH3)3C−OH + HCl → (CH3)3C−Cl + H2O” is SN1:

• There is a tertiary alcohol, where the functional group (i.e. the “OH”) is a poor

leaving group.

• There is a halide ion (X−) that acts as the nucleophile. HX dissociates to form H+

and X− since strong mineral acids (HCl, HBr, HNO3, etc.) are fully ionized in

solution.

• Protonation is required, i.e. a “make-bond” process is first needed.

119

• A “break-bond” process will follow, in order to break the bond between the carbon

atom (in the long chain) and the functional group. This is to make room for the

substitution of the nucleophile that is attached to the initial compound. This step will

produce a stable carbocation.

• A final “make-bond” process is needed to complete the entire reaction since the

nucleophile in the organic compound (substrate) has been substituted and a stable

output is produced.

Based on the above facts, the thought processes underlying the general approach used in

writing mechanisms can be explicitly stated. For example, the organic mechanism for

the reaction in Figure 4.6 can be illustrated as shown in Figure 4.7. The three steps in

Figure 4.7 are modelled as QPT processes (as discussed in Chapter 3).

HCl

OH Cl + H2O

Figure 4.6 A reaction that needs SN1.

120

H+

HO..

..−

�

H..O

|

H

−+

�

+

(Lewis base) (3o Carbocation)

Step 1 Step 2

+ ..

::..−Cl

..:

..Cl + H2O

(Lewis acid) Step 3

Figure 4.7 The organic processes occurred in the order of “make-bond” (Step 1), “break-bond” (Step 2) and “make-bond” (Step 3). The reaction can be explained by the SN1 mechanism.

4.4.2 The SN2 Mechanism

The order of reactivity for alkyl halides under SN2 changes dramatically from that

observed in the SN1 reaction, such that primary and secondary alkyl halide can undergo

the SN2 mechanism, but tertiary halides can react only very slowly. Primary alkyl

halides undergo the SN2 reaction faster than secondary alkyl halides. Reaction between

methyl bromide and the hydroxide ion (HO−) is a simple example of SN2 reaction

involving a primary alkyl halide, as shown in equation 4.2. The hydroxide ion is

nucleophilic because it has negative charge. The negative charge is on the oxygen, so

this is the nucleophilic centre. There is a concerted process where the incoming

nucleophile forms a bond to the reaction centre at the same time as the C−Br bond is

broken. The transition state involves the incoming nucleophile approaching from one

side of the molecule and the outgoing halide departing from the other side.

121

Br CH3 + HO− → HOCH3 + Br− (4.2)

Factors affecting SN1 versus SN2 are: (1) solvent, (2) nucleophilicity, and (3) leaving

group. These three factors are considered in the design of the OntoRM ontology

(Chapter 5).

4.5 Simulation Scenario for Reproducing the Behaviour of SN1

The organic mechanism tells us how bonds are formed and broken and in what order

things happen. How do chemists interpret the chemical equation: “(CH3)3COH + HCl

→ (CH3)3CCl + H2O”? They will propose SN1 mechanism to explain the production

of the alkyl halide (i.e. (CH3)3CCl). In which, in the first step, the alcohol oxygen (the

“O” from the “OH” group) is protonated. This indicates the “O” captures a proton (H+).

This reaction produces the OH2+ which is a good leaving group. Next step is the

cleavage of the link between “C” and “OH2+”. Once the link is broken, a stable tertiary

carbocation intermediate is produced. The production of this intermediate is unique to

SN1 for an alcohol as the starting material. The occurring factors are that, the “O” in

“(CH3)3C−OH2+” is unstable since there are three covalent bonds (valency for oxygen is

two). At the end of this step, it will produce a water molecule (H2O). In the last step,

the incoming nucleophile (Cl−) can bond to the carbocation to form a neutral and stable

final product. The reaction step occurs since the two views are reactive (charged

species).

Based on the behaviour of SN1 presented in Section 4.4.1 and the way chemists interpret

the chemical equation, the overall simulation works as follows. Initially, there are three

reacting species: a proton (charged electrophile), the chlorine ion (charged nucleophile)

and the alcohol substrate (neutral nucleophile). Based on the <charged electrophile,

122

neutral nucleophile> view pair, the “make-bond” process (Figure 4.3) is activated in

order to simulate the chemical behaviour of the first reaction step for the chemical

equation “(CH3)3COH + HCl→ (CH3)3CCl + H2O”.

• Step 1: The simulation scenario for this reaction step has been presented in Section

4.3.2. The new quantity created by this “make-bond” process is the oxonium ion

(−OH2+) and it will be placed into the VIS. All values assigned to each individual

are retrieved from the quantity spaces that keep track of the current values of each

quantity and the direction of change.

• Step 2: At this point, the oxonium ion (Line 1) required by the dissociation process

(Figure 4.8) is available in the VIS. The simulation thus continues by switching to

the second reaction step. Note that only the nucleophilic centre (“O”) is shown

rather that the entire structural unit (the “−OH2+”). The two reacting units come

from the same compound. This process describes the cleavage of the carbon-oxygen

bond in tert-butyloxonium ion ((CH3)3C−OH2+) which is unstable since the “O” is

now charged and has three covalent bonds. Changes that propagate via qualitative

proportionalities are as follows: The acceptance of two electrons from the

dissociation process will neutralize the “O” in “OH2+” (Line 7 and Line 8), hence

the water molecule is formed (and it will be kept in the side product array). At this

point, both the conditions in quantity-conditions slot are no longer valid. On the

other hand, the deletion of a covalent bond between “C” and “O” will affect its

charge, in that “C” is now positive charge (Line 10).

• Step 3: The carbocation ((CH3)3C+) and the chlorine ion (Cl−) are now left in the

VIS and happen to be the two individuals that are required to activate the generic

“make-bond” process. The task specific name for this bond making process is

called “Capturing of carbocation by anion”. In this third step, the individuals are

123

“Cl−” (charged nucleophile) and “C+” (charged electrophile). The functional

dependencies defined in the Relations slot of Figure 4.3 can be reused (since both

processes are “make-bond”). Only the contents in the “Individuals” slot need

modification, where the two new individuals for this process are “C+” and “Cl−”

(refer to Figure 4.9). The start of this process can be explained by the incomplete

octets of a carbocation and a chloride ion and it stops due to the production of most

stable species where both ions complete their valences. Lines 9–12 in Figure 4.9

describe the following scenario: The increasing covalent bond on “Cl−” (Line 9)

propagates its effect to bring about the reduction of the number of its lone pair

electrons and further affecting the changing sign of its charge (from negative to

neutral, i.e. it is increasing). As for the other reacting species, the process’s quantity

(i.e. bond-activity) necessitates an increase in the number of covalent bond on “C”.

Here, “C” has regained its maximum bonds when its value is checked against the

simple facts stored in the knowledge base. The entire simulation ends here, as

chlorine and carbon atoms are both in neutral state.


A

B

C

D

Individuals 1. O+ ( oxonium ion – functions like a delta minus) 2. C (carbon – functions like a delta plus)

Quantity-Conditions 3. Am[no-of-bond(O)] > Am[max-bond-allowed(O)] 4. charges(O, positive)

Direct Influences 5. I - (no-of-bond(O), Am[bond-activity])

6. I - (no-of-bond(C), Am[bond-activity])

Relations 7. lone-pair-electron(O) −

+P no-of-bond(O)

8. charge(O) +

−P lone-pair-electron(O) 9. lone-pair-electron(C) P no-of-bond(C) 10. charge(C) −

+P no-of-bond(C)

Figure 4.8 A “break-bond” model fragment represented using QPT. This model fragment is used to reproduce the behaviour of the second step of “(CH3)3COH + HCl”.

124


A

B

C

D

Individuals 1. Cl− (the incoming charged nucleophile) 2. C+ (the charged electrophile)

Quantity-

Conditions

3. Am[no-of-bond(C)] < Am[max-bond-allowed(C)] 4. Am[lone-pair-electron(Cl)] > Am[max-lone-pair-electron(Cl)] 5. charges(C, positive) 6. charges(Cl, negative)

Direct Influences 7. I + (no-of-bond(Cl), Am[bond-activity]) 8. I + (no-of-bond(C), Am[bond-activity])

Relations 9. lone-pair-electron(Cl) +

−P no-of-bond(Cl)

10. charge(Cl) −

+P lone-pair-electron(Cl) 11. lone-pair-electron(C) P no-of-bond(C) 12. charge(C) +

−P no-of-bond(C)

Figure 4.9 A “make-bond” model fragment represented using QPT. This model fragment is used to reproduce the behaviour of the third step of “(CH3)3COH + HCl”.

Now, the VIS is left with one species and the entire reaction is completed. The final

products are alkyl chloride ((CH3)3CCl) and water (H2O) which are very stable. The

sequence of process activation is “Protonation, Dissociation, followed by Capturing of

carbocation by chloride ion”. These three steps (reaction route) can be used to explain

the overall chemical change that occurred. At the end of the simulation, when checked

with the chemical KB, the sequence of processes occurred explains that it is in fact the

SN1 mechanism that makes the production of the final output possible. With this, the

“(CH3)3COH + HCl → (CH3)3CCl + H2O” is successfully simulated following the

behaviour of the SN1 mechanism.

4.5.1 Contents of the View Instance Structure (VIS) During Reasoning

During qualitative reasoning, the structures of the reacting species are updated and

recorded in VIS. Figure 4.10 depicts the running contents of the VIS during simulation

125

of equation 3.2 while Figure 4.11 gives the contents in the VIS during the simulation of

equation 3.3. When a simulation is completed, the contents of the structure are

retrieved and these would be the final products of a reaction. Intermediate Structure

(IS) is used to store side products (if any). More internal structure representation and

molecular patterns stored as two-dimensional (2D) arrays are presented in Chapter 5.

(Initial state)

(a)

(After step 1) (b)

(After step 2)

(c)

(After step 3)

(d)

Intermediate

Structure

Figure 4.10 The contents in the VIS during the simulation of “protonation” process. The VIS is constantly updated to reflect the new intermediates produced until the entire reaction is ended. Content in (d) is the final product.

(Initial state)

(a)

(After step 1) (b)

(After step 2) (c)

(After step 3)

(d)

Intermediate Structure

Figure 4.11 The contents in the VIS during the simulation of the “dissociation” process. Content in (d) is the final product of this reaction.

4.5.2 Stopping Conditions for Reaction Steps and the Entire Simulation

A process will stop when the entry-condition is no longer valid. When the individual

instances required by the second process are available, simulation continues by

switching to the second step of the mechanism and finally reaches the step that produces

the most stable product. In this work, quantity-conditions only serve to start (or stop)

H2O

(CH3)3CO+HH (CH3)3COH

2H2O

(CH3)3C+

C+

2H2O

Cl-

(CH3)3C+ (CH)3CCl

Cl-

(CH3)3COH

H+ Cl-

(CH3)3CO+HH

(CH3)3CBr

H3O+

Br-

126

the series of steps while the overall process is stopped due to only one species remains

in the VIS and the species must be in stable form.

4.6 QPT Process Model as Reusable Component

This section provides evidence to support our claim in Chapter 1 that the QPT models

constructed are reusable. The reusable components are the models developed for the

“make-bond” and “break-bond” processes. As described earlier, a reaction mechanism

describes the series of small changes that happened to a given pair of individual views.

These changes are caused by the occurring of either the “make-bond” or the “break-

bond” process (in a variety of sequence). Since organic processes will occur between a

pair of views such as: <charged nucleophile, neutral electrophile>, <neutral nucleophile,

charged electrophile>, <charged nucleophile, charged electrophile> and <charged

electrophile, neutral nucleophile>, hence the QPT models constructed for the organic

processes can support simulation of many chemical equations. The reaction as given by

equation 4.3 is used to illustrate this situation.

(CH3)3CX + H2O → (CH3)3COH + HX (4.3) Alkyl halide (in excess) Alcohol

Figure 4.12 depicts how bonds are formed and broken during the conversion of alkyl

halide to the alcohol product. The simulation of “(CH3)3CX + H2O → (CH3)3COH +

HX” is achieved by reasoning about the QPT models in the following order: “break-

bond”, “make-bond” and “make-bond”. The diagram also tells in what order these

processes happen. This is essentially the “mechanism” used in predicting and explaining

the result. The mechanism used is SN1 since it involves a tertiary alkyl halide and it is a

127

unimolecular reaction where only the concentration of the alkyl halide will affect the

reaction rate.

(CH3)3C X

(a) The first reaction step is “break-bond” that disscociates the halogen atom from the molecule. The QPT model constructed for breaking the (CH3)3COH2

+ bond in Equation 3.2 is reused here.

..

(CH3)3C+ + : O – H

| H

(b) The second step is “make-bond” that bonds the electrophilic carbon centre to the nucleophilic oxygen centre of a water molecule. The QPT model constructed for making a bond between the alcohol oxygen of (CH3)3C–OH and a proton (H+) in Equation 3.2 is reused here.

.. .. (CH3)3C – O+ – H + : O – H | |

H H

(c) The third reaction step is also “make-bond”. The QPT model constructed for making a bond in Equation 3.2 can be used again.

Figure 4.12 The QPT process models constructed for Equation 3.2 can be reused by other chemical equation simulation such as Equation 4.3.

Even though the substrates used in both reactions (Figure 4.7 and Figure 4.12) are

different, the processes designed for Figure 4.7 can also be used by the reaction in

Figure 4.12. As one can see the two chemical equations can be explained by the same

mechanism, but both start with different substrates and produce different products. In

Section 4.6.1 (below), we will illustrate that the same QPT process models can be used

again for the SN2 mechanism.

128

4.6.1 Model Reuse by SN2

Reusability of components is made possible by recognizing the individual small steps

required while each of the steps is determined by the individuals that are present in the

VIS. How does the SN2 reuse the QPT model? Figure 4.13 shows a reaction that

necessitates SN2. Even though the mechanism of this reaction is SN2, the models

developed for Figure 4.7 and Figure 4.12 can still be used. In principle, the two steps

are concerted, but it is modified so that the steps are executed in sequence (“break-

bond” followed by “make-bond”). Since the view pairs used to activate a QPT process

are different therefore different outcomes can be obtained. For example, the reaction in

Figure 4.13 uses the <delta-minus, delta-plus> pair for both its “break-bond” and

“make-bond” processes. Recall that in the simulation of Figure 4.7, a different pair of

views (i.e. <neutral nucleophile, charged electrophile>) is used to activate the “make-

bond” process. As long as it is a bond formation process, the earlier constructed “make-

bond” model can be used. Likewise, if it is determined to be a bond cleavage process,

the “break-bond” model is retrieved. The decision of letting the view pairs to determine

“when” a model is relevant addresses the “model selection” issue in QR research.

X

HO− + CH3X → [ HO

− CH3 ] → CH3OH + X

−

Figure 4.13 The mechanism used in this simulation is SN2. The organic processes that occurred are “break-bond” (expulsion of the leaving group) and “make-bond” (the approaching of the hydroxide ion to form a bond to the carbon centre).

129

The above description suggests that the reasoning order of the QPT models is vital in

the prediction of the final product(s) of a chemical reaction and in the reproduction of

behaviour of various organic mechanisms. The algorithm developed in the framework

can cater for such “select and sequence” ability, as shall be discussed in Chapter 5.

4.6.2 Model Reuse Scenario

The automated QPT models can be used to reproduce the behaviour of reactions such as

“A1 + B1”, “A2 + B2”, …, “An + Bn”. In other words, no matter what A’s and B’s are,

provided that they belong to the same class of nucleophiles and electrophiles, the same

make/break bond processes can be used (Figure 4.14).

130

A1, A2, …, AN + B1, B2, …, BM

Input Recognizer

The constituents (Classified as either

nucleophile or electrophile)

Determine suitable process

Write to Reuse Qualitative of processes Modelling is done here for Processes Retrieve from


Predicted Results

View pairs and associated processes

Collection of substrates and their

constituents

Make-bond & Break-

bond Models

Figure 4.14 Model reuse scenario for the simulation of organic reactions.

It is the nature of the qualitative reasoning approach that supports the reusability of

models, in that it solves problem by having laid down the conceptual domain

knowledge rather than finding different factors to tackle each problem. The two

approaches of solving problems are depicted in Figure 4.15. Qualitative reasoning uses

the approach in part (a) of the diagram, in which “concepts” are akin to “processes” and

“problems” are akin to the different reaction mechanisms.

131

Concepts

Problem1 Problem2 Problem3 Problem4

(a)

Problem

Factor-1 Factor-2 Factor-3 Factor-4

(b)

Figure 4.15 (a) A problem solving method that uses concepts to tackle multiple problems (b) A precoded KB of an expert system in solving a specific problem.

4.7 Qualitative Explanation Manifestation

Algorithms determine what the behaviour is, not an explanation of it. An explanation of

system behaviour may take many forms. An example is “causality” (causal accounts).

Causal account is a kind of explanation that is consistent with our intuitions of how

systems function. One of the objectives of this work is to prepare and generate

explanations in a language and format understandable to the learners and earlier we

solicited from the chemistry students that causal account is of help and meaningful to

them. Thus, causal graphs (state diagrams) are used to explain and justify solutions that

are returned by the system. Our approach stresses on the causal theories. Such

approach can generate explanation in a variety of forms, as will be discussed in Section

4.7.1 (generating a causal graph) and Section 4.7.3 (interpreting a causal graph).

132

4.7.1 Generating a Causal Graph

The formalism of QPT which makes causality explicit is of great value in explaining

chemistry phenomena for teaching purposes. A chain of effect propagation represented

as functional dependency among chemical parameters will be constructed during

runtime by the QR algorithms (QSA module in specific, refer to Chapter 5 for

algorithms) that serve as an embedded intelligence module in the tool to produce the

causal graphs. A causal graph depicts the set of causal relationships between quantities

occurring in the simulation. One such cause-effect relationship is depicted in Figure

4.16 (a sketch for illustration purposes). The computer generated version can be found

in Appendix D.10 and Appendix D.11. The necessity of constructing such a graph is

first discussed while interpretation of the causal graph is provided in the following

subsection.

133

Step 1: A "make-bond" process

H+ (Hydrogen ion, a charged electrophile) (CH3)3C-OH(Alcohol oxygen, a nucleophile)

no-of-bond(H) increased

charge(H) decreased

no-of-bond(O) increased

lone-pair-electron(O) decreased

charge(O) increased

The "make-bond" process

produces the (CH3)3COHH+

Step 2: A "break-bond" process

(CH3)3C(the C is an electrophile - delta plus) OHH+ (the O serves as a nucleophile - delta minus)

no-of-bond(C) decreased

charge(C) increased

no-of-bond(O) decreased

charge(O) decreased

lone-pair-electron(O) increased

The "break-bond" processproduces the carbocation

intermediate (CH3)3C+

Step 3: A "make-bond" process

Cl- (the chloride ion serves as a nucleophile) C+ (the carbocation serves as an electrophile)

charge(C) decreased

no-of-bond(C) increased no-of-bond(Cl) increased

lone-pair-electron(Cl) decreased

This is the last reaction step inthe simulation. It produces a

stable product (CH3)3CCl

charges(Cl) increased

a: Line 12

b: Line 13

c: Line 15

Figure 4.16 A causal graph showing cause-effect relationship of chemical parameters during the simulation of “(CH3)3C–OH + HCl” reaction.

For cross checking purposes, label “a” in Step 1 of Figure 4.16 represents Line 12

(lone-pair-electron(O) +

−P no-of-bond(O)) of the “make-bond” process depicted in

Figure 4.3. Similarly, from the same graph, label “b” is derived from Line 13

(charges(O) −

+P lone-pair-electron(O)) and label “c” represents Line 15

(charges(H) +

−P no-of-bond(H)) from the same model.

134

4.7.2 Design of Causality

Basically, the behaviour of a chemistry system can be described as a sequence of

qualitative states occurring over a particular span of time. Our approach generates

explanation by examining the functional dependency statements in the QPT model. In

this work, apart from showing the qualitative state change of each atom involving the

conversion of the substrate to its final product, explanation is also supported by the

presentation of causal graphs. Causality can be used to manifest order upon the world.

For example, when given “X causes Y”, we believe that if we want to obtain Y we

would create X. As such, when we observe Y we will think that X might be the reason

for it. Qualitative proportionality (the P’s) helps propagate effects of change caused by

process quantity. For example, given two proportionalities (qp1 and qp2):

lone-pair-electron (O) +

−P no-of-bond(O) … qp1

charge(O) −

+P lone-pair-electron (O) … qp2

A question that can be asked (or derived) from the above two statements could be:

“How would the above qualitative proportionalities explain the “O” atom has positive

charge?” A causal explanation that could be generated is: The number of lone pair

electron will decrease when more covalent bonds are made on the “O” atom via the

inverse proportionality defined in qp1. In qp2, when the lone pair electron on “O”

decreases the charge on it will increase. The above functional dependencies can explain

why the charge on “O” is now positive. Simply, it donated electrons to form a covalent

bond. Another advantage of representing chemical behaviour using the qualitative

proportionality of QPT is that the state of the chemical system can be tracked over time.

Tracking of the state of a chemical system helps explain the underlying cause-effect

chain which is implicit in the qualitative models. The causal graph produced by QRiOM

135

can be used to capture such dependency at runtime and then be used to generate

explanation on-the-fly (screenshots are presented in Chapter 5).

Figure 4.17 – Figure 4.19 are causal graphs that represent the minimal yet essential set

of properties abstracted from each reaction step presented in Figure 3.2 (page 68). Each

figure is discussed in turn. Figure 4.17 represents the cause-effect notion in the

“protonation” process. Legends used in these figures are: I = Influences and P =

Proportionalities.

lone-pair-electron(O) >= min-electron-pair(O)

protonation-activity

I+ I+ I-

no-of-bond(O) bond-activity(O) charge(H)

+

−P −

+P

lone-pair-electron(O) no-of-bond(H)

−

+P

charge(O)

Figure 4.17 Causal graph for the “protonation” process. The inequality above the dotted line is the entry condition to the process.

In Figure 4.17, the inequality statement shown above the dotted line represents the

quantity-condition that must be true for the protonation process to start. Effects are then

propagated via the direct (I) and indirect (P) influences. The protonation process

requires a proton (H+), from the HX acid. Since “H+” is electron poor, it will seek for

electrons. The other individual is “O” from the alcohol oxygen with the quantity lone-

pair-electron greater or equal to min-electron-pair in order to donate two electrons to H+

136

to form a covalent bond between them. The process’s quantity is bond-activity. This

quantity directly influences no-of-bond for “O” and “H”. In other words, after the

protonation process “O” will have an extra covalent bond. The effects will propagate to

other dependent quantities shown in the diagram. For example, the number of lone pair

electron will decrease when more covalent bonds are made at “O” atom via the inverse

qualitative proportionality defined in qp3 (derived from the left branch of the graph in

Figure 4.17). In qp4, when the lone-pair-electron on “O” decreases, the charge on “O”

will increase. Besides, qp4 also explains why “O” is positive charge (loosing of electron

to make a covalent bond).

lone-pair-electron (O) +

−P no-of-bond(O) ... qp3

charge(O) −

+P lone-pair-electron (O) ... qp4

The dissociation behaviour is briefly shown in Figure 4.18. The process will start due

to instability of the oxonium ion (the extra covalent bond on the oxygen atom in the

oxonium ion). This situation is indicated in the inequality above the dotted line.

Changes that propagate via functional dependencies among quantities are: The charge

on “C” will turn positive and we shall get “C+” when one electron is transferred to the

oxonium ion to neutralize it and hence short of one covalent bond via the relation “no-

of-bond(C) +

−P charge(C)” (derived from the left branch of the graph in Figure 4.18).

137

no-of-bond(O) > max-bond-allowed(O)

dissociation-activity

I+ I- I+

charge(C) no-of-bond(O) bond-activity(C, O)

+

−P +

−P

no-of-bond(C) lone-pair-electron(O)

+

−P

charge(O)

Figure 4.18 Causal graph for the “dissociation” process. The process stops when the oxygen (“O”) regains its equilibrium state.

Next, we shall examine the cause-effect interaction for the third process (Figure 4.19).

This process is caused by the incomplete octet of “C+” and “Cl−” and it will stop when

both the ions completed their valences.

no-of-bond(C) < max-bond-allowed(C) no-of-bond(Cl) < max-bond-allowed(Cl)

formation-activity

I+ I+ I+

no-of-bond(C) bond-activity(C, Cl) charge(Cl)

+

−P +

−P

charge(C) lone-pair-electron(Cl)

+

+P

no-of-bond(Cl)

Figure 4.19 Causal graph for the “Capturing of carbocation by anion” process.

138

Capturing of carbocation by anion (or formation of alkyl halide) describes how the one

pair of non-bonded electron from “Cl−” combines with the “C+” to form a covalent bond

thus neutralizing both charges (refer to qp5, qp6 and qp7).

charge(C) +

−P no-of-bond (C) ... qp5

lone-pair-electron (Cl) +

−P charge(Cl) ... qp6

no-of-bond(Cl) −

+P lone-pair-electron (Cl) ... qp7

4.7.3 Interpreting a Causal Graph

Figure 4.16 depicts the overall chemical change of the substrate during simulation of

equation 3.2. We will now interpret the graph given that the building blocks have just

been discussed in Section 4.7.2. An organic reaction is triggered by an electrophile and

a nucleophile. Step 1 (“make-bond” process) is activated by the <H+, O> pair. In this

case, the nucleophile is the “O” from the OH group which has extra lone pair electrons

to be donated to the proton (H+, electrophile). The direct effect is that a covalent bond

will be made between “O” and “H”. These effects are propagated to other quantities as

follows. The charge on “H” is decreased (from positive to neutral) and the lone pair on

“O” is also decreased (donated to the electrophile). Decreasing the lone pair on “O”

will cause an increasing charge. The charge of oxygen atom is now turning from

neutral to positive. Assigning quantity values to each reacting species is coordinated by

the QSA module. All values that are assigned to each parameter are retrieved from the

quantity spaces maintained in the chemical KB. The first reaction step produces

(CH3)3COH2+ (an intermediate). From the chemical KB, this intermediate has “C” and

“O+” as the reacting species for a “break-bond” process to be activated. The “break-

bond” process (refer to Step 2 in Figure 4.16) describes the cleavage of the carbon-

139

oxygen bond in tert-butyloxonium ion ((CH3)3COH2+) which is unstable since the “O”

is charged. The immediate cause of this process is that the bond between “O” and “C”

will break. State changes that propagate via functional dependencies among quantities

are: The acceptance of two electrons from the dissociation activity will neutralize the

“O” in “OH2+”. On the other hand, donation of electrons (lone pair decreases, as

indicated in the graph) will cause the charge on “C” becomes positive (charge

increases). This propagation produces a tertiary carbocation ((CH3)3C+) for the next

process activation use.

Atom “C” in the carbocation is now unstable and it is reactive. Since the carbocation

and the chlorine ion (Cl−) remain in the VIS, another “make-bond” process (“capturing

of carbocation by anion”) can be initiated. This is the third reaction step in the entire

simulation. The start of this process can be explained by the incomplete octet of the

carbocation and the chloride ion. The following describes Step 3: When a covalent

bond is made between “Cl−” and “C+”, the chlorine’s lone pair electrons will decrease.

This effect is further propagated to changing its charge (from negative to neutral state)

while the carbon’s charge is decreased (from positive to neutral). The entire process

ends here because both “Cl” and “C” are in neutral states (i.e. their valences having

been completed). Recall that, in this work, a process will stop when only one view pair

left in the VIS. From the graph, the final products are alkyl chloride which is very

stable and a side product (water molecule). The sequence of process activations are

protonation, dissociation, followed by capturing of carbocation by chloride ion. These

three steps can be used to explain the overall chemical change occurred. The

presentation of chemical effect propagation as a cause-effect diagram would help one

appreciate the general chemical principles underlying the chemical phenomenon and

hence be useful in improving one’s conceptual understanding in the subject.

140

4.7.4 Deriving Explanation From a Causal Graph

The reasoning process which involves sequential changes of the substrate’s parameters

obeying chemistry theories will enable us to deduce how a particular chemical process

came about. A significant part of learning QPT models is that it could serve as

corrective feedback module in the entire learning endeavour. We will demonstrate how

the ontological modelling constructs of QPT can provide causal explanation about some

aspects of chemical system behaviour. Based on the causal graph in Figure 4.16, a set

of queries can be devised as shown in Table 4.1.

Table 4.1: A set of queries and explanations. The explanation is generated based on Step 1 in the causal graph presented in Figure 4.16.

Question Answer

How would the cause-effect relationships explain the charge on “O” is changed from neutral to positive?

The number of lone pair electrons will decrease when more covalent bonds are made on “O” and this effect is propagated to cause the charge on “−OH2

+” increases (from neutral to positive by referring to the quantity space).

Where did the electrons come from to form the new O−H bond?

From the extra lone pair electrons on “O”.

How would you explain a decrease in the lone pair electrons on “O”?

We know that the immediate cause of the process is the number of covalent bonds on “O” will increase (i.e. a bond is made). This quantity will influence the lone pair electron on it and the influence is strictly decreasing through the inverse proportionality relationship.

Why did the “make-bond” process occur?

The statements in quantity-conditions are satisfied, which briefly speak for “there needs a proton and alcohol oxygen with at least one pair of unshared electrons to be donated to the proton in order to make a bond”.

Why did the “make-bond” process stop?

When the process begins, the “O” will have an extra covalent bond while “H” will be neutralized. When more covalent bonds are made on “O”, its number of lone pair electrons will decrease via the inverse qualitative proportionality. When the lone-pair electrons of “O” decrease its charge will increase. These relationships explain how the “O” donated an electron in order to make a bond. At this point of time, the quantity-condition has been violated. Therefore, the process stops.

141

Qualitative reasoning allows learners to access notions of how the behaviour of systems

evolves in time. The inspection of the cause-effect chain can help learners to develop

their chemical intuition so as to pick up the underlying concept better than merely

memorizing the reaction steps or basic facts. The explanation that is derived from the

qualitative model will engage students to rationalize why a particular process occurred,

and why was it stopped. This type of explanation is not precoded. To validate this, the

“dissociation” process is presented. Table 4.2 shows a few queries based on the “break-

bond” behaviour (refer to Step 2 in Figure 4.16).

Table 4.2: A set of queries and explanations. The explanation is generated based on the second step of the causal graph presented in Figure 4.16.

Question Answer

Why did the dissociation process occur?

The activation of the process is due to the extra number of bonds the alcohol oxygen possesses; where its covalent bond has exceeded the maximum number in stable state. Refer to the inequality: no-of-bond (O) > max-bond-allowed (O).

What are the factors that affect the reduction of covalent bond on “C”?

One of covalent bonds on the “C” will break and this is caused by the following factors. First, the dissociation process will directly influence the charge on “C” and this is strictly increasing (neutral to positive). Next, an increase in the charge will decrease the number of covalent bonds on the “C”.

What happened to the second lone pair of electrons on the “O” after this reaction step?

When the bond between the “O” and the “C” is broken, both electrons in the C−O bond move onto the oxygen to restore a second lone pair of electrons and thus neutralizing the charge and so that it could leave the organic compound as a neutral molecule called water.

Why was the dissociation process stopped?

This is because the entry-condition is no longer valid, i.e. the “O” has regained its stability.

4.8 Discussion

In this work, the time required for a reaction was not used as an influencing parameter.

This is because “time” does not affect the result. As such it was decided to include only

the essential chemical parameters for the entire modelling and simulation. The essential

142

parameters are: (1) bonds (for changing the structure of a molecule), (2) lone pair

electrons, and (3) charge (for checking if an atom has filled up its valences). As far as

the simulation results are concerned, the simulated results matched those written in

textbooks. Note that only one round of simulation is implemented in software, in that a

simulation is considered complete when the nucleophile (or leaving group) in an

organic compound (serves as the substrate) has been replaced.

4.9 Conclusion

This chapter fulfilled three objectives. First, the use of QPT-based reasoning as the

simulation approach for organic reactions has been discussed. Second, a way of

generating explanation effectively via modelling constructs of the QPT has been

introduced. Third, the use of causal graph as a means to explain an organic process has

been presented. The procedures to generate and interpret causal graphs have also been

discussed. In particular, this chapter has answered two research questions: (1) “How

can qualitative reasoning be used to support organic reaction simulation?” (2) “How

can the modelling constructs of QPT be used to explain a chemical phenomenon?” In

this work, qualitative reasoning based on QPT ontology is used to predict the outcome

of “A + B”. The explanation and justification of a simulated result is achieved by

showing the “reaction mechanism” used. In this case, it is either the SN1 or SN2

mechanism. The suggested reaction mechanism will consist of the series of organic

processes used in the conversion of the reactants (“A + B”) to form the final product,

emphasizing on the chemical parameter states’ change that has occurred. QPT

reasoning supports behavioural explanation generation in order to facilitate mastering of

organic reaction concept. Explanation can be derived almost isomorphically from the

QPT constructs that are defined in the qualitative models. The explanation stresses on

143

what is happening to the valence electrons in the molecule such as their movement and

rearrangement during a reaction. The reusability of models in supporting the

reproduction of the behaviour of SN1 and SN2 has also been demonstrated. We shall

discuss the entire qualitative reasoning framework in the following chapter.

144

Chapter 5 Qualitative Reasoning Framework for Organic Reaction

Simulation

5.1 Introduction

This chapter presents the entire reasoning framework for the simulation of organic

reactions. The roles played by each component in the framework are presented. This

chapter also discusses the results of the following three objectives stated in Chapter 1:

• To develop a reasoning framework for organic reaction simulation and explanation.

• To define the types and roles of chemical knowledge at different abstraction levels

in order to facilitate effective use of the knowledge.

• To develop a small set of chemistry ontology called OntoRM for use with reaction

mechanisms for knowledge validation purposes.

The organization of this chapter is as follows: Section 5.2 gives the workflow of the

qualitative reasoning framework. A schematic view of the framework architecture

components and the roles of associated software modules are also presented. The

algorithm development for each functional component to be implemented in QRiOM is

presented and discussed in Section 5.3. The components encompass all the software

modules. These include the two-tier architecture for the knowledge base, the design of

data structures, substrate recognizer, model constructor for organic processes, reasoning

engine for reaction simulation, causal model generator and the design of attributes and

methods for an atom used in simulation. Section 5.4 discusses the method used for

storing organic compounds in software. Section 5.5 gives the design of structuring the

chemical knowledge in terms of their types and roles. Section 5.6 provides the protocol

for interacting with the software. The simulation results are presented and discussed in

145

Section 5.7. Section 5.8 concludes the chapter with the fulfilment of the three

objectives mentioned above.

5.2 The Qualitative Reasoning Framework

The proposed framework uses QPT as the knowledge capture tool together with a set of

reasoning algorithms to systematically gather and reuse the chemical knowledge and

chemical theories to reproduce the behaviour of reaction mechanisms. The workflow of

the qualitative reasoning framework is presented as a collection of flowcharts in

Appendix B.1 – Appendix B.5. A schematic view of the reasoning framework is shown

in Figure 5.1.

5.2.1 Inputs

Inputs to the system are assumed to have no noise as the system only caters for alcohols

(primary, secondary and tertiary alcohols) and alkyl halides. This is because in our

present work, only the basic facts and theories related to alcohols and alkyl halides are

included in the chemical KB. The number of known organic compounds is more than

10 million. Only two families of organic substrates were selected as kick start inputs.

Nevertheless, more families of organic compound will be included in our full system.

Chemists deal with a variety of structures and transformation which can usually be

decomposed into clearly identifiable entities. Along this line, the organic compounds

are decomposed into the hydrocarbon chains (e.g. “CH3CH3CHC” and

“CH3CH3CH3C”) and the attachments (e.g. the functional group “OH”). Therefore, the

design of the input is rather straightforward, in that the organic compound and their

146

decomposed units are stored as Prolog clauses. These basic facts will be retrieved and

populated on the Graphical User Interface (GUI).

5.2.2 Outputs

The simulator will return the following results: (1) final products, (2) intermediates

produced at each step, (3) sequence of processes used to reproduce the behaviour of the

proposed reaction mechanism, (4) overall structural change of the substrate, (5) QPT

model for organic processes, (6) causal graphs, (7) view pairs used in the simulation,

and (8) parameter state history for each atom that is involved in a reaction. Sample

results for (1) – (8) can be found in Appendix D.

5.2.3 Software Components

The reasoning framework consists of a number of software components. Figure 5.2

gives the main components – input recognizer, model constructor, reasoning engine,

explanation generator, knowledge validation module, OntoRM and chemical KB. Other

sub components are the Quantity Space Analyzer (QSA) and the Molecule Update

Routine (MUR).

147

Substrate Recognizer

Module

QPT Model ConstructorModule

Qualitative SimulationModule

Final ProductsReaction Routes

Parameter Histories, etc.

Explanation Generator

Module

Substrate and Reagent

Selection

Modeling and Reasoning

Ou tp u t a n d Ex p la n a tio n

Input Module

Chemical

KB

QPTprocesses

OntoRM

ontology

Figure 5.1 A schematic view of the qualitative reasoning framework described in terms of the input, process, output and the knowledge bases.

148

9

Knowledge

Validation

Routine

5

Explanation

Generator

3

Qualitative Simulator

(Reasoning engine)

Qualitative 2

Model

Constructor

Graphical User Interface 0

11

Chemical

Knowledge

Base

6

Causal Model

Generator

Substrate Recognizer 1

4

Simulated Results (Final products and the

mechanism used)

7

Molecule

Update

Routine

(MUR)

Molecule Patterns Storage

10

OntoRM

8

QPT

Process

Models

QSA

Various Types of Explanation

Figure 5.2 Main software components of QRiOM.

The software components depicted in Figure 5.2 are used in the following sequence.

Briefly, a substrate recognizer checks inputs entered by the user. Then, the qualitative

model constructor composes QPT models for organic reactions based on the identities

and types of the inputs. The reasoning engine is then called upon to simulate the

chemical behaviour using the constructed models. The explanation generator is called

upon when a user needs an explanation or a justification for a simulated result. The

149

simulator will generate the following outputs and responses: causal diagrams,

qualitative models, predicted outcomes, suggested organic mechanisms, parameter state

histories, view instance structures (showing the use of reacting pairs in each reaction

step) and the entire reaction route of a simulation. Table 5.1 outlines the roles of each

module in the simulator.

Table 5.1: Main modules and their roles.

Module No. Roles

Module 0 (Graphical User

Interface)

• This module provides an interface for the learners to interact

with the system. • The module contains all the screen layouts and event driven

components.

Module 1 (Substrate Recognizer)

• This module checks user selection and returns the “type” of the input as either a nucleophile or an electrophile. From here on, an organic process may be determined.

• It also initializes a number of tables (E.g. 2D arrays) to hold the running results of various chemical parameters during simulation.

Module 2 (Model Constructor)

• This module automates the construction of QPT models based on the identity of user input.

• It will generate the QPT model as depicted in Appendix D.9.

Module 3 (Reasoning Engine)

• This module does the actual reasoning and final product prediction. This is where qualitative simulation takes place.

• The main reasoning functions are handled by the QSA and MUR.

Module 4 (Simulated Results)

• This module will return simulated results.

Module 5 (Explanation Generator)

• This module will generate explanation to justify a simulated result.

• It will retrieve various data structures (produced by the prediction engine) in order to generate explanation on-the-fly.

Module 6

(Causal Model Generator)

• Causal ordering (Iwasaki and Simon, 1986) is best known for dependencies and causality. So, the simulator constructs causal graphs to produce accounts of behaviour based on causality.

Module 7 (Molecule Update

Routine)

• This module keeps track of the structural change (pattern) of the substrate, from one organic reaction to another.

• It will display reaction route as shown in Appendix D.5 – Appendix D.7.

150

Table 5.1, continued.

Module No. Roles

Module 8 (QPT Process Models)

Module 9 (Knowledge Validation

Routine)

• This data store contains qualitative models for organic processes.

• This is a routine called up by the reasoning engine whenever it needs to use a piece of knowledge to make a decision or to return an output.

Module 10

(OntoRM) • The reaction mechanism ontology that defines the basic

chemical knowledge and chemical commonsense for SN1 and SN2.

Module 11

(Chemical Knowledge Base)

• This data store contains information such as chemical facts and theories that are needed to perform qualitative reasoning.

The operational relationship of the various modules is as follows. Given a chemical

equation in the form of “A (substrate) + B (reagent)” (through the GUI, module 0), the

substrate recognizer (module 1) will check with the KB (module 11) to see whether it is

a valid input. If it is, then individual views are identified (module 2). Next, pairing of

views is carried out in order to construct QPT models (module 2 – qualitative

modelling) to prepare the chemical processes/reaction step. When there are active

processes, the reasoning engine (module 3) will keep track of the changing qualitative

states of the affected reactive units until the entire reaction ends. Along the reasoning

route, changes made to each individual’s parameters are recorded. This is handled by

the molecule update routine (module 7). The entire reaction will end when there are no

more reactive units. When a reaction ends, outputs will be displayed, together with all

the bonding and their sequence of execution (module 4). These steps can then be used

to explain the overall chemical change that occurred. If a user needs an explanation for

the results or has a question regarding the behaviour of a quantity, then the explanation

module (module 5 and module 6) will be run. Our approach for answer justification is

151

based on cause-effect reasoning. The prediction engine calls up module 8 (knowledge

validation handler) as and when it is needed to disambiguate a situation and this task

relies on the OntoRM (module 9) to constrain the use of the knowledge. The main

modules and their associated use of knowledge are given in Table 5.2. Note that

“input” refers to the specific type of knowledge that serves as the required input to the

module while “output” refers to the new knowledge asserted to the knowledge base or

created as intermediate results.

Table 5.2: Software modules and the associated inputs and outputs.

Module No. Name of the input

Name of the output

Module 1 (Substrate

Recognizer)

1. The inputs (reactants selected by the user).

2. Prolog clauses that contain constituent parts of substrates.

The module produces:

• View pairs • Atom table • Initial View Structure

Array • Initial atom table • Bond table • Initial 2D molecule table

for the substrate

Module 2 (Model

Constructor)

1. Java classes of either “make-bond” or “break-bond” (depending on the view pairs).


• View and process models such as “make-bond” and “break-bond”

Module 3 (Reasoning

Engine)

1. Java class for the identified

covalent bonding and quantity spaces for relevant views are retrieved.

2. 2D molecule table, bond table, atom table and atom property table.


• Intermediate-product array • Updated View Structure

Array • Updated atom table • Updated 2D molecule table

and bond table

Module 4 (Simulated

Results)

1. OntoRM 2. Atom table 3. 2D molecule table 4. View structure array


• Name of the mechanism used

• Final output

152

Table 5.2, continued.

Module No. Name of the input

Name of the output

Module 5 (Explanation

generator)

1. QPT models 2. Atom property table 3. View structure array 4. 2D molecule table


• Reaction route • Parameter history • Reacting units used in each

reaction step

Module 6 (Causal Model

Generator)

1. QPT models 2. Quantity spaces


• Causal graphs

Module 7 (Molecule

Update Routine)

1. Atom property table 2. Quantity spaces


• Bond table • 2D molecule table for the

changes made to the substrate

5.3 Component Design

The various components in the framework are implemented in software (the QRiOM

simulator). Code development is presented in Appendix E.

5.3.1 The Two-tier Architecture of Knowledge Base

The knowledge base has a two layer structure, as shown in Figure 5.3.

153

Layer 2 (Upper layer): OntoRM

A chemistry ontology for reaction mechanism simulation represented as Java classes (It is accessed via QR algorithms; called upon during reasoning and result prediction)

Elements include:

List of possible end products

Processes that are allowed in nucleophilic

substitution reaction

List of possible order of processes execution

Layer 1 (Lower layer): Chemical Data Instances A pool of chemical facts and chemical theories coded in Prolog

(It provides basic facts of atoms and functional groups and their unchanged properties)

Elements include: Valence

electrons, covalent bond,

etc.

Nucleophilicity and carbocation

stability, etc.

Electro-negativity for nucleophiles and electrophiles

Qualitative proportionality for chemical

theories

Direct influences for

covalent bonding

Figure 5.3 Architectural design of the knowledge base.

The purpose of the lower layer is to provide basic facts for nucleophilic substitution use.

This layer is called “chemical instances” (or basic facts). Instances refer to chemical

elements and their chemical properties that do not change over time. Examples are

atomic weight, electro-negativity, valence electron and covalent bond (lowest normal

valence consistent with explicit bonds). The upper layer is the chemistry ontology for

reaction mechanisms simulation. This tier is called OntoRM; it is used as a tool to

define reaction mechanism in a formal way. The ontology defines the requirements and

constraints when suggesting a mechanism for a chemical equation simulation.

OntoRM provides only the “knowledge” and not how the knowledge is used. The

design of OntoRM is intentionally made to be “task neutral” in order to achieve two

objectives: (1) to conform to the definition of “ontology”, and (2) to promote ontology

reuse in other applications.

154

5.3.2 The Chemical Knowledge Base

The QR algorithms will “process” the chemical parameters of reacting species involved

in a simulation. There is a lot of information required to support this task. Chemical

information to be processed is coded in Prolog. Prolog has a power feature called

“backtracking” that enables the program to use other alternative if the previous

alternative fails. This unique feature of Prolog will automatically choose the facts

needed to solve a query. In this work, the pool of Prolog clauses representing the basic

facts and theories of reacting species are termed as chemical knowledge base (KB). In

the development of the knowledge base, significant analyses are required in the problem

domain to decide the most needed (essential) chemical facts to be represented. Physical

quantities and chemical theories in the form of qualitative proportionality are also stored

in the chemical KB and not the reaction mechanism itself. New chemical facts can be

added to this KB since a GUI page is provided for the instructor to do so. Figure 5.4

shows an example of the chemical KB that supports the reasoning framework.

/*---- Chemical theories in Prolog syntax ----*/ qprop(make_bond, no_of_bond, lone-pair-electron, plus, minus). qprop(make_bond, charge, lone-pair-electron, minus, plus). // lower the energy higher its stability qprop(_, energy, stability, minus, plus). // less stable means more reactive qprop(_, stability, reactivity, minus, plus).

/*--- Chemical facts for reasoning use ---*/ covalent_bond('C', '4', stable). covalent_bond('O', '2', stable). lone_pair('O', '2', stable). lone_pair('C', '0', stable). charge(‘O’, neutral). charge(‘H+’, pos). charge(‘Br-‘, neg).

Figure 5.4 Examples of chemical facts and theories used in reaction simulation.

155

5.3.3 OntoRM: Objectives and Motivations

OntoRM is an ontology defined specifically for the field of reaction mechanisms. The

concept of reaction mechanism includes the patterns of reactants and products and the

transformation operators (such as charge addition and subtraction, bond addition or

deletion). However, OntoRM is designed in such a way that the ontology does not

include any possible type of “final product” since the prediction of the final product is

handled by the suite of qualitative simulation algorithms. As shall be seen, only species

type (e.g. functional group) is included. There are no exact “patterns” for reactants

stored in the KB. Adding to that, the transformation operator is not fixed in the

definition of OntoRM. This is because the type of bonding is determined by the view

pair identification approach and as such OntoRM is used specifically for validating

whether the bonding is appropriate.

The representation format of OntoRM is frame-based. A “frame” consists of multiple

slots which are suitable for representing the attributes defined in OntoRM. OntoRM

guides the reasoning algorithm to make decision by validating and constraining the use

of chemical knowledge. In the implementation part, OntoRM is represented as a set of

Java classes with only attributes and no processing. Its main objectives are:

1. It is used to describe knowledge, requirements and constraints (if any).

2. It is used for defining and handling special cases.

3. It is used as a reference during validation (to validate uses of the KB) and to

constrain their use. For example, it is used to reject a decision during reasoning or

to confirm a prediction before returning the final products.

4. By using OntoRM, the use of chemical knowledge is made clearer and thus filled up

the gap in a previous work (as this was not found in QALSIC).

156

The design of the ontology is motivated by the following factors:

1. QPT is a representation of domain knowledge in qualitative terms, with no

definition or description on how these knowledge are used (such as in what

order/sequence).

2. Previous systems stored all simple facts (elements and their properties) and other

unchanged chemical properties that can be obtained from the periodic table such as

atoms and their basic properties (e.g. oxygen’s maximum covalent bond is two

when it is in neutral state) with no emphasis on the structuring and hierarchical

organization of the domain knowledge. Therefore, a logical grouping of knowledge

is not found. With OntoRM, chemical data can be accessed and used effectively.

OntoRM can achieve the aim of component portability and reusability in other

applications. It is made very general in order to achieve portability so that it is

shareable by other applications. This is seen as a contribution to “computer in

chemistry” research.

5.3.3.1 The Design of OntoRM

Ontologies are normally used to abstract knowledge of a domain in a way that can be

used by both humans and computers by providing an explicit representation of the

entities of interest and the relationships among them (Dolan and Blake, 2009). One of

the earlier works on the design of chemistry ontology was described in Angele et al.

(2003). OntoNova is one such system developed under the Halo Project

(http://www.projecthalo.com) that answers questions from the “Advanced Placement

Test: Chemistry”. Encoding knowledge by experts appeared to be costly and it is just

one order of magnitude more costly than writing the natural language text itself. In

157

OntoNova, basic concepts of chemistry are represented in F-Logic. Properties of these

concepts and relations between these concepts are represented by methods. Complex

chemical relationships and axioms are represented by rules. In this approach, a large

amount of specific cases are needed and it is again resorted to the traditional approach

of problem solving. An example of a rule written in OntoNova is as follows:

rule burnhydrocarbon: FORALL F,V1,V2,V3

burned(F):CombustionReaction[hasReactants->>{"O2",F};

hasProducts->>{"H2O","CO2"}] <-

burn(F) and hydrocarbon(F).

This rule states that if a formula F represents a hydrocarbon and is burned then the

reaction is identified as a combustion reaction with the reactants O2 and F and the

products H2O and CO2 of the reaction equation. The system relies on the OntoBroker

inference engine for solving equation balancing problems and generating explanation.

OntoBroker performs a mixture of forward and backward chaining based on the

dynamic filtering algorithm (Kifer and Lozinskii, 1986) to compute the subset of the

model for answering the query.

Hsu et al. (2006) developed ontology for patterns of molecule as well as reaction

mechanisms. The team has also developed a reaction network generator tool for

producing reaction networks based on the knowledge defined in their ontology. The

chemistry ontology defined by the research team consists of three different ontologies to

describe molecules/patterns, reaction mechanisms and reactions. The molecule/pattern

ontology defines elementary concepts such as atoms, bonds and patterns of atoms,

which are essential for describing any reaction, reaction mechanism or other chemical

phenomena. The reaction mechanism ontology defines concepts such as transformation

158

operators that enable the representation of reaction mechanisms at an abstract level so

that the expert can rapidly test the hypothesis by simply combining several instances

stored in the knowledge base. Web Ontology Language (OWL) is used to encode the

ontology. In their work, a reaction mechanism, the input/output patterns and the

transformation steps are fully described. The reaction network generated unimolecular

and bimolecular elementary reactions following the exhaustive search process.

Our approach, on the other hand, has no pattern or output stored in any part of the

knowledge base or the OntoRM ontology. The outputs and molecule patterns are

predicted solely during runtime (by the qualitative simulator). OntoRM defines

chemical knowledge specifically for reaction mechanisms. The qualitative simulator

refers to this module to determine what aspects of the domain knowledge should be

presented to the simulator. In other words, the module performs only knowledge

validation, unlike OntoNova. There is no prior work associated with the design of

reaction mechanism ontology for validation use. As such, the use of reaction

mechanism ontology purely for validation purposes (rather than as an execution tool) is

considered as a new addition and contribution.

Another significance of the OntoRM ontology is that it can provide commonsense

knowledge to the qualitative simulator since the simulation of reaction mechanism

requires knowledge beyond what is maintained in the chemical KB. Besides, it can also

be used to disambiguate a situation. OntoRM defines the following entities:

• General definition for reaction mechanisms and functional units

• SN1 definition

• SN2 definition

159

• Definition for Leaving Group (LG)

• Definition for Nucleophiles

• Definition for Electrophiles

The entities are arranged and organized as follows. Basic concepts of organic

mechanism are designed as hierarchy of IS-A (“is a”) relation. Properties of these

concepts are represented by Java classes having only attributes. It is likened to frame-

based representation scheme (slots consisting of parameters and data). Figure 5.5 –

Figure 5.7 define vocabularies that can be used to specify or determine a suitable

process. Table 5.3 gives the list of data types (together with the possible value sets)

used in defining the concept and properties of reaction mechanism while the

implementation formats are presented in Appendix E.

Functional Unit (FuncUnit) FuncUnit :: root Nucleophile :: FuncUnit Electrophile :: FuncUnit ChargedNu :: Nucleophile NeutralNu :: Nucleophile ChargedElec :: Electrophile NeutralElec :: Electrophile AlcoholOxygen :: NeutralNu ChlorideIon :: ChargedNu HydrogenIon :: ChargedElec Carbocation :: ChargedElec

(a) IS-A for “functional unit”

Reaction Mechanism (RM) RM :: root NucleophilicSubstitution :: ReactionMechanism Elimination :: ReactionMechanism ElectrophilicAddition :: ReactionMechanism Sn1 :: NucleophilicSubstitution Sn2 :: NucleophilicSubstitution

(b) IS-A for “reaction mechanism”

Figure 5.5 Basic concepts in OntoRM ontology are hierarchically structured using the IS-A relation.

160

NucleophileView [ hasName => STRING; hasNeutral => BOOLEAN; hasCharge => BOOLEAN; hasBond => NUMBER; hasRsDegree => NUMBER; hasCarbocationStability => BOOLEAN; hasLonePair => NUMBER; hasReactivity => REACTIVITY_VAL; hasElectroNegativity => GREATER_LESSER; hasChargeOperator => PLUS_MINUS; hasBondOperator => ADD_REMOVE; hasNucleophilicity => NU_SCALE; ]

(a) Properties of nucleophiles

ElectrophileView [ hasName => STRING; hasNeutral => BOOLEAN; hasCharge => BOOLEAN; hasBond => NUMBER; hasRsDegree => NUMBER; hasCarbocationStability => BOOLEAN; hasLonePair => NUMBER; hasReactivity => REACTIVITY_VAL; hasElectroNegativity => GREATER_LESSER; hasChargeOperator => PLUS_MINUS; hasBondOperator => ADD_REMOVE; ]

(b) Properties of electrophiles Leaving_Group [ hasName => STRING; hasBond => NUMBER; hasBondType => BOND_TYPE; hasLonePair => NUMBER; hasDegreeSubstituent => DEGREES; hasElectroNegativity => GREATER_LESSER; hasNucleophilicity => NU_SCALE; hasAtomAttachmentType=>ATOM_ATTACH_TYPE;

hasReactivity => REACTIVITY_VAL; hasBaseStrength => BASE_STRENGTH; ]

(c) Properties of a leaving group

Substrate [ hasFunctionalGroupName => STRING; hasFunctionalGroupType => GRP_STRING; hasCarbonDegMainChain=>NUMBER; ]

(d) Properties of a substrate Alcohol [ hasName => STRING; hasBondType =>BOND_TYPE; hasReactivity => BOOLEAN; hasDegreeSubstituent => DEGREES; hasBaseStrength => BASE_STRENGTH; hasStability => BOOLEAN; hasLGType => LG_STRING; hasLGName => STRING; ]

(e) Properties of an alcohol

Alkyl_Halide [ hasName => STRING; hasBondType => BOND_TYPE; hasReactivity => BOOLEAN; hasDegreeSubstituent => DEGREES; hasBaseStrength => BASE_STRENGTH; hasStability => BOOLEAN; hasLGType => LG_STRING; hasLGName => STRING; ]

(f) Properties of an alkyl halide

Figure 5.6 Properties of basic concepts defined in the ontology are encapsulated in the format of a Java class.

ReactionMechanism Sn1 [ hasAlias => STRING; hasReactantNames => STRING; hasProduct => PROD_STRING; hasProductNames => STRING; hasDegreeSubstituent => NUMBER; hasReactivity => REACTIVITY_VAL; hasRateDetermineStep => WHAT_STEP_STR; hasReactionRateDependentNo => NUMBER; hasReactionRateDependentUnit => STRING; hasReactionRateDependentFactor => FACTOR_STR;

hasProcessOrder => PROCESS_ORDER_STR; hasViewsPairConstraint => SPECIES_TYPE; hasSpecialCause => SOLVENT_TYPE; hasAllowedDegreeOfCarbon => NUMBER; ]

(a) Sn1 reaction mechanism

ReactionMechanism Sn2 [ hasAlias => STRING; hasReactantNames => STRING; hasProduct => PROD_STRING; hasProductNames => STRING; hasDegreeSubstituent => NUMBER; hasReactivity => REACTIVITY_VAL; hasRateDetermineStep => WHAT_STEP_STR; hasReactionRateDependentNo => NUMBER; hasReactionRateDependentUnit => STRING; hasReactionRateDependentFactor => FACTOR_STR;

hasProcessOrder => PROCESS_ORDER_STR; hasViewsPairConstraint => SPECIES_TYPE; hasSpecialCause => SOLVENT_TYPE; hasAllowedDegreeOfCarbon => NUMBER; ]

(b) Sn2 reaction mechanism

Figure 5.7 Chemical properties of SN1 and SN2.

161

Table 5.3: Data types and the associated values.

Data Type Values (Quantity Spaces)

ADD_REMOVE [add, remove] ATOM_ATTACH_TYPE E.g. “Carbon”, “Oxygen” BASE_STRENGTH [weak, strong] BOND_TYPE [single, double, triple, ring] BOOLEAN [yes, no] DEGREES [primary, secondary, tertiary] FACTOR_STR GREATER_LESSER

E.g. “concentration” [>, <]

GRP_STRING E.g. “Halide ion”, “Hydroxyl” LG_STRING E.g. “OH”, “Halide” NU_SCALE [low, high] NUMBER E.g. [1, 2, 3] PLUS_MINUS [+, -] PROCESS_ORDER_STR E.g. “make-bond, break-bond, make-bond”, etc. PROD_STRING [“Water”, “Alkyl Halide”, “Alcohol”] REACTIVITY_VAL [low, high] SPECIES_TYPE [“Neutral_Nucleophile + Charged_Electrophile”, “…”] SOLVENT_TYPE [temperature, nucleophilicity, pH] STRING E.g. “Chloride ion”, “Proton”, Hydroxide”, “Alcohol”, etc. WHAT_STEP_STR E.g. “The break bond process called dissociation is the...”, etc.

5.3.3.2 Validation Examples

This section demonstrates examples of how the OntoRM ontology can be used for

making sure that a correct piece of chemical data is passed to the simulator. What

needs to be validated? Earlier it was stated that only the “mechanisms” portion (e.g.

SN1, SN2, and the substrates) is considered. Why just reaction mechanisms? This is

because chemical processes and behaviour (e.g. proportionalities and direct influences)

are already handled by QPT. It serves also to constrain the use of some processes and

sequences that will not lead to a valid mechanism or final product so that wrong

reaction steps can be avoided. Basically, the semantic consistency for each small

reaction step in a proposed reaction mechanism is validated using the OntoRM.

Based on Figure 5.6 and Figure 5.7, the following validation examples are provided. In

each case, one of the definitions in OntoRM is retrieved:

162

1. With regard to leaving groups, the reaction rate of both the SN1 and SN2 is increased

if the leaving group is a stable ion and a weak base. For example, iodide is a better

leaving group than bromide and bromide is a better leaving group than chloride (I >

Br > Cl > F). Also, alkyl fluorides do not undergo nucleophilic substitution. To

correctly make a prediction, the simulator needs to check the name of the leaving

group before carrying out the prediction. This information can also be included in

the pre-conditions slot of a QPT model such as to put “exclude fluorides for SN1 and

SN2 simulation”.

2. When (CH3)3CBr reacts with H+Cl−, there is a substitution of bromine by chloride,

but not in the case of (CH3)3CF and H+Br−, i.e., no reaction between the two species

(since fluoride is highest in its nucleophilicity). In the framework, the

“hasNucleophilicity” attribute is used to determine whether the simulation will

proceed or simply return the message: “no reaction since the X2 is less stable than

X1 when leaves the compound”. An example of the Java implementation of this

validation case is given in Figure E. 7 (Appendix E). This is another special case

example.

3. The “hasAllowedDegreeOfCarbon” in organic mechanism definition will check

whether an organic mechanism that is suggested by the simulator is acceptable.

This is possible since the attribute is linked to the allowable degree of carbon

attachment. For example, first degree carbon cannot undergo SN1. This can avoid

wrong simulation at the very beginning stage.

4. Alkyl groups release electron density better than the hydrogen, so the more alkyl

groups are attached to a positively charged carbon, the more stable the carbocation.

Such information (e.g. “hasStability” and “hasDegreeSubstituent” in alkyl_halide

definition) is included in the ontology to inform that “reaction can proceed if alkyl

groups are 2 or 3 degrees carbon”. Besides, the “hasDegreeSubstituent” field in the

163

alcohol and alkyl halide definition can also be used to check whether a “break-

bond” process should be initiated (even though there is a suitable pair of views

available in the VIS). Such checking helps avoid producing unstable carbocation.

As an example, the system is expected to reject suggestion for deleting the bond

between the C and O of “CH3OH”, since this would result in a highly unstable

intermediate (the “CH3+” in this case). Only tertiary carbon is guaranteed in stable

state when the leaving group (LG) is departing from the main chain carbon. This is

achieved by referring to the “tertiary” value of “yes” for the “hasStability”.

Together, they guarantee the correct chemical process to be activated. An example

of the Java implementation for this case is given in Figure E. 9 (Appendix E).

5. The IS-A structuring of the reacting units can also help in pairing up the individual

views. Suppose that the <Br−, alcohol oxygen> pair is present in the VIS together

with the <Br−, C+> pair, there is no reaction for the former pair because both of

them are nucleophiles. If validation is performed, time is saved from predicting the

chemical changes that will not happen. An example of the Java implementation for

this particular case of validation is presented in Figure E. 8 (Appendix E).

6. In equation 3.2 (page 68), there are competing pairs of individual views to activate

different processes, such as between <C, alcohol oxygen> and <H+, alcohol

oxygen>. OntoRM can be used to resolve the situation. In this case, the

“hasBaseStrength” attribute in leaving group definition is needed, where “OH” is

found to have the “strong” value for its base strength. From our chemical KB, it can

also be found that “OH” is a poor leaving group, so it is not likely to break bond.

Otherwise the “OH−” will be very reactive thus violating the chemical principle of

moving towards a more stable state. However, the <H+, alcohol oxygen> pair has

no such restriction (since based on the result of the view pair identification module,

164

it should be a “make-bond” process activation), so between the two candidate pairs,

the reasoning engine will suggest that the oxygen atom be protonated.

7. When a substrate is recognized as “alcohol”, the alcohol definition will be retrieved

and checked for its LG name and base strength. If the functional group to be

substituted is a “poor” leaving group, then a “make-bond” process will be

suggested. In OntoRM, the functional group “OH” (recognized through the parent’s

definition: “hasLGType”) possesses a chemical property of “hasBaseStrength”

which is determined to be “strong” (from the chemical KB), so the C−OH bond is

difficult to break. It has to first undergo a protonation process. If the checking is

not performed, the bond between C and O can break since C is a δ+ and O is a δ-

respectively. Surely, this is an incorrect step if allowed to proceed.

8. In the definition of the substrate, there is an attribute called

“hasFunctionalGroupType” that connects it to a list of valid functional group to be

used. This particular definition can be used to validate whether the initial compound

has completed a proper substitution of its nucleophile. For example, in Figure 3.3

(page 74), the “OH2+” that is attached to the main chain carbon is checked by

OntoRM. It was found that it is not a valid functional group, so it is not the final

product yet. Therefore, a further reaction step is recommended. After each reaction

simulation, the QR program will come to this part to check whether more reaction

steps are needed or the simulation is claimed completed.

9. The “hasElectroNegativity” can also help to determine, in a “break-bond” process,

the lone pair electrons on which atom is more available for donation. For instance,

when the C−Br bond breaks, a non-bonded electron pair on a less electronegative

atom (the C) is more available for donation than a non-bonded electron pair on a

more electronegative atom. Similarly, when the “C−OH2+” bond breaks, both

165

electrons move onto the oxygen to restore a second lone pair of electrons and thus

neutralizing the charge. This is because the oxygen atom is a more electronegative

atom than the carbon atom therefore it has a greater share of electrons in the bond.

OntoRM helps structure and organize the chemical knowledge used by the simulator,

where proper use of chemical knowledge during simulation is achieved via the

knowledge validation module. Common ontologies typically specify only some of the

formal constraints that hold over objects in the input and output in the domain of

discourse. Commitment to a common ontology is a guarantee of consistency but not

completeness and there is no exception for OntoRM. This means that the ontology

cannot be used to validate all organic compounds in the domain of discourse, but

consistency can be achieved each time the vocabulary of OntoRM is used for validation

purposes.

5.3.4 The Substrate Recognizer

The inputs are represented as basic facts in the format of predicate logic (Prolog). Each

input has an internal structure that represents its decomposed units. This means there is

a corresponding decomposed array for each substrate. This is so because the chemical

process to be used is determined by the view pairs consisting of the decomposed units

(identified as nucleophiles and electrophiles). The decomposed units will also be placed

in the view structure arrays. Processing will proceed with these decomposed

substructures.

166

5.3.5 The Model Constructor for Organic Processes

QR can work for a variety of domains and purposes. However, in the entire process,

“automatic model construction” is still a challenge, and it has not been satisfactorily

addressed this far. As mentioned by Bredeweg and Struss (2003),

“…automated model building, efficient algorithms and QR techniques to generate task-

oriented models from generic ones are among the QR research challenges…”

As explained earlier, this work constructs QPT models for processes from the set of

view pairs available. Such technique of automating model construction addresses one of

the issues of QR research – “model automation”.

QPT is used for modelling the domain knowledge. In particular, it is used to represent

the chemical theories that model the chemical intuition possessed by chemists when

solving organic reaction problems. Basically, a chemical process’s functional

characteristics are represented using QPT and its processing description is implemented

as a set of QR algorithms (since QPT does not describe how these models are used).

The top level design of the views and the process model automation steps is outlined in

Figure 5.8.

167

INDIVIDUAL VIEWS AND PROCESSES MODELLING ALGORITHM

Qualitative_Modelling(substrate, reagent, QPT_MODEL) 1. Examine user inputs 1.1 Retrieve constituent parts of the substrate from chemical KB 2. Recognize structural units in substrates 2.1 Assign units as either nucleophile or electrophile

2.2 Store them in View Instance Structure (VIS) 3. Retrieve chemical facts and chemical properties of the nucleophile and electrophile 3.1 Compose the four slots of a QPT view

4. Suggest a chemical process based on the view pair 5. Retrieve the chemical facts and chemical theories of the suggested organic process

5.1 Assign process quantity to the direct-influence slot of the QPT model 5.2 Compose the other three slots: [Individuals, Quantity-Cond, Relations]

Figure 5.8 The main steps in the model constructor module.

5.3.6 The Reasoning Engine for Reaction Simulation

This is the heart of the framework where the mental model of a chemist is reproduced

during runtime by qualitative reasoning. The QR algorithm can mimic/simulate a

chemist’s way of reasoning when trying to propose a mechanism or to explain the

proposed mechanism. In the following subsections, the algorithm for the reasoning

engine is presented. The set of algorithms works well with the qualitative data (e.g.

chemical theories) represented in QPT. This is novel as far as chemistry software is

concerned.

Changes are caused by continuous chemical processes which provide the notion of

mechanism for causality. These changes propagate through the system that indicates

causal relationships between quantities. A set of algorithms to “use” the knowledge has

been developed. Figure 5.9 gives the main steps of the reasoning algorithm

implemented in QRiOM.

168

QPT-BASED SIMULATION ALGORITHM Q_Simulation(QPT_model, OUTPUT) 1. Perform qualitative reasoning on the constructed QPT model 1.1 Store the process’s entry conditions 1.2 Store the directly influenced process quantity 1.3 Keep track of the state transition (handled by the QSA module) 2. IF process_stopping_condition = true THEN Store propagated effects in special purpose data structures Store new individuals in the VIS Update the VIS END_IF 3. Update the substrate’s molecular structure (handled by the MUR module) 4. IF VIS contains reactive individuals THEN Determine a suitable chemical process

Go to step 1 ELSE Retrieve final product from the VIS Call OntoRM to check for validity of the predicted product

Call OntoRM to check for the possible order of process execution Write the final product and the proposed mechanism to OUTPUT

END_IF 5. Return OUTPUT

Figure 5.9 The main steps of the simulation algorithm.

5.3.7 The Causal Model Generator

Explanation generation by software is never an easy or a straightforward

implementation. Nevertheless, when the behavioural aspect of the reaction problem is

described in qualitative terms, the “causality” concept is inherent in the model, where

functional dependencies can help explain the effects of propagation caused by a

process. Qualitative representation using the constructs of QPT provides us with a

simple means to capture the intuitive, especially causal aspects of human mental

models. The causal model presented in Figure 4.16 provides a medium for the students

to examine (and trace) graphically the dependency between chemical parameters. This

can help to scaffold the students’ reasoning ability. The inspection of cause-effect

chains can help a learner to pick up the underlying concept better than merely

memorizing the reaction steps or basic facts. QRiOM is able to provide this type of

explanation on demand. Appendix D.10 and Appendix D.11 show two screenshots of

169

the causal graphs generated by QRiOM for explaining respectively the SN1 and SN2

mechanisms in producing the outcomes of reactions. The component that handles this

task is the QSA. Figure 5.10 presents the algorithm for QSA that constantly keeps track

of the state transition of a molecule in order to generate the causal graph on demand.

The essential steps and data needed to generate a causal graph are shown in Figure 5.11.

QUANTITY SPACE ANALYZER QA_Analyzer(quantity_name, initial states of affected quantities, quantity space)

1. Store initial states of each quantity in special purpose arrays 2. Perform qualitative arithmetic 2.1 Examine the sign and direction of change (derivative) of the quantities 2.2 Check the relevant quantity spaces for new values 2.3 Update qualitative states 2.4 Store propagated effects in special purpose data structures 3. Stop

Figure 5.10 Main steps in the QSA module.

170

Figure 5.11 The flowchart for generating a causal graph.

5.4 Storing Molecular Patterns in Software

Chemists are pioneers in building electronic databases for storing chemical compounds.

The huge amount of chemical information is difficult to be handled manually by human

(Engle and Gasteiger, 2002). Therefore, chemists started quite early in storing

information in electronic form. Each year more than 6,000,000 new chemical

compounds are registered in the Chemical Abstract database

(http://www.cas.org/substance.html). In this work, chemical structures are represented

by “structure diagrams” which consist of the atoms of molecules and how these atoms

171

are connected by chemical bonds. Such a representation is called a connection table.

Examples are shown in the following section.

5.4.1 Design of Attributes and Methods for an Atom

The properties of atoms need to be updated from one reaction step to another. Some

essential attributes and methods associated with an atom are given in Table 5.4.

Table 5.4: Some attributes and methods associated with an atom.

Attributes The role of the attribute during an organic reaction

simulation

Molecular structure The molecule to which the atom belongs.

Bond neighbours The list of bonds which are connected to the atom.

Atom neighbours The list of atoms which are only one bond away from the atom.

Own lone pair electrons This is the number of electrons which present in the valence shell and do not belong to a bond.

Own charge The charge of an atom is determined by its own valence electrons and by the number and order of the bond neighbours.

Own no. of covalent bonds

This is the number of links (bonds) that connects the atom to other neighbouring atoms. Only the atoms that are connected one bond away is counted.

Methods The main role played by the method associated with an atom

during an organic reaction simulation Connect to bond

Given a new bond that is to be attached to the atom, the method updates the list of bond neighbours and the lone pair attributes accordingly using the QPT indirect influence.

Connect to atom This method will create a new bond to form a link to another atom. In this work, it is either a nucleophile or an electrophile.

Update own charge, lone pair and covalent bond total

This method further checks and updates other dependent parameters such as the charge of both the atom in question and the approaching ones.

Disconnect from bond This method carries out the inverse procedure from the method that connects to a bond.

Disconnect from atom This method carries out the inverse procedure from the method that connects to an atom.

172

In this work, the attributes and methods of atom and bond objects focus on representing

and modifying the connectivity of the molecular structure, in a way similar to the work

described in Mavrovouniotis and Forsythe Jr (1998). When a suitable chemical

bonding is determined, then the bonding process will be activated. Suppose it is a

“make-bond” process. The first task is to create a bond (through the “direct influence”

construct of the QPT) on the atom in question. The next task is to update the states of

other dependent chemical parameters, as well as the states of chemical parameters for

the incoming nucleophiles/electrophiles. The “bond neighbour” and “atom neighbour”

will be updated accordingly. It is plus one for the former attribute and the list of atoms

which is one link away will be updated to respond to the latter attribute. The extra atom

attachment will also be used to update the overall molecular structure for the organic

compound.

5.4.2 Connection Table

A connection table has a 2D structure representation. Figure 5.12 shows the computer

representation of the “C–O–H” structure.

Atoms Atom1 1

Atom2 2

Atom3 3

: AtomN

N

(a) The individual atoms in a substrate’s functional group

Atom1 Atom2 Atom3 .. AtomN Atom1 0 Atom2 0 Atom3 0

: 0 AtomN 0

(b) The connection table is presented as a 2D array

Figure 5.12 A substrate’s functional group represented as a connection table.

173

Bond order is not considered in our work since only single bond species are used. A

specific example is given in Figure 5.13, where the initial bond connection for the

functional group of the alcohol is shown. In the reasoning framework, the table entry is

updated as soon as the reasoning is performed on the structure during the “make-bond”

process (“protonation” in this case), as shown in Figure 5.14. These tables can then be

used to update the 2D molecule table.

C O H Remarks C 0 1 0 “C” has 1 chemical bond with “O”. O 1 0 1 “O” forms 2 chemical bonds. H 0 1 0 “H” has 1 chemical bond which is connected to the “O”.

Figure 5.13 Connection table for initial structure of the substrate.

C O H1 H2 Remarks

C 0 1 0 0 - O 1 0 1 1 Three bonds on “O” is unstable

H1 0 1 0 0 - H2 0 1 0 0 -

Figure 5.14 Connection table after the “protonation” process (“make-bond”). The digit “1” is filled in the correct entry based on the individuals that activates the process. “H2” indicates the newly added atom.

5.4.3 The Molecule Table

The structure of the substrate is constantly being updated from one reaction step to

another until the entire simulation is finished. The changing of a compound’s molecular

structure is recorded in special purpose data structures. These structures can be

retrieved at a later step for explaining the proposed mechanism by showing the reaction

steps occurred. The component that handles this task is the MUR. Sample structures

will be shown in the following subsection. The output generated by the computer can

174

be found in Appendix D.5 – Appendix D.7. The main steps in updating the whole

substrate are presented in Figure 5.15.

Molecule_Update_Routine (MUR) IF process_name = “make-bond” THEN Recognize incoming element and the index of the atom in the molecule to form a bond Add a bond to the atom to be attached to the incoming element Update the molecule_table Update charge and lone pair properties for both the affected atoms in atom_property_table Add the new element into row+1 and col+1 of the bond_table Insert “1” in the table entry of the newly inserted element in the bond_table ELSE_IF process_name = “break-bond” THEN

Recognize the position of the atom to be detached Remove a bond from the atom Store the removed element in VIS Update the molecule_table Update charge and lone pair properties for both the affected atoms in atom_property_table Remove all elements starting from the index of affected atom until lastRow/lastCol in the bond_table END_IF Figure 5.15 Algorithmic steps in the MUR module that updates the molecule table in order to prepare the reaction route of a chemical reaction.

Figure 5.16 shows the structure of the alcohol substrate before any reaction takes place

while the content of the molecule after the “H” has been added is shown in Figure 5.17

(observe the shaded entries):

1 2 3 4 5 6 7

1 CH3 2 | 3 CH3 C O H 4 | 5 CH3

Figure 5.16 A molecule table is represented as a 2D array. This is the initial structure of the alcohol substrate.

175

1 2 3 4 5 6 7

1 CH3 H 2 | | 3 CH3 C O H

4 | 5 CH3

Figure 5.17 The “H” has been attached to the main compound. This is the effect of the generic “make-bond” process.

In Figure 5.17, to add a covalent bond to “O” (Column 5, Row 3), a link is formed at

(Column 5, Row 2) and an atom “H” is placed at (Column 5, Row 1). To remove a

covalent bond from “C” (Column 3, Row 3), the entries for “Column + 1” onwards will

be replaced by “ ” (blank). In other words, for each column, from 4 (column adjacent

to carbon) to 7 (last column), entries in rows from 1 to 5 are deleted.

A real challenge was faced in the design of the algorithms used for drawing the 2D

patterns for the substrates’ molecular structures. It was decided in the end to focus only

on the structural units (nucleophilic centre) that will undergo substitution. In this way,

most atoms (especially in the long chain of a compound) remain unaffected and there is

no need to carry out updating tasks on these atoms.

5.5 Knowledge Structuring

The intention of structuring the chemical knowledge used in the simulator is to

overcome what was lacking in QALSIC. The chemical KB is organized as having three

abstraction levels and four types. Each type of knowledge is stored in a different Prolog

file, so that whenever a set of related chemical facts or theories is needed, the correct

file will be retrieved. Besides, the order of use of the knowledge can be traced by

keeping a log of the different chemical KB file being used during qualitative simulation.

176

Such a log file is useful when we want to find out “when” each type of the knowledge is

used in a particular reasoning route. As noted in Chapter 2, QALSIC has quite a

number of qualitative processes but how they are structured in the system for efficient

retrieval is not well-defined in the literature. To address this issue, the domain

knowledge is organized by grouping all chemical theories specific to an organic

reaction in the same data store. With this measure, fast retrieval for a set of related

chemical information for an identified chemical process can be achieved. Table 5.5

defines the meanings for the three levels of knowledge use in QRiOM while the four

knowledge types and their roles together with their knowledge levels are given in Table

5.6. The inclusion of knowledge typing and the different abstraction levels of

knowledge are motivated by the work described in Bredeweg (2001).

Table 5.5: Three abstraction levels of knowledge for use in QRiOM.

Abstraction

Level

Description

Domain

Knowledge

This level of knowledge is task neutral (i.e. “how” they are used is not

specified). The design is therefore looking for relations that can be

used under different organic reaction mechanisms. Examples are the

declarative facts and relations.

Inference

Knowledge

It specifies how the domain knowledge can be used for qualitative

reasoning. It points out the role the domain knowledge plays in the

reasoning process. This level of knowledge can be represented as a

model that describes the real world system in which the behaviour of

that system does not change during a period of time. Examples are

conceptual and conditional knowledge stored in the chemical KB that

are used for composing QPT models for organic processes.

Strategic

Knowledge

It controls the overall reasoning, e.g. how tasks can be selected for

achieving goals. The steps defined in the QR algorithms are considered

as strategic knowledge.

177

Table 5.6: Knowledge types, abstraction levels and roles for use in QRiOM.

Knowledge

Type

Abstraction

Level

Roles Examples specifically designed

for organic reaction

mechanisms

Conceptual

Domain Knowledge

• They are not tied up with any execution

• They are stored in chemical KB as facts and relations

• reagent(‘HCl’) • reagent(‘HBr’) • nucleophile(‘Cl-‘, charged) • electrophile(‘H+’, charged) • is_a(‘OH’, leaving-group) • is_a(‘H’, proton) • product_formed(water,

‘H2O’, stable) • substrates(alcohol,

‘CH3OH’) • functional_unit(nucleophile) • functional_unit(electrophile)

Computational

Domain Knowledge Inference Knowledge

• Mainly used by the reasoning algorithm during simulation to compute and update the states of chemical parameters

• covalent_bond(‘C’, 4, stable) • covalent_bond(‘O’, 2,

stable) • covalent_bond(‘H’, 1, stable) • lone_pair(‘C’, 0, stable) • lone_pair(‘H’, 0, stable) • lone_pair(‘O’, 2, stable) • q_space(charge, ‘[neg,

neutral, pos]’).

Conditional

Strategic Knowledge

Inference Knowledge

• The knowledge used

in partial model for “when” to apply the model fragment

• E.g. those statements defined in quantity-

conditions slot of a QPT model

The following two statements: • electro-negativity(‘O’) >

electro-negativity(‘C’) • Am[no-of-bond(‘O’)] >

Am[max-bond-allowed(‘O’)] are represented in software, as follows: • electronegativity(‘O’, ‘>’,

‘C’) • greater(no_of_bond,

covalent_bond)

Additional

Strategic Knowledge Inference Knowledge

• The “pre-condition” used in organic process models

• This type of knowledge is normally used to disambiguate situations

• The definitions can also support special cases reasoning

• leaving_group(‘OH’, poor) • leaving_group(‘Cl’, good) • non_reactive_unit(‘H’,

‘CH4’) • nucleophilicity(charged, ‘>’,

neutral) • nucleophilicity(charged, ‘>’,

neutral) • stable_ion(‘Cl-’) • stable_ion(‘Br-’)

178

5.6 The Protocol for Interacting with QRiOM

Implementing the qualitative reasoning framework is one of the objectives of this work.

In order to substantiate our claim that QPT models can be automated (described in

Chapter 3) and the qualitative reasoning framework is realistic, QRiOM has been

developed.

Figure 5.18 shows the problem solving model of QRiOM (i.e. the protocol to interact

with the software tool). The user can repeat any function as many times he/she desires.

The tool also provides explanation via special functions to emphasize certain chemical

theories and the general concept of a particular chemical process. In order to facilitate

user control over a simulation task, the navigational interface includes the following

functions:

• Moving forward and back one screen at a time within the same reaction simulation.

• A list of clickable buttons is provided that can be accessed in random sequence.

Such features enable users to compare the various forms of results generated at the

end of a simulation in the same text area.

• Multiple tables to display what has been changed in an atom from the start state

until the end of a simulation.

• Jumping to a particular explanation page.

• System exit can be done easily (from all the major GUI pages).

• Each button is annotated with brief function and uses. This enables users to read

ahead before a particular button is clicked.

• Some buttons are disabled to avoid wrong sequence of running a simulation. For

instance, model construction function needs to be run before a simulation can be

performed.

179

• Even though users can click on any button that appears on an interface page, there is

always a button framed with green colour to draw the user attention. This

“coloured” button indicates it is the correct button to click.

• A periodic table is provided to check physical properties of an atom used in the

simulation.

• A glossary of chemistry and QPT terms (and jargon) is also included.

• A printer-friendly function is provided to send narrative notes to a printer.

Select substrate and

reagent

Build process model

(Automate QPT model

construction)

Run simulation

View final products and the

reaction mechanism used

Examine the entire

reaction route

Inspect qualitative model

(QPT models)

Analyze causal graphs in

explanation page

Study changes in atoms’

chemical parameters

Corresponds

to A

Corresponds

to B

Corresponds to

C, D & E

Corresponds

to F

Corresponds

to G

Corresponds

to H

Corresponds

to H

Figure 5.18 Protocol in using the simulator (Labels A – H can be found in Figure 5.19).

In this work, the tool is designed in such a way that the user is taken through the

modelling, simulation and explanation pages step-by-step. A computer screenshot of

the main interface of QRiOM is given in Figure 5.19.

180

A

B

C

D

E

F G I H

Figure 5.19 Main interface of the QRiOM software.

The navigation is achieved through careful menu layout, where the input, output and

more explanation are logically divided on the same interface page. The choice to move

back and forth is also provided through the “Previous” and “Next” buttons. Java

program snippets for the main software modules embedded in the prototype are

presented in Appendix E. In particular, the Java methods/classes that support the

simulation and explanation tasks are provided.

5.7 Simulation Results and Discussion

In this section, the simulated results are presented as a collection of computer

screenshots. The correctness of the outputs is verified by the students and chemistry

instructors. At the end of a simulation, the simulator returns the final products formed,

as well as the following simulation results and explanations:

181

• The entire reaction route of a qualitative simulation. This output result helps

explain why certain atom leaves (or approaches) a given organic compound. Such

result permits learners to study how a substrate’s molecular pattern is changed from

one reaction to another (Figure 5.20).

• The qualitative model representing the chemical process specification used in

predicting the behaviour of a chemical reaction (Figure 5.21).

• A causal graph that depicts the reacting species used, the intermediates produced,

and the cause-effect chain of chemical parameters in the simulation (Figure 5.22).

• The whole set of the parameter state histories (the parametric values) assigned to

each quantity in the reaction simulation. This is called a piece of “history” (Figure

5.23).

• The atom property table that contains the chemical states possessed by each reacting

unit during simulation (Figure 5.24).

• The whole set of view pairs used in the simulation (Figure 5.25).

As it is claimed, all the outputs are produced dynamically (on-the-fly). There is no

precoded reaction route in the program as in the traditional software development

approach of chemistry educational programs. Table 5.7 presents a summary of the

computer screenshots together with the objectives they serve, and the questionnaires

that test the achievement of the objectives. After analyzing all the computer outputs, a

mental shift in each student is expected as the consequence.

182

Table 5.7: Computer screenshots, objectives and the questionnaires used to test it.

Computer screenshots Educational objectives Questionnaires that test the

achievement of the objectives

Reaction route of a qualitative simulation (Figure 5.20)

• Promote conceptual understanding

• Promote ability to articulate various aspects of a reaction

• “Effectiveness of the explanation of QRiOM” survey form

• From interviews

QPT model for organic processes (Figure 5.21)

• Promote ability to articulate various aspects of a reaction

• “Usefulness and Helpfulness of QRiOM” survey form


• From interviews Causal graph (Figure 5.22)

• Promote conceptual

understanding • Promote ability to

articulate various aspects of a reaction

• “Effectiveness of the

explanation of QRiOM” survey form


• From interviews

Parameter state histories (Figure 5.23)



Atom property table (Figure 5.24)



View pair used in each reaction step (Figure 5.25)

• Promote conceptual

understanding

• “Usefulness and Helpfulness of

QRiOM” survey form

5.7.1 Reaction Route

In QRiOM, a substrate’s structural change is represented in 2D format, resulting in the

so-called “reaction route” of a simulation. Two examples of reaction routes generated

by QRiOM are depicted in Figure 5.20. When organic reactions are described in this

way, the product of an organic reaction can be readily predicted, without recourse to

memorization. The reaction route gives the step-by-step change of the molecular

structure of an organic substrate. The tool can explain not only the steps it takes during

the reasoning process, but also the reasons for following these steps. This kind of

183

explanation requires an explicit representation of the domain knowledge, described in

qualitative terms. Inspecting such a 2D representation can promote the conceptual

understanding of students.

( a) Step-by-step change of the molecular structure of an organic substrate (SN1 example)

(b) Step-by-step change of the molecular structure of an organic substrate (SN2 example)

Figure 5.20 Screenshots showing two reaction routes generated by QRiOM at the end of a simulation.

184

5.7.2 QPT Model

Students typically have problems in describing the chemical parameters needed to solve

the problem. This is due to lack of the necessary chemical intuition, especially on how

to relate the parameters within a situation. When inspecting a model, students have to

articulate relationships between entities and dependencies. This can help improve their

reasoning ability. A screenshot of model inspection page is shown in Figure 5.21.

Testing different substrates in laboratory can be costly and sometime unsafe. The

simulator can solve this problem, as the user can repeat any number of times a given

reaction. Furthermore, students have trouble in thinking about the reasons for justifying

the activation of an organic reaction. The tool helps in a way that each constructed

model for a simulation task is presented for student inspection. The main goal of letting

students inspect the qualitative model is that they can articulate ideas behind the design

of the various slots in a QPT model. For example, the “quantity-condition” can be used

to justify why a process would start/stop.

Figure 5.21 A computer generated QPT model.

185

5.7.3 Causal Graph

Much of the explanation used by QRiOM is achieved by tracing the effect propagation

through ontological modelling constructs of QPT. For example, during each reaction

simulation, a causal graph (Figure 5.22) is generated that shows the use of the

qualitative proportionality statements in the QPT models. Inspecting parameter

dependency and their direction of change can help a learner to pick up the underlying

concepts much better than merely memorizing the reaction steps or formulas. Causal

models help learners to rationalize why a particular process occurred. This can lead to

a deeper understanding of chemical processes. The four domain experts (chemistry

lecturers at University of Tenaga Nasional) have commented that the representation of

causality in the model generated by QRiOM is acceptable and valid.

Figure 5.22 A causal graph generated by QRiOM that enables learners to examine the cause-effect relationships of chemical parameters during reasoning.

186

5.7.4 Parameter State History and Atom Property Tables

Learners can also browse the behavioural change of parameters belonging to each

reacting units (Figure 5.23) in which all the units (nucleophiles and electrophiles) used

in an organic reaction simulation is populated to a pull-down list. When a species is

selected from the list, the whole occurrence of the selected atom can be viewed. As an

example, under the “Charge History” title, the “[neg] [nil] [nil] [neutral]” can be

interpreted as “the initial value of Bromide ion is negative, it is not involved in the first

and second reaction steps (hence “nil” is used), the ion is used in the last reaction step

and this step will turn its charge to neutral”.

Figure 5.23 The states of chemical parameter of each reacting species involved in a simulation task can be examined in greater detail.

The values assigned to the chemical parameters during simulation are recorded in

special-purpose data structures for future retrieval. One such structure is the atom

property table (Figure 5.24a). These results can then be used to generate the necessary

reaction route (Figure 5.24b). The structure of the final product can be easily drawn

187

from Figure 5.24a. For example, when the charge on “C” is positive (A1, Figure

5.24a), then a positive sign is assigned next to the “C” atom (B1, Figure 5.24b).

Likewise, at A2 of Figure 5.24a (under “After step 3” heading), the “C’ regained its

stability and this change is reflected at B2 of Figure 5.24b.

A1

A2

(a)

B1

B2

(b)

Figure 5.24 (a) The chemical states possessed by each reacting unit during simulation are stored in the atom property table (b) A reaction route drawn from using the data values in the atom property table.

When the three outputs: (1) causal graph, (2) parameter state history, and (3) atom

property table are examined together, the students are expected to relate various aspects

in a reaction such that they are able to explain an organic reaction in a more elaborate

way. This will lead to an improvement in one’s overall conceptual understanding of the

subject.

5.7.5 List of Reacting Species (View Pairs)

Since the majority of chemistry students have difficulties identifying the right view

pairs for processes activation, the tool will also generate the whole set of view pairs

188

used in the simulation (Figure 5.25) thus informing the learner of the type of functional

units that activate a given process. For instance, “H+” and “CH3CH3CH3COH” are

reacting species (an electrophile and a nucleophile respectively) that activate the “make-

bond” process. The process also generates an intermediate called

“CH3CH3CH3CO+H2” (see “After Step 1” heading). Recall that this is the

intermediate product after the “First reaction step” as given in equation 3.1. This output

is seen as useful to the students (survey results are presented in Chapter 6).

Figure 5.25 The choice of reacting units for each reaction step and the intermediates produced are displayed for further inspection.

5.8 Conclusion

This chapter fulfilled three objectives. First, a qualitative reasoning framework for

organic reaction simulation and explanation has been developed. The chapter described

the development of the qualitative reasoning famework for the simulation of organic

reactions. Particularly, the chapter presented the algorithms that enable model

automation, chemical process reasoning, and the generation of causal graphs. The

189

design of the internal structures for storing the atoms and molecules together with their

associated methods for implementation are also discussed. The reasoning framework

can be extended to support other organic mechanisms. The extensibility only relies on

additional facts to be placed in the chemical knowledge base. With such a design, many

learning software can be developed with minimal modification to suit their very unique

needs or features. Second, the different types and roles of chemical knowledge at

different abstraction levels have been defined. Third, a small set of chemistry ontology

called OntoRM for use during simulation has been developed. The application of the

OntoRM ontology helps the reasoning engine makes correct prediction. This feature is

not found in the QALSIC software. In particular, this chapter answered two research

questions: (1) “How can the domain knowledge (represented in QPT) and the OntoRM

ontology be effectively used?” (2) “How can knowledge validation be carried out?”

The first research question was answered by presenting the two-tier architecture for

knowledge base with their clear division of functions and roles. This is not found (or

rather unclear) in QALSIC. The second research question was answered with a number

of validation examples. This chapter also provides the protocol when interacting with

QRiOM simulator prototype, where “How” the system is used is described. The

development of QRiOM is based on the qualitative reasoning framework. The

simulation results presented in this chapter have been verified by the chemists, they

commented that the results of simulation matched those written in textbooks. The

various forms of outputs serve as the “explanation” to a chemical reaction or

phenomemon being learned. We shall see in the following chapter that QRiOM can

assist chemistry students learn organic reactions through the “explanation” pedagogy

embedded in the software.

190

Chapter 6 Evaluation of QRiOM

6.1 Introduction

The main objective of QRiOM is to help learners gain a better understanding of the

fundamentals of organic reaction concepts and to improve their reasoning ability by

analyzing the multiple forms of output generated by the software. To test the

achievement of this objective, a preliminary evaluation of QRiOM was conducted upon

its completion. This chapter presents the feedback of students after using the software

tool. The survey comprised questionnaires, interviews and QRiOM hands-on. The

survey has three objectives. First, it is to collect the students’ receptivity towards using

the tool. Second, the survey was to find out the effectiveness of the explanation

generated by the QR/QPT approach. Third, we would like to know whether the

interface design can satisfy students from chemistry background (“user friendliness” is

the focus). We can collect some feedback from the students even though this study was

not intended to be highly prescriptive. Some evidence that QRiOM can improve

learning is also provided.

6.2 The Evaluation Context

QRiOM is not a courseware (neither it is an ITS), but a qualitative simulator that has

embedded intelligence to explain its reasoning by tracing the functional dependency of

parameters governed by the modelling constructs of the QPT. We focus on two main

criteria when designing the questionnaires, namely, effectiveness and user-friendliness.

The former criterion is to collect general feedback of the respondents about how

effective the tool is in nurturing their conceptual understanding towards the subject

while the latter criterion aims at collecting general view of the students if it is a user-

191

friendly tool. The evaluation was conducted based on qualitative and quantitative

approaches. Murray (1993) stated that qualitative approaches provided information as a

function of personal interaction and perception. The most commonly used technique for

data gathering when conducting qualitative research is subject-based that includes

questionnaires, observations, interviews, and focus groups (combining elements of both

interviewing and observation). Two types of subject-based evaluation techniques were

used in this work, namely questionnaires and interviews. We also conducted the

quantitative approach. Quantitative evaluation is mainly about identifying the

characteristics of a situation or setting (Shute and Regina, 1993). Questionnaires are the

most commonly used technique for data gathering when conducting quantitative

research. For example, we computed the total counts of responses for each question in

a survey form (e.g. the total number of responses for “I find qualitative reasoning

easy”). Other techniques used in evaluating educational materials include pre-test and

post-test, which is also used in this study.

Background of participants: The respondents include chemistry lecturers, IT lecturers

and undergraduate level chemistry students (from different academic standings). They

are invited to serve as evaluators upon completion of QRiOM’s development. Only a

small group of chemistry students enrolled in an introductory chemistry class was

recruited since QRiOM currently has the status of a prototype. We would like to find

out if they can learn and understand better when exposed to the tool and if they would

like to see other features in the software. The IT lecturers were interviewed to provide

general comments on the system’s GUI design, technical contents, overall functionality

and user-friendliness of the software. The respondent categories are summarized as

follows:

192

1. Twenty chemistry students

• To collect feedback on the effectiveness of the explanation returned by QRiOM

• To collect feedback on the usefulness and helpfulness of QRiOM

• To provide comments on the GUI design

2. Supervisors (for overall objectives fulfilment)

• Two chemists and one AI expert

• To verify the prediction results produced by the tool

3. University colleagues

• Four other chemists

• To collect general views and comments for enhancement

• To see if there is anything seriously lacking in the software

• Three information technologists

• To provide comments on the GUI design

This chapter only discusses the evaluation results from the chemistry students, as the

tool is intended to assist them in their learning.

6.3 Procedures Used for Conducting the Questionnaires

The evaluation includes a lecture on the QPT ontology and a tutorial on QRiOM,

particularly focusing on some common ontological modelling constructs and the notion

of qualitative causal graphs. After introducing the modelling language and a

walkthrough on QRiOM, time limited hands-on sessions began. At the end of each

session, students are given a survey form. Table 6.1 summarizes all the questionnaires

and the associated educational objectives that each achieved. Figure 6.1 shows the

procedures used in conducting the system evaluation. Section 6.3.1 – Section 6.3.6

193

discuss the procedures of the questionnaires, the survey results with discussion as well

as the achievement of specific educational objective(s).

Table 6.1: Questionnaires and the fulfilment of respective educational objective.

Survey forms Educational objectives to

achieve

Survey

Results

Snapshot and/or component

in the framework that

support the fulfilment of the

specific objective

1. “Pre-Questionnaire” survey form

• Improve in conceptual understanding

• There is a positive mental change (high scores were given in post-test evaluation)

Figure 6.6

• The framework component that is responsible for the achievement of this objective is the reasoning engine

2. “Post-Questionnaire” survey form

3. “Effectiveness of

the explanation of QRiOM” survey form

• Improve in conceptual understanding (especially in behavioural and causal aspects of a reaction)

• Able to articulate (can provide longer answers)

Figure 6.8 • Reaction route (Figure 5.20)

• QPT model inspection page (Figure 5.21)

• Causal graph (Figure 5.22) • Atom property table for the

substrate (Figure 5.24) • Framework component that

is responsible for the achievement of this objective is the causal explanation generator

4. “Usefulness and Helpfulness of QRiOM” survey form

• There is a positive mental change (more confidence in attempting new problems)

• Able to articulate

Figure 6.10 • Reaction route (Figure 5.20)

• Causal graph (Figure 5.22) • Parameter history during

simulation (Figure 5.23) • View pairs choice list

(Figure 5.25) • The entire simulation

engine is responsible for this objective

5. “Student understanding towards QPT ontology” survey form

• Students can understand the ontology

• Students can appreciate the ontology as a new knowledge capture tool

Figure 6.4

Not applicable

6. “Opinion about applying qualitative reasoning and modelling in chemistry” survey form

• Students find the technique appropriate for chemistry reaction simulation

• Students will explore this new technique of performing chemistry reaction simulation

Not applicable

194

START

END

Survey forms are given to collect information about students’ knowledge

in core areas of organic reactions (This is Pre-Questionnaire)

QPT briefing is delivered; in order to understand some terms used in the tool

Opinions about QPT and qualitative reasoning approach are sought

QRiOM problem solving model is explained to the students, i.e. how to

interact with the software tool

Students are given 20 minutes hands-on using the tool

Survey forms are distributed to collect students’ opinions about the

effectiveness of the explanation facility of QRiOM

Perspectives on user friendliness of the tool are collected (via an interview)

Survey forms are distributed to measure the usefulness and helpfulness of

QRiOM

Survey forms for Pre-Questionnaire are given again to see if a student’s

conceptual understanding in the core areas of organic reactions has improved

(This is Post-Questionnaire)

Figure 6.1 Flowchart of the QRiOM evaluation exercise.

195

6.3.1 Students’ Feedback on the use of QPT and Qualitative Reasoning

Approaches

Participants were given a brief introduction to QPT and qualitative reasoning approach

then they attempted the questions in Figure 6.2 (see also Figure C.1 in Appendix C) and

then they answered the questions in survey form as presented in Figure 6.3 (see also

Figure C.2 in Appendix C). Thirty minutes are allocated for this session.

The objective of this survey is to collect opinions and views from the students on the

use of QPT-based reasoning. This survey enables us to collect feedback about the

suitability of QPT-based reasoning as the new means for performing simulation and

prediction in the application domain. Examples of the survey questions are given in

Figure 6.2 and Figure 6.3.

Strongly

Disagree

Disagree Neither

Agree Nor Disagree

Agree

Strongly Agree

Q1 The identification of quantities (parameters) helped me to establish the functional dependency among them.

1 2 3 4 5

Q2 The specification represented using QPT makes it easy to understand the organic processes (reaction steps) that are involved in a chemical reaction simulation.

1 2 3 4 5

Q3 The flow of the reasoning is more systematic when a specification that captures the chemical knowledge and intuition is there. (like the one given in the attachment – a QPT model)

1 2 3 4 5

Q4 I still don’t know how to read the diagram in the attachment even though it is already taught.

1 2 3 4 5

Q5 The QPT specification describes almost exactly what I have in mind.

1 2 3 4 5

Q6 There are still many concepts implicit in the chemical reaction but I don’t seem to see them in the model.

1 2 3 4 5

Figure 6.2 Examples of survey questions used for measuring students’ understanding towards QPT.

196

Strongly Disagree

Disagree

Neither Agree Nor Disagree

Agree

Strongly Agree

Q1 The likelihood that I would read further about qualitative reasoning & modeling is high.

1 2 3 4 5

Q2 I don’t like the trouble of going through modeling and simulation before real experiment.

1 2 3 4 5

Q3 If I want to explain reaction mechanisms to my friends, this is the type of formal way that I’m looking for.

1 2 3 4 5

Figure 6.3 Sample questions in a survey form that collect students’ opinions about qualitative reasoning and modelling approaches.

Survey results and discussion: Based on a short lecture (30 minutes) about qualitative

reasoning and modelling using the QPT technique for learning organic reactions,

respondents were asked to indicate on the scale given in the survey form. The

following implementation levels were considered: 5 = strongly agree, 4 = agree, 3 =

neither agree nor disagree, 2 = disagree, 1 = strongly disagree. Based on the survey

questions in Figure 6.2, the average score for Q1 – Q3 is 4 (“agree”). The score

indicates that the students find the modelling constructs of QPT helpful. In particular,

the use of the constructs in a QPT model helps promote a student’s understanding in the

basic behaviour of organic processes. On the other hand, the average score for Q4 – Q6

is 3 (“neither agree nor disagree”). This result somewhat reflects that majority of the

students were still blurred with the QPT’s various slots (i.e. not sure how to match their

mental states to the modelling constructs of QPT) since this is the first time the QPT

technique was introduced to them. Based on the survey questions in Figure 6.3, the

average score for Q1 is 3 (this is a neutral decision, indicating that they find the AI

approach something new and not sure whether they would explore it further); the

average score for Q2 is 2 (“disagree”, this means that they do not mind trying out

modelling before real experiment), and the average score for Q3 is 4 (“agree”, the result

indicates that the approach is somewhat promising).

197

Figure 6.4 Students’ responses towards understanding QPT and qualitative reasoning

approaches.

When the technology behind the reasoning engine of QRiOM is introduced (i.e. the

concept of qualitative reasoning based on QPT), 13 out of 20 students commented that

the technique is rather difficult to understand. Figure 6.4 somewhat reveals that the

group of chemistry students felt that the reasoning technique is difficult to understand,

but they said the results returned by the software are useful. This is why when

designing the software we hide all the complexities behind the GUIs (i.e. not to let the

students “see” QPT reasoning at the forefront of the learning tool).

6.3.2 Assessment of Students’ Skills in Core Areas of Organic Reactions – The

Pre-Questionnaire

The survey continued by getting the participants answered Pre-Questionnaire. Ten

minutes are allocated for this session.

198

This questionnaire is to assess student skill and knowledge in core areas of organic

reactions before using the tool. Sample questions are presented in Figure 6.5. These set

of questionnaires were distributed twice to observe the pre- and post- differences once

was before the students were exposed to the tool and once after they had the hands-on

session.

Skill-Set Area Poor Fair Good Expert

1. Fundamental principle of organic reactions

2. SN1 and SN2 mechanisms

3. “Make-bond” and “break-bond” processes

4. Parameters dependency in an organic reaction

5. Use of reacting species in “make-bond” and “break-bond” organic processes

6. Classifying structural units as nucleophiles or electrophiles

7. Chemical theories that support an organic reactions

8. Rule-of-thumb use in predicting final product(s)

Figure 6.5 The survey form for course competency assessment distributed before/after using the simulator.

6.3.3 Assessment of Students’ Skills in Core Areas of Organic Reactions – The

Post-Questionnaire

Participants were then briefed with the problem solving model of QRiOM. After that,

they were exposed to the tool and then they were asked to rate their competencies for

several technical skills stated in Post-Questionnaire. Twenty minutes are allocated for

this session.

199

This survey aims at collecting the opinions from the students to observe if there is a

mental change/shift experienced after using the tool. It is also to find out if the students

can do better in solving new problems. The survey questions used for Pre-

Questionnaire (Figure 6.5) are used again.

Survey results and discussion: The chemistry students are observed to learn better in

terms of their conceptual understanding of the reactions. Based on the feedback, it can

be concluded that they could do better in solving new problems as a result of acquiring

skills in knowledge articulation (Figure 6.6).

Figure 6.6 Student pre-test and post-test responses to the core skills.

200

6.3.4 Assessment of Effectiveness of QRiOM’s Explanation Facility

After answering the “before-and-after” sets of questionnaire, the students were told to

continue with the questions that aimed at assessing the effectiveness of the explanation

facility of QRiOM. Ten minutes are allocated for this session.

This questionnaire is to solicit the students’ responses towards the explanation

generation capability of QRiOM. This survey enables us to assess if chemistry students

are able to articulate knowledge after analyzing the various ways of presenting the

results of a simulation. Figure 6.7 gives the survey questions used in this questionnaire.

Knowledge aspects Not at all

To a limited extent

To a moderate

extent

To a great extent

1. The conditions to start/stop a chemical process

2. The proper identification of nucleophile and electrophile to activate a chemical process

3. Cause-effect propagation among chemical parameters

4. Behavioural change of a substrate (in terms of its charge, lone pair changes)

5. The production of an intermediate: the why and how?

6. Fundamental concepts of SN1 and SN2

7. Fundamental concepts of “make-bond” and “break-bond” processes

Figure 6.7 Questions in the survey form for the measure of explanation-based learning in skills reinforcement.

201

Survey results and discussion: Figure 6.8 indicates the overall results for each rating

score presented in Figure 6.7. Particularly, 80% (16 out of 20) of the respondents felt

that their knowledge on the two aspects as described in question 3 and question 4 has

been improved to a great extent. Namely, students seemed to find analyzing the

reaction route the cause-effect helpful in learning how an organic process takes place

and the overall changes undergone by the organic substrate. They have never thought

of using a causal graph or even the reaction route to express the overall behavioural

change of substrates. This supports the fact that the tool has potential in helping the

students understand organic chemical reactions.

Figure 6.8 Students’ feedbacks on the extent to which the tool improves one’s knowledge in terms of skill reinforcement through explanation-based learning.

202

6.3.5 Assessment of the Usefulness and Helpfulness of QRiOM

After answering the questionnaire for measuring “effectiveness of explanation facility”,

the respondents were told to continue with the questions that aimed at assessing the

usefulness and helpfulness of the tool. Ten minutes are allocated for this session.

The objective of this questionnaire is to determine if the tool is useful (e.g. students are

more confident in answering new questions) and helpful (e.g. the materials presented

motivate the student to learn). Sample questions are given in Figure 6.9.

Please respond to the following statements about your learning experience:

Strongly agree

Agree Neither agree nor disagree

Disagree

1. I gained more confidence after using the simulator prototype (QRiOM) – for usefulness test

2. The reaction route and the use of pairs of nucleophile and electrophile helped me to understand better the essential reacting species used in the entire reaction – for usefulness test

3. The chemical properties that modelled as qualitative proportionality helped me to understand basic concepts of a reaction – for usefulness test

4. General chemical behaviour and chemical knowledge represented in the QPT process model allowed me to acquire essential background knowledge before going to see the output and explanation – for helpfulness test (motivated?)

5. The cause-effect demonstration in tabulated form encouraged me to think more critically towards the problem task at hand – for helpfulness test

6. My comment to this statement:

“kalau saya dengar, saya lupa (If I hear, I forget)

(≅ merely attending lectures) kalau saya lihat, saya ingat (If I see, I remember)

(≅ just doing experiments in lab) kalau saya buat, saya tahu (If I do, I understand)

(≅ simulation hands-on using QRiOM)” – for helpfulness test

Figure 6.9 Examples of the survey questions for the measure of usefulness and helpfulness of QRiOM in a student’s learning endeavour.

203

Survey results and discussion: Figure 6.10 reveals that majority of the students agreed

that the tool is useful (in terms of the confidence gained) in their learning process. On

the other hand, helpfulness covers the value of the materials presented as well as the

ease with which a user can operate the application. Students gave very high score to the

tool on this aspect. They found the tool helpful because it motivates the student to

learn, especially in several areas such as:

• They are allowed to choose different combinations of <substrate, reagent> pair.

• They can repeatedly run the same reaction.

• The tool offers certain degree of interactivity.

• The tool provides adequate coaching.

In addition, when interviewed, slightly less than half of the students representing 40%

(8 out of 20), felt that they underwent a change of reasoning, as the explanation

provided by the software does reveal the chemical intuition needed to solve the organic

reaction problems.

204

Figure 6.10 Students’ feedbacks on helpfulness (motivated) and usefulness (gain more confidence) of QRiOM.

6.3.6 Comments on Graphical User Interface Design

An interview was conducted after the students have been exposed to the tool. They

were asked to give comments on some Graphic User Interface (GUI) design aspects,

such as clarity of interface (80%), interface consistency (70%), and meaning of

commands (60%). Percentage in bracket indicates the satisfaction level. Attitude

towards using the software have also been measured, including several affective

components, for example, “I like the tool” or “I dislike the tool”. The results showed

that students with positive attitude outperformed those with negative attitude (14 out of

20). Most of them were very pleased to have had the chance to use the tool (16 out of

20) and majority of the respondents felt that there was too much emphasis on the QPT

terms (mentioned as difficult). They suggested more lectures should be given to them

(17 out of 20). The responses collected seem very encouraging given that this is the

205

first exposure of the chemistry students to using/evaluating the computer-based learning

tool. We will take into the consideration the comments made by the students to

improve on the software tool. One of the efforts is to convert most of the QPT terms

into student-friendly terms, so that this does not become a barrier for them.

6.4 Conclusion

The initial evaluation of the QRiOM and its explanation facility has been carried out.

The evaluation result supports the hypothesis that qualitative simulator tools can be

valuable aids for improving conceptual understanding of the basic principles of organic

chemistry processes. In general, chemistry problems presented in textbooks could be

difficult to understand by students because the diagrams and figures are in static form.

The educational benefits offered by QRiOM include the ability to take learners into

environments otherwise inaccessible by conventional face-to-face teaching and the

ability to create a dynamic and interactive environment for learning. A set of logically

related research questions has also been answered, such as:

• Will the tool help improve a student’s conceptual understanding of the subject?

The answer is also “yes”. Results are shown in Figure 6.6.

• Is the explanation generated by QRiOM effective in enhancing students’

understanding of the subject? From the feedback, it is effective. Results are shown

in Figure 6.8.

• How useful is the software in terms of helping the students to promote their

understanding and articulation of chemistry concepts in relation to an organic

reaction? The answer is positive as mostly said the software is useful and helpful.

Results are shown in Figure 6.10.

206

• Will the chemistry students undergo mental change so that they are able to explain

chemical phenomenon in a more elaborate way? The answer is there was a change

in one’s mental design when interacting with the software, in which a given

organic reaction can in fact, be described better by the chemistry students.

• QALSIC was never evaluated with actual student responses, as such the assessment

of QRiOM made a contribution to the literature of QR-related system.

207

Chapter 7 Conclusion

7.1 Thesis Summary

There are three main objectives of this thesis. The first and foremost task is to develop

a conceptual framework of qualitative reasoning that is powerful enough to perform

organic reaction simulation and to reproduce the behaviour of selected organic

mechanisms. This task has been accomplished. The second objective is to apply the

framework to the task of providing qualitative explanations for observed chemical

phenomena using the QPT formalism. This too has been accomplished. Third,

implementation of a qualitative simulator to perform behaviour prediction and

explanation that is of help to chemistry students. From the results of the evaluation, the

simulator has achieved its objectives.

Overall, this thesis presents the work on the design of a qualitative reasoning framework

for the simulation of SN1 and SN2 mechanisms in organic reactions. The development

of a simulator prototype, QRiOM, aimed to simulate organic chemical reactions for

learning purposes has also been described. QRiOM can predict and explain the

formation of the final products, given an initial situation comprising an organic

substrate and a reagent. In particular, a principled approach for automating model

construction has been proposed. The OntoRM ontology has also been developed to

validate the use of chemical knowledge during simulation. The ontology can be

extended to cater for more descriptions of reaction mechanisms.

The fundamental assumption behind this research is that the modelling and simulation

techniques based on QPT ontology and qualitative reasoning technique provides an

208

effective approach that is capable of explaining phenomena in organic reactions in a

natural way (much like the way a human expert would explain it). It is confirmed that

the explanation generated by the software tool can help improve a student’s

understanding of organic reaction and mechanism.

The simulator is similar in idea with some existing systems based on qualitative

reasoning. However, the software is supported by a two-tier knowledge base, namely

the OntoRM reaction mechanism ontology (purely used as a validation tool) and a

chemical knowledge base that stores the essential domain knowledge (the partial

knowledge needed by the QR approach), rather than just one layer as reported in

reviewed literature. All of the functional components presented in Chapter 5 have been

implemented in software using Java and Prolog.

Conventional teaching and learning of the subject is facing several limitations. There

are also some problems faced by other approaches of program development. The

limitations and problems are summarized below:

1. Most students when interviewed said that they learn the subject by memorizing the

reaction steps and the entire chemical equations.

2. In classrooms, students are taught how to use arrows to move electrons around in

order to predict the outcome of a reaction. Overall change made to a reaction is

difficult to be visualized at once (on the white board).

3. In the conventional approach, reaction prediction is performed by finding a route

through searching the entire state space (the precoded routes in the knowledge

base). Consequently, the software cannot handle new problems that are not coded

in program and large amount of storage is needed since all possible reaction routes

need to be stored in the KB for searching and retrieval.

209

4. Traditional chemistry educational software is inadequate to promote understanding

or explain toward its results because the traditional method does not link

“reasoning” to problems.

The approach used in this work is able to overcome the limitations of conventional

teaching methods. This work addressed all four of the aforesaid problems by providing

techniques, algorithms, prototype, and the evaluation results of the prototype.

7.2 Results and Contributions

No previous work has been reported on solving organic reaction problems using

qualitative reasoning approach. In this work, qualitative simulation based on QPT

models is used as a means to provide explanation to chemistry students. Prior to this

work, the domain has never been tested with QPT-based reasoning. This thesis starts

with a critical review of the QPT ontology and then used it to represent chemical

knowledge qualitatively in order to model the behaviour of organic reactions and

mechanisms. Qualitative reasoning algorithms for the simulation of numerous organic

chemical reactions involving different organic substrates were then designed. All the

components in the reasoning framework have been implemented in QRiOM. QRiOM is

viewed as useful and effective by chemistry learners, consistent with the fact that

students’ conceptual understanding is improved. QRiOM is also the first chemistry

education software that can generate multiple forms of textual explanation (in order to

justify a simulated result). The explanation follows almost isomorphically from the

QPT model reasoning.

210

The contributions of this thesis concern different areas of research related to qualitative

modelling, model automation, qualitative simulation and prediction, interactive

explanation and chemistry educational software, each of which is discussed in detail in

the following subsections (Section 7.2.1 – Section 7.2.3).

7.2.1 Conceptual Framework Development

A new technique for modelling the behaviour of organic reactions has been explored.

The new technique referred here is qualitative modelling of domain knowledge using

QPT ontology. The approach used to predict the outcome of “A + B” (A reacts to B) is

by performing qualitative reasoning based on the QPT models constructed for the two

generic processes. In this work, the formation of the final product is explained by the

“mechanism” that is used to accomplish the prediction task. The suggestion of a

suitable chemical process is determined by recognizing the nucleophilic and

electrophilic centres of the < substrate, reagent> input pair while a predicted outcome is

explained by means of the specific mechanism suggested by the reasoning engine. An

organic mechanism used for the reaction will consist of the series of organic processes

used in the conversion of the reactants to the final product by emphasizing on the state

change of chemical parameters that had occurred.

The reasoning framework embodies a number of functional components for a wide

range of organic processes and mechanisms simulation. The components in the

framework include (1) Substrate Recognizer (for checking user selection, and returns

the “type” of the input as either a nucleophile or an electrophile), (2) Model Constructor

(for automating the construction of QPT models based on the identity of user input), (3)

Reasoning Engine (for actual simulation), (4) Causal Model Generator (for constructing

211

causal graphs to produce accounts of behaviour), (5) Explanation Generator (for

generating explanation to justify a simulated result), (6) Molecule Update Routine (for

keeping track of the structural change of the substrate, from one organic reaction to

another), (7) Knowledge Validation Routine (for ensuring correct piece of chemical

data is used), (8) OntoRM ontology (for defining chemical knowledge related to

reaction mechanism), and (9) Chemical Knowledge Base (for storing information such

as chemical facts and theories).

7.2.1.1 QPT as the Knowledge Capture Tool

QPT is chosen because the modelling ingredient of this particular QR ontology provides

good grounds for describing processes in conceptual terms with notions of causality

which can be used for explaining the behaviour of chemical systems. The ontology also

allows representation of chemical process elements at the finest level of granularity.

Problem characteristics and the behaviour of organic reactions and mechanisms were

sought and studied. Dialogues with chemists were conducted to find the possibility of

representing the required knowledge in qualitative terms using QPT. Collecting

intuitive and causal aspects of chemists’ mental models helped us in designing the

cognitive steps used in the reasoning algorithms. These types of knowledge enabled us

to establish functional dependency of chemical parameters during a reaction using the

modelling constructs of QPT (which also support cause-effect propagation via its

direct/indirect influences).

212

7.2.1.2 Model Automation

We classified chemical processes for a variety of substrates into a few organic

processes. Chemical processes needed in the simulation were identified as “make-bond”

and “break-bond”. This simple classification scheme of two processes enables the

assembly of general chemical behaviour and chemical theories needed to support model

automation. Multiple chemical equations under SN1 and SN2 mechanisms were studied

in order to collect their general behaviour. A set of rules has been formulated which

specify how the chemical theories of organic processes can be represented using the

modelling constructs of QPT. When the logical steps were obtained, the algorithm that

enables model automation was developed. The generalization of such behaviour helps

promote model reusability. The model automation algorithm provided in this thesis

helps solve partly the knowledge acquisition bottleneck.

In the course of developing the conceptual framework, we produced the following

results:

1. Reusable chemical processes (hence the models) were identified. The reusable

processes are “make-bond” and “break-bond”.

2. Model automation logics were formulated. Automating the construction of QPT

models is made possible by first identifying the type of the reacting species, then the

chemical process that can take place. Last, the common set of chemical theories

represented in QPT is retrieved for composing the process model.

3. The mental attributes of human chemists when solving organic reaction problem

were represented using the modelling constructs of QPT.

213

7.2.2 QRiOM – A Tool for Explaining Organic Reactions

The simulator prototype, QRiOM integrates all of the components stated in Section

7.2.1, and it produces acceptable behaviour prediction for all the reactions as stated in

Chapter 1 (course scope section). The simulation results and the textual and

diagrammatic explanation produced by QRiOM are not precoded but they are generated

via causal model tracing and interpretation. The simulation algorithms generate

explanation following the same structure from a QPT-based reasoning and as such no

complicated explanation technique is required. The work produced better explanation

as compared to LHASA (a collection of equations are used and very complex molecular

structures are presented) and QALSIC (no explanation provided), in terms of more

natural and less technical in its presentation. The prototype is able to handle new cases

since only general chemical principles of organic reactions are stored and not the

specific reaction routes that produce the final product. Furthermore, since reaction

routes are not precoded, the entire program takes up very little space.

Traditional chemistry learning software generates results without proper explanation.

QRiOM returns the following textual and diagrammatic outputs and explanations at the

end of a simulation task: (1) The final products of a reaction, (2) Suggested organic

mechanism used to predict the product, (3) The qualitative models used in predicting

the behaviour of a chemical reaction, (4) A causal graph that depicts the reacting

species used, the intermediates produced, and the cause-effect chain of chemical

parameters in the simulation, (5) The set of the parameter values assigned to each

chemical quantity in the reaction simulation, (6) The entire reaction route of a

qualitative simulation, and (7) The updated list of reacting species (electrophiles,

nucleophiles and new intermediates) before and after each chemical process. Most of

214

these outputs are generated by two special purpose modules in the QRiOM simulator,

namely the Quantity Space Analyzer (QSA) and Molecule Update Routine (MUR).

Main results obtained from the QRiOM simulator prototype are:

1. The tool is able to make correct prediction for a large set of reactants, with no

precoded answers in the knowledge base. We have tested that new cases can be

handled such as adding new reactants by not making any change to the chemical

KB. The main reason for this achievement is that only chemical principles and

chemical theories are stored in the KB, not the precoded reaction routes.

2. Interactive explanation is achieved by tracing and interpreting causal models that

are created during qualitative simulation.

7.2.3 Evaluation Results of QRiOM

QRiOM does help in nurturing the conceptual understanding of a learner especially in

the understanding of the behavioural and causal aspects of organic reactions. An

evaluation study has been carried out with twenty first-year undergraduate students.

The results of the evaluation suggest that QRiOM is effective in terms of its ability to

promote understanding in learning organic processes through the inspection of the

explanation generated by the tool. The setup included paper-based pre- and post-

assessments concerning their skills in a few core areas of the subject. A questionnaire

was used to gauge the participants’ responses about QRiOM in terms of the helpfulness

and usefulness of the explanation generated by the software. Results from interviews

showed that, after using the software students are able to explain a chemical

phenomenon in a more elaborated way (i.e. providing a longer answer as they are now

more confident in solving the problem). Mental change of chemistry students who

215

participated in the QRiOM evaluation was surveyed. Use of the tool can help oneself to

discover his/her mental change, such as realizing or knowing his/her own reasoning

ability. The achievement of these learning objectives is due to the “explanation”

pedagogy that is embedded in QRiOM that assists chemistry students learn organic

reactions through the study of functional dependencies of parameters and the causality

chain. As far as the application of the QPT-based reasoning is concerned, we achieved

positive results that met the educational objectives stated in Chapter 1. Major

accomplishments of applying the QR approach in solving the reaction simulation

problem for learning purposes are summarized in Figure 7.1.

Main results obtained from the evaluation of QRiOM are:

1. The tool has been evaluated in terms of its usefulness, helpfulness and effectiveness

in explaining chemical phenomena related to organic reactions. Overall, the results

are promising, i.e. the tool generally enhanced student knowledge.

2. QRiOM is viewed as useful and helpful where most of the student underwent

mental change when exposed to the software.

3. The tool helps nurture conceptual understanding of the learners especially on the

knowledge about the behavioural and causal aspects of organic processes.

4. The majority of the respondents agreed that the tool gave a good background on

solving organic reaction problems.

216

Learning organic

reaction using

qualitative

reasoning

approach

Enhance

reasoning ability

Improve

understanding

Promote one’s

learning via the

“explanation”

pedagogy

Overcome traditional

program development

problem

Explanation follows

isomorphically from

the underlying QPT

reasoning

Cause-effect

relationships can

be examined

Figure 7.1 Accomplishment of the QR approach when implemented in a tool for learning organic reactions.

Apart from the abovementioned achievements, we have also made the following

contributions:

• OntoRM – a scheme for organizing and structuring chemical knowledge for reaction

mechanisms has been developed. The effective use of the chemical knowledge base

is achieved by applying OntoRM during modelling and simulation stages.

• Knowledge validation was carried out to avoid wrong simulation steps in a reaction

through the use of OntoRM. As a result, more accurate and reliable predictions can

be obtained.

• An essential part of the work, a scheme for hierarchical structuring of processes to

facilitate the effective use of knowledge was also developed. This is not found (or

rather unclear) in QALSIC.

• An analysis of application of the QR approach in inorganic versus organic reaction

simulation was carried out. The main finding is that organic chemistry reactions are

217

relatively easier to be modelled using QR approaches as compared to inorganic

chemistry reactions.

• The QALSIC program was investigated and tested with numerous inorganic

reactions. We have detected the main reason why the software returns incorrect

predictions for a large number of inorganic reactions. The software has been tested

with a mixture of inorganic reaction experiments and the conclusion is that those

invalid outputs are due to the nature of the problem domain rather than the

limitation in the underlying QPT reasoning. Simply said, the inorganic chemical

reactions are difficult to generalize compared to organic chemical reactions.

7.3 Limitations

Since this is the first attempt in testing the qualitative reasoning approach for solving

organic reaction problems and the tool is still in its infancy, there are several limitations

in this work. The limitations are summarized as follows:

• Our knowledge base contains only aliphatic compounds and a small set of reagents.

• 3D animation is not included.

• User modelling is not included.

• Users can only view the models but are not involved in building the model, i.e. the

entire modelling phase is without user participation.

7.4 Future Works

Several key challenges remain, such as expanding the chemistry process to include

more types of organic mechanisms and to further investigate the SMILES format for

representing each organic compound as a line notation in the knowledge base to achieve

218

greater portability and reusability of the system component. The work will be

continued from a number of aspects. These include generating 3D animated multimedia

output (currently, outputs are in plain 2D format) and the development of a protocol

converter to handle protocol between the reasoning shell and the 3D output. A problem

ontology that handles user queries much like the one presented in Pah et al. (2007) is

also the direction of our future work. The main purpose of having the problem ontology

is to deal more specifically and accurately with questions that may be asked by the

learners. QRiOM can also be improved by adding pedagogical elements (such as the

different learning styles) in the “technogogical” three-dimensional (technology, content

and the pedagogy) learning environment as proposed by Idrus (2008). Extending

QRiOM to a full version learning software by embedding the user modelling module

and an assessment system is on the way. So far, the prototype does not support much

student-initiated exploration. A long term research direction has also been charted,

which is to build a graphical language to link between the qualitative simulator and a

graphic package. This is because one aspect of QR research that we see people have

not addressed is to build a graphical language between the reasoning engine (based on

some QR ontology) and graphic package itself (e.g. Model Science software). Many

simulators can only generate textual explanation but not graphical animation.

Bredeweg’s VisiGarp is a visual representation of the qualitative processes, which is not

what we discuss here. If the graphical language does exist, it will become the protocol

to communicate these two worlds: reasoning engine and graphic packages. This may

help to push more QR-based systems into the commercial world.

219

7.5 Concluding Remarks

The qualitative reasoning approach based on qualitative process theory described in this

work has never been applied to the fields of organic chemistry and reaction

mechanisms. The reasoning framework (and the simulator prototype) is able to

generate similar outcomes as the one produced by chemists. Evaluation results showed

that embedding qualitative reasoning approach in educational software is useful to

nurture a student’s various learning skills. To conclude, this work combines qualitative

reasoning and ontologies in a problem-solving system, and generates explanations for

learners from the problem-solving system. After developing and testing the prototype,

we anticipate a fully usable system that can assist chemistry students not only in

understanding the subject, but also engaging them in building simple models as a means

to acquire knowledge. This research provides a good foundation for future works in the

application of qualitative reasoning approach in other subfields of the organic chemistry

course.

220

References

Advanced Reasoning Group Homepage. Available at: http://www.aber.ac.uk/compsci/Research/mbsg/ (Retrieved on 27 February 2005) Amzi! Prolog Home Page. Available at: http://www.amzi.com (Retrieved on 3 Jan 2006) Angele, J., Moench, E., Oppermann, H., Staab, S. and Wenke, D. (2003). Ontology-

based query and answering in chemistry: OntoNova @ Project Halo. In Fensel, D. et

al. (Eds.), Lecture Notes on Computer Science. Springer-Verlag Berlin Heidelberg. Vol. 2870, pp. 913–928. Atkins, R.C. and Carey, F.A. (1977). Organic Chemistry: A Brief Course, 6th Edition, McGraw-Hill. Bailey-Kellogg, C. and Zhao, F. (2003). Qualitative Spatial Reasoning: Extracting and Reasoning with Spatial Aggregates. AI Magazine, 24:47–60. Bailey-Kellogg, C., Ramakrishnan, N. and Marathe, M. (2006). Spatial Data Mining to Support Pandemic Preparedness, ACM SIGKDD Explorations, 8:80-82. Bessa Machado, V. and Bredeweg, B. (2002). Investigating the Model Building Process

with HOMER. Proceedings of the International workshop at Intelligent Tutoring Systems, San Sebastian, Spain. Bessa Machado, V. and Bredeweg, B. (2003). Building Qualitative Models with

HOMER: A Study in Usability and Support. Proceedings of the 17th International Workshop on Qualitative Reasoning, Brasilia, Brazil, pp. 39-46. Bessa Machado, V. (2004). Supporting the Construction of Qualitative Models. PhD Thesis, University of Amsterdam, Amsterdam, The Netherlands. Bessa Machado, V., Groen, R., and Bredeweg, B. (2005). Towards Support in Building

Qualitative Knowledge Models. Proceedings of the 12th International Conference on Artificial Intelligence in Education: Supporting Learning through Intelligent and Socially Informed Technology, IOS Press, Amsterdam, The Netherlands, pp. 395-402. Biswas, G., Schwartz, D., Bransford, J., and The Teachable Agents Group at Vanderbilt (2001). Technology Support for Complex Problem Solving: From SAD Environments to AI. In Forbus, K.D. and Feltovich, P. (Eds.), Smart Machines in Education. Menlo Park Calif.: AAAI Press, pp. 71-97. Bobrow, D.G. (1985). Qualitative Reasoning about Physical Systems. Cambridge, Massachusetts: The MIT Press. Bouwer, A. and Bredeweg, B. (2001). VisiGarp: Graphical Representation of Qualitative Simulation Models. In Moore, J.D. Luckhardt Redfield, G. and Johnson, J.L. (Eds.), Artificial Intelligence in Education: AI-ED in the Wired and Wireless

Future, IOS Press/Ohmsha, Japan, Osaka, pp. 294–305.

221

Bouwer, A. and Bredeweg, B. (2002). Aggregation of Qualitative Simulations for

Explanation. Proceedings of the International Workshop at Intelligent Tutoring Systems, June, San Sebastian, Spain. Bouwer, A. and Bredeweg, B. (2005). Generating Structured Explanations of System Behaviour Using Qualitative Simulations. In Looi, C.K. McCalla, G., Bredeweg, B. and Breuker, B. (Eds.), Artificial Intelligence in Education: Supporting Learning through

Intelligent and Socially Informed Technology, IOS press, Amsterdam, pp. 756–758. Bouwer, A. (2005). Explaining Behaviour: Using Qualitative Simulation in Interactive

Learning Environments. PhD Thesis, University of Amsterdam, Amsterdam, The Netherlands. Bratko, I. and Šuc, D. (2002). Qualitative Explanation of Controllers. Proceedings of the 16th International Workshop on Qualitative Reasoning, Barcelona, Spain, pp. 1-2. Bratko, I. and Šuc, D. (2003a). Learning Qualitative Models. AI Magazine, 24(4): 107–119. Bratko, I. and Šuc, D. (2003b). Qualitative Data Mining and its Applications. CIT

Journal. Computing and Info. Technology, Vol. 11, No. 3, pp. 145–150. Bredeweg, B. (1992). Expertise in Qualitative Prediction of Behaviour. PhD thesis, University of Amsterdam, Amsterdam, The Netherlands. Bredeweg, B. and Winkels, R. (1998). Qualitative Models in Interactive Learning Environments: An Introduction. Interactive Learning Environments, 5(1-2): 1–18. Bredeweg, B. and Forbus, K.D. (2003). Qualitative Modelling in Education. AI

Magazine 24(4): 35–46. Bredeweg, B. and Struss, P. (2003). Current Topics in Qualitative Reasoning. AI

Magazine (special issue), Vol. 24, No. 4, pp. 13–130. Bredeweg, B., Salles, P. and Neumann, M. (2006). Ecological Applications of Qualitative Reasoning. In Recknagel, F. (Ed.), Ecological Informatics, Scope,

Techniques and Applications, 2nd Edition, Springer, Berlin, pp. 15-47. Bredeweg, B., Bouwer, A., Jellema, J., Bertels, D., Linnebank, F. and Liem, J. (2007). Garp3–A New Workbench for Qualitative Reasoning and Modelling. Proceedings of the 4th International Conference on Knowledge Capture (K-CAP), Whistler, BC, Canada, pp. 183-184. Brown, J.S., Burton, R.R. and Zdybel, F. (1973). A Model-Driven Question Answering System for Mixed-initiative Computer Assisted Instruction. IEEE Transactions on

Systems, Man and Cybernetics, SMC-3. Brown, J.S. and Burton, D. (1982). Pedagogical, natural language and knowledge engineering techniques in SOPHIE I, II and III. In Sleeman, D. and Brown, J.S. (Eds.), Intelligent Tutoring Systems, Academic Press. Cartwright, H.M. (1993). Application of Artificial intelligence in Chemistry, New York: Oxford University Press Inc., Oxford Science Publications.

222

Chi, M.T.H., Feltovich, P.J., and Glaser, R. (1981). Characterization and Representation of Physics Problems by Experts and Novices. Cognitive Science, 5, 121–152. Daylight Chemical Information Systems Inc. Home Page. Available at: http://www.daylight.com/smiles/ (Retrieved on 23 April 2006) Dehghani, M. and Forbus, K.D. (2009). QCM: A QP-based Concept Map System. Proceedings of the 23rd International Workshop on Qualitative Reasoning, Ljubljana, Slovenia, pp. 16-21. de Koning, K. and Bredeweg, B. (1994). A Framework for Teaching Qualitative

Models. In Cohn, A.G. (Ed), Proceedings of the 11th European Conference on Artificial Intelligence, John Wiley & Sons, pp. 197-202. de Kleer, J. and Brown, J.S. (1984). Qualitative Physics Based on Confluences. Artificial Intelligence Journal, 24: 7–83. de Kleer, J. and Brown, J.S. (1992). Model-based diagnosis in SOPHIE III. In Hamscher, W., de Kleer, J. and Console, L. (Eds.), Readings in Model-Based

Diagnosis. Morgan Kaufmann. de Koning, K. and Bredeweg, B. (1998). Qualitative Reasoning in Tutoring Interactions. Interactive Learning Environments, Vol. 5, Number 1-2, pp. 65–80. de Koning, K., Bredeweg, B., Breuker, J. and Wielinga, B. (2000). Model-based Reasoning about Learner Behaviour. Artificial Intelligence, 117:173–229. Dolan, M.E. and Blake, J.A. (2009). Using Ontology Visualization to Facilitate Access to Knowledge about Human Disease Genes. Applied Ontology, Vol. 4, Number 1, IOS Press, pp. 35–49. Dolata, D.P. (1998). Artificial Intelligence in Chemistry. In Schleyer, P. v.R., Allinger, N.L., Clark, T., Gasteiger, J., Kollman, P.A., Schaefer III, H.F. and Schreiner, P.R. (Eds.). Encyclopedia of Computational Chemistry, Volume 1. John Wiley & Sons. (p. 44-63) D’Souza, A., Rickel, J., Herreros, B. and Johnson, W. L. (2001). An Automated Lab

Instructor for Simulated Science Experiments. In Moore, J.D., Luckhardt Redfield, G. and Johnson, J.L. (Eds.), Artificial Intelligence in Education: AI-ED in the Wired and Wireless Future, Amsterdam, The IOS-Press, Netherlands, pp. 65–76. Engel T. and Gasteiger, J. (2002). Chemical Structure Representation for Information

Exchange. Online information review, Vol. 26, No. 3, MCB UP Limited, pp.139-145. Falkenhainer, B. and Forbus, K. (1991). Compositional Modelling: Finding the Right Model for the Job. Artificial Intelligence, 51(1-3), 95–143. Fessenden, R.J. and Fessenden, J.S. (1998). Organic Chemistry, 6th Edition, Brooks/Cole Publishing Company, ITP. Forbus, K.D. (1984). Qualitative Process Theory. Artificial Intelligence, 24: 85–168.

223

Forbus, K.D. (1993). Self Explanatory Simulators: Making Computers Partners in the

Modelling Process. In Carreté, N.P. and Singh, M.G. (Eds.), Qualitative Reasoning and Decision Technologies. Barcelona: CIMNE, pp. 3-13. Forbus, K.D. and Whalley, P.B. (1994). Using Qualitative Physics to build Articulate

Software for Thermodynamics Education. Proceedings of the 12th National Conference on Artificial Intelligence, pp. 1175-1182. Forbus, K.D. (1996a). Self-explanatory Simulators for Middle-school Science

Education: A Progress Report. In Farquhar, A. and Iwasaki, Y. (Eds.), Proceedings of the International Workshop on Qualitative Reasoning, Stanford, California, pp. 52-56. Forbus, K.D. (1996b). Qualitative Reasoning. In Tucker, A.B. (Ed.), The Computer

Science and Engineering Handbook, 715-733. Boca Raton, Fla.:CRC. Forbus, K.D. (1997). Using Qualitative Physics to Create Articulate Educational Software. IEEE Expert, May/June, 32–41. Forbus, K.D. and Kuehne, S.E. (1998). RoboTA: An Agent Colony Architecture for

Supporting Education. Proceedings of the 2nd International Conference on Autonomous Agents. Minneapolis/St. Paul, MN, pp. 455-456. Forbus, K.D., Everett, J., Ureel, L., Brokowski, M., Baher, J., and Kuehne, S. (1998). Distributed Coaching for an Intelligent Learning Environment. In Proceedings of International Workshop on Qualitative Reasoning, Cape Cod. Forbus, K.D., Whalley, P., Everett, J., Ureel, L., Brokowski, M., Baher, J. and Kuehne, S. (1999). CyclePad: An Articulate Virtual Laboratory for Engineering Thermodynamics. Artificial Intelligence Journal, 114: 297–347. Forbus, K.D. (2001). Articulate Software for Science and Engineering Education. In Forbus, K.D., Feltovich, P. and Canas, A. (Eds.), Smart Machines in Education: The Coming Revolution in Educational Technology, AAAI Press. Forbus, K.D., Carney, K., Harris, R. and Sherin, B.L. (2001). A Qualitative Modelling

Environment for Middle-school Students: A Progress Report. In Biswas, G. (Ed.), The 15th International Workshop on Qualitative Reasoning, St. Mary’s University, San Antonio, Texas, pp. 65-72. Forbus, K.D., Carney, K., Sherin, B. and Ureel, L. (2004a). VModel: A Visual

Qualitative Modelling Environment for Middle-school Students. Proceedings of the 16th Innovative Applications of Artificial Intelligence Conference, San Jose, California. Forbus, K.D., Carney, K., Sherin, B. and Ureel, L. (2004b). Qualitative Modelling for

Middle-school Students. Proceedings of the 18th International Qualitative Reasoning Workshop, Evanston, Illinois, August. Forbus, K.D. and Gentner, D. (2009). Dark Knowledge in Qualitative Reasoning: A

Call to Arms. Proceedings of the 23rd International Workshop on Qualitative Reasoning, Ljubljana, Slovenia.

224

Frederiksen, J.R. and White, B.Y. (2002). Conceptualizing and Constructing Linked

Models: Creating Coherence in Complex Knowledge Systems. In Brna, P. and Baker, M., Stenning, K. and Tiberghein, A. (Eds.), The Role of Communication in Learning to Model, Mahwah, N.J.: Lawrence Erlbaum, 69-96. Gentner, D. and Stevens, A. (Eds.) (1983). Mental Models, IEA Associates. Goddijn, F., Bouwer, A. and Bredeweg, B. (2003). Automatically Generating Tutoring

Questions for Qualitative Simulations. Proceedings of the 17th International Workshop on Qualitative Reasoning, Brasilia, Brazil, pp. 87-94. Goldberg, D.E. (1998). Fundamental of Chemistry, 2nd Edition, McGraw-Hill. Groutas, W.C. (2000). Organic Reaction Mechanisms – Selected Problems and

Solutions, New York: John Wiley & Sons, Inc. Guerrin, F. (1991). Qualitative Reasoning about an Ecological Process: Interpretation in Hydroecology. Ecological Modelling, 59:165–201. Guerrin, F. (1992). Model-based Interpretation of Measurement, Analysis and Observations of an Ecological Process. AI Applications, 6(3):89–101. Hinrichs, T.R., Nichols, N.D. and Forbus, K.D. (2006). Using Qualitative Reasoning in

Learning Strategy Games: A Preliminary Report. Proceedings of the 20th International Workshop on Qualitative Reasoning, Hanover, NH, USA, pp. 91-96. Hirashima, T., Horiguchi, T., Kashihara, A. and Toyoda, J. (1998). Error-based Simulation for Error-visualization and its Management, International Journal of

Artificial Intelligence in Education, Vol. 9, pp. 17–31. Hirashima, T. and Horiguchi, T. (2001). Evaluation of Error-Based Simulation by

Using Qualitative Reasoning Techniques. Proceedings of the 20th International Workshop on Qualitative Reasoning, pp. 128-133. Horiguchi, T. and Hirashima, T. (2009). Intelligent Authoring of “Graph of

Microworlds” for Adaptive Learning with Microworlds. Proceedings of the 23rd International Workshop on Qualitative Reasoning, Slovenia, pp. 49-53.

Horiguchi, T., Hirashima, T. and Okamoto, M. (2005). Conceptual Changes in

Learning Mechanics by Error-based Simulation. In Looi, C.K., Jonassen, D. and Ikeda, M. (Eds.), Proceedings of the International Conference on Computers in Education (ICCE), Singapore, pp. 138-145. Horiguchi, T. and Hirashima, T. (2005). Graph of Microworlds: A Framework for

Assisting Progressive Knowledge Acquisition in Simulation-based Learning

Environments. Proceedings of the 11th International Conference on Artificial Intelligence in Education, pp. 670-677. Horiguchi, T. and Hirashima, T. (2006). Robust Simulator, a Method of Simulating

Learners' Erroneous Equations for Making Error-Based Simulation. Proceedings of the Intelligent Tutoring Systems, pp. 655-665.

225

Horiguchi, T., Imai, I., Toumoto, T. and Hirashima, T. (2007). A Classroom Practice of

Error-Based Simulation as Counterexample to Students' Misunderstanding of

Mechanics. Proceedings of the International Conference on Computers in Education (ICCE), Taiwan, pp. 519-526. Horiguchi, T. and Hirashima, T. (2008). Domain-Independent Error-Based Simulation

for Error-Awareness and Its Preliminary Evaluation. Proceedings of the PRICAI Conference, pp. 951-958. Hsu, S.H., Krishnamurthy, B., Rao, P., Zhao, C., Jagannathan, S., Caruthers, J. and Venkatasubramanian, V. (2006). A Systematic Approach for Automated Reaction

Network Generation. In Marquardt, W. and Pantelides, C. (Eds.), Proceedings of the 16th European Symposium on Computer Aided Process Engineering and the 9th International Symposium on Process Systems Engineering. Elsevier B.V., pp. 973–978. Idrus, R.M. (2008). Transforming Engineering Learning via Technogogy. Proceedings of the 5th WSEAS/IASME International Conference on Engineering Education, Heraklion, Greece, pp. 33-38. Iwasaki, Y. and Simon, H. (1986). Causality in Device Behaviour, Artificial

Intelligence, Vol. 29, No. 1. Iwasaki, Y. (1997). Real-world Applications of Qualitative Reasoning. IEEE Expert, pp. 16–21. Kifer, M. and Lozinskii, E. (1986). A Framework for an Efficient Implementation of

Deductive Databases. Proceedings of the 6th Advanced Database Symposium, Tokyo, pp. 109–116. Kuipers B.J. (1986). Qualitative Simulation. Artificial Intelligence Journal, 29: 289– 338. Kuipers, B.J. (1993). Qualitative Simulation: Then and Now. Artificial Intelligence 59: 133–140. Kuipers, B.J. (1994). Qualitative Reasoning – Modelling and Simulation with

Incomplete Knowledge, MIT Press, Cambridge, Massachusetts.

Kumar, A.N. (2002). Model-Based Reasoning for Domain Modelling in a Web-Based

Intelligent Tutoring System to Help Students Learn to Debug C++ Programs. Proceedings of the 6th International Conference on Intelligent Tutoring Systems, Biarritz, France, pp. 792-801. Laraba, M.E.H. (2006). A Dialogical Agent-based Framework for Explaining the

Results of QSIM Algorithm to the End-User. Proceedings of the 20th International Workshop on Qualitative Reasoning, Hannover, USA, pp. 169-174. Laraba, M.E.H and Brezillon, P. (2009). Using Contextual Graphs for Supporting

Qualitative Simulation Explanation. Proceedings of the 23rd International Workshop on Qualitative Reasoning, Slovenia, pp. 62-67.

226

Leelawong, K., Wang, Y, Biswas, G., Vye, N., Bransford, J., and Schwartz, D. (2001). Qualitative Reasoning Techniques to Support Learning by Teaching: The Teachable

Agents Project. Proceedings of the 15th International Workshop on Qualitative Reasoning, San Antonio, Texas, AAAI Press. Leelawong, K., Viswanath, K. Davis, J. Biswas, G. Vye, N. J. Belynne, K. and Bransford, J. B. (2003). Teachable Agents: Learning by Teaching Environments for

Science Domain. Proceedings of the 15th Innovative Applications of Artificial Intelligence Conference, Acapulco, Mexico, AAAI Press. LHASA group home page. Available at: http://derek.harvard.edu/ (Retrieved on 13 September 2005) Mavrovouniotis, M.L. and Forsythe Jr, R.G. (1998). Object-oriented Programming. In Schleyer, P. v.R., Allinger, N.L., Clark, T., Gasteiger, J., Kollman, P.A., Schaefer III, H.F. and Schreiner, P.R. (Eds.). Encyclopedia of Computational Chemistry, Volume 3. John Wiley & Sons. (p. 1948-1960) Murov, S. and Stedjee, B. (1997). Experiments in Basic Chemistry, 4th Edition, New York: John Wiley & Sons, Inc., 111–122. Murray, T. (1993). Formative Qualitative Evaluation for “Exploratory”. Journal of

Artificial Intelligence in Education, 4(2), 179 –207. Neuper, W.A. (2001). Reactive User-Guidance by an Autonomous Engine Doing High-

School Math. PhD Thesis, TU Graz, IICM – Software Technology. Neuper, W.A. and Wotawa, F. (2002). Model-based Reasoning in Mathematical

Tutoring Systems – Preliminary Report. Proceedings of the 6th International Workshop on Intelligent Tutoring Systems, San Sebastian, Spain. Pah, I. Maniu, I. Maniu, G. and Damian, S. (2007). A Conceptual Framework based on

Ontologies for Knowledge Management in E-learning Systems. Proceedings of the 6th WSEAS International Conference on Education and Educational Technology, Italy, pp. 283–286. Pang, J.S., Syed Mustapha, S.M.F.D. and Zain, S.M. (2001). Preliminary Studies on

Embedding Qualitative Reasoning into Qualitative Analysis and Simulation in the

Laboratory. Proceedings of the Pacific Asian Conference on Intelligent Systems, Seoul, Korea, pp 230–236.

Patrick, G.L. (1997a). Beginning Organic Chemistry I, New York: Oxford University Press. Patrick, G.L. (1997b). Beginning Organic Chemistry 2, New York: Oxford University Press. Patrick, G.L. (2000). Organic Chemistry, New York: BIOS Scientific Publishers Limited. Peller, J.R. (1997). Exploring Chemistry, Prentice Hall, pp. 71-82.

227

Peller, J.R. (2003). Exploring Chemistry Laboratory Experiments in General, Organic

and Biological Chemistry, 2nd Edition, Prentice Hall. Ricardo, M.P. de Alcântara., Germana M. da Nόbrega., Salles, P. (2006). Proceedings of the 20th International Workshop on Qualitative Reasoning, Hanover, NH, USA, pp. 37-46. Rose, J.R. (1998). Machine Learning Techniques in Chemistry. In Schleyer, P. v.R., Allinger, N.L., Clark, T., Gasteiger, J., Kollman, P.A., Schaefer III, H.F. and Schreiner, P.R. (Eds.). Encyclopedia of Computational Chemistry, Volume 3. John Wiley & Sons. (p. 1521-1525) Salles, P.S.B.A., Pain, H. and Muetzelfeldt, R.I. (1996). Qualitative Ecological Models

for Tutoring Systems: A Comparative Study, AAAI Technical Report WS-96-01. Salles, P. and Bredeweg, B. (1997). Building Qualitative Models in Ecology. Proceedings of the 11th International Workshop on Qualitative Reasoning, Cortona, Italy, pp. 155-164. Salles, P., Bredeweg, B. and Winkels, R. (1997). Deriving Explanations from

Qualitative Models. In du Bouley, B. and Mizoguchi, R. (Eds.), Artificial Intelligence in Education: Knowledge and Media in Learning Systems, IOS-Press/Ohmsha, Japan, Osaka, pp. 474-481. Salles, P, and Bredeweg, B. (2001). Constructing Progressive Learning Routes through

Qualitative Simulation Models in Ecology. Proceedings of the International Workshop on Qualitative Reasoning, San Antonio, Texas, pp. 82-89. Salles, P. and Bredeweg, B. (2002). A Case Study of Collaborative Modelling: Building

Qualitative Models in Ecology. Proceedings of the International workshop on Model-based Systems and Qualitative Reasoning for Intelligent Tutoring Systems, San Sebastian, Spain, pp.75-84. Salles, P. and Bredeweg, B. (2003). Qualitative Reasoning about Population and Community Ecology. AI Magazine, Volume 24, Number 4, pp. 77-90. Salles, P. Bredeweg, B. and Araujo, S. (2003). Qualitative Models of Stream Ecosystem

Recovery: Exploratory Studies. Proceedings of the 17th International workshop on Qualitative Reasoning, Brasilia, Brazil, pp. 155-162. Salles, P. and Bredeweg, B. (2006). Modelling Population and Community Dynamics with Qualitative Reasoning. Ecological Modelling, Volume 195, Issues 1-2, pp. 114-128. Salles, P., Bredeweg, B. and Araujo, S. (2006). Qualitative Models about Stream Ecosystem Recovery: Exploratory Studies. Ecological Modelling, Volume 194, Issues 1-3, pp. 80-89. Shute, V.J. and Regina, J.W. (1993). Principles for Evaluating Intelligent Tutoring Systems. Journal of Artificial Intelligence in Education, 4(3), 245–271.

228

Sime, J.A. and Leitch, R.R. (1992). Multiple Models in Intelligent Training. Proceedings of the First International Conference on Intelligent Systems Engineering (ISE '92), Edinburgh, Scotland, 19-21.

Sime, J.A. (1995). Supporting the use of Qualitative Models in Intelligent Training

Systems. Proceedings of the AIED’95 Workshop, Amsterdam, The Netherlands. Sime, J.A. (1996). An Investigation into Teaching and Assessment of Qualitative

Knowledge in Engineering. Proceedings of the European Conference on Artificial Intelligence in Education, Lisbon, Portugal. Sime, J.A. (1998). Model Switching in a Learning Environment Based on Multiple Models. Interactive Learning Environments, 5(1-2): 109–124. Sime, J.A. (2002). Learning with Qualitative Models and Cognitive Support Tool: The

Learners’ Experiences. Proceedings of the International Workshop on Model-based Systems and Qualitative Reasoning for Intelligent Tutoring Systems, San Sebastian, Spain, pp. 85-95. Struss, P. and Price, C. (2004). Model-Based Systems in the Automotive Industry. AI

Magazine, 24(4): 17–34. Syed Mustapha, S.M.F.D., Pang, J.S. and Zain, S.M. (2002). Application of Qualitative

Process Theory to Qualitative Simulation and Analysis of Inorganic Chemical Reaction. Proceedings of the Sixteenth International Workshop of Qualitative Reasoning, Barcelona, Spain. Syed Mustapha, S.M.F.D., Pang, J.S. and Zain, S.M. (2005). QALSIC: Towards Building an Articulate Educational Software using Qualitative Process Theory Approach in Inorganic Chemistry for High School Level, International Journal of

Artificial Intelligence in Education, 15(3): 229–257. Tang, A.Y.C. and Syed Mustapha, S.M.F.D. (2006). Representing SN1 Reaction

Mechanism using the Qualitative Process Theory. In Bailey-Kellogg, C. and Kuipers, B. (Eds.), Proceedings of the 20th International Workshop on Qualitative Reasoning, Hanover, USA, pp. 137-147. Toppano, E. (2002). MMforTED: A Cognitive Tool Fostering the Acquisition of

Conceptual Knowledge about Artifacts. Proceedings of the International Workshop on Model-based Systems and Qualitative Reasoning for Intelligent Tutoring Systems, San Sebastian, Spain. Turban, E. (1999). Expert Systems and Applied Artificial Intelligence, Macmillan, pp. 119. Ubiquity, Volume 4, Issue 45, January 13-19, 2004, Making Sense of Common Sense Knowledge. Available at: http://www.acm.org/ubiquity/interviews/v4i45_kuipers.html (Retrieved on 29 November 2005)

Ureel, L. and Carney, K. (2003). Design of Computational Supports for Students in

Visual Modelling Tasks. In Wasson B., Baggetun, R., Hoppe, U. and Ludvigsen, S. (Eds.) International Conference on Computer Support for Collaborative Learning,

229

CSCL 2003, Community events, Communication and Interaction. Bergen, Norway, University of Bergen Press, pp. 98-100. Vadillo, J.A. and Diaz de Ilarraza, A. (1995). Domain Representation in Intelligent

Tutor Systems for Training using Causal Models. Proceedings of the AIED’95 Workshop. Valley, K. (1992). Explanation in Expert System Shells: A Tool for Exploration and

Learning. In Frasson, C., Gauthier, G. and McCalla, G.I. (Eds.), Proceedings of the 2nd International Conference on Intelligent Tutoring Systems, Montreal, Canada, pp. 601-614.

Weld, D.S. (1988). Comparative Analysis, Artificial Intelligence, 36, 333–374. Weld, D.S. and de Kleer, J.H. (Eds.) (1990). Readings in Qualitative Reasoning about

Physical Systems. Morgan Kaufmann. White, B. and Frederiksen, J.R. (1990). Causal Model Progressions as a Foundation for Intelligent Learning Environments. Artificial Intelligence 42(1): 99-157. Yang, S.Y. (2007). An Ontological Template-supported Interface Agent for FAQ

Service. Proceedings of the 6th WSEAS International Conference on Applied Computer Science, Hangzhou, China, pp. 98-103. Zupan, J. (1998). Neural Networks in Chemistry. In Schleyer, P. v.R., Allinger, N.L., Clark, T., Gasteiger, J., Kollman, P.A., Schaefer III, H.F. and Schreiner, P.R. (Eds.). Encyclopedia of Computational Chemistry, Volume 3. John Wiley & Sons. (p. 1813-1827)

230

Appendix A

A Summary of Systems Related to Qualitative Reasoning

231

Table A.1: Examples of educational software employing QR approaches.

Application Area /

Name of System Researchers/

Authors

Brief Description

QSIM Benjamin Kuipers

• QSIM is the first qualitative simulation program; developed by Kuipers. The ontology used is constraint-based (Kuipers, 1986).

• The approach started with a set of constraints abstracted from a differential equation and proved that the QSIM algorithm is guaranteed to produce a qualitative behaviour corresponding to any solution to the original equation.

• His work also showed that any qualitative simulation algorithm will sometimes produce spurious qualitative behaviours: ones which do not correspond to any mechanism satisfying the given constraints.

• These observations suggest specific types of care that must be taken in designing applications of qualitative causal reasoning systems, and in constructing and validating a knowledge base of mechanism descriptions (Kuipers, 1993).

Intelligent Tutoring

Environment for Industrial

Training Operators (ITTs)

J.A. Vadillo & Díaz de Ilarraza

• The work by Vadillo & de Ilarraza (1995) describes an extension of the INTZA system, a tutoring environment for industrial training operators.

• The work concentrates on the potential use of qualitative models for generating explanation to help users or learners to learn a domain. Paradigms applied are qualitative simulation based on components (de Kleer and Brown, 1984) and Qualitative Process Theory (Forbus, 1984).

• In order to provide good behavioural explanation for simulations in INTZA, they extended domain models with a qualitative causal viewpoint. The causal model is obtained by applying causal ordering (Iwasaki and Simon, 1986) to the set of differential equations that describes the system.

• The work concluded that qualitative causal model is useful to generate explanation in ITTs.

MS-PRODS/

CPRODS for Learning

Complex Physical System

Intelligent Training System

Julie-Ann Sime

• Sime (1995) presents the rationale behind the design of a simulation based learning environment, the Model Switching PRO cessing Demonstration System (MS-PRODS).

• MS-PRODS, a learner-centred learning environment based on multiple qualitative models in order to promote better understanding of a process. Three qualitative and three quantitative simulations of the behaviour of the physical system have been implemented using the ITSIE tools.

• Emphasis is on the use of different domain models, both quantitative and qualitative to achieve understanding of a process.

• The work introduced seven dimensions to classify the different domain models and a mechanism to progress through the models, based on these dimensions. The system could use several strategies for model progression.

• In CPRODS (Sime, 1998), six qualitative and quantitative models were used. A trainee can solve problems or observe the expert demonstrate problem solving using multiple models, switching between them as and when necessary.

• One of its strength is that instructional design is provided. However, there is no reflection on the learning process.

• The design of the learning environment stresses on how the models are to be used to promote learning, in particular examining the model switching mechanism. This mechanism determines how to select and sequence the presentation of models to the learner during guided explorations of the domain.

232

• Sime claims that an intelligent training system requires not just the development of tools and techniques for modelling and simulation but also some guidance on how to use qualitative models in learning environments.

• The work uses of Cognitive Flexibility Theory with assumptions that learning is greatly facilitated by guided, non-linear, multi-dimensional explorations of the content domain.

VModel

RoboTA

Kenneth D. Forbus et al.

• VModel software (Forbus, Carney, Harris & Sherin, 2001) was developed for middle school students, with an aim to prepare students to have skill in modelling.

• VModel uses a visual representation of modelling conventions similar to concept maps.

• Ureel & Carney (2003) presented a design of computational supports for students in visual modelling tasks. In this work, a visual representative language was developed for use with the middle-school students, because predicate calculus based formalisms is an entry barrier to its use by children.

• Their approach was to provide students with a software-based conceptual modelling environment that supports the articulation and qualitative simulation of their knowledge of physical systems.

• To guide students through the modelling process they have implemented coaching supports in response to student questions.

• Many graphical external representations have been created to aid students in articulating their understandings of phenomena. They can be grouped into three families, namely concept map notation, dynamical systems notations (STELLA, Model-It), and argumentation environments. The vocabulary for causal maps is drawn from QPT.

• It was reported that it is extremely difficult to create software that detects whether or not arguments and models are well-formed.

• RoboTA (Forbus et al., 2004a; Forbus et al., 2004b): an architecture for colony of distributed agents aimed at supporting instructional tasks.

• RoboTA is in use for providing assistance and feedback to the users of CyclePad (Forbus et al., 1999). The built-in email facility allows users of CyclePad to send in their design and get feedback from a design coach (the CyclePad Guru) that runs as a RoboTA agent.

• A RoboTA colony has two kinds of agents: A central server process (the PostOffice) and course or application-specific agents (TA agents).

CyclePad for Science and

Engineering Education

Kenneth D. Forbus et al.

• CyclePad is an Articulate Virtual Laboratory (AVL) for learning

engineering thermodynamics by design. Its intention was to scaffold the design task that frees the student from the burden of the design.

• CyclePad is a new kind of software called “articulate software” (Forbus, 1997). Properties of articulate software include: be fluent, supportive, generative, and customizable.

• The domain knowledge is represented using techniques from qualitative physics (Forbus, 1984) and compositional modelling (Falkenhainer and Forbus, 1991).

• Its domain theory includes: Physical and conceptual entities, structural knowledge, qualitative knowledge, quantitative knowledge, modelling assumptions, assumption classes, and economic model (Forbus and Whalley, 1994; Forbus et al., 1998; Forbus et al., 1999; Forbus, 2001).

233

Qualitative Models

in Ecology

Paulo Salles, Helen Pain, & Robert I.

Muetzelfeldt

• Salles et al. (1996) explored different approaches to model qualitatively the vegetation, dynamics of Brazilian cerrado, in order to assess their suitability to provide the domain-specific knowledge in tutoring systems. The ultimate goal is to predict and explain the behaviour of ecology systems in qualitative terms.

• Two formalisms, the System of Interpretation of Measurements, Analysis and Observations (SIMAO) and the Qualitative Process Theory (QPT), are compared. The comparison aspects are: (1) capacity for making predictions about the behaviour of a plant population, and (2) the generation of explanations from encoded knowledge.

• Both SIMAO and QPT-based models can produce similar predictions to those obtained with a numerical model of the same problem. SIMAO provides a useful qualitative algebra to make calculations with heterogeneous variables. However it is not possible to incorporate descriptions of the ecological components nor do dynamic simulations with the SIMAO-based model.

• On the other hand, QPT allows the encoding of qualitative knowledge and building more detailed models, but does not provide a qualitative algebra for combining empirical values of variables.

• They also discussed the role of different organisational levels and scales of space and time in explaining the behaviour of ecological systems. A combined approach was said to be advantageous in building tutoring systems.

Ecological Modelling

Paulo Salles, Bert Bredeweg

S. Araújo & M. Neumann

• In the work reported by (Salles and Bredeweg, 1997), GARP was used as the qualitative simulation tool for modelling cerrado vegetation and community. Later, they moved on to implement works on ecological for nutrient cycling, stream ecosystem recovery, and community ecology applications.

• Initially, they hand-coded qualitative models for running in GARP. Then, Salles & Bredeweg (2001) investigated how to decompose a large qualitative simulation into a progressive sequence of smaller simulations, useful for teaching purposes, in the domain of ecology. The work discusses progressive learning routes through large qualitative simulation models of ecological systems using ideas on model dimensions from Causal Model Progression (CMP), the Genetic Graph (GG), and the Didactic Goal Generator (DGG).

• The group perceive modelling is a learning activity, an important educational activity (Salles and Bredeweg, 2002; Salles and Bredeweg, 2003; Salles and Bredeweg, 2006; Salles et al., 2003; Salles et al., 2006)

Qualitative Reasoning

in Interactive Learning

Environments

Paulo Salles, Bert Bredeweg

& Radboud Winkels

The work presented insights to the following aspects: • Finding the minimum set of model fragments needed for

simulating the behaviour of a system and to answer adequately a specific question.

• Generating specific trajectories of possible behaviours rather than returning a full prediction of all possible behaviours of a system.

• Providing the system with fault model that reflect common misconceptions.

234

Qualitative Reasoning in Tutoring Interactions

Kees de Koning, Bert Bredeweg &

D.S. Weld

• The research done by de Koning and Bredeweg (1994, 1998) presented an experimental study that examines to what extent existing qualitative reasoning representations and techniques are sufficient for modelling the interaction between a student and a teacher when discussing the (qualitative) behaviour of physical devices.

• They investigated the usefulness of these models in actual teaching situations.

• Qualitative models are claimed to be beneficial for teaching systems.

• They claimed QR should be viewed not only as a means but also as goal, and that the knowledge representations as used in qualitative reasoning are largely adequate, whereas the reasoning techniques need adaptation for teaching.

• They also highlighted that, two important aspects are missing from early days of intelligent tutoring systems: (1) there is no representation of causality, and (2) there is no representation of physical structure (topological structure). Both the above are important for explanation though.

Tutoring System Dealing with

Physics

Kees de Koning et al.

• Main results and findings are described in (de Koning and Bredeweg, 1998; de Koning and Bredeweg, 1994; de Koning et al., 2000) are that:

• A framework is presented that defines a key role for qualitative models as interactive simulations of the subject matter. The framework focuses on automating the diagnosis of learner behaviour. Automated handling of tutoring and training functions in educational systems requires the availability of articulate domain models.

• They showed how a qualitative simulation model of the subject matter can be reformulated to fit the requirements of general diagnostic engines such as GDE.

• A set of procedures is presented that automatically maps detailed simulation models into a hierarchy of aggregated models by hiding non-essential details and chunking chains of causal dependencies. The result is a highly structured subject matter model that enables the diagnosis of learner behaviour by means of an adapted version of the GDE algorithm.

• An experiment has been conducted that shows the viability of the approach taken, i.e., given the output of a qualitative simulator the procedures they developed automatically generate a structured subject matter model and subsequently use this model to successfully diagnoses learner.

GARP Bert Bredeweg

• GARP is a domain independent qualitative reasoning engine implemented in SWI-Prolog – the PhD work of Bert Bredeweg (Bredeweg, 1992).

• In GARP, models are built using text editor and its interface is also text-based. The output of GARP consists of statements in predicate logic format, is not easy to understand for novices.

• The reasoning engine follows a compositional approach and implements the characteristics of QR technology by de Kleer & Brown (1984) and Forbus (1984)

HOMER

Vania Bessa Machado &

Bert Bredeweg

• HOMER (Bessa Machado and Bredeweg, 2002; Bessa Machado and Bredeweg 2003) implements a graphical oriented model building environment to GARP.

• In HOMER, The task of building a qualitative model is to create a set of model fragments (stored in a library) and specify one or more scenarios. A scenario refers to a structural description of the system.

235

QUAGS

F. Goddijn, Anders Bouwer,

& Bert Bredeweg

• Models constructed with HOMER can be exported as a set of files which can be used as input for GARP.

• When the simulator is called, it uses the model fragments to predict the behaviour of the system defined in the selected scenario.

• There is a support module that can guide the users through the model building process.

• QUAGS – stands for Questions about GARP Simulations. • QUAGS software generates questions based on simulations

produced by GARP (Goddijn et al., 2003).

VisiGarp

WiziGarp

Anders Bouwer & Bert Bredeweg

• VisiGarp (Bouwer, 2005; Bouwer and Bredeweg, 2005, Bouwer and Bredeweg, 2002; Bouwer and Bredeweg, 2001) implemented a graphical interface to GARP.

• The system allows users to inspect qualitative simulation models by interacting with automatically generated visualizations.

• The work investigates how explanations of dynamic phenomena can be generated using qualitative simulations and how these can be utilized in a domain independent interactive learning environment.

• The potential of aggregation principles to reduce the complexity of qualitative simulations has been explored.

• A prototype interactive learning environment has been implemented, called WiziGarp, which incorporates the aggregation mechanisms and expands the communicative functions of VisiGarp (Bouwer, 2005).

TCME

EBS

Tsukasa Hirashima

& Tomoya

Horiguchi

• Hirashima & Horiguchi (2009) proposed a method for semi-automating the description of graph of microworld using the compositional modelling mechanism.

• The Tiny Compositional Modelling Engine (TCME) generates the model of a given situation. The models are then used in a simulation-based learning environment. The aim of this research is to support adaptive learning with microworlds.

• In another work, Error-Based Simulation (EBS) is used as a method to visualize an erroneous equation in a mechanical problem (in physics). The approach was evaluated using QSIM (Kuipers, 94) and DQ-analysis (Weld, 1988). First, the EBS-manager predicts qualitative behaviour of the EBS by using qualitative simulation and compares it to a normal simulation. When a qualitative difference is found, the EBS-manager judges that the EBS is effective for error visualization. The EBS-manager also tries to find parameters by using comparative analysis of which perturbation cause qualitative differences between the EBS and a normal situation. After deriving the sequence of qualitative states based on an erroneous equation by QSIM, the EBS-manager derives the sequence of qualitative directions corresponding to the sequence qualitative states with perturbation of a parameter by using DQ-analysis.

The Teachable Agents

Betty’s Brain

K. Leelawong et al.

• The Teachable Agents project at Vanderbilt University (Biswas et al., 2001) showed a good example of how qualitative modelling can be useful for students.

• Their Betty’s Brain system uses qualitative representations expressed in concept maps to foster learning.

• The work extended intelligent learning environments with teachable agents to enhance learning. Their qualitative modelling framework uses qualitative mathematics, with tables for composing discrete values to provide qualitative simulation

236

• The task they use is to “teach” Betty (their software) by building concept maps so that Betty can produce explanations. This system turned out to be intensely motivating for students (Leelawong et al., 2001; Leelawong et al., 2003; Biswas et al., 2001)

ALI

(Automated Lab Instructor)

Aaron A. D’Souza

et al.

• ALI (D’Souza et al., 2001), a tool that uses qualitative representations to coach students while they interact with a virtual laboratory. ALI is based on the qualitative process theory (Forbus, 1984) and uses visual representations of direct influences and indirect influences.

• Course authors provide ALI the qualitative knowledge relevant to a specific quantitative model. When the quantitative model is simulated, ALI automatically infers the applicable causal dependencies and uses them to interact with the learner, both in terms of asking questions and showing graphics. A pilot study suggests that ALI does provide important guidance during discovery learning.

• ALI has a representation of the key relationships in the simulation model that the student should learn and it uses this knowledge to interleave its teaching opportunistically with the student's own discovery learning. Specifically, it can recognize learning opportunities in a student's experiments, test the student's resulting understanding, and gently guide the student towards these learning opportunities when necessary. ALI is claimed to be domain independent so that it can be attached to any quantitative simulation.

SOPHIE Project

J.S. Brown, D. Burton &

Johan de Kleer

• SOPHIE (de Kleer and Brown, 1992; Brown and Burton, 1982) is a big project that centred around three different SOPHIE systems (I, II and III).

• SOPHIE is a project focused on troubleshooting DC power supply. It can be seen as a pioneer landmark in trying to have computers communicate knowledge about the system behaviours with students.

• SOPHIE I & SOPHIE II included numerical simulation capabilities to create an artificial lab, or reactive learning

environment, in which a learner can perform experiments safely and easily and receive informed feedback, for the domain of troubleshooting of electronic circuits.

• With SOPHIE III, its designers incorporated a qualitative simulator, in an attempt to move towards more humanlike reasoning and explanation capabilities. A remarkable feature of the SOPHIE systems was the robust natural language interface, which could handle a broad range of queries from a user.

A Framework for High School Level

Mathematics

Walther Neuper & Franz Wotawa

• Neuper (2001) and (Neuper and Wotawa, 2002) presented a framework for handling knowledge base on Model Based Reasoning (MBR).

• The work constructs mathematical model from textual description. The basic element used in the framework is a “mathematical concept”. Its semantics is based on first order logic.

• However, not all mathematical examples can be expressed within the framework.

• Moreover, guidance is not provided in the modelling phase and generation of explanation is only possible in modelling phase.

237

Appendix B

Collection of Flowcharts for the Qualitative Reasoning Framework

238

Figure B.1 Workflow of the QPT-based modelling, reasoning and explanation framework.

239

Substrate Recognizer (Function No. 1)

Enter

Recognize structural units

in organic substrate

Identify structural units as

nucleophiles or

electrophiles

Prepare view pairs based

on recognized types for

each individual

Exit

Pairs of

individuals

Chemical

KB

View

Instance

Structure

(VIS)

Figure B.2 The task performed by the “Substrate Recognizer”.

240

Model Constructor (Function No. 2)

Enter

Retrieve the suggested

chemical process

Make-bond?

Store the chemical theories

in special purpose arrays

(in the format of QPT slot)

Retrieve the chemical data

and theories implicit in the

identified bond activity

Exit

Break-bond?

yes

no no

yes

Figure B.3 Workflow for automating QPT model for organic processes.

241

Qualitative Simulator (Function No. 3)

Enter

Exit

Check process's

quantity

Quantities that are directly

influenced by a process; termed

as "direct influence" in QPT

QSA

Examine indirect

influences and their effect

propagation

Process

entry_conditions

violated

?

Yes

No

Enter

Exit

Store direct influenced quantity

in special purpose array

Check qualitative

proportionalities in "Relation"

slot of the process

Store propagated effects in

special purpose arrays

This step helps provide

causal graph generation

This step helps answering

the how? why? what?

types of question

regarding a process

simulation and behaviour

So that the current

process may stop & the

next process may start, if

there are reactive

individuals left in the VIS

The QSA routine

Store new individuals in VISVIS

New individuals such

as the intermediates

produced in each

small reaction step

Figure B.4 Workflow of the QPT-based simulation and the micro steps in the QSA module.

242

Enter

Check Qty-Cond to see

which entry

requirement is violated

Why does a process

start/stop?

What are the causes

for a particular quantity

to assign a new value?

Why some structural

units left the main

compound?

How a particular

type of intermediate

product is obtained?

Exit

Select a question from

the pull-down list

Check what new

individuals are created

during a process

reasoning

Check quantity spaces

for affected individuals

Check all functional

dependencies starting

from the direct influence

slot of a QPT process

Display answers

accordingly

yes

yes

yes

yes

no

no

no

no

Explanation Generator (Function No. 4)

Figure B.5 Workflow of the technique used in handling and generating an explanation.

243

Appendix C

Questionnaires Used for Collecting Students’ Feedback on

the use of QPT and Qualitative Reasoning Approaches

244

PART A: Your understanding towards the QPT.

Directions:

To answer this questionnaire, take your time to really get into the mood of the

situation. When rating the statement below, please give your opinions based on what

you actually ‘experienced’ while studying the QPT way of modelling the organic

reaction and SN1 mechanism. Indicate the extent to which you agree or disagree with

each statement by circling the number that best express your opinion.

Strongly

Disagree

Disagree Neither

Agree Nor Disagree

Agree

Strongly Agree

1 The identification of quantities

(parameters) helped me to establish

the functional dependency among

them.

1 2 3 4 5

2 The specification represented using

QPT makes it easy to understand the

organic processes (reaction steps) that

are involved in a chemical reaction

simulation.

1 2 3 4 5

3 The various slots in a QPT process

make me confused. 1 2 3 4 5

4 The flow of the reasoning is more

systematic when a QPT specification

that captures the chemical knowledge

and intuition is given.

1 2 3 4 5

5 I still don’t know how to read a QPT

model even though it is already taught. 1 2 3 4 5

6 The specification describes almost

exactly what I have in mind. 1 2 3 4 5

7 There are still many concepts implicit

in the reaction formula but I don’t

seem to see them in the model.

1 2 3 4 5

Figure C.1 Questionnaire to assess students’ understanding on QPT.

245

PART B: Your opinion about applying qualitative modelling and reasoning in the

teaching and learning organic reaction mechanisms.

Directions:

Based on the short lecture about qualitative reasoning and modelling using QPT for

learning SN1 reaction, please indicate (by circling) on the scale below, your opinion

about the appropriateness and effectiveness of applying qualitative in the said

domain.

Strongly

Disagree

Disagree

Neither

Agree Nor

Disagree

Agree

Strongly

Agree

1 The likelihood that I would read

further about qualitative reasoning &

modelling is high.

1 2 3 4 5

2 I don’t like the trouble of going

through modelling and simulation

before real experiment.

1 2 3 4 5

3 If I want to explain reaction

mechanisms to my friends, this is the

type of formal way that I’m looking for.

1 2 3 4 5

Please indicate, in general, how favourable you are with the qualitative way of

modelling and explaining the reactions and its mechanism.

Strongly

Favourable

Favourable

Neither Favourable

Nor Unfavourable

Unfavourable

Strongly

Unfavourable

1 2 3 4 5

If you wish to make any suggestions regarding any of the points covered in this

questionnaire, please use the space provided.

--------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------

Figure C.2 Questionnaire to collect students’ opinions on qualitative modelling and reasoning approaches of problem solving for organic chemistry.

246

Appendix D

Selected Computer Screenshots

247

Figure D.1 Login page.

Figure D.2 Front page of the QRiOM qualitative simulator.

248

Figure D.3 Main interface of QRiOM.

Figure D.4 More learning activities and explanation can be viewed by clicking A, B and C buttons.

249

Figure D.5 Reaction route for the simulation of “CH3Cl + HO−”.

Figure D.6 Reaction route for the simulation of “CH3CH3CH3Br + 2H2O”.

250

Figure D.7 Reaction route for “CH3CH3CH2Cl + HO−”.

Figure D.8 QPT model inspection page.

Learners may

inspect the

automated

models

251

Figure D.9 A“make-bond” process described in QPT terms (between a charged nucleophile and a charged electrophile).

Figure D.10 A causal graph showing the cause and effect relationships of the various chemical parameters during qualitative reasoning.

252

Figure D.11 Causal graph inspection page with annotation.

Figure D.12 Brief explanation of each slot in a QPT model is provided.

253

Figure D.13 More explanation for the various modelling constructs of QPT.

Figure D.14 Contents of the View Instance Structure (VIS) give the pairs of reacting species used in each small reaction step.

254

Figure D.15 A snapshot of the contents of the VIS during the simulation of “CH3CH3CH3COH + HBr”.

Figure D.16 Each chemical state change (parameter state history) is recorded for further examination.

Learners can select

any reacting species

(views) to study its

parameter history

255

Figure D.17 Chemical states for “HO−”are retrieved and displayed.

Figure D.18 Contents in the “substrate table” showing the functional units involved in a reaction.

256

Figure D.19 The screenshot for a specific case where QRiOM is unable to predict the output, where the reason is displayed via a pop-up window.

Figure D.20 A screenshot of “no reasoning” for an input pair of <CH3Cl, HF>, where the system simply returns a short message.

257

Figure D.21 A QPT learning corner is included in the software.

Figure D.22 A “terminology help window” that provides quick notes for important organic chemistry terms used in simulation and explanation.

258

Figure D.23 The main interface for “model building” by the students – for future expansion of the simulator.

Figure D.24 Knowledge base Editor – for adding/deleting chemical facts and theories.

259

Appendix E

Program Snippets for the Main Software Modules in QRiOM

260

Program Snippets for Views and Processes Constructors

Figure E.1 shows the Java instructions that retrieve the chemical facts from the KB

(stored as Prolog clauses) and to prepare slots for a QPT model. The chemical

information will be used to compose QPT processes. Note that texts after the “//” sign

are comments.

: amzi.ls.LogicServer ls = new amzi.ls.LogicServer(); long term3, term4, term5; globalDataEG gd = new globalDataEG(); step3_Nu = TwoViewStructureArr[1]; // E.g. Cl- String tempElectrophile = TwoViewStructureArr[0]; // E.g. C+ try{ ls.Init(""); ls.Load("chemkb.xpl"); // Connect to Prolog backend file term3 = ls.ExecStr("nucleophile('"+step3_Nu+"', Q)"); // Retrieve the required data for a nucleophile if (term3 == 0){ ls.Close(); return; } gd.TypeList1[trying] = nuStart[trying] = ls.GetStrArg(term3, 2); term4 = ls.ExecStr("find_reacting_unit(intermediateProd, tempElectrophile, P , _)"); if (term4 == 0){

ls.Close(); return;

} step3_Elec = ls.GetStrArg(term4, 3); term5 = ls.ExecStr("electrophile('"+step3_Elec+"', R)"); // Retrieve the required data for an electrophile if (term5 == 0){ ls.Close(); return; } gd.TypeList2[trying] = electroReagent[trying] = ls.GetStrArg(term5, 2); ls.Close(); } // end try catch (Throwable t) { t.printStackTrace(); }

(a) Retrieving essential chemical information of the individuals involving in a chemical process

261

public void viewModel_actionPerformed(ActionEvent e) { amzi.ls.LogicServer ls = new amzi.ls.LogicServer(); : String pn = gd.ProName; String ind1 = gd.View1; String Ind1Type = gd.Type1; String Ind2Type = gd.Type2; String ind2 = gd.View2; : jTextArea2.setFont(new java.awt.Font("Dialog", Font.BOLD, 12)); jTextArea2.append("Process Activated: " + gd.ProNameList[0] + "\n"); jTextArea2.append("\n" + "Individuals (The reacting units in this process/step)" + "\n"); jTextArea2.setFont(new java.awt.Font("Dialog", Font.PLAIN, 10)); jTextArea2.append(" " + gd.ViewList1_sn2[0] + "\t" + gd.ViewList2_sn2[0] + "\n"); jTextArea2.setFont(new java.awt.Font("Dialog", Font.BOLD, 12)); jTextArea2.append("\n" + "Quantity-Condition (Entry requirements to activate the process)" + "\n"); jTextArea2.setFont(new java.awt.Font("Dialog", Font.PLAIN, 10)); : ls.Load("bondKB.xpl"); term1 = ls.CallStr("qty_cond("+gd.ProNameList[0]+", "+gd.TypeList1[0]+", X, Y, Z).");

: do { jTextArea2.append(" " + ls.GetStrArg(term1, 3) + "(" + ls.GetStrArg(term1, 4) + ") "

+ ls.GetStrArg(term1, 5) + "\n" ); } while (ls.Redo()); term2 = ls.CallStr("qty_cond("+gd.ProNameList[0]+", "+gd.TypeList2[0]+", X, Y, Z)."); : do{ jTextArea2.append(" " + ls.GetStrArg(term2, 3) + "(" + ls.GetStrArg(term2, 4) + ") " + ls.GetStrArg(term2, 5)+ "\n" ); } while (ls.Redo()); jTextArea2.setFont(new java.awt.Font("Dialog", Font.BOLD, 12)); jTextArea2.append("\n" + "Infuences (Direct effect caused by the process)" + "\n"); jTextArea2.setFont(new java.awt.Font("Dialog", Font.PLAIN, 10)); if (gd.ProNameList[0].equals("make_bond")) jTextArea2.append(" " + "A covalent bond is added (formed)" + "\n"); else jTextArea2.append(" " + "A covalent bond is removed (cleaved)" + "\n"); jTextArea2.setFont(new java.awt.Font("Dialog", Font.BOLD, 12)); jTextArea2.append("\n" + "Parameters dependency (Effects propagation)"); jTextArea2.setFont(new java.awt.Font("Dialog", Font.PLAIN, 10)); : do{ jTextArea2.append(" " + ls.GetStrArg(term3, 3) + "(" + ls.GetStrArg(term3, 5) + ") followed by " + ls.GetStrArg(term3, 4) + "(" + ls.GetStrArg(term3, 6) + ")" + "\n"); } while (ls.Redo()); jTextArea2.append("\n" + gd.ViewList2[0] + " [" + gd.TypeList2[0] +"]" + ":" + "\n"); term3 = ls.CallStr("process_relations(make_bond, "+gd.TypeList2[0]+", P, Q, R, S)."); : do{ jTextArea2.append(" " + ls.GetStrArg(term3, 3) + "(" + ls.GetStrArg(term3, 5) + ") followed by " + ls.GetStrArg(term3, 4) + "(" + ls.GetStrArg(term3, 6) + ")" + "\n"); } while (ls.Redo()); : }

Get ready the individuals for the chemical process

Based on the view’s type,

general set of chemical

theories are retrieved from

the KB

Display the effect

propagation caused by the process. These are the indirect influences of a

QPT model

Prepare the headings for the QPT model, and

the direct influence of the

process

Prepare the individuals for the chemical process

(b) Preparing QPT slots for displaying on the GUI

Figure E.1 The Java code for retrieving chemical facts of reacting species and for constructing a QPT process.

262

Program Snippets for Prediction Engine

The Quantity Space Analyzer (QSA) is one the most important software modules in the

reasoning engine of QRiOM. The module will be called up to perform tasks such as

updating and maintaining multiple data structures whenever an organic process is

determined based on the view pairs. The Java codes for “view structure updating” and

“atom properties updating” submodules in the QSA will be shown in the next few

subsections. These two submodules collectively help predict the final product of a

reaction simulation.

View structure updating: Figure E.2 shows the main processing steps in keeping track

of the updated status of the VIS. The contents of the array can be used to suggest the

next chemical process for reasoning.

: if (Subst_1_ChargedHistory[4].equals("pos")) { appendedCharge1 = StartMaterialTable[0].concat("+"); // Check its charge’s state theStr1 = "CH3CH3CH3".concat(appendedCharge1);} else if (Subst_1_ChargedHistory[4].equals("neg")) { appendedCharge1 = StartMaterialTable[0].concat("-"); theStr1 = "CH3CH3CH3".concat(appendedCharge1);} else theStr1 = "CH3CH3CH3".concat(StartMaterialTable[0]); if (Agent_2_ChargedHistory[4].equals("pos")) { appendedCharge1 = StartMaterialTable[1].concat("+"); theStr2 = theStr1.concat(appendedCharge1);} else if (Agent_2_ChargedHistory[4].equals("neg")) { appendedCharge1 = StartMaterialTable[1].concat("-"); theStr2 = theStr1.concat(appendedCharge1);} else theStr2 = theStr1.concat(StartMaterialTable[1]); ViewStructureArr[0] = theStr2; // Update the contents of VIS

:

Figure E.2 The associated Java statements for updating the VIS in order to suggest the next organic process in the qualitative simulation environment.

263

Atom property updating: Figure E.3 presents the code in updating the chemical states of

atoms during simulation. The contents of the special tables will be used to generate

causal graphs, display the atom property table and produce the parameter history table.

: ls.Init(""); ls.Load("chemkb.xpl"); t1 = ls.CallStr("qpropAPTable(make_bond, chargedElec, P3, Q3, R3, S3); // Prepare to retrieve chemical theories if (t1 == 0) { : ls.Close(); return; } do{ if (n1 == 1){ qtyArrTemp[0] = ls.GetStrArg(t1, 3); qtyArrTemp[1] = ls.GetStrArg(t1, 4); signArrTemp[0] = ls.GetStrArg(t1, 5); signArrTemp[1] = ls.GetStrArg(t1, 6);} if (n1 == 2){ qtyArrTemp[2] = ls.GetStrArg(t1, 3); qtyArrTemp[3] = ls.GetStrArg(t1, 4); signArrTemp[2] = ls.GetStrArg(t1, 5); signArrTemp[3] = ls.GetStrArg(t1, 6);} n1++; } while (ls.Redo()); if ((individual_2_type.equals("chargedElec")) && (bActivity.equals("make_bond"))) { if ((signArrTemp[0].equals("plus")) && (qtyArrTemp[0].equals("charge"))) { for (int t=0; t<=2; t++) if (charge[t].equals(smt_0_ChargeVal)) index = t; gdat.Sub_1_ChargedHistory[4] = Subst_1_ChargedHistory[4] = charge[++index]; smt_0_ChargeVal = Subst_1_ChargedHistory[4];} else if ((signArrTemp[0].equals("minus")) && (qtyArrTemp[0].equals("charge"))) { for (int t=0; t<=2; t++) if (charge[t].equals(smt_0_ChargeVal)) index = t; gdat.Sub_1_ChargedHistory[4] = Subst_1_ChargedHistory[4] = charge[--index]; smt_0_ChargeVal = Subst_1_ChargedHistory[4];} if ((signArrTemp[0].equals("plus")) && (qtyArrTemp[0].equals("lone_pair_electron"))) { for (int t=0; t<=4; t++) if (String.valueOf(lone_pair[t]).equals(smt_0_LPVal)) index = t; gdat.Sub_1_LonePairHistory[4] = Subst_1_LonePairHistory[4] = String.valueOf(lone_pair[++index]); smt_0_LPVal = Subst_1_LonePairHistory[4];} else if ((signArrTemp[0].equals("minus")) && (qtyArrTemp[0].equals("lone_pair_electron"))) { for (int t=0; t<=4; t++) if (String.valueOf(lone_pair[t]).equals(smt_0_LPVal)) index = t; gdat.Sub_1_LonePairHistory[4] = Subst_1_LonePairHistory[4] = String.valueOf(lone_pair[--index]); smt_0_LPVal = Subst_1_LonePairHistory[4];} if ((signArrTemp[0].equals("plus")) && (qtyArrTemp[0].equals("no_of_bond"))) { for (int t=0; t<=4; t++) if (String.valueOf(bond[t]).equals(smt_0_BondVal)) index = t; gdat.Sub_1_BondHistory[3] = Subst_1_BondHistory[3] = String.valueOf(bond[++index]); smt_1_BondVal = Subst_2_BondHistory[4];} else if ((signArrTemp[0].equals("minus")) && (qtyArrTemp[0].equals("no_of_bond"))) { for (int t=0; t<=4; t++) if (String.valueOf(bond[t]).equals(smt_0_BondVal)) index = t; gdat.Sub_2_BondHistory[4] = Subst_2_BondHistory[4] = String.valueOf(bond[--index]); smt_0_BondVal = Subst_1_BondHistory[4];} : : }

The chemical theories (stored as qprop) for the identified organic

process in the chemical KB are

retrieved and stored in

temporary arrays

The states of the chemical

parameter are updated based

on the qualitative

proportionalities retrieved earlier.

Figure E.3 The Java code for updating the chemical parameters’ states of each atom during simulation.

264

Program Snippets for Causal Graph Generator

Figure E.4 shows the code fragment that generates a causal graph by using the entries in

various special purpose data structures discussed in Chapter 5.

public void causalGraph_actionPerformed(ActionEvent e) { globalDataEG gd = new globalDataEG(); jCG.setText(" "); jCG.append("\nThis is the causal diagram for the reaction formula you just selected\n"); if (gd.procInvolved[0].equals("make_bond")){ jCG.append("Step 1: Make-Bond Process\n"); jCG.append("Nucleophile" + "(" + gd.smn + ")" + "\t\t\t" + "Electrophile" + "(" + gd.aat00 + ")" +"\n" ); for(int y=0; y<=2; y++){ if(!(gd.par1[y].equals("nil"))) if (y==1) jCG.append("\t\t" + gd.par1[y] + "(" + gd.sign1[y] + ")" ); else jCG.append("\t\t" + gd.par1[y] + "(" + gd.sign1[y] + ")" ); else jCG.append("\t\t\t\t"); if(!(gd.par2[y].equals("no change"))) { if (y==2) jCG.append("\t\t\t" + gd.par2[y] + "(" + "no change" + ")" + "\n"); else if (y==1) jCG.append("\t" + gd.par2[y] + "(" + gd.sign2[y] + ")" + "\n"); else jCG.append("\t\t" + gd.par2[y] + "(" + gd.sign2[y] + ")" + "\n");} else jCG.append("\n"); } : : } // End Generate Causal Graph

These codes prepare the values taken by each main

parameter during the entire reasoning and simulation. The set of values are then formatted onto the appropriate user

graphical interfaces

Figure E.4 The Java statements for constructing a causal graph.

Parameter History Maintenance and Retrieval: The construction of causal models

and the production of 2D organic structure would require referencing the contents of the

so-called “parameter history” structure. The Java code for retrieving the parameter

history of a given reacting unit is presented in Figure E.5.

265

: if (jComboBox1.getSelectedIndex() == 0){ // if (unitNameSelected.equals("gd.AAT[0]")){ jTextArea2.append("\n" + "Charge History:" + "\n"); jTextArea2.append("Initial State " + " State After Step 1 " + " State After Step 2 " + " State After Step 3 (Final State)" + "\n"); for (int p=1; p<=4; p++) { if (gd.Agen_1_ChargedHistory[p].equals(" ")) gd.Agen_1_ChargedHistory[p] = "nil"; jTextArea2.append("[" + gd.Agen_1_ChargedHistory[p] + "]"); jTextArea2.append(" "); } jTextArea2.append("\n\n" + "Covalent Bond History:" + "\n"); jTextArea2.append("Initial State " + " State After Step 1 " + " State After Step 2 " + " State After Step 3 (Final State)" + "\n"); for (int p=1; p<=4; p++) { if (gd.Agen_1_BondHistory[p].equals(" ")) gd.Agen_1_BondHistory[p] = "nil"; jTextArea2.append("[" + gd.Agen_1_BondHistory[p] + "]"); jTextArea2.append(" "); } jTextArea2.append("\n\n" + "Lone Pair History:" + "\n"); jTextArea2.append("Initial State " + " State After Step 1 " + " State After Step 2 " + " State After Step 3 (Final State)" + "\n"); for (int p=1; p<=4; p++) { if (gd.Agen_1_LonePairHistory[p].equals(" ")) gd.Agen_1_LonePairHistory[p] = "nil"; jTextArea2.append("[" + gd.Agen_1_LonePairHistory[p] + "]"); jTextArea2.append(" "); } jTextArea2.append("\n\n" +"Note: \"nil\" means \"no reaction\" in the step. "); }

Figure E.5 The Java statements for retrieving the parameter history of a reacting unit.

266

Program Snippets for Knowledge Validation

A representative set of definitions in OntoRM is given in Figure E.6.

public class ontoRM {

// Public data: the attributes that are common to all reacting units (nu/elec/LG) String hasNameRM = ""; String hasTypeRM = ""; int hasBondNumberRM =-1; String hasChargeStateRM = ""; : String hasSuggestedMechanism; String [] processUsedRM = {"", "", ""}; // For (NUCLEOPHILE) & (ELECTROPHILE) char hasElectroNegativityRM; // GREATER_LESSER the neigboring atom String hasRsdegreeRM; // [primary, secondary, tertiary] String hasCarbocationStabilityRM; // [no, yes] boolean hasAbilityAsLeavingGroupRM; // [0, 1], for checking whether is a weak base (=stable ion) String hasNucleophilicity; // [low, high] // For (LG) String hasDegreeSubstituentRM; // [primary, secondary, tertiary] String hasAtomAttachmentTypeRM; // ATOM_ATTACH_TYPE (carbon/oxygen) String hasBaseStrengthRM; // [weak, strong], used by sn1/sn2 to see whehter can break bond String hasBondTypeRM = ""; // [single, double, triple, ring] String hasElectroNegativityRM; // [>, <] // For (Substrate) String hasFunctionalGroupNameRM; String hasFunctionalGroupTypeRM; int hasCarbonDegMainChainRM; // FOR ALCOHOL AND ALKYL HALIDE String hasBondTypeRM; // [single, double] String hasDegreeSubstituentBearingFuncUnitRM; // [primary, secondary, tertiary] String hasLGTypeRM; // E.g. "OH" - ok, "Cl" - ok, "F" - no reaction (strong base, too reactive) String hasLGNameRM; // FOR SN1 AND SN2 MECHANISMS String [] hasPatternOfReactantsRM; //E.g. how many bonds (at LG location), how many Rs (just 3 and 2 for sn1), what type of bonds String hasAliasRM; // full name of sn1 and sn2 respectively String [] hasReactantNamesRM; // hydrogen chloride String [] hasSubstrateNamesRM; // tertiary alcohol String [] hasEndProductsRM; // water, alkyl halide, alcohol String [] hasRateDetermineStepRM; // e.g. the dissociate step determines the reaction rate String [] hasProcessOrderRM; // e.g. [make_bond, break_bond, make_bond] int hasAllowedDegreeOfCarbonRM; // [1,2,3] int hasDegreeSubstituentMainCarbonRM; // [1,2,3]

: String [] [] hasViewPairConstraintRM = {{"make_bond", "neutralNu", "chargedElec"}, {"make_bond", "chargedElec", "chargedNu"}, {"make_bond", "chargedNu", "neutralElec"}, {"break_bond", "neutralElec", "chargedNu"}, {"break_bond", "neutralElec", "neutralNu"} }; String [] hasSpecialCauseRM = {"nucleophilicity", "temperature", "pH", "Solvent Type", "LG type", "Alkyl halide used"}; // Default Constructor : // Other Constructor public ontoRM(String theViewName, String theViewType, int theBondNo, String theChargeState) { hasNameRM = theViewName; hasTypeRM = theViewType; hasBondNumberRM = theBondNo; hasChargeStateRM = theChargeState;} : }

(a) Representative set of attributes for constituents of organic compounds and the definitions of general chemical properties of reaction mechanisms.

267

: functional_unit(nucleophile). functional_unit(electrophile). rm(organic_mechanism). is_a(chargedNu, nucleophile). is_a(neutralNu, nucleophile). is_a(chargedElec, electrophile). is_a(neutralElec, electrophile). is_a(nucleophilic_substitution, organic_mechanism). is_a(elimination, organic_mechanism). is_a(electrophilic_addition, organic_mechanism). example_of(chloride_ion, chargedNu). example_of(alcohol_oxygen, neutralNu). example_of(carbon, neutralElec). example_of(hydrogen_ion, chargedElec). example_of(carbocation, chargedElec). example_of(sn1, nucleophilic_substitution). example_of(sn2, nucleophilic_substitution).

:

(b) Hierarchical structuring of the basic concepts of reaction mechanisms (in Prolog syntax)

Figure E.6 A sample set of definitions for nucleophiles, electrophiles and the basic concepts of organic mechanisms.

268

Knowledge Validation Implemented as Java Methods

Figure E.7 shows the Java method that checks whether two nucleophiles are from the

same group in the periodic table. If so, there will be a nucleophilic substitution.

Otherwise, a message of “no reaction” will be suggested. By doing so, the qualitative

simulator will not simply carry out reasoning when in actual case none is required.

Note that the checkNucleophilicity method makes use of the “hasNucleophilicity”

attribute of a NucleophileView concept in OntoRM. Figure E.8 validates if a substrate

can undergo SN1 at all. A pair of views and its suggested covalent bonding will also

undergo validation to make sure it is a valid organic process as presented in Figure E.9.

public int checkNucleophilicity(String theComingNu, String theCompoundNu){

int nu1Scale= -2, nu2Scale = -2, nuFlag = 0; String globalStr1="", globalStr2="", globalStr3 = ""; globalDataEG gNu = new globalDataEG(); : if (theCompoundNu.equals("F")) nu2Scale = nucleophilicity[0]; else if (theCompoundNu.equals("Cl")) nu2Scale = nucleophilicity[1]; else if (theCompoundNu.equals("Br")) nu2Scale = nucleophilicity[2]; else if (theCompoundNu.equals("I")) nu2Scale = nucleophilicity[3];

char[] tempNuArr = theComingNu.toCharArray(); tempNuArr[0] = ' '; String theComingNuTran = String.valueOf(tempNuArr); : if (nu1Scale > nu2Scale) { gNu.gStr1 = globalStr1 = "Since the nucleophilicity of " + concatNuStr + " is higher than " + theCompoundNu + " (resides in the compound)" ; gNu.gStr2 = globalStr2 = "Hence, " + concatNuStr + " (a better nucleophile) will replace " + theCompoundNu ; gNu.gStr3 = globalStr3 = "The product is : " + concatFinalProd; nuFlag = -3;}

else { gNu.gStr2 = globalStr2 = "Since the nucleophilicity of " + concatNuStr + " is lower than " + theCompoundNu + " (the nucleophilic center of the compound)" ; gNu.gStr3 = globalStr3 = "And, the "+ theCompoundNu + " is a poor LG, so the reaction is extremely slow."; gNu.gStr1 = globalStr1 = "The bond between the C and the " + theCompoundNu + " is difficult to break."; nuFlag = -4;} return nuFlag; }

Figure E.7 A Java method that checks the nucleophilic reactivity for a pair of nucleophiles for possible substitution.

269

public String noReactionCheck(String subst){ long term1; try { amzi.ls.LogicServer ls = new amzi.ls.LogicServer(); ls.Init(""); ls.Load("ontology.xpl"); term1 = ls.CallStr("no_reaction(X,'"+subst+"' ,Y)"); if (term1 == 0) { JOptionPane.showMessageDialog(null, subst + " not found!\n", "Substrate's reactivity check", JOptionPane.PLAIN_MESSAGE); ls.Close(); return "nil"; } ls.Close(); } catch (Throwable t) { t.printStackTrace(); } return "hasreaction"; } The Java method will require information from the chemical knowledge base, such as the following: no_reaction(sn2, 'CH3CH3CH3COH', 'Reason: Crowded'). no_reaction(sn1, 'CH3OH', 'Reason: Intermediate that produced is not stable'). no_reaction(sn1, 'CH3CH3CH3CF', 'Reason: Unstable ion when left the compound'). non_reactive_substrate('CH4'). :

Figure E.8 A Java method that checks whether a substrate can undergo SN1 or SN2.

public String sameViewTypeCheck(String View1, String View2){ long term1, term2; String sv1="", sv2=""; try { amzi.ls.LogicServer ls = new amzi.ls.LogicServer(); ls.Init(""); ls.Load("ontology.xpl"); term1 = ls.CallStr("has_view_type('"+View1+"', X)"); if (term1 == 0) { ... ); : } term2 = ls.CallStr("has_view_type('"+View2+"', Y)"); if (term2 == 0) { JOptionPane.showMessageDialog(null, View2 + " is not found!\n", "View's type check", JOptionPane.PLAIN_MESSAGE); ls.Close(); return "false"; } sv1 = ls.GetStrArg(term1, 2); sv2 = ls.GetStrArg(term2, 2); ls.Close(); } catch (Throwable t) { t.printStackTrace(); } if (sv1.equals(sv2)) return "no_bond_activity"; else return "proceed_bond_activity"; }

Figure E.9 A Java method that checks the types of individual views in order to recommend a suitable chemical process.

270

Control Scheme to Stop the Entire Simulation

Figure E.10 gives the statements to end a simulation. Figure E.11 contains statements

to generate the sequence of organic processes used in the prediction of the outputs, and

Figure E.12 gives the statements to generate the final products.

: while((count != 1) && (flag == -1)) { if (first_process == 'm'){ viewPair vp = new viewPair(sn1_sequence1[proc_step]); gdat.ProNameList[step] = processInvolved[step] = sn1_sequence1[proc_step] vp.quantitySpaceAnalyzer(); proc_step++; step = step + 1;} else if (first_process == 'b'){ viewPair vp = new viewPair(sn1_sequence2[proc_step]); processInvolved[step] = sn1_sequence2[proc_step]; vp.quantitySpaceAnalyzer(); proc_step++; step = step + 1;} if (proc_step == 1) { gdat.OneVSA[0] = OneViewStructureArr[0] = ViewStructureArr[0]; gdat.OneVSA[1] = OneViewStructureArr[1]= ViewStructureArr[1]; gdat.OneVSA[2] = OneViewStructureArr[2]= ViewStructureArr[2]; } else if (proc_step == 2) { gdat.TwoVSA[0] = TwoViewStructureArr[0]= ViewStructureArr[0]; gdat.TwoVSA[1] = TwoViewStructureArr[1]= ViewStructureArr[1]; gdat.TwoVSA[2] = TwoViewStructureArr[2]= ViewStructureArr[2]; } else if (proc_step == 3) { gdat.ThreeVSA[0] = ThreeViewStructureArr[0]= ViewStructureArr[0]; gdat.ThreeVSA[1] = ThreeViewStructureArr[1]= ViewStructureArr[1]; gdat.ThreeVSA[2] = ThreeViewStructureArr[2]= ViewStructureArr[2]; for (int v=0; v<=ViewStructureArr.length-1; v++) if (ViewStructureArr[v] != " ") count++; } trying++;

: : }// end else if if ((Agent_2_ChargedHistory[4].equals("neutral")) && (Subst_1_ChargedHistory[4].equals("neutral"))) {flag = 1;} } // while-loop

This line checks whether the substituted

nucleophile and the ultimate product are in their stable states

(e.g. no charge around the atoms)

This line checks whether VSA is having just one

element (i.e. count = 1), if so then the

entire reaction may stop since the top-most loop condition

is violated.

Figure E.10 The associated Java statements to stop the entire reaction simulation.

271

for(int m=0; m<=2; m++)

jTextArea1.append(" " + processInvolved[m]);

This array is updated each time a new organic

process is recommended.

Figure E.11 The Java statements for displaying the organic processes in the order of occurring.

jTextArea3.append(" " + ThreeViewStructureArr[0] + " and "); for (i = 0; i <= IntermediateArr.length -1; i++) jTextArea3.append(IntermediateArr[i]);

The bottom most element of VSA stores the final

product, while other side products are stored in

intermediate array.

Figure E.12 The Java statements to display the final product.