SWE321- Data Centered Design

Data-Centered Software Architecture

Dewan Tanvir Ahmed, Ph.D.

King Saud University

[email protected]

OverviewData-centered SW Architecture is characterized by

A centralized data store that is shared by all surrounding software components

Consists of two types of componentsData Store

Software component agents

OverviewThe connection between the data module and the software components is either implemented by

Explicit method invocation or Implicit method invocation

Software component does not communicate directlyVia data store

The shared data module providesInsertion

Deletion

Update and

Retrieval

Classification

Classified into two categories (based on Flow Control Strategy)Repository

Blackboard

RepositoryRepository

Data store is passive

Clients of the data store are active Its clients taking control of flow logic

Client may access the repository

Interactively

Batch transaction

Examples Database management system

Library Information system

Interface repository in CORBA

UDDI registry for Web Services

CASE tools IBM Rational Rose

IDE (Interactive Development Endowment)

BlackboardBlackboard

Data store is active

Clients are inactive/passive

Flow of logic is determined by the current data statusThe clients of a blackboard system are called Knowledge sources

Listeners or subscribers

A new data change may trigger events so that the knowledge sources take actions to respond to these events. These actions may result in new data and change in logic flow

Example: Knowledge-based AI system

voice and image recognition system

Security system

Business resource management systems.

Repository Style

Bootstrap

Agent 1 Agent 2 Agent n

Repository

Dash line means clients have full control over logic flow

Each agent might have different interface, functionality, privilege

Repository Architecture

Example: Student Record DB

Student

- SSN

- Name

+ Student()

+ getSSN()

+ setSSN()

+ getName()

+ setName()

123 Solomon

234 Anderson

987 Smith

database

connectivity

r0

r1

rn

class Students table

Students

- Vector or Arraylist

+ Students()

+ add()

+ remove()

+ next()

+ first()

class

Client: Students class, has a list of Student instances (in Vector)

Store: Students table (student_id, student_name)

Follows Model-View-Controller model

It describes the static relationships between data classes and their backup database tablesIt describes the static relationships between data classes and their collection classes collection classes

A programming oriented view of the repository design architecture

Instance of Student class represent one specific record in the database table

The relational database management system is a typical design domain for the repository architecture.

The data store of the repository maintains all types of data including schema (metadata), data tables, and index files for data tables.

Many tools are available to develop applications on the database stored in the database management system.

These include design, development, maintenance, and documentation tools.

Example 2: CASE Tool

CASE tool

Data

reposition

Diagram drawing

Fill -in- Forms

(Specifications)

Code for

Reverse Engineering

Diagrams

Reports (Text

Specifications)

Generated diagram

form RE

Graphic

Files

Text Report

Files

Text Code

Files

CommandsProgram

Scripts menu

A Computer Aided Software Engineering (CASE) system is another popular application domain for the repository software architecture.

There are many CASE tools surrounding the data store

A user of CASE tools can draw a UML design diagram such as a class diagram, collaboration diagram, or sequence and store the design blueprints in the CASE data store.

These UML diagrams can then be converted from one format to another.

Java or C++ skeleton code can also be generated based on these UML diagrams. If there is code without the original design diagram, the UML diagram can still be regenerated by reverse engineering tool.

There are many other input formats available for design and many output formats, as well.

Compiler construction is another good example of the repository architecture design.

Every compiler system has its own reserved keyword table, identifier symbol table, constant table generated after lexical analysis, and syntax and semantics trees generated by syntax and semantics analysis.

These tables' data structure in memory is shared by all phases of the compilation.

Each phase will generate new data or update the existing data in the data repository.

The flow control is controlled by a program which takes a source code as its input, then goes through each phase step by step, and finally produces the target binary code which is either executable or interpretable, such as Java bytecode. In other words, all agents in a repository system are not necessarily completely independent.

There is a logical order in the executions of all compilation phases. There may still be some communication between individual agents.

For example, the lexical analysis may find some unacceptable characters, so that the compilation must be abandoned and compile errors must be reported.

Example 3: Compiler

Construction

Compiler tool

scanner parsercode

generator

int x, y;

x= y + 1;

x int 0

y int 0

Symbol

table

Statement

Var exp ; (Int) (Int)

x y + 1

(Int) (Int) (Int)

parse tree

mov ax, [y]

add ax, 1

mov [x], ax

[x] address of variable x

In symbol table

Type check

By semantic

parserQuestion: why not batch sequence?

The data in memory are shared by all agents and the agents dont pass on data to each other directly.

The scanner takes two lines of an int type variable declaration and an assignment statement.

The lexical analyzer (scanner) tokenizes all input entities and puts them in the symbol table used by the syntax analyzer (parser) to build a syntax tree based on the grammar.

This syntax tree is checked again by the semantics analyzer (not shown) and is also used by the code generator to produce the target code.

We can see that the data in memory are shared by all agents and that the agents don't pass on data to each other directly.

Variants of Data Repository

Virtual repositoryBuilt up on the top of multiple physical repositories

Most DB allows users to create views that are virtual repositories since they do not exist physically.

BenefitsSimplify the overall complexity of overall database structure

Security management in terms of scope of data of manipulation for different users

Variants of Data Repository

Distributed repository system (distributed database system)All data are distributed over all sites linked by network

Data are replicated in order to

Improve reliability and local accessibility

Other issues - concernsVertical or horizontal partitions

Synchronization of duplicated data

Cost of data transmission

Collaboration (two-phase transaction commitment)

Summary: Data RepositoryApplication Domain

Suitable for large complex information systemData transactions drive the control flow

BenefitsData integrity: easy to backup and restore,

System scalability and Reusability of agents: Reduce the overhead of transient data between software components

ConsData store reliability and availability High dependency between data structure of data store Overhead cost of moving data on network

Blackboard Architecture Style

IntroductionBlackboard system:

A common knowledge base, the "blackboard", is iteratively updated by a diverse knowledge sources,

starting with a problem specification and ending with a solution.

1st Blackboard arch. developed in 1970smainly used for Speech recognition

Weather forecast

Motivated by classroom teachingBlackboard

Teacher and students (agents) solve problems together

Agents can work collaboratively or independently

Overview

Blackboard: variation of data-centric

Consists of three partitionsBlackboard: Used to store data (Hypothesis and fact)

Knowledge Source: stores domain specific knowledge

controller initiating the blackboard and knowledge sources

How does it work?Data driven a change in the data stored in Blackboard triggers one or more knowledge source might lead to more changes

The connections between the blackboard subsystem and knowledge sources are basically implicit invocations from the blackboard to specific knowledge sources, which are registered with the blackboard in advance.

Data changes in the blackboard trigger one or more matched knowledge source to continue processing.

Data changes may be caused by new deduced information or hypotheses results by some knowledge sources.

This connection can be implemented in publish/subscribe mode.

Blackboard Arch. in Box-Line

PropertiesMany domain-specific knowledge sources collaborate together to solve a complex problem such as pattern recognition or authentication in information security

Each knowledge source is relatively independent

They dont need to interact with each other

Only interact and respond to the blackboard subsystem

Each source works on a specific aspect of the problem and contributes a partial solution to the ultimate solution

Blackboard Arch. in Box-Line

Agent 1 Agent 2 Agent n

Blackboard

Also called subscriber mode

Note the direction of control flow

Data links

Flow logic control

Class Diagram of Blackboard Arch.

+inspect()

+update()

-facts

Blackboard

+execute()

-Blackboard

-KS

-controller

Control

+matchInspect()

+update()

-rules

Knowledge

Source (KS1)

1

1 1

n

1 n

A UML diagram for rule-based blackboard software

As we can see, one blackboard may have many knowledge sources associated with it, working on given data and deduced data available in the blackboard subsystem.

Each knowledge source helps to solve problems in its expertise area. Knowledge can be stored in different knowledge representation formats depending on the reasoning strategy.

For example, a knowledge source stores all related rules and provides activation mechanisms for the blackboard to trigger in rule-based expert system.

Of course, knowledge sources must register themselves with the blackboard so that if any change takes place in the blackboard, they will be notified to fire up actions in the corresponding knowledge sources, which can deduce new facts and update the blackboard.

Each individual knowledge source may have its own problem solving strategy and use its own knowledge expertise to contribute to a partial solution which will lead to a final solution.

The blackboard class holds the current data state, and the final problem solution will be placed in the blackboard for the controller to pick up and use to generate a final report.

The rule-based strategy is one of many reasoning algorithms in use today.

There are many other problem solving strategies that can be applied including Fuzzy set theory, probability and statistics, neural network, data mining, and heuristic searching.

Example: KBSSince the blackboard architecture is basically a self-activated system, the controller subsystem in the architecture only acts at the beginning of the process to initiate blackboard and all knowledge sources; it also periodically inspects the current state of the blackboard to determine whether to terminate the processing if the solution is acceptable or optimal enough.

KBS: Knowledge Based System

Knowledge: represented as production ruleCondition

Action

Example SettingHere is a set of rules:

R1: IF animal gives milk THEN animal is mammal

R2: IF animal eats meat THEN animal is carnivore

R3: IF animal is mammal AND animal is carnivore AND animal has tawny color AND animal has black stripes THEN animal is tiger

The set of facts is F1: animal eats meat

F2: animal gives milk

F3: animal has black strips

F4: animal has tawny color

GOAL: Which animal it is?

How to Solve ProblemsTwo major approaches

Forward Reasoning

Backward Reasoning

Both can use the blackboard arch.Facts in blackboard

Rules in agents

Example: forward reasoningF1 matches R2 produces new fact F5: it is a mammal

F2 matches R1 produces new fact F6: it is carnivore

F5,F6,F3,F4 matches R3 produces the result

Matching Process

ControlKnowledge

Source i Blackboard

initiate()

intiate()

matchInspect()

update()

matchInspect()

update()

inspect()

Dynamic interaction in a Blackboard system

How is the subscribing implemented?

Blackboard ( Fact i )

addEventListener()

KS EventTrigger()

Vector V 0 1 n-1

KS 0

Event Sink 0

handleEvent 0

( exec) action

KS (n-1)

Event Sink (n-1)

handleEvent (n-1)

(exec) actionscan

.

Publish/subscriber relationship between blackboard and knowledge sources

Another Example: Travel ReservationTravel Consulting System

ProcessA client submits a request

The system stores all the data in the blackboard

The blackboard makes a request to the air agent

Once air reservation data is returned and stored in blackboard, the change triggers hotel, auto rental, attraction agents for a travel plans under budge and time

Client chooses one of the plan

The system triggers the billing process

Two major approaches

Example: Travel Reservation

Travel Reservation System

Air agent Hotel agent Auto agent

control

client

Billing agentAttraction

agent

db

Why not repository arch. for travel reservation?

Observation:User requests in relatively fixed format

Hard to integrate all knowledge (business logic) in a central store, why?

Hotel discounts rules changes every day!

Many specialized hotel reservation services available?

best solution: publish the contract data and let all of them bid.

Summary - BlackboardApplication Domain

Suitable for solving immature and complex AI problemsThe problem spans multiple disciplines, each of which has complete different knowledge expertiseOptimal, partial, or approximate solution is acceptable Exhausted searching is impossible.

ProsScalability: easy to add new knowledge source Concurrency: all knowledge sources can work in parallel Reusability of knowledge source agents

ConsTight dependency between the blackboard and knowledge sourceSynchronization of multiple agents is an issueDebugging and testing of the system is a challenge.

Documents

SWE321- Data Centered Design