Intelligent support for specifications transformation

S Y S T E M S

Intelligent Sumort for U - I I _ _

Specifications Transformation

Jeffrey 1.9. rmi, University of //linois at Chicago Joel C. Ridge, Motorola Communications

Because software deve/opment is

kno w/ed@eirr+ensive, it makes sense fos, developers to use

expert systems. rile p r o t o m

specificatom transfornation system

&s&Where proves this.

uring the past decade, many software tools and design techniques D have been developed to help the

process of software development.' These tools, while allowing more efficient use of computers, have been mostly limited to solving problems with algorithmic solu- tions.* The softwaredesign process, which requires many heuristic guidelines, can- not be completely solved with this a p proach.

Because software development is a knowledge-intensive activity, the use of a knowledge-based software assistant should provide a programming methodology better suited to solving related p rob lems3 This technique also allows incremental accumulation and timely updating of design expertise. It should result in a sizable reduction of costs for de- signs of succeeding software systems.

We have constructed such an expert sy"

Ridge was at the University of Illinois at Chicago when the work reported here was conducted.

0704-7459/88~/0028r$01 ,000 1988 IEEE

tem, the Specification-Transformation Expert System, to automatically translate requirements specifications into design specifications during the development phase of the software life cycle.

STES accepts as input a software-requirements specification expressed in terms of dataflow diagrams. We represent the software-requirement specification with dataflow diagrams that have been widely used in several requirement techniques and tools such as structured system analysis: the Structured Analysis and De- sign Technique? and the Requirements Statement Language and Requirement Engineering Validation System6 We r e p resent the softwaredesign specification with structure charts7,* that represent the architectural design of the software system being designed.

Using rules that embody the structured design methodology originated by Your- don and Constantine, STES translates this specification into a template describing a structure chart.

IEEE Software

STES essentially consists of a knowledge base and an inference engine. The knowledge base contains information on the struc tureddesign methodology and heuristic guidelines to help determine when certain methods should be applied. This knowledge is represented with a combina- tion of production rules and compound data types. Given a target software system’s requirements specification, STES’s inference engine can perform intelligent decision making and determine a suitable architectural design specification for the software system being designed.

We originally implemented STES in OPS5 on aVAX 11/780 computer. We have since ported it to an Apollo DN3000 work- station and integrated it with a commer- cial CASE tool from Cadre Technologies. T h e new version of STES can incrementally accumulate design experience and facilitate software development.

Structured design The structured-design methodology

(also called “composite design” and “transformcentered design”) is designed to make up for the weakness in the hnc- tionaldecomposition approach, which fails to tell if the decomposition of a func- tion is good or bad. However, functional decomposition, which uses a divide-and- conquer strategy to decompose a complex software system into a set of simple software components to simplify the design process, remains an important technique for software development.

Although the structured-design approach has been available since the late 1970s, it has not been widely used because a relatively simple change in a functional requirement could require the manual updating of several handdrawn diagrams. Even if the method appeared to offer many benefits, it was often too much work to use.

This is no longer true. The average software developer can now use inexpensive

November 1988

workstations running powerful CASE tools, removing the updating difficulties. The methodology is now used widely because it helps develop software systems that are reliable, modifiable, and efficient.

Because of the benefits of the structureddesign methodology and the avail- ability of sufficient tools to support it, we chose to embed its design concept in STES.

Designing a software system with the structured-design approach involves three phases.

First, structured analysis is used to determine the “what” aspects of the problem. Creating a suitable specification of the problem is an iterative process that involves creating dataflow diagrams, con- ducting review sessions, and revising the

dataflow diagrams based on feedback from the reviews.

Second, once a suitable specification has been obtained, the dataflow diagrams are transformed into structure charts. Struc- ture charts let you specify the design’s “how” aspects.

Third, once the structure charts are available, they are used as a template for partitioning the system being designed into modules implemented with a standard language like Pascal, Cobol, or C.

lhe STES system The STES system is divided into three

components: data storage, knowledge base, and inference engine.

The data-storage component serves as a global database of symbols representing facts and assertions about the problem. In an OPS5-based expert ~ys t em,~ the data storage is called working memory and holds the knowledge that the entire system can access. Each unit of working memory, called a working memory element, consists of several attribute-value pairs. Any attribute not specifically assigned a value in working memory for a particular instance is given a default value of nil.

The knowledge base includes a set of ac- tual production rules. Each rule has acon- dition part that describes the dataconfigu- ration for which the rule is appropriate. Each rule also has an action part that gives instructions for changing the data configuration. Two parts make up a rule in OPS5: the left side and right side. The left side is a sequence of one or more condition elements; the right side is a sequence of actions. Each condition element specifies a pattern to be matched against working memory. The actions are imperative statements that are executed in sequence if the rule fires.

The inference engine is an executor: It determines which rules are relevant to a data memory configuration and chooses one to apply. The inference engine’s reasoning mechanism has three phases:

Match, which inspects the left side of all the rules against the fact in the working memory to see which, ifany, are satisfied.

Conflict resolution, which decides which rule will fire next if more than one rule is eligible. This decision is based on the simple concept that the production rule whose left-side elements were most recently placed into working memory will fire next. This means that, once the ma- chine begins a subtask, it will not be dis- tracted by earlier subtasks.

Act, which executes the right side of the production rule selected. This step may change the information in the work-

29

ing memory or in the knowledge base.

Supported heuristics. STES supports several textbook heuristics, including coupling, cohesion, fan-in, and span of control (fan-out).

Page-Jones* defines coupling as a method to partition modules so they are asinde- pendent as possible. Associated with a dataflow-diagram specification is an analysis tool that records the information con- tent of all dataflows. This tool, called adata dictionary, can be applied so sufficient information is available to determine the degree of coupling between the modules in a structure chart.

Cohesion is a way to measure the degree of functional relationofactivities ina module. Determining the amount of cohesion in a module is not easy for an expert system. A person can use a simple decision

tree to determine cohesion with little effort. But until a computer can understand natural language rather than merely use recognition, the system must query the user for the information required to traverse the decision tree. A good design usually requires a strong cohesion for each software component.

Asoftware system usuallyconsists ofa set of software components, and the relation- ship of calls among those components can be organized into a hierarchy. Fan-in r e p resents the number of immediate superordinates of a component in the system’s hierarchy. (Immediate superordinates are components that directly call the current component.) In the design process, you must maximize fan-in. You do this by reus- ing software components instead of creating a new component with the same func- tion.

Extracting features

Inference engine

Factoring

Knowledge base

Refining structure charts

Representing structure charts

Flgure 1. The phases required to transform a dataflowdiagram requirements specification into a structurechart design specification.

The span of control of asoftware compe nent is the number of immediate subordi- nates of the component. (Immediate s u b ordinates are the components that your current component calls.) In terms of sizes ofsoftware components, very high or very low spans are possible indicators of poor design. A low span of control can be increased either by breaking the compo nents into additional subordinate s u b components or by compressing the component into itssuperordinate. A high span ofcontrol can be decreased by creating in- termediate components.

Transformation procedure. STES ad- dresses the second design step in the structured-design approach: the transfonna- tion of dataflow diagrams into structure charts. The procedure to transform structured analysis to structured design is:

1. Identify the flow of data in the prob lem domain and construct an accurate specification using dataflow diagrams.

2. Identi@ the afferent, efferent, and transformcentered components.

According to Yourdon’s definition,’ nf- /umt data components are those high- level components of data furthest removed from physical input that may still be considered as inputs of the system. By the same token, efferent data components are those high-level components of data furthest removed from physical outputs that may still constitute outputs of the system. The rest of the dataflow diagram contains the system’s tran.s/m-centaedcompe nents. The transform centers contain the system’s essential functions, which are in- dependent of any constraints imposed by a particular implementation.

3. Factor the afferent, efferent, and transformcentered branches to form a hierarchical program structure. This first- cut structure chart specifies a good design for the resulting system. 4. Refine and optimize the structure

chart to improve the design generated in step 3. This step transforms a good design into an excellent design and relies on both textbook knowledge and the expertise gained through years of experience.

Figure 1 shows the phases required in STES to transform a dataflowdiagram requirements specification into a structurechart design specification:

30 IEEE Software

extracting features, factoring, refining structure charts, and representing structure charts.

In the featureextraction phase, STES analyzes the behavior in a dataflow diagram and extracts all salient features and converts them into production rules. These rules, which contain information about the data paths in the system being designed and the degree of difference between input and output dataflows, let the other components of STES use the information extracted from the dataflow diagram.

In the factoring phase, STES applies inference to identify and transform the e f fe ren t , afferent, a n d transformcentered components of the dataflow diagram into a first-cut structure chart. The result of this step is a structure chart speci- fymg a balanced system,8 which is a good design that still has room for improvement.

In the structure-chart refining phase, several refinement criteria (such as coupling, cohesion, fan-in, and fan-out) are provided as feedback to the dataflowdiagram designer. The designer can trim his dataflow diagram and modify the input specification accordingly.

In the structure-chart representation phase, the system generates a template describing the structure chart and the criteria measurements from the internal OPS5 representations produced in the structurechart refining phase. The resulting template is translated into a structurechart diagram using the following procedure:

Draw a box that is the root of the structure chart and assign this module the proj- ect name.

*Find structures that are the children of root, place them under the root box, draw line segments connecting the root and children, place input and output going up or flowing down along the line segment, and put the verb and name in the box as the process name.

Choose a box found in the previous step, use it as a root, and apply the same procedure as in the previous step.

Repeat the previous step until all the bubbles in the dataflow diagram are converted.

Knowledgdmse representations

To facilitate the reasoning process, the essence of the dataflow diagram must first be represented in a form that STES can use. Adataflow diagram is made up of four basic elements:

dataflows, symbolized by named vec- tors,

processes, symbolized by bubbles, data stores (files), symbolized by a pair

data sources and sinks, symbolized by

Figure 2 shows four basic elements of a

ofstraight lines, and

boxes.

dataflow diagram.

Dataflow structure. In an OPSS-based expert system such as ours, the pertinent features of the diagram are depicted by creating a data-structure bubble for each dataflow through a given process bubble. Each dataflow can be portrayed as follows:

(literalize bubble name input inputfrom output outputto difference used num)

This structure contains information about the name, the input and where it is from, the output and where it goes, the process number, and a flag that shows if the process has been used in the structure chart. It also contains an attribute called Difference, which shows the difference between the input and the output. An OPS5- based expert system createsa word table to record the degree (such as “same,” “similar,” or “different”) of the differences among different terminologies for a certain application domain. An OPSS-based expert system uses this attribute to locate the afferent, efferent, and transformcentered components of the dataflow diagram.

Structurechart elements. Also required is a data type to represent the end product: thestructurecharts. Astructurechartcon- sists of two elements:

boxes, which represent the modules of the software system, and

I Data store

Fgure 2. Basic structure of a dataflow diagram

Software

F i r e 3. Basic structure of four kinds of structure charts: (a) afferent flow, (b) efferent flow, (c) transform flow, and (d) coordinate flow.

couples (arrows), which represent the control and dataflow among the software modules.

Using these two basic elements, an OPS5 expert system can define four kinds of flow structures-afferent, efferent, transform, and coordinate flows - and specify a structure chart of the transformed system. Figure 3 shows the structure of four kinds of structure charts. In STES, a box in the structure chart is represented as

(literalize structure name childof upinput upoutput downinput downoutput repetition-bi t fan-in fan-in-con mol-hit fan+ut fansut-con mol-bi t verb num)

This data type contains information about the module’s name, parent, and dataflow

November 1988 31

RULE AFFEKENT

IF

THEN

For the process X and the process it outputs data to Y the input of X is similar to its output AND the input of Y is different from its output we know X is an AFFERENT process

(a)

RULE EFFERENT

IF

THEN

For the process X and the process it receives data from Y the input of X is similar to its output AND the input of Yis different from its output we know X is an EFFERENT process

(b)

RULE TRANSFORM IF THEN

the input of the process X is different from its output we know X is a TRANSFORM process

(c)

Figure4. Production rules to determine (a) afferent, (b) efferent, and (c) transform components.

RULE REPETITION-INITIATION IF

not been checked AND there exists a process AA derived from DFD whose redundancy bit has

there is nothing in the redundancy list named by AA THEN set redundancy bit ofAAAND

create a new name AA in the redundancy list AND setMSvalue to7ero

RULE COUNT-REPETITION IF there exists a process AA derived from DFD whose redundancy bit has

there exists a name AA in the redundancy list set redundancy bit ofAA AND add one to AA's value

not been checked AND

THEN

(a)

RULE FAN-IN-IN ITIAI ,I ZATION IF there exists a process AA derived from DFD whose fan-in bit has not

there is no name AA in fan-in list

create a new name AA in fan-in list AND set AA's value to zero

been checked AND

THEN setfan-in bitofAAAND

RULE FAN-IN-COUNT IF there exists a process AA derived from DFD whose fan-in bit has not

the name AA is in fan-in list

add one to AA's value

been checked AND

THEN setfan-in bitofAAAND

RULE WRITE-FAN-IN IF all processes' fan-in bits have been checked AND

process AA's fan-in has not been determined AND the value of AA in fan-in list is X AND the value of AA in redundancy list is Y the fan-in ofAA is X minus Y THEN

(b)

Figure 5. Production rules to (a) check redundancy and (b) count fan-in.

32

to and from above and below. The attri- butes fan-in and fanaut serve as refinement criteria. Fan-incontrol-bit and fan- out-control-bit are flags to control fan-in and fm-out, respectively. The attribute Verb identifies dataflow through the in- dividual structure name. There are two verbs used: Get for afferents and Put for efferents; no verb is used for transforma- tions.

Stack and control. In addition to the r e p resentation of dataflow diagrams and structure charts, an OPSS-based expert system's knowledge base uses a stack and several control conditions.

An OPSS-based expert system uses a data-type root to keep track of which part of the structure chart is being transformed. It uses the stack to record thisroot information when another part of the dataflow diagram is being factored. Sever- aloperations-such aspush, pop, and u p date -that use the data type stack-opera- tion are associated with the stack.

An OPS5-based expert system repre- sents the condition for controlling rules with an attribute called Status. When a production rule sets Status to lazy, the inference engine is notified that it should concentrate on the stack designated via the latest stackaperation. When Status is assigned as active, the inference engine ig- nores all stack operations.

Production rules. Figure 4 shows a set of rules in the heuristic design process to determine the afferent component, the efferent component, and the transform component. Figure 5 shows the rules to check redundancy and count fan-in.

Inference mechanism The first step in the structureddesign

methodology is to identify the highest level afferent and efferent data compe nents. People recognize these components based on the difference in meaning between their names. Expert systems must try to approximate this ability.

To simulate this process, you could im- pose the restriction that, for a particular problem domain, names from a standard set are used when naming the datacompe nents. Your expert system could then create aword table to record the degree of

IEEE Software

0 OKmaster . old master record check sum

/ new master area

Figure 6. Example dataflow diagram of a simple master-file update program.

difference between these names. Another approach is to let the software

designers responsible for generating the dataflow diagrams spec15 the transform center when constructing the diagrams. Because these designers are close to the problem domain, they can categorize this

area on the dataflow diagrams with little extra effort. The information is available at this time and should be captured for later use. Specifying the transform center, assigning all associated flows a difference of “different”and giving remaining flows a difference of “simi1ar”is sufficient for later

F i r e 7. Interface of the CASE tool used by the expert system.

November 1 988

identifying the type ofdataflow. This is thc approach that we used in the latest version of STES.

After STES has determined the difference between flows, it can find the types. First, to identify afferent components, it traces each data component along the input data path until it finds a component ACwith a different meaning from its immediate predecessor. All the data components before ACalong the input data path are then identified as afferent components. STES uses a similar approach to 1m cate the efferent components. The rest of the components in the dataflow diagram are identified as transform-centered coin ponents

Having factored out these first-level afferent, efferent, and transfornicentered components, STES produces the structure chart by representing each comptr nent as a single box. The box at the root of the structure chart is denoted as a control component. This factoring process is then repeated for the next level of data components in the structure chart until all the bubblesin the dataflow diagram are represented.

After all the bubbles have been trans formed into the structure chart, STES determines which areas of the design need further refinement. With feedback on cri-

33

34 IEEE Software

teria such as coupling, cohesion, fan-in, and span of control, you can trim your dataflow diagram and modify the input specification accordingly. By cycling through userdataflow diagraming, system feedback, user-modified-dataflow diagraming, and system feedback, you specify your software system through stepwise refinement.

Example T h e example dataflow diagram in

Figure 6 models a simple master-file u p date p r ~ g r a m . ~ We generated the diagram in this figure with a structured-analysis tool, one of Cadre Technologies’ Team- work tools, that runs on an Apollo worksta- tion. Figure 7 shows the tool’s user interface. Using a program written in C++ that used the Cadre’s Access tool, we converted the diagram generated by Teamwork into an OPS5 production rule for input into STES. Access gives a conventional programming language direct access to the Teamwork tools’ internal representation of dataflow diagrams and structure charts.

Based on the dataflow-diagram de-

Ref ecences I .

2.

3.

4.

5.

6.

7.

8.

9.

S.S. Yau and J.J-P. Tsai, “A Survey of Soft- ware Design Techniques,” @!E Trans. SoJ- wareEng.,June 1986, pp. 713-721. D. Srirain and M.D. Rychener, “Expert Sys- tems for Engineering Applications,” IME S3j7uarr; March 1986, pp. 3-5. R. Balzer, T. Cheatham, and C. Green, “Software Technology in the 1990s: Using a New Paradigm,” Computm, Nov. 1983, pp. 39-45. T. DeMarco, Structured Analysis and System .Sprrijration, Yourdon Press, New York, 1978. D. Ross, “Structured Analysis: A Ianguage for Communicating Ideas,” E%E Trans. S?jruareEng.,Jan. 1977, pp. 6-15. M. Alford, “A RequirementsEngineering Methodology for Real-Time Processing Re- quirements,” @33: Trans. SojwareEng., Jan. 1977, pp. 60-69. E. Yourdon and I.. Constantine, Shuchmd Destgn: Fundamental of fiscipline of Computer Program and S y s t m Deszgn, Prentice-Hall, Englewood Cliffs, NJ., 1979. M. Page-Jones, The Prarticul Guide to Struc- tured System Design, Yourdon Press, New York, 1980. C.L. Forgy, “OPS5 User’s Manual,” Tech. Report CMU-CS-81-135, Computer Science Dept., Carnegie Mellon Univ., Pittsburgh,July 1981.

igner’s knowledge, we gave the process bubbles associated with the transform center the Difference attribute value of “different.” We chose processes 6 and 7 as the transform center for the diagram in Figure 6. We then loaded the resultingfile, which contains a production rule named Init, into STES to provide the information extracted from the dataflow diagram.

Now that the dataflow diagram was r e p resented, STES could use its knowledge of the structured-design methodology to produce the structure chart shown in Figure 8.

e believe a system using our a p proach can provide a frame- W work for the incremental accu-

mulation of design experience and automate part of software development.

One area that needs improvement is the efficient use of the information in the requirements specification. We must begin using other components of the specification, such as the data dictionary, to determine the weak spots in the latest refinement of the design. We believe that the structured-analysis methodology will re-

quire extensions such as control-flow diagrams to be useful in the design of real- time systems.

In its current state, STES is useful as a prototype demonstrating the feasibility of our approach. It could be used as a train- ing tool for novice software engineers learning structured design or perhaps to provide a software engineer a rough guess about the final structure of a system being designed.

As the production rules are updated to incorporate new design heuristics, a tool like ours can evolve and prolide true expertise in the translation of requirements specifications into design specifications. *:*

Acknowledgments We thank theother membersofourresearch

team: Alan Liu and Kavin Hsueh. We also thank Murat Tanik and the anonymous refer- ees for their commenu on earlier versions of this article. The effort of those involved in re- viewing the article, including Glenn Kacmr and Dean Vogler, is also appreciated.

This work is supported i n part by the National Science Foundation under grant CCR-8809381 and by the research board of the University of Illinois at Chicago.

Jeffrey J.-P. Tsai is an assistant professor of computer science at the University of II- linois at Chicago. Previously, he worked as a systems engineer for Digital Equipment Corp. His research interests include knowledge-based software systems, distributed software systems, real-time software test- ing, and debugging techniques.

Tsai received a PhD in computer science from Northwestern University. He has been a member of the program committee of the IEEE International Computer Soft- ware and Applications Conference since 1987. He is also in the editorial review board of the computer-science book series pliblished by Slawson Communication. He is a member ofAAAI, ACM, IEEE, and U p silon Pi Epsilon.

Joel C. Ridge is a senior software engineer at Motorola Communications. His research interests include object-oriented design, programming methodologies, and artificial intelligence.

Ridge received a BS in electrical and computer engineering from the University of Cincinnati and an MS in electrical engineering and computer science from the University of Illinois at Chicago. He is a member of the IEEE Computer Society.

Address questions about this article to Tsai at EECS Dept., University of Illinois, Chicago, IL 60680.

November 1 988 35

Documents

Intelligent support for specifications transformation