BIT 3105-LECTURE 4
Software Re-engineeringRef: Somerville 6th Ed Chapter 28
1
So what is S/W Re-engineering?
• We have so far seen that s/w re-engineering is the process of restructuring and reorganizing the software to facilitate future changes without adding new functionality.
• It is applicable where some but not all sub-systems of a larger system require frequent maintenance
• Re-engineering systems involves adding effort to make them easier to maintain. The system may be re-structured and re-documented
2
Advantages of S/w Re-engineering
• S/w re-engineering has two advantages over more radical approaches to systems evolution
– Reduced risk. There is a high risk in new software development. There may be development problems, staffing problems and specification problems. Planning is difficult and delays are expensive
– Reduced cost. The cost of reengineering is often significantly less than the costs of developing new software.
3
When Should Re-engineering be Done?
• When system changes are mostly confined to part of the system then re-engineer that part
• When hardware or software support becomes obsolete
• When tools to support re-structuring are available
4
S/W Re-engineering Vs Conventional Development
• Conventional development starts with requirements, and proceeds through the other usual phases.
• In re-engineering the old system acts as a specification for the new system.
5
Factors affecting s/w re-engineering costs?
• The cost of re-engineering a s/w can be affected by;– The quality of the software to be re-engineered
– The tool support available for re-engineering
– The extent of data conversion required
– The availability of expert staff• This can be a problem with old systems based on technology
that is no longer widely used
6
So what is involved in S/W Re-engineering?
• S/w re-engineering may involve the following;
– Source code translation
• Converting the code to a new language
– Reverse engineering
• Analyzing the program to understand it
– Program structure improvement
• Restructuring the program automatically for understandability
– Program modularization
• Re-organizing the program structure
– Data re-engineering
• Cleanup and restructure the system data
Note: We cover these in detail in the next slides 7
The S/w Re-engineering Process
Reverseengineering
Programdocumentation
Datareengineering
Original data
Programstructure
improvement
Programmodularisation
Structuredprogram
Reengineereddata
ModularisedprogramOriginal
program
Source codetranslation
8
1. Source code Translation
• Involves converting the code from one language (or language version) to another e.g. FORTRAN to C
• May be necessary because of:
– Hardware platform update
– Staff skill shortages
– Organisational policy changes
• Only realistic if an automatic translator is available
9
The Program Translation Process
10
Automaticallytranslate code
Design translatorinstructions
Identify sourcecode differences
Manuallytranslate code
System to bere-engineered
System to bere-engineered
Re-engineeredsystem
2. Reverse engineering
• This involves analysing a software in order to understand its design and specification
• It may be part of a re-engineering process but may also be used to re-specify a system for re-implementation
• It builds a program database and generates information from this database
• Program understanding tools (like browsers, cross-reference generators, etc.) may be used in this process
• Reverse engineering often precedes re-engineering but is sometimes worthwhile in its own right
– The design and specification of a system may be reverse engineered so that they can be an input to the requirements specification process for the system’s replacement
– The design and specification may be reverse engineered to support program maintenance
11
The Reverse Engineering Process
12
Data stucturediagrams
Program stucturediagrams
Traceabilitymatrices
Documentgeneration
Systeminformation
store
Automatedanalysis
Manualannotation
System to bere-engineered
Note: A traceability matrix is a document, usually in the form of a table, that correlates any two base-lined documents that require a many to many relationship to determine the completeness of the relationship. It is often used with high-level requirements (these often consist of marketing requirements) and detailed requirements of the software product to the matching parts of high-level design, detailed design, test plan, and test cases.
3. Program Structure Improvement
• Maintenance tends to corrupt the structure of a program. It becomes harder and harder to understand
• The program may be automatically restructured to remove unconditional branches
• Conditions may be simplified to make them more readable
• See example on next slide 13
Example: Code Simplification
• Suppose during maintenance, a certain condition was changed to this one below;
if not (A > B and (C < D or not ( E > F) ) )...
• During program structure improvement, the above complex condition could be simplified as shown below;
if (A <= B and (C>= D or E > F))...
14
Automatic Program Restructuring
15
Graphrepresentation
Programgenerator
Restructuredprogram
Analyser andgraph builder
Program to berestructured
Problems with Program Structure Improvement
• Problems with re-structuring are:
– Loss of comments
– Loss of documentation
– Heavy computational demands
• Restructuring doesn’t help with poor modularisation where related components are dispersed throughout the code
• The understandability of data-driven programs may not be improved by re-structuring
16
4. Program Modularization
• This is the process of re-organising a program so that related program parts are collected together in a single module
• Usually a manual process that is carried out by program inspection and re-organisation
• To modularize a program, you have to identify relationships between components and work out what these components do. Browsing and visualization tools help but it is impossible to automate this process completely.
17
Module Types
• Data abstractions
– Abstract data types where data structures and associated operations are grouped
• Hardware modules
– All functions required to interface with a hardware unit. They are closely related to data abstractions and collect together all of the functions which are used to control a particular hardware device.
• Functional modules
– Modules containing functions that carry out closely related tasks
• Process support modules
– Modules where the functions support a business process or process fragment
• Read pages 632-633 of Somerville 6th Edition for more on this.18
Recovering Data Abstractions
• Many legacy systems use shared tables and global data to save memory space
• Causes problems because changes have a wide impact in the system
• Shared global data may be converted to objects or ADTs
– Analyse common data areas to identify logical abstractions. It will often be the case that several abstractions are combined in a single shared data area.
– Create an ADT or object for these abstractions. If the programming language does not have data hiding facilities, simulate an abstract data type by providing functions to update and access all fields of the data.
– Use a program browsing system or a cross-reference generator to find all references to data. Replace these with calls to the appropriate functions.
19
Data Abstraction Recovery
• Analyse common data areas to identify logical abstractions
• Create an abstract data type or object class for each of these abstractions. That ADT will help to represent a collection of those common data areas.
• Provide functions to access and update each field of the data abstraction
• Use a program browser to find calls to these data abstractions and replace these with the new defined functions
20
5. Data Re-engineering
• This involves analysing and reorganising the data structures (and sometimes the data values) in a program
• May be part of the process of migrating from a file-based system to a DBMS-based system or changing from one DBMS to another
• Data re-engineering may not be necessary if the functionality of the system is unchanged
• The objective is to create a managed data environment
• Read pages 634-638 of Somerville 6th Edition for more on data re-engineering.
21
Data Re-engineering Approaches
22
Data Problems
• End-users want data on their desktop machines rather than in a file system. They need to be able to download this data from a DBMS
• Systems may have to process much more data than was originally intended by their designers
• Redundant data may be stored in different formats in different places in the system
23
Example of re-organizing data
24
Databasemanagement
system
Logical andphysical
data models
describes
File 1 File 2 File 3 File 4 File 5 File 6
Program 2 Program 3
Program 4 Program 5 Program 6 Program 7
Program 1
Program 3 Program 4 Program 5 Program 6
Program 2 Program 7
Program 1
Becomes
Examples of Data Problems
• Data naming problems
– Names may be hard to understand. The same data may have different names in different programs
• Field length problems
– The same item may be assigned different lengths in different programs
• Record organisation problems
– Records representing the same entity may be organised differently in different programs
• Hard-coded literals
• No data dictionary25
Data Conversion
• Data re-engineering may involve changing the data structure organisation without changing the data values
• Data value conversion is very expensive. Special-purpose programs have to be written to carry out the conversion
26
The Data Re-engineering Process
27
Entity namemodification
Literalreplacement
Data definitionre-ordering
Datare-formattingDefault valueconversion
Validation rulemodification
Dataanalysis
Dataconversion
Dataanalysis
Modifieddata
Program to be re-engineered
Change summary tables
Stage 1 Stage 2 Stage 3
Summary
• The objective of re-engineering is to improve the system structure to make it easier to understand and maintain
• The re-engineering process involves source code translation, reverse engineering, program structure improvement, program modularisation and data re-engineering
• Source code translation is the automatic conversion of program in one language to another
28
Summary- Cont’d
• Reverse engineering is the process of deriving the system design and specification from its source code
• Program structure improvement replaces unstructured control constructs with while loops and simple conditionals
• Program modularisation involves reorganisation to group related items
• Data re-engineering may be necessary because of inconsistent data management
29