Upload
softwarecentral
View
934
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
1
Software Reliability Engineering: A Roadmap
Michael R. Lyu
Dept. of Computer Science & Engineering The Chinese University of Hong Kong
Future of Software EngineeringICSE’2007
Minneapolis, MinnesotaMay 24, 2007
2
Introduction
Software reliability is the probability of failure-free operation with respect to execution time and environment.
Software reliability engineering (SRE) is the quantitative study of the operational behavior of software-based systems with respect to user requirements concerning reliability.
SRE has been adopted by more than 50 companies as standards or best current practices.
Creditable software reliability techniques are still in urgent need.
3
Historical SRE Techniques: Fault Lifecycle
Fault prevention: to avoid, by construction, fault occurrences.
Fault removal: to detect, by verification and validation, the existence of faults and eliminate them.
Fault tolerance: to provide, by redundancy and diversity, service complying with the specification in spite of manifested faults.
Fault/failure forecasting: to estimate, by statistical modeling, the presence of faults and occurrence of failures.
4
Fault Lifecycle Technique
Fault Manifestation and Modeling Process
Reliability
Fault Prevention
Fault Removal
Fault Tolerance
Fault/Failure Forecasting
5
Fault Lifecycle Technique
Fault Manifestation and Modeling Process
Reliability Availability Safety Security
Fault Prevention
Fault Removal
Fault Tolerance
Fault/Failure Forecasting
6
Software Reliability Modeling
Execution Time
Failure Rate
PresentAdditional Time
Present
Objective
R = e -t
Testing Time
7
Current SRE Process Overview
8
Current Trends and Problems
The theoretical foundation of software reliability comes from hardware reliability techniques.
Software failures do not happen independently. Software failures seldom repeat in exactly the
same or predictable pattern. Failure mode and effect analysis (FMEA) for
software is still controversial and incomplete. There is currently a need for a creditable end-to-
end software reliability paradigm that can be directly linked to reliability prediction from the very beginning.
9
Future Direction 1: Reliability-Centric Software Architectures
The product view – achieve failure-resilient software architecture Fault prevention Fault tolerance
The process view – explore the component-based software engineering Component identification, construction,
protection, integration and interaction Reliability modeling based on software structure
10
Future Direction 2: Design for Reliability Achievement
Fault confinementFault detectionDiagnosisReconfigurationRecoveryRestartRepairReintegration
Fault Confinement
Fault Detection Fault Detection
Failover Diagnosis
Online Offline
Reconfiguration
Recovery
Restart
Repair
Reintegration
12
Future Direction 3: Testing for Reliability Assessment
Establish the link between software testing and reliability
Study the effect of code coverage to fault coverage
Evaluate impact of reliability by various testing metrics
Assess competing testing schemes quantitatively
13
Positive vs. negative evidences for coverage-based software testing
Resources Findings
Positive
Frankl(1988)
Horgan(1994)
Weyuker(1988)
High code coverage brings high software reliability and low failure rate
Chen(1992) A correlation between code coverage and software reliability
is observed
Wong(1994) The correlation between test effectiveness and block coverage is higher than that between test effectiveness and the size of test set
Frate(1995) An increase in reliability comes with an increase in at least one code coverage measures
Cai (2005) Code coverage contributes to a noticeable amount of fault coverage
Negative Briand(2000) The testing result on published data did not support a causal
dependency between code coverage and defect coverage
14
RSDIMU test cases description
I
II
IIIIV
V
VI
15
The correlation: various test regions
Linear modeling fitness in various test case regions
Linear regression relationship between block coverage and fault coverage in the whole test set
Fault Coverage
16
The correlation: normal operational testing vs. exceptional testing
Normal operational testing very weak correlation
Exceptional testing strong correlation
Testing profile (size) R-square
Whole test case (1200) 0.781
Normal testing (827) 0.045
Exceptional testing (373) 0.944
17
The correlation: normal operational testing vs. exceptional testing
Normal testing: small coverage range (48%-52%) Exceptional testing: two main clusters
Fault CoverageFault Coverage
18
The Spectrum in Software Testing and Reliability
Software ReliabilityGrowth Models
New Model Coverage-Based Analysis
• A new model is needed to combine execution time and testing coverage
Time Based Models
CoverageBasedTesting
- user oriented - tester oriented- more physical meaning - less physical meaning - abundant models - lack of models- easy data collection - hard data collection- less relevance to testing - more relevance to testing
19
A New Coverage-Based Reliability Model
λ(t,c): joint failure intensity function λ1(t): failure intensity function with respect to time
λ2(c): failure intensity function with respect to coverage
α1,γ1, α2, γ2: parameters with the constraint of
α 1 + α 2 = 1
joint failure intensity function
failure intensity function with time
failure intensity function with coverageDependency
factors
20
Estimation Accuracy
21
Future Direction 4: Metrics for Reliability Prediction
New models (e.g., BBN) to explore rich software metrics
Data mining approachesMachine learning techniquesBridging the gap of the one-way function:
feedback to building reliable softwareContinuous industrial data collection efforts
– demonstration of cost-effectiveness
22
Future Direction 5: Reliability for Emerging Software Applications
“The Internet changes everything”On-demand customizable softwareService oriented architecture, composition,
integrationCustomization by middleware – from
metadata to metacodeA common infrastructure delivers reliability
to all customers
23
Replication Manager
Web service selection algorithm
WatchDog
UDDI
Registry
WSDL
Web ServiceIIS
Application
Database
Web ServiceIIS
Application
Database
Web ServiceIIS
Application
Database
Client
Port
Application
Database
1. Create Web services
2. Select primary Web service (PWS)
3. Register
4. Look up
5. Get WSDL
6. Invoke Web service
7. Keep check the availability of the PWS
8. If PWS failed, reselect the PWS.
9. Update the WSDL
A Paradigm for Reliable Web Service
24
ConclusionsSoftware reliability is receiving higher
attention as it becomes an important economic consideration for businesses.
New SRE paradigms need to consider software architectures, testing techniques, data analyses, and creditable reliability modeling procedures.
Domain specific approaches on emerging software applications are worthy of investigation.
Still a long way to go, but the directions are clear.