19
94 CANADIAN JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, VOL. 37, NO. 2, SPRING 2014 Reliability Estimation for Component-Based Software Product Lines Estimation fiable de lignes de produits logiciels à base de composantes Ebrahim Bagheri, Senior Member, IEEE , and Faezeh Ensan Abstract—The objective of the software product line engineering paradigm is to enhance the large- scale reuse of common core assets within a target domain. Reuse is facilitated by systematically organizing and modeling the core assets and the relationships between them. One of the main core assets of a domain is the model for representing the available functional aspects, often known as features, within structured forms such as feature models. The selection and composition of the most suitable or desirable set of features for a given purpose allows the rapid development of new final products from the software product line. Product developers are, in most cases, not only interested in building applications that possess certain functional characteristics but are also concerned with nonfunctional properties of the final product, such as reliability. To this end, we propose a component- based software product line reliability estimation model that is able to provide lower and upper reliability bounds guarantees for a software product line feature model, its specializations and configurations. Our model builds on top of the reliability of the individual features that are present in the product line and provides best- and worst-case estimates. Our work is based on an essential and widely used assumption that features are implemented using self-contained software components or services whose reliability can be determined independently. We also propose reliability-aware configuration methods that ensure the satisfaction of both functional and reliability requirements during the application development process. We offer our observations and insight into the performance of our reliability estimation model and provide analysis of its advantages and shortcomings. Résumé—L’objectif du paradigme en ingénierie de la ligne de produits des logiciels est d’améliorer la réutilisation à grande échelle des biens essentiels communs dans un domaine cible. La réutilisation est facilitée par l’interaction entre l’organisation systématique et la modélisation des actifs de base. L’un des principaux atouts de base d’un domaine est le modèle de représentation des aspects fonctionnels disponibles, souvent connus en tant que caractéristiques, au sein des formes structurées telles que les modèles de fonction. La sélection et la composition de l’ensemble de caractéristiques les plus appropriées ou souhaitables pour un objectif quelconque permettent le développement rapide de nouveaux produits finaux de la ligne de produits des logiciels. Les développeurs de produits sont, dans la plupart des cas, non seulement intéressés par la conception d’applications comprenant certaines caractéristiques fonctionnelles, mais sont également concernés par les propriétés non fonctionnelles du produit final, telle que la fiabilité. À cette fin, nous proposons un modèle fiable d’estimation de gamme de produits logiciels basé sur les composantes qui est en mesure de fournir des garanties de fiabilité inférieures et supérieures aux limites pour un modèle de fonction de ligne de produits logiciels, ainsi que pour ses spécialisations et ses configurations. Notre modèle s’appuie sur la fiabilité des caractéristiques individuelles qui sont présentes dans la gamme de produits et fournit des estimations des meilleures et des pires cas. Notre travail est basé sur une hypothèse essentielle et largement utilisée qui stipule que les fonctionnalités sont implémentées en utilisant des composantes ou des services de logiciels autonomes dont la fiabilité peut être déterminée indépendamment. Nous proposons également des méthodes de configuration fiables qui assurent la satisfaction des besoins fonctionnels et sûrs au cours du processus de développement d’applications. Nous présentons également dans cet article nos observations et un aperçu de la performance de notre modèle d’estimation de la fiabilité ainsi qu’une analyse de ses avantages et inconvénients. Index Terms—Automated configuration, feature models, nonfunctional properties, reliability estimation, software product lines. I. I NTRODUCTION S OFTWARE product line engineering is a paradigm that facilitates systematic reuse through variability model- ing [1] and by the adoption of the concepts of mass pro- Manuscript received August 22, 2013; revised November 6, 2013; accepted December 24, 2013. Date of current version August 15, 2014. E. Bagheri is with Ryerson University, Toronto, ON M5B 2K3, Canada (e-mail: [email protected]). F. Ensan is with Athabasca University, Athabasca, AB T9S 3A3, Canada (e-mail: [email protected]). Associate Editor managing this paper’s review: Vahid Garousi. Digital Object Identifier 10.1109/CJECE.2014.2323958 duction and mass customization for the software development domain [2]. A software product line is in essence an efficiently managed set of shared software assets, often referred to as core assets [3], that comprehensively describe the functionality of a target domain. It enables software developers to rapidly model and build new software applications for a target domain through the reuse of core assets. It is widely believed that effective implementation of software product lines for complex domains can result in considerable cost and time saving in the software development process [4]. An example of this is the syngo.via platform developed by Siemens Healthcare. 0840-8688 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Reliability estimation for component-based software product lines

  • Upload
    faezeh

  • View
    213

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Reliability estimation for component-based software product lines

94 CANADIAN JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, VOL. 37, NO. 2, SPRING 2014

Reliability Estimation for Component-BasedSoftware Product Lines

Estimation fiable de lignes de produits logicielsà base de composantes

Ebrahim Bagheri, Senior Member, IEEE, and Faezeh Ensan

Abstract— The objective of the software product line engineering paradigm is to enhance the large-scale reuse of common core assets within a target domain. Reuse is facilitated by systematicallyorganizing and modeling the core assets and the relationships between them. One of the main coreassets of a domain is the model for representing the available functional aspects, often known asfeatures, within structured forms such as feature models. The selection and composition of the mostsuitable or desirable set of features for a given purpose allows the rapid development of new finalproducts from the software product line. Product developers are, in most cases, not only interestedin building applications that possess certain functional characteristics but are also concerned withnonfunctional properties of the final product, such as reliability. To this end, we propose a component-based software product line reliability estimation model that is able to provide lower and upper reliabilitybounds guarantees for a software product line feature model, its specializations and configurations.Our model builds on top of the reliability of the individual features that are present in the productline and provides best- and worst-case estimates. Our work is based on an essential and widely usedassumption that features are implemented using self-contained software components or services whosereliability can be determined independently. We also propose reliability-aware configuration methods thatensure the satisfaction of both functional and reliability requirements during the application developmentprocess. We offer our observations and insight into the performance of our reliability estimation modeland provide analysis of its advantages and shortcomings.

Résumé— L’objectif du paradigme en ingénierie de la ligne de produits des logiciels est d’améliorerla réutilisation à grande échelle des biens essentiels communs dans un domaine cible. La réutilisationest facilitée par l’interaction entre l’organisation systématique et la modélisation des actifs de base. L’undes principaux atouts de base d’un domaine est le modèle de représentation des aspects fonctionnelsdisponibles, souvent connus en tant que caractéristiques, au sein des formes structurées telles que lesmodèles de fonction. La sélection et la composition de l’ensemble de caractéristiques les plus appropriéesou souhaitables pour un objectif quelconque permettent le développement rapide de nouveaux produitsfinaux de la ligne de produits des logiciels. Les développeurs de produits sont, dans la plupart descas, non seulement intéressés par la conception d’applications comprenant certaines caractéristiquesfonctionnelles, mais sont également concernés par les propriétés non fonctionnelles du produit final, telleque la fiabilité. À cette fin, nous proposons un modèle fiable d’estimation de gamme de produits logicielsbasé sur les composantes qui est en mesure de fournir des garanties de fiabilité inférieures et supérieuresaux limites pour un modèle de fonction de ligne de produits logiciels, ainsi que pour ses spécialisationset ses configurations. Notre modèle s’appuie sur la fiabilité des caractéristiques individuelles qui sontprésentes dans la gamme de produits et fournit des estimations des meilleures et des pires cas. Notretravail est basé sur une hypothèse essentielle et largement utilisée qui stipule que les fonctionnalitéssont implémentées en utilisant des composantes ou des services de logiciels autonomes dont la fiabilitépeut être déterminée indépendamment. Nous proposons également des méthodes de configuration fiablesqui assurent la satisfaction des besoins fonctionnels et sûrs au cours du processus de développementd’applications. Nous présentons également dans cet article nos observations et un aperçu de la performancede notre modèle d’estimation de la fiabilité ainsi qu’une analyse de ses avantages et inconvénients.

Index Terms— Automated configuration, feature models, nonfunctional properties, reliability estimation,software product lines.

I. INTRODUCTION

SOFTWARE product line engineering is a paradigm thatfacilitates systematic reuse through variability model-

ing [1] and by the adoption of the concepts of mass pro-

Manuscript received August 22, 2013; revised November 6, 2013; acceptedDecember 24, 2013. Date of current version August 15, 2014.

E. Bagheri is with Ryerson University, Toronto, ON M5B 2K3, Canada(e-mail: [email protected]).

F. Ensan is with Athabasca University, Athabasca, AB T9S 3A3, Canada(e-mail: [email protected]).

Associate Editor managing this paper’s review: Vahid Garousi.Digital Object Identifier 10.1109/CJECE.2014.2323958

duction and mass customization for the software developmentdomain [2]. A software product line is in essence an efficientlymanaged set of shared software assets, often referred to ascore assets [3], that comprehensively describe the functionalityof a target domain. It enables software developers to rapidlymodel and build new software applications for a target domainthrough the reuse of core assets. It is widely believed thateffective implementation of software product lines for complexdomains can result in considerable cost and time saving inthe software development process [4]. An example of thisis the syngo.via platform developed by Siemens Healthcare.

0840-8688 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Page 2: Reliability estimation for component-based software product lines

BAGHERI AND ENSAN: RELIABILITY ESTIMATION FOR COMPONENT-BASED SOFTWARE PRODUCT LINES 95

This platform supports many different functionalities of med-ical imaging devices and has already been configured intomore than 70 diagnostic applications.

One of the common and concrete ways to model andmanage software product lines is to benefit from feature-oriented software development [5], through which uniqueincremental functionalities of the domain are represented bythe so-called features. The collection of all features for adomain, i.e., the feature space, depicts the possible and avail-able functionalities that can be incorporated into an applicationfor that domain. Thus, a software application is representedas the composition of a selected set from the feature space.The feature space is often restricted through structural andintegrity constraints between the features, thereby describingthe restrictions determining a valid feature composition [6].The constraints governing feature relationships within thefeature space can be represented through a hierarchical tree-like structure, known as a feature model [7]. The root of afeature model represents the domain that is being modeledand the other nodes denote domain features.

The development process of software product lines consistsof the dual lifecycle of domain engineering and applicationengineering [8]. The domain engineering lifecycle consistsof activities for exploring, understanding and documentingthe various aspects of a domain. This could include theidentification and formalization of domain features and thedetermination of their interdependencies. More specifically,the goal of this lifecycle is to develop a unique model fordescribing the domain in terms of its functional features, thereference architecture for product development, the detaileddesign, or procurement of suitable components that can real-ize the domain features, and testing strategies that wouldallow for the validation and verification of the product line.The outcomes of this lifecycle will be in the form of a set ofcore assets that capture the variabilities and commonalities ofall possible domain applications [9].

The domain engineering lifecycle builds a platform on topof which the application engineering lifecycle can rapidlybuild new domain applications. The process of building a newproduct involves the understanding of the specific stakehold-ers’ needs, binding the variable points through the selectionof the most desirable features and composing them into aunique product. The gradual selection of the desirable featuresfor a product restricts the product line feature space bybinding the variable points and is often referred to as thespecialization process [10]. A fully specialized product withno possible further refinements and no remaining unboundvariability points is called a configuration representing a finaldomain application. A configuration along with the productline reference architecture and the provided implementationdetails for the features allow the application developers toultimately realize a functional domain application at the endof the application engineering lifecycle.

While the development of applications through thesoftware product line paradigm offers numerous advantages,the identification of defects and failures within the product line

is exponentially harder compared with the individual applica-tions. The main reason for this is the presence of variabilityand feature relationships within a product line [11]. In otherwords, given n features, variability allows up to 2n applicationsto be developed; requiring much more extensive testing foridentifying all possible failures and defects, i.e., O(2n × t) �O(t), where O(t) is the complexity of testing one individualapplication. Various testing strategies and test case generationtechniques such as T-wise testing [12], grammar-based [13]and evolutionary-based [14] test case generation have beenproposed to tame this exponential complexity.

The main goal of testing approaches in software prod-uct lines has been to identify a subset of the product lineconfiguration space, i.e., a limited number of configurations,that if tested would reveal an acceptable number of failures anddefects within the product line. Such techniques will ensurethat a reasonable number of product line defects are detected;however, there is no guarantee that all of the product lineconfigurations would receive the same treatment. Simply put,the objective is to maximize the number of identified defectsand failures across all the configurations and not necessarilyon an individual configuration basis. For this reason, onemight encounter a case where defects of one configurationare uncovered as a result of testing the software product line;while some defects of another configuration remain unnoticed.Therefore, while the overall quality of the software productline increases as a result of testing, no upper bound guaranteeon the number of possible defects can be given on a perindividual configuration basis.

Along the same lines, the idea of considering nonfunctionalproperties during product line configuration stems from theneed to satisfy at least a minimum degree of quality ofservice by the derived final application. For this purpose,nonfunctional properties are often captured and representedalong with the functional aspects of the feature space. How-ever, in many of the proposed techniques such as our own workon the fuzzy representation of nonfunctional properties [15],functional requirements are considered to be hard-constraintsthat should be satisfied while nonfunctional properties areviewed as soft-constraints whose satisfaction is only desirable.In this view, the satisfaction of nonfunctional properties isgood-to-have but not a necessary condition. Furthermore,many nonfunctional properties are:

1) subjective and nonquantifiable in nature;2) nonmeasurable before runtime and hence not

computable during the configuration process inthe application engineering lifecycle.

For these reasons, while it is desirable to satisfy nonfunc-tional requirements during product configuration, it becomesquite difficult to ensure at least a lower bound for thenonfunctional properties of a given configuration. The mostappropriate configuration process would be one that max-imally satisfies both functional and nonfunctional require-ments and is able to provide guarantees in term of an upperbound on the number of possible defects for a developedconfiguration.

Page 3: Reliability estimation for component-based software product lines

96 CANADIAN JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, VOL. 37, NO. 2, SPRING 2014

To be able to model and analyze lower and upper soft-ware program reliability bounds, software reliability predictionmodels have traditionally been developed and employed toestimate the ability of a software system to perform itsexpected tasks under certain conditions for a specific amountof time [16]. Reliability models are mostly probabilistic innature and represent the probability of failure for a softwaresystem, calculated based on prior execution observations madeunder similar conditions for that system [17]. Such models aredeveloped either at:

1) the application level where the application reliability ismonitored as a whole;

2) the component level where each component of theapplication is monitored individually and applicationlevel reliability is modeled through the composition ofits constituent components reliability.

An important aspect of many reliability models is that theyoperate based on the assumption that the software applicationstructure is deterministic and well defined and will not changeeither at compile-time or runtime. With this key assumption,these models are able to turn prior application or componentreliability observations into reliability prediction models [18].However, any changes in the application architecture or any ofits components will impact the dependability of the developedreliability model. Within the context of software product lineswhere variability plays a key role in application developmentsuch reliability models cannot be easily adopted. The lowerand upper bounds of the reliability of a software product linedepends on the reliability of each of the features, their interac-tions, the structural, and integrity constraints over the featurespace, as well as the employed variability patterns [9]. Fur-thermore, the reliability of a configured application dependson the structure of the composition of the selected features andwould only be known once a full configuration is developed.Hence, an appropriate reliability model for the product lineparadigm would need to consider the unique characteristics ofsoftware product lines including feature relationships and vari-ability and allow for the estimation of application reliabilityduring the specialization process in the application engineeringlifecycle.

As discussed in the following section, in this paper, wewill focus on aspects of reliability estimation for the softwareproduct line engineering paradigm and offer concrete modelsto estimate reliability in the face of variability. Our workwill ensure that reliability guarantees (both lower and upperbounds) can be offered throughout the application configura-tion process. The rest of this paper is organized as follows.Section II will outline the specific problem statement andthe overview of our contributions. Section III will introducevariability and composition patterns and how they pertain toreliability estimation. The foundation and details of our relia-bility model are introduced in Sections IV and V. Section VIdiscusses how reliability-aware product line configuration canbe performed. Some observations on the behavior of thereliability model are given in Section VII. This paper proceedswith discussions and related work and is then concluded inSection X.

II. PROBLEM STATEMENT

An important consideration prior to software delivery iswhether the developed software application is able to meetthe expected operational requirements or not. In other words,does the software application satisfy the minimum acceptablelevel of reliability in practice and under specific conditions?In reality, there is no guarantee that a fully tested andfunctional software will behave as it had under stresstest conditions; however, the only reliable source ofinformation for deciding about the reliability of a softwareis either its prior execution history or the results of thetesting process. This information can be used to generalizeoperational characteristics of the software application to buildprobabilistic models that would estimate software reliabilityin a given period of time.

While the development of reliability models for generalsoftware applications has its own set of challenges, reliabilitymodels for software product lines are even more so challengingbecause of issues related to software variability and the logicalseparation between the domain engineering and applicationengineering phases. Here, we outline some of the maincharacteristics of software product lines that set them apartfrom typical software applications.

1) Reliability-sensitive applications are often developedwith reliability considerations in mind from the earlystages of development. Therefore, the issue of satisfyingexpected operational requirements is dealt with duringthe various stages of software development. However,given software product lines are meant to be designedin a way to represent many, if not all, applications of agiven domain, and that each of these applications havetheir own reliability requirements, it is extremely hard,if at all possible, to holistically incorporate reliabilityconcerns in the product line models. In other words,the reliability demands of the applications derivablefrom the same product line can be so different thatwould prevent the product line engineers from uni-formly representing them. For instance, one applicationconfigured from a product line may be safety-critical andrequire high reliability guarantees while another productfrom that same product line could be more flexible withregards to reliability concerns.

2) Besides, the variety of applications that can be derivedfrom the product line, the separation of the domain engi-neering and application engineering phases is anotherchallenge. In complex domains, these two phases arenot necessarily performed with any organized sequen-tial consistency. Furthermore, the engineers undertakingdomain engineering and application engineering rolesmight not necessarily be the same people. Therefore,reliability concerns during the application engineeringprocess might not be known, communicated or consid-ered during the domain engineering process.

3) The reliability of an application is dependent on thechoice of features that are selected and composed.In other words, the feature composition adopted in theconfiguration process will determine the basis on which

Page 4: Reliability estimation for component-based software product lines

BAGHERI AND ENSAN: RELIABILITY ESTIMATION FOR COMPONENT-BASED SOFTWARE PRODUCT LINES 97

reliability should be modeled. The challenge here is thatthe configuration process is often performed in stages,referred to as staged configuration [19], at each stage ofwhich a subset of the desirable features are selected bythe stakeholders or engineers; therefore, the selection offeatures could be suboptimal with regards to reliabilityin the configuration process.

4) The reliability of a final application derived from a soft-ware product line is also dependent on the adopted com-position patterns [20] that determine how the selectedfeatures are combined and put together to operational-ize the application. While product lines often offer areference architecture, its details are quite dependent onthe features that are selected and hence will only bedetermined in detail once the final product feature con-figuration is known. For this reason, it is quite difficult todetermine reliability for a final product during the con-figuration process of the application engineering phase.

In light of these challenges, we are interested in develop-ing a reliability estimation model for the component-basedsoftware product line domain that is able to: 1) estimatethe lower and upper bound reliability for the worst andbest reliable products derivable from a given product line;2) provide reliability predictions during the staged configura-tion process and hence assist in offering reliability guaranteesfor the products that are developed; and 3) develop an optimalconfiguration with respect to reliability given a set of featureselections. We offer the following concrete contributions inour reliability estimation model.

1) We formalize a reliability estimation model that con-siders variability as its central deciding factor. Theestimation model is able to compute lower and upperbound reliability values for all unbound, semispecial-ized and fully configured component-based product linemodels. Our reliability estimation model allows productdesigners and application engineers to make informedreliability-aware decisions with regards to feature selec-tion during the application engineering phase.

2) We propose a configuration process that not onlyconsiders functional requirements, but also takes lowerand upper bound requirements on reliability of thefinal product into account. The configuration process isindependent of composition patterns and considers theselection of worst and best composition patterns betweenthe selected features for calculating the lower and upperbound reliability values.

With the explicit consideration of reliability in the configu-ration process, our work allows engineers to set the lowestacceptable reliability value for an application. In this way,while an application is being developed in the configurationprocess, those configurations whose lower bound reliabilityfalls below the specified threshold will be automaticallydiscarded. The benefit is that a guaranteed lower boundreliability can be ensured for the result of the configurationprocess. The additional advantage of our work is that it buildsthe reliability calculations on top of the reliability of eachindividual feature; therefore, it can easily be integrated with

traditional reliability models that can estimate the reliabilityof each individual feature, e.g., reliability models for softwarecomponents [21], [22].

III. VARIABILITY AND COMPOSITION PATTERNS

In this paper, we make two basic assumptions: the firstassumption is that the interaction between the features withinthe software product line feature space is captured throughfeature models [7]; thereby, any references to a softwareproduct line or product line model in this paper refers to afeature model representation of a product line. The main roleof a feature model is to depict the commonalities and variabil-ities of the products of a domain, which is effectively donethrough the use of variability patterns. This first assumptionwould entail that the only feature interaction patterns that aresupported in our reliability model are those explicitly repre-sented in feature models, i.e., through the integrity constraintssuch as excludes and includes as well as structural parent-child and sibling relationships. The second assumption pertainsto the operationalization of a configuration. We assume thatthere is at least one software component or service1 for eachfeature that can perform the functions intended by that feature.The components related to the selected features in a productconfiguration will be integrated through the use of compositionpatterns (aka workflow patterns) [20].

A. Variability Patterns

Variability patterns are essentially used to depict the inter-relationship between the features of the software product lineand show how they can be potentially configured within thefinal application. These patterns provide the possibility torepresent unbound alternative points between features; there-fore, resulting in a variety of final applications. Variabilitypatterns are typically classified as follows.

1) Mandatory: The feature must be included in the finalproduct if its parent feature is already included.

2) Optional: The inclusion or exclusion of this featurefrom an application is based upon the discretion of theapplication engineer.

3) Alternative Feature Group (AltFG): One and only one ofthe features from the AltFG can be included in the finalapplication if the parent feature of the group is alreadyselected.

4) Or Feature Group (OFG): One or more than one ofthe features from the OFG can be included in the finalapplication if the parent feature of the group is alreadyselected.

5) Feature Group: All of the features from the and featuregroup (AFG) need to be included in the final applicationif the parent feature of the group is already included.

Definition 1 (Feature Model): A feature model FM =GFM(V , E) is a directed acyclic graph consisting of a setof vertices V representing features and edges E ⊆ V × V ×Patterns representing the parent-child relations between the

features, such that patterns = { •f ,

◦f , fOR, fXOR, fAND}, where

1In this paper, we will use the terms component and service interchangeably.

Page 5: Reliability estimation for component-based software product lines

98 CANADIAN JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, VOL. 37, NO. 2, SPRING 2014

Fig. 1. Variability patterns in the feature modeling notation.

Fig. 2. Two generic composition patterns.

•f and

◦f denote mandatory or optional parent-child feature

relations, respectively; and fAND, fXOR and fOR denote AND,Alternative, and OR group relations between the parent-childfeatures with common parents, respectively.

For instance, a feature model with only three features f1,f2, and f3, where f2 is a mandatory child of f1 and f3is an optional child of f1 would be represented as V ={ f1, f2, f3} and E = {( f1, f2,

•f ), ( f1, f3,

◦f )}. Fig. 1 shows

the graphical notation used in feature models to depict thevariability patterns.

B. Composition Patterns

Composition patterns have their roots in workflow manage-ment and are employed to show how the various tasks within aworkflow are assembled to accomplish an objective [20]. In thecontext of software product lines, composition patterns allowthe specification of the application behavior by specifyinghow the selected features are put together and composed interms of execution sequence and control flow to form a finaloperationalized application. Given the context of our work, weonly consider behavioral composition patterns. Compositionpatterns can be roughly grouped into two main groups.

1) Sequential Pattern: The execution of one component ortask can only begin after the completion of the executionof the preceding component or task.

2) Parallel Patterns: Two or more components or tasks canbe executed simultaneously or the execution can be splitinto multiple branches to achieve a specific goal.

It should be noted that there are many variants of parallelpatterns. We group them into one class for the sake of sim-plicity. Fig. 2 shows the representation of these two patterns insimplified business process modeling notation (BPMN) wherethe diamonds represent gateways, rectangles denote activitiesand circles show start/end events.

Based on our second assumption that there is at least onecomponent that operationalizes each feature in the featuremodel, we can now define operational semantics for an appli-cation configured from a software product line as follows.

Definition 2 (Feature Operationalization): Let FM =GFM(V , E) be a feature model and f ∈ V represent anyone leaf node in the feature model, then we define S( f )

as the set of available components that provide a suitableimplementation for feature f .

Definition 3 (Configuration Operationalization): Let V ∗ bethe set of features selected in a configuration process fromthe feature model FM = GFM(V , E) and CV ∗ be a featureoperationalization for V ∗ such that CV ∗ = {c1, . . . , c|V ∗|}; ci ∈S( fi ) and fi = V ∗[i ]. An operational application for CV ∗ ,denoted OPCV ∗ = (V ′, E ′), is a directed acyclic graph, whereV ′ = CV ∗ ∪ s ∪ e (s and e being the start and end states)and E ′ ⊆ V ′ × V ′× Patterns denoting control-flow transitionbetween components where Patterns = {Ps , Pp}. Ps and Pp

denote sequential and parallel patterns, respectively.For instance, given a configuration consisting of only two

features f1 and f2 that can be operationalized using com-ponents c1 and c2, the configuration operationalization fora situation where c1 and c2 would be composed using thesequential pattern would be represented by V ′ = {s, e, c1, c2}and E ′ = {(c1, c2, Ps)}.

The above definition shows how a derived configurationfrom the software product line can be operationalized usingthe composition patterns. Simply put, a configuration is opera-tionalized by selecting one component for each of the selectedfeatures from their set of available components (S( f )) andthen composing these components into an execution sequenceusing either sequential or parallel composition patterns. Inother words, each feature in the configuration will be rep-resented by one of the components that implements it. Thesecomponents will then be connected through the use of the com-position patterns. The end result is a graph (OPCV ∗ ) whosenodes are the components representing the selected featuresand the edges are of composition pattern type connecting thecomponents together. OPCV ∗ provides the execution sequencefor a configuration turning it into an executable application.

Now, in the context of the variability and compositionpatterns, it is worth noting that there are two decisions,among others, that impact the reliability of an operationalizedconfiguration.

1) The selection of the set of features included in theconfiguration.

2) The configuration operationalization in terms of theplacement of components and their interaction throughcomposition patterns.

Page 6: Reliability estimation for component-based software product lines

BAGHERI AND ENSAN: RELIABILITY ESTIMATION FOR COMPONENT-BASED SOFTWARE PRODUCT LINES 99

In practice, such information is only gradually known and maynot even be fully known during some stages of the configu-ration process. Therefore, in our work, we do not necessarilyneed a fully configured model to be able to estimate lower andupper bound reliability. Furthermore, our work is independentof the adopted composition patterns and finds lower and upperbound reliability based on the worst and best case adoptionsof composition patterns for a given configuration.

IV. FOUNDATION OF RELIABILITY MODEL

The base of our reliability model is the reliabilities of thecomponents that will eventually operationalize the softwareproduct line features. Therefore, we will first clearly definethe reliability model that we will employ for each individualcomponent. Based on Definition 2, there is at least one compo-nent for each feature that provides a suitable implementationfor the expected functions of that feature. Formally speaking,∀ f ∈ V ,S( f ) ≥ 1. With this component-based model forfeature operationalization, we base our reliability model onthe reliability of components. The reliability of a componentis often modeled based on the number of observed failuresduring an operational or component testing period.

Definition 4 (True Reliability): Let Fic denote failure of

component c on its i th invocation such that

Fic =

{0, c succeeds on attempt i1, c fails on attempt i.

(1)

We can measure the true reliability of c, T Rc, as follows:

T Rc = 1 −(

limi→∞

1

i

∑i

Fic

). (2)

Assuming that each execution of component c can be com-pleted independently of the other previous invocations of c andthat component failure probability does not change betweenexecutions (unless some change is made in the component),then according to weak law of large numbers [23], as the num-ber of executions tends to infinity, T Rc will converge towardthe true reliability of component c. However, this means thatthe true reliability of a component is only computable basedon an infinite number of component execution observations.Given we cannot infinitely execute or test a component, insteadof computing the true component reliability, we need tomeasure the observed component reliability based on limitedprior execution intervals.

Definition 5 (Observed Reliability): Let [Tk, Tl ] be a timeinterval during which a component is executed Tl − Tk times.The observed reliability of a component, O R[Tk ,Tl ]

c , can bedefined as

O R[Tk ,Tl ]c = 1 −

Tl∑i=Tk

Fic

Tl − Tk(3)

such that with enough intervals, the predicted reliability of the

component in the next interval (R[Tn ,Tn+1]c ) can be estimated

based on the observed reliabilities in prior intervals

R[Tn ,Tn+1]c = f

(O R[T1,T2]

c , O R[T2,T3]c , . . . , O R[Tn−1,Tn]

c)

(4)

where f is a predictive model such as logistic regression, naiveBayes, and support vector machine.

With the above definition, we estimate the reliability ofeach component in its next execution interval based onits reliability in the previous execution intervals. This canbe achieved using a predictive model (f) that would com-pute the likelihood of failure in the next interval based onthe observed failure rates in the previous execution history.In this paper, we will not address the selection of function f.We will make the assumption that any existing predictivemodel, which can offer an accurate enough estimate of thereliability of a component in a future time interval can be used.For the sake of simplicity, we denote R[Tn ,Tn+1]

c as Rc in therest of this paper, which would basically provide an estimatefor the reliability of component c in its next execution interval.

V. PROPOSED RELIABILITY ESTIMATION MODEL

Our proposed reliability model is gradually introduced inthis section by first introducing the lower and upper boundreliability for a single feature, then extending that to composedfeatures and further generalizing the reliability model to acomplete specialization.

A. Model Premises

It is important to note the premises and assumptions behindthe model that we will propose in the following sections.These premises will clarify the restrictions of our model andwould strengthen the usability of our work by ensuring that theproposed model is appropriately deployed in practice. Thesepremises are as follows.

1) We base our reliability model on the seminal workin [24], which assumes that the reliabilities of separatesoftware modules are independent. Given our softwareproduct line model uses a component-based implemen-tation platform for its realization, the assumption regard-ing the separation of modules, i.e., software components,is a realistic one. Furthermore, the independence ofmodules (components) implies that they do not com-pensate for each others’ failure; hence, a failure inone module (component) will not be amended in asubsequent module.

2) Based on this paper in [25] and [24], we assume that thetransfer of control between components is Markovian.Research has already shown that this is a realisticassumption at the macroscopic level [21]. Furthermore,we will consider that for a stable user profile andnonevolving operational execution environment/profile,the Markovian transition probabilities will stayconsistent.

3) For the calculation of reliability at any point in time,we will follow Musa’s model [26] that suggests calcula-tion at a specific time requires the software system to befrozen and the operational profile be stationary. This willrequire that many reliability calculations be performed atsubsequent time intervals to paint a trustworthy pictureof software reliability.

Page 7: Reliability estimation for component-based software product lines

100 CANADIAN JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, VOL. 37, NO. 2, SPRING 2014

4) In light of the fact that we adopt a component-basedsoftware platform for the realization of software productlines, glueware is required to compose the compo-nents into a working software. Deficiencies in gluewarecan impact the reliability of the model. Models havealready been proposed in the literature such as thosepresented in [27] that measure reliability of component-based software with attention to glueware. Our modelin this paper will assume that the employed gluewareare completely reliable. The focus of this paper will bereliability estimation of variability-rich software productlines; therefore, reliability of the glueware will not beconsidered.

These basic premises should be noted when consideringthe proposed reliability model for component-based softwareproduct lines.

B. Reliability of a Feature

As mentioned earlier, to be able to operationalize a feature,suitable components that implement the functionalities repre-sented by that feature are either built or procured; therefore,the operationalization of a feature such as f according toDefinition 2 is accomplished by selecting one component cfrom the set of available components (c ∈ S( f )) for thatfeature. Product line designers will need to make sure thatthere is at least one component per feature that provides therequired implementation, otherwise configurations cannot beoperationalized. Based on this, we define the lower and upperreliability of a feature as follows:

Definition 6 (Feature Reliability Bounds): Let f ∈ V be afeature and S( f ) > 0 be the nonempty set of components thatoperationalize f . We define the feature reliability set (FRS) asthe set containing the reliabilities of all components in S( f )

FRS( f ) = {Rc|c ∈ S( f )}. (5)

The feature reliability bounds can be defined using FRS as

RLf = min

Rc∈FRS( f )Rc (6)

RUf = max

Rc∈FRS( f )Rc. (7)

Based on this definition, the reliability bounds for a featuresuch as f can be defined as [RL

f , RUf ], where the lower bound

is the reliability of the least reliable component implementingthe feature and the upper bound is the reliability of themost reliable component for feature f . The rationale behindthis is that a feature operationalization can be most reliablewhen the component with the highest reliability is selected tooperationalize it and least reliable when the component withthe lowest reliability is chosen.

C. Reliability of Composed Features

The reliability of composed features, e.g., the reliability oftwo interconnected features, is dependent on the compositionpattern that is used for interconnecting them. For this reason,the lower and upper reliability bounds for composed featureswill change as different composition patterns are used between

Fig. 3. Composition of two features f1 and f2: (a) sequential; (b) parallel.

the features. In general, two features can be composed usingthe two generic composition patterns introduced earlier, asshown in Fig. 3. It should be noted that the parallel pat-tern shown in Fig. 3(b) can have many different variationsdepending on the type of the chosen gateway (resulting inover 30 different patterns) but those variations in the parallelpattern will not impact reliability calculations.

Now, depending on the composition pattern that is adopted,the calculation of the lower and upper bound reliability forcomposed features will be different. According to previouslyproposed component-based reliability models, we measure thereliability of two composed features, when they are inter-connected using a sequential pattern shown in Fig. 3(a), asfollows:

Definition 7 (Reliability of Sequential Domposition): Letf1 and f2 be two features and f1 ◦ f2 be the sequentialcomposition of these two features, the reliability bounds ofthe composed features is defined as follows:

RL( f1◦ f2)

=∏(

RLf1, RL

f2

)(8)

RU( f1◦ f2)

=∏(

RUf1, RU

f2

). (9)

In this pattern, given features will be executed one afterthe other, the overall reliability of the composition will bethe result of the product of the individual feature reliabilities.In other words, if two features f1 and f2 are composedusing a sequential pattern, the upper reliability bound of thisnew composition would be equivalent to the multiplicationof the upper reliability bounds of both features f1 and f2.The same is also applicable to the lower reliability boundof the composition of these two features. For instance, ifthe reliability of f1, R f1 = (0.3, 0.7) and the reliabilityof f2, R f2 = (0.4, 0.9), then the reliability bounds forthe sequential composition of these two features would beR( f1◦ f2) = (0.12, 0.63). We further specify the reliability oftwo composed features through parallel patterns, as shown inFig. 3(b), in the following.

Definition 8 (Reliability of Parallel Composition): Letf1 and f2 be two features and f1 • f2 be the parallelcomposition of these two features, the reliability bounds ofthe composed features is defined as follows:

RL( f1• f2)

= min(RL

f1, RL

f2

)(10)

RU( f1• f2)

= max(RU

f1, RU

f2

). (11)

The lower reliability bound of the composition, RL( f1• f2)

, isdue to the possibility of the gateway being an and gateway

Page 8: Reliability estimation for component-based software product lines

BAGHERI AND ENSAN: RELIABILITY ESTIMATION FOR COMPONENT-BASED SOFTWARE PRODUCT LINES 101

in case of which the reliability will be equivalent to thereliability of the weakest feature. Therefore, for two featuresf1 and f2 that have been composed through a parallel com-position pattern, the lower reliability bound will be equal tothe lower reliability bound of the feature with the weakerlower bound. The upper bound, RU

( f1• f2), pertains to a situation

when an XOR gateway is placed between the two parallelfeatures. In this situation, the upper bound for the reliabilityof the composition will correspond to the reliability of themost reliable component in the composition. In other words,the upper reliability bound of the composition will be equalto the upper reliability bound of the feature with a higherupper reliability bound. For the earlier example, the reliabilitybounds for the parallel composition of the two features wouldbe R( f1• f2) = (0.3, 0.9).

It is now possible to calculate the reliability of severalfeatures in the presence of variability patterns using Defini-tions 7 and 8. We assume, based on the existing product linecomposition work such as AHEAD [28], that all compositionswill be done in the form of functional compositions on top of abase feature that we will call B. The calculations for variabilitypatterns can be done as follows.

1) AFGs: The calculation of reliability for the compositionof ‘and’ feature group with B is an extension of thecalculations provided in Definitions 7 and 8. If weassume that f1,…, fn are the set of features in an AFG,the lower and upper bounds for the composition for thesequential pattern will be as follows:

RL(B◦AFG) =

∏(RL

B,

∏i=1..n

RLfi

). (12)

RU(B◦AFG) =

∏(RU

B,

∏i=1..n

RUfi

). (13)

The above equations show that if a group of featuresthat are necessarily required to be composed togetherin a final configuration are sequentially composed withthe base, then the reliability bounds of the compositionare computed by the multiplication of the upper andlower bounds of the base with the upper and lowerbounds of the features in the AFG, respectively. Forinstance, assume that the reliability bounds of the baseare RB = (0.8, 0.9) and the AFG consists of twofeatures f1 and f2 with reliability bounds (0.4, 0.6) and(0.5, 0.8), respectively. The reliability bounds for thesequential composition of the features in this AFG withthe base would be R(B◦AFG) = (0.8 × 0.4 × 0.5, 0.9 ×0.6 × 0.8) = (0.16, 0.432). Similarly for the parallelcomposition pattern, we define

RL(B•AFG) = min

(RL

B, min

i=1..nRL

fi

)(14)

RU(B•AFG) = max

(RU

B, max

i=1..nRU

fi

). (15)

Now, if the features in the AFG were composed in paral-lel with the base, their lower and upper reliability boundswould be computed by finding the minimum and max-imum lower and upper bounds. For the above example,the reliability bounds for the parallel composition of the

features in this AFG with the base would be R(B•AFG) =(min (0.8, min(0.4, 0.5)) , max(0.9, max (0.6, 0.8)) =(0.4, 0.9).We note that the reliability for a mandatory feature isa special case of the AFG where the group consists ofonly one feature.

2) AltFGs: The restriction that only one feature can beselected from within an AltFG impacts the reliabilitycalculations. Unlike AFG, there is no need to take theentire feature of the group into account in the computa-tion of the upper and lower reliability bound. Only themost and least reliable features need to be considered.Assuming f1,…, fn are the features in an AltFG, thereliability bounds for a sequential composition of thefeatures with the base feature B is

RL(B◦AltFG) =

∏(RL

B, min

i=1..nRL

fi

)(16)

RU(B◦AltFG) =

∏(RU

B, max

i=1..nRU

fi

). (17)

The reliability bounds of parallel composition forAltFGs will be exactly the same as the parallel com-position for AFGs, as shown in (14) and (15).

3) OFGs: Unlike an AltFG, an OFG can have more thanone of its features selected. Therefore, multiple featuresmight need to be considered when computing the relia-bility bounds for this variability pattern. We let f1,…, fn

be the features in the OFG, then the reliability boundsfor the sequential composition of these features alongwith the base feature, B, can be formulated as follows:

RL(B◦OFG) =

∏(RL

B,

∏i=1..n

RLfi

)(18)

RU(B◦OFG) =

∏(RU

B, max

i=1..nRU

fi

). (19)

The underlying reasons for defining the reliabilitybounds as shown in (18) and (19) can be explainedas follows. The lower bound reliability for an OFGcan be computed when all of the features in the OFGare selected. Therefore, the OFG will essentially actas an AFGs under this condition; therefore, the lowerbound reliability for an OFGs would be equivalent to thelower bound reliability for an AFGs. Now, on the otherhand, the upper bound reliability for an OFGs wouldbe achieved when only one feature with the highestreliability is selected. This case would be similar to theupper bound reliability of an AltFG, as shown in (17).

4) Optional Features: It is possible to eliminate an optionalfeature to compute upper reliability bounds and toinclude it to compute lower reliability bounds. Whilethe reliability bounds for an optional feature within aparallel composition pattern can be computed through(14) and (15), the reliability bounds for the sequentialcomposition pattern needs to be defined as follows:

RL(B◦ f ) =

∏(RL

B, RL

f

)(20)

RU(B◦ f ) = RU

B. (21)

As indicated above, the reliability calculation includesthe optional feature as its inclusion will result in a lower

Page 9: Reliability estimation for component-based software product lines

102 CANADIAN JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, VOL. 37, NO. 2, SPRING 2014

minimum for the lower reliability bound and excludes itto find the upper limit of reliability.

It should be noted with respect to optional features thatwhile such features are either selected or eliminated, they canbe composed with the rest of the selected features througheither sequential or parallel patterns. For instance, assume thatan optional feature f has been selected to be included in aconfiguration. Now, given the fact that it has been decidedto include f in the configuration, it is no longer an optionalfeature with regards to this specific configuration and it can beviewed as a mandatory feature for the current configuration.Therefore, as mentioned earlier that a mandatory feature couldbe considered a special case of the AFGs, (14) and (15)could be used to estimate reliability for such a feature in aparallel composition. However, the reliability estimations for asequential composition would require different treatment giventhe fact that the optional feature could be eliminated to yielddifferent upper and lower bound reliabilities.

All the above equations allow us to find the upper and lowerreliability for different variability patterns if the compositionpattern is known. However, as mentioned earlier, the compo-sition patterns are not known during the configuration processand hence cannot be used for computing the reliability bounds.For this reason, to ensure that we compute the largest upperand the smallest lower reliability bounds, we need to computethe final upper and lower bounds for each variability patternas follows:

RL = min(RL(◦), RL

(•)) (22)

RU = max(RU(◦), RU

(•)). (23)

These two equations show that given we do not know, whichcomposition pattern will be used for some observed variabilitypattern, we will compute the reliability bounds for bothcomposition patterns and employ the minimum and maximumto represent the upper and lower bounds.

It is now possible to further extend the proposed reliabil-ity model to unbound, semispecialized and fully configuredproduct line models based on the lower and upper reliabilitybounds that have been provided here for both variability andcomposition patterns.

D. Reliability of Configurations

A configuration is a fully bound product line feature modelthat has no variation points. In other words, decisions regard-ing all variation points have been made and a set of finalfeatures have been selected. For this reason, the computationof the upper and lower reliability bounds are not dependenton variability patterns and depend only on the adopted com-position patterns. It is quite straightforward to see that theupper reliability bound is equivalent to the reliability of themost reliable feature in the configuration when the featuresare composed through parallel composition while the lowerreliability bound is the result of the composition of all featuresthrough the sequential pattern.

Definition 9 (Configuration Reliability Bounds): Let V ∗ ={ f1, . . . , f|V ∗|} be a configuration represented as the set of

features selected in a configuration process from the featuremodel FM = GFM(V , E), the lower and upper reliabilitybounds for V ∗ is determined as

RLV ∗ =

∏i=1..|V ∗|

RLfi

(24)

RUV ∗ = max

i=1..|V ∗| RUfi. (25)

The reliability bounds for a configuration are usefulestimates for determining the worst and best possible applica-tions that can be potentially developed using the available setof features within that configuration.

E. Reliability of Specializations

The decisions regarding the features to be included orexcluded from the final configuration gradually restrict thefeature space and limit the set of possible applications thatcan be eventually developed. A restricted feature space isknown as a specialization. The reliability bounds of a spe-cialization show, at each point of the staged configurationprocess, the maximum and minimum possible reliability thatcan be guaranteed based on the current feature selections.It is important to notice that such bounds should be determinedin a way that further specializations will neither increase theupper reliability bound nor decrease the lower bound.

To compute the reliability bounds of a specialization, weneed to find two configurations derived from that specializa-tion where one would have the highest possible reliability,representing the upper bound, and one with the least relia-bility, expressing the lower bound. Let us first focus on thelower reliability bound by attempting to find the least reliableconfiguration for a given specialization. The upper bound canbe attained with a similar approach.

As already discussed in the literature, the problem offinding a maximally covering configuration based on a setof selected features is in general an NP-complete problem[29]; however, elegant approaches such as conversion intointeger linear programs (ILPs) have been able to providesupport for the process [15], [30]. The major challenge thatwe face here is that unlike related work, whose objectiveis maximize the inclusion of more features given an initialset of constraints, our optimization objective is to find aconfiguration that will minimize the lower bound reliabilityas computed through (24) and (25). According to (24), theobjective function to be minimized consists of a complexproduct term (

∏i=1..|V ∗| RL

fi.), which cannot be converted

into a corresponding ILP representation given its nonlinearity.For this reason, an alternative approach, other than thosealready proposed in the literature, which assume linearity ofthe objective function, is needed to find the lower bound.

Given it is not feasible to solve the nonlinear opti-mization problem involving the product of |V ∗| variables(min

∏i=1..|V ∗| RL

fi× fi .), we will need to find and address

the complexity roots of the problem to efficiently find a lowerreliability bound. By looking at the feature relationships withinthe feature space, it is possible to see that the reliabilitybounds of any specialization can be calculated using 12–21iff that specialization does not consist of excludes integrity

Page 10: Reliability estimation for component-based software product lines

BAGHERI AND ENSAN: RELIABILITY ESTIMATION FOR COMPONENT-BASED SOFTWARE PRODUCT LINES 103

constraints. For cases that include integrity constraints, it is notpossible to simply compute reliability bounds using the aboveequations as the integrity constraints lead to more complexfeature relationships not supported by the above equations. Forinstance, the selection of one feature might prevent anotherfeature within the same specialization to appear in a finalconfiguration of that specialization.

To address this problem, we propose a divide andconquer strategy to find the lower reliability bound.In this approach, we will not try to find the configurationwith the lowest reliability as the process is computation-ally intractable. Instead, we will benefit from the feasibilityof computing reliability for specializations without excludesintegrity constraints to calculate the reliability bounds. Inthis vein and based on dynamic programming, we partitionthe input space, i.e., the specialization, in such a way thatmultiple subspecializations of the initial specialization arecreated where each of these subspecializations do not containany excludes integrity constraints. The idea is to implicitlyrepresent the excludes integrity constraints by distributingtheir impact on the developed subspecializations. Once thesubspecializations are developed, given they do not consist ofany excludes integrity constraints, their lower bound reliabilitycan be calculated through the employment of the earlierequations. Once the lower bound reliability of each subspe-cialization is known, the lower reliability bound of the inputspecialization is calculated by finding the least reliable subspe-cialization. The steps involved in this proposed process are asfollows.

1) For a given specialization (S) develop subspecializations(Si ) such that:

a) each Si does not consists of excludes integrityconstraints;

b) the union of derivable applications from all ofthe subspecializations (∀Si ) is equivalent to theapplications derivable from S (♣).

2) Compute reliability of each Si using 12–21.3) Find the Si with the lowest lower reliability bound and

set that as RLS .

Now, the most challenging step of this process is to effi-ciently find the subspecializations. We develop such subspe-cialization through graph coloring [31]. The central point inbuilding the subspecializations is to eliminate the excludesintegrity constraints. To achieve this objective, we propose toperform the following tasks over specialization S.

1) We let C = {( fi , f j )| fi excludes f j } be the set of allexcludes integrity constraints in specialization S.

2) Initialize graph G(V, E) = (∅,∅).3) For each c = ( fi , f j ) ∈ C do:

a) V = V ∪H( fi )∪H( f j ); // H() computes horizon-tally dependent features;

b) E = E ∪ (H( f j ) × H( f j ));

4) Color graph G such that no two adjacent nodes have thesame color.

5) SubS = ∅.6) For each color co in graph G:

a) remove features in G not colored with co from S;

b) add the resulting subspecialization to SubS.

7) Return SubS.

In the above process, we first identify all of the excludesintegrity constraints that are available in specialization S. Foreach feature involved within an excludes integrity constraint,we find all horizontally dependent features to it through theH() function as defined in [32]. Each feature along with itshorizontally dependent features will be added to a graph asits nodes. Edges will be added to the graph between eachtwo features involved in an excludes integrity constraint andtheir horizontally dependent features. This way, the featuresthat cannot appear together due to integrity constraints willbe connected to each other through an edge. Once the graphis completed, we use a graph coloring algorithm to color thegraph in such a way that no two adjacent nodes have the samecolor. The coloring algorithm will ensure that only featuresthat can appear together in a configuration of S have the samecolor. Therefore, it is possible to build subspecializations fromS by only allowing features in S that are not in G or features inG with only the same color to appear in the subspecialization.This process will allow us to build n subspecializations fromS, where n is the number of colors used for coloring graph G.Fig. 4 shows this process over a small model.

We note that the main condition for the subspecializa-tions, as mentioned in ♣, was that all possible applicationsderivable from the initial specialization S would need to beequivalent to the union of applications derivable from allthe subspecialization. It is easy to show this for our graphcoloring-based method as a feature is either not present in anexcludes integrity constraint, in which case it will appear inall subspecialization; or, it is guaranteed that it will appear inat least one of the subspecializations. The subspecializationcontaining such a feature will only lack features that couldnot have appeared with the feature in a configuration dueto the excludes integrity constraints. Therefore, it will stillbe possible to generate all feasible applications with such afeature through the subspecializations. Hence, the condition(♣) is satisfied by our graph coloring-based method.

With the development of the subspecializations SubS forS, it is now possible to compute the lower reliability boundfor each of the subspecializations and further to measure thelower reliability bound of S by finding the minimum lowerreliability bound over its subspecializations. The computationof the upper reliability bound for a specialization is similar tothis with the slight difference that the maximum over the upperreliability bounds of the subspecializations is considered.

VI. RELIABILITY-AWARE CONFIGURATION

One of the main benefits of reliability estimation is beingable to benefit from this information in building a final applica-tion that would guarantee reliability bounds for its users. In theprevious section, we presented the mechanisms to estimate thelower and upper reliability bounds for a specialization or con-figuration; however, the mechanisms were only able to com-pute the reliability bounds when specializations or configura-tions in question were already developed. In other words, oncea specialization or configuration is developed, our proposed

Page 11: Reliability estimation for component-based software product lines

104 CANADIAN JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, VOL. 37, NO. 2, SPRING 2014

Fig. 4. Building subspecializations through graph coloring.

reliability estimation model is able to compute the reliabilitybounds for it. The next additional useful step would be todevelop a configuration technique that would develop a finalproduct line configuration in light of reliability constraints, i.e.,the configuration algorithm would develop final configurationsthat not only satisfy the functional requirements but also wouldguarantee a minimum upper and/or lower bound reliability onthe final configuration. For instance, we could specify thatwe need a final configuration that satisfies a set of functionalrequirements and its upper reliability bound is at least 90%.

To be able to build configurations that satisfy a minimumon their upper and lower reliability bounds, the configurationalgorithm would need to be able to factor (24) and (25) in itscomputation process. The reason for this is that these equationsallow the computation of the bounds of a configuration and aretherefore essential for finding configurations that respect therequested reliability bounds. Most of the existing configurationalgorithms use some form of SAT solvers for addressing thisproblem in the form of binary decision diagrams, ILP andothers [30], which require the optimization objective functionto be linear. However, as mentioned earlier, our reliabilityobjective functions [(24) and (25)] are by no means linearand cannot be converted into a linear format as they includestoo many product terms. We propose two ways that could finda solution for this problem.

The first approach would be to first ignore the reliabilityconstraints and employ a traditional configuration algorithmto derive all possible configurations that satisfy the functionalconstraints. Once these configurations are derived, we couldthen use the reliability estimation equations to calculate thelower and upper bound reliability of each of the derivedconfigurations. This way, only those configurations that satisfythe required reliability bounds can be kept and the rest can bediscarded. The drawback of this approach is that if too manyfunctionally acceptable configurations are derived in the firststep, the process of computing the reliability bounds for everyand each configuration can be a time-consuming process.

The second approach is to use metaheuristic search meth-ods [33] to find the appropriate configurations by consider-ing the functional and reliability constraints simultaneously.Genetic algorithms [34] are among the metaheuristic searchmethods that have already been used for configuring productline feature models [35]. Their goal is to optimize an objec-tive function through several iterations and over a randomlygenerated population of potential solutions. The advantage ofgenetic algorithms is that they are not restricted to certainforms of objective functions; therefore, our nonlinear reliabil-ity model can be used as a part of the objective function. Inshort, the genetic algorithm first generates a set of randomconfigurations that satisfy the functional requirements. Thereliability bounds of these configurations are computed andthose with the most suitable reliability bounds are kept for thenext iteration of the process and the rest of the configurationsare discarded. In the next iteration, new valid configurationwith regards to functional requirements are generated from theprevious configuration pool through mutation and crossoveroperations and added to the existing configuration pool. Again,the reliability bounds of the configurations are calculated andthe process is repeated until the optimal configuration is foundor a predefined number of iterations has been exhausted. Thedisadvantage of this approach is that it does not necessarilyguarantee that the optimal configuration with regards to theobjective function is derived.

These two approaches for reliability-aware configuration areschematically shown in Fig. 5. The main advantage of sucha process is that its outcome is a configuration whose oper-ationalization would necessarily guarantee lower and upperreliability bounds, which is ideal for the process of derivingapplications from a software product line that have strictreliability requirements.

Let us now analyze the complexity of these two approaches.For the first approach, the process consists of initially per-forming a configuration process based on the functionalrequirements, which is performed using a variation of a SAT

Page 12: Reliability estimation for component-based software product lines

BAGHERI AND ENSAN: RELIABILITY ESTIMATION FOR COMPONENT-BASED SOFTWARE PRODUCT LINES 105

Fig. 5. Schematic overview of the two reliability-aware configuration approaches.

solver. Afterward, the reliability of each individual resultingconfiguration is calculated. Let us assume that the productline has n features, the configuration process is representedby �n , the number of valid configurations derived from theconfiguration process is p and the process for calculating thereliability bounds of each individual configuration is ϕ. Thecomplexity of the first approach would be

O(�n) + O(p × ϕ). (26)

We note that while O(�n) is NP-complete, useful heuristicshave been developed that perform the optimization processin acceptable time; however, it should be noted that thishas limitations as the size of the model, n, increases. Thecomplexity of computing the reliability bounds of each con-figuration is constant time, as observed in (24) and (25).Therefore, the computational complexity of the first approachis O(�n), which is NP-complete in general but can have betterapproximations for smaller size models.

Now, the complexity of the second approach is dependenton several parameters within the genetic algorithm. In essence,this approach evaluates the reliability of each configuration inthe configuration pool and repeats this process for a number ofgenerations until a good-enough solution is found. Assumingthat the genetic algorithm is run for k iterations and there arem configurations within its solution population, the complexityof this approach is

O(k × m × ϕ). (27)

Based on the fact that the complexity of O(ϕ) is constanttime, the complexity of the second approach is dependenton the population size and the number of iterations; hence,O(k × m). The comparison of the complexity of the twoapproaches is the based on the relationship between O(�n)and O(k × m). It seems that given the heuristics available

for improving the performance of �n , the use of the firstapproach would be a better choice in product lines witha smaller number of features. However, as the number offeatures increases in the product line, the performance of theheuristics of the first approach could become less efficient andhence the employment of the second approach could be morereasonable. We will also empirically evaluate this later in thispaper.

VII. RELIABILITY MODEL BEHAVIOR

In this section, we report on some of our observationsregarding the proposed reliability model. The observations willbe provided in the following three categories:

1) computational time for calculating reliability of productline specializations;

2) impact of variability patterns on the overall reliabilitybounds;

3) execution time of reliability-aware configuration underdifferent conditions.

To perform the experiments, we have employed the SPLOTtool [36] for developing the required feature models withvarious feature sizes and integrity constraints (CTC). The restof the parameters in SPLOT were set to default values. Foreach specification, 10 random models with that specificationwas generated and the average of the observation results overthese 10 models were used for reporting. For instance, fora feature model specification with 1000 features and 5%CTC, 10 feature models satisfying these two conditions weregenerated and the reported observations are based on theaverage over these 10 different models.

A. Calculation of Specialization Reliability Bounds

One of the important aspects of the reliability calculationfor a software product line is the time required to perform

Page 13: Reliability estimation for component-based software product lines

106 CANADIAN JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, VOL. 37, NO. 2, SPRING 2014

Fig. 6. Time for calculating specialization reliability bounds.

the calculation of the lower and upper reliability bounds of aspecialization. This is mainly due to the various unbound vari-ation points and the integrity constraints that can be present ina specialization. We already discussed our proposed approachfor calculating the reliability bounds of a specialization inSection V-E. The main component of this process is to break aspecialization down into multiple individual specializations insuch a way that none of these resulting specializations, calledsubspecializations, consists of excludes integrity constraints.For this purpose, we employ graph coloring to build thesubspecializations.

To observe the execution time involved with computingthe upper and lower reliability bounds, we randomly gener-ated feature models with sizes 1000–5000. For each featuremodel size, we developed feature models, which had CTC of5%−20%. Furthermore, each feature in the feature model wasassigned a random lower and upper bounds in (0, 1] in such away that the lower bound was always smaller or equal to theupper bound. We then used our proposed approach to computethe reliability bounds for these generated models. The resultsof our observations are reported in Fig. 6.

As observed in this figure, the time required for computingthe reliability of a specialization with 1000 is around 2 s,which is quite reasonable. We should note that as the numberof features and integrity constraints increase, the computationtime also increases; however, the increase in time is at mostlinear in this case. Based on the results, we make the fol-lowing conclusions: 1) the trend in the calculation time forspecializations with a low number of integrity constraints,i.e., CTC = 5%, shows that the increase in the number offeatures in specializations with small CTCs does not have asignificant negative impact on the calculation time and 2) thenear linear increase of computation time in specializations withCTC = 20% could be an indication of the impact of the num-ber of integrity constraints on the computation time. In otherwords, fewer integrity constraints result in faster reliabilitycalculation on specializations. This can be explained by con-sidering the details of our specialization reliability model. Themost computationally expensive component of our reliabilitycalculation model is the graph coloring algorithm. Given thegraph that needs to be colored consists of only features that areinvolved in integrity constraints; therefore, the more integrity

constraints there are, the larger the graph will be and hencethe longer it will take to color it efficiently. However, if aspecialization does not have any integrity constraints, then thegraph will be empty and no coloring will be required.

It should be noted that while a large number of integrityconstraints could highly increase the computation time, inreality the number of integrity constraints are often small.For instance, on average the models in SPLOT repositorywith more than 100 features only have a CTC of 8%. Byconsidering this, the reliability bounds of a specialization with5000 features and CTC value of 8% can be calculated inless than 8 s. For the largest model available in the SPLOTrepository with 290 features and a CTC value of 11%, it willonly take 1493 ms (∼1.5 s) to compute the reliability bounds.

B. Impact of Variability Patterns on Reliability Bounds

Besides, the importance of estimating lower and upper relia-bility bounds, which gives developers the ability to effectivelyincorporate reliability concerns into their development process,it is also important to understand how it would be possible toimprove the reliability bounds of a software product line oncethey are known. As mentioned earlier, our proposed reliabilitymodel is based on the reliability bounds of the components thatimplement each individual feature within the feature space.Therefore, an obvious strategy for improving the reliabilitybounds is to employ more reliable components for the productline features. However, for reasons such as limited availabilityof resources, it is not always possible to improve the reliabilityof all components. Hence, the main research question that wewould like to consider is as follows: for software product lineengineers, when faced by limited resources, improvement inthe reliability of the features of which variation pattern couldresult in a more significant increase in the overall: 1) lowerbound and 2) upper bound model reliability? The answerto this question would help software product line designersdecide where to invest their limited resources to get the mostreliability gains.

To address this question, we developed three categories ofmodels:

1) models based on the OFG pattern;2) models based on the AltFG pattern;3) models based on the AFG pattern.

For each of these categories, we generated random featuremodels with 1000 features. Within each category, four typesof models were generated:

1) models with 20% allocation of features to the categorypattern;

2) models with 30% allocation of features to the categorypattern;

3) models with 40% of features allocated to that categorypattern;

4) models who had 50% of their features allocated to thecategory pattern.

For instance, for the AFG, we generated four types of features,each type with 20%–50% of their features belonging to AFGs.

Once the feature models were generated, we increased thelower and upper reliability bounds of the features within the

Page 14: Reliability estimation for component-based software product lines

BAGHERI AND ENSAN: RELIABILITY ESTIMATION FOR COMPONENT-BASED SOFTWARE PRODUCT LINES 107

Fig. 7. Impact of variability patterns on lower reliability bounds. (a) 5% reliability change. (b) 10% reliability change. (c) 15% reliability change. (d) 20%reliability change.

Fig. 8. Impact of variability patterns on upper reliability bounds. (a) CTC = 5%. (b) CTC = 10%. (c) CTC = 15%. (d) CTC = 20%.

variation pattern of that category by 5%, 10%, 15%, and20%, respectively. For instance, for the AltFG category, thereliability bounds of features within AltFGs were increasedonly. We computed the overall reliability bounds of the modelsprior to and after the change and measured the differences.In other words, we considered the change of the reliabilitybounds after the increase to be an indication of the impactof the changes on the reliability of the model. The resultsare reported in Figs. 7 and 8 for lower and upper reliabilitybounds, respectively.

The four diagrams in Figs. 7 and 8 show the results ofchange in reliability bounds based on different amounts ofreliability change in the features. For instance, Fig. 7(a) showsthe amount of overall lower reliability bound change when the

reliability of features in different variation patterns is increasedby 5%. It is possible to make two observations from Fig. 7.First, as the percentage of features in the variation patternof each category increases (from 20% to 50%), the impact ofthat variation pattern becomes more predominant in the overalllower reliability bound. This is a direct result of the reliabilitymodel as it is the reliability of the individual features withinthe variation patterns that shapes the overall reliability. Second,the same increase in the lower bound reliability of AltFGfeatures has a more significant impact on the overall lowerbound reliability of the model as compared with AFGs andOFGs. In other words, to increase the overall lower reliabilitybound of a software product line, it would be reasonable toinvest in increasing the lower bound reliability of components

Page 15: Reliability estimation for component-based software product lines

108 CANADIAN JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, VOL. 37, NO. 2, SPRING 2014

engaged with the implementation of features in AltFGs as theyhave a higher impact on the overall lower reliability boundcompared with the features in the other variation patterns.

The other observation is with regards to the change in theupper reliability bound. As observed in Fig. 8, the reliabilitychange pattern for all three variation patterns is quite similar;therefore, it is not possible to make a general recommenda-tion on this issue. To answer the initial question, one couldconclude that the best strategy to increase the overall lowerreliability bound of a model is to increase the lower reliabilitybound of components that implement the features present inAltFGs. This could be achieved by replacing those componentswith more reliable components or further testing and debug-ging the existing components of such nature. Our observationsshow that reliability bounds can be improved between 30% and85% of the initial reliability change depending on the variationpattern distribution within the product line model.

C. Reliability-Aware Configuration Execution Time

An important benefit of being able to estimate product linereliability is that it enables developers to build configura-tions that guarantee desirable reliability bounds. While thisis a highly desirable feature, our initial theoretical analysisrevealed that the computation of an optimal configurationthat respects both functional and reliability requirements canbecome computationally intractable. In light of this, the goalis to build configuration models that can find optimal config-urations in reasonable amount of time. We have proposed twoapproaches, as shown in Fig. 5 that attempt to address thischallenge.

To empirically evaluate the execution time of the thesetwo approaches, we randomly generated feature models withsizes from 1000 to 10 000 with randomly distributed reliabilityvalues. These models were then used to perform the reliability-aware configuration process. For Approach 1, the techniquepresented in [15] was employed for satisfying the functionalrequirements. For Approach 2, we have made observationsusing two variations of this approach: in configuration 1, weset the GA parameters as follows: GA population: 5% of thetotal number of features, crossover: 0.5, mutation: 0.001, and500 epochs while in configuration 2, we employ the samesetting as the first configuration with the slight differencethat the GA population is set to 10% of the total numberof features. In addition, in all three approaches, we define thefunctional and reliability requirements randomly as well.

Fig. 9 shows the configuration execution time forApproach 1 and both variations of Approach 2. Fig. 9(a)–(d)represents the configuration execution time for models withdifferent number of integrity constraints. Our initial theoreticalanalysis of these configuration approaches lead us to believethat Approach 1 is likely to work quite well with smallersize models given its use of customized heuristics, whileApproach 2 would be more efficient in larger size models. Ourempirical observations also confirm our theoretical analysis.As can be observed in the figure, Approach 1 outperforms bothvariations of Approach 2 in terms of execution time in modelsthat have up to 3000 features. While for models with over

4000 features, Approach 1 becomes very slow compared withthe two variations of Approach 2. It is worth mentioning thateven for models smaller than 4000 features, Approach 1 onlyslightly outperforms the variations of Approach 2. Therefore,we believe that Approach 2 is generally a better approach forperforming reliability-aware configuration.

We would like to point out that we have only madeobservations for two variants of Approach 2; therefore, theremight be other variations for Approach 2 that could evenoutperform Approach 1 in smaller models, which is not thecase at the moment. We conclude that if the use case for theconfiguration process is highly sensitive to execution time andthat models are generally smaller than 3000 features, then theuse of Approach 1 is recommended. Otherwise, the secondapproach, especially the two variations shown above, workquite well in general for the reliability-aware configuration ofboth small and large models.

D. Summary

We can briefly provide the following digest of ourobservations.

1) The execution time for computing the reliability boundsof a specialization is primarily dependent on the numberof present integrity constraints. The increase in thenumber of features would only have minor effects onthe execution time.

2) To improve the overall reliability bounds, it wasobserved that increasing the lower reliability boundof features present in AltFGs would result in higherlower bound reliability improvement. However, no suchobservation could be made for upper reliability bounds.

3) Observations showed that our first proposed config-uration process performs reasonably well on modelswith less than 3000 features while the second proposedapproach performs better on larger models but cangenerally be considered as a reasonable configurationapproach for both small and large models.

VIII. DISCUSSION

In this section, we provide some discussions around theideas presented in this paper and how they could potentially beextended in the future. The first issue relates to the separationof variability representation and execution model within soft-ware product line artifacts especially feature models. In otherwords, variability models, e.g., feature models, are mainlyconcerned with the proper presentation of functional variationsand do not provide composition semantics for connecting thefeatures. Therefore, in our work, we have considered two maincomposition patterns and built the details of our reliabilitymodel based on these two generic patterns. In this way,we consider best and worst case scenarios using these twocomposition patterns to identify the upper and lower reliabilitybounds. However, if the exact composition patterns to be usedwith the features were known in advance, it would have beenpossible to compute the reliability bounds more accurately.There have been some proposals to connect variability modelswith models that provide composition and execution details,

Page 16: Reliability estimation for component-based software product lines

BAGHERI AND ENSAN: RELIABILITY ESTIMATION FOR COMPONENT-BASED SOFTWARE PRODUCT LINES 109

Fig. 9. Comparison of the execution time of the proposed configuration approaches. (a) CTC = 5%. (b) CTC = 10%. (c) CTC = 15%. (d) CTC = 20%.

e.g., mapping techniques that connect features to tasks inbusiness process models [37]. For such cases, it would be pos-sible to employ our reliability model to compute finer grainedreliability bounds by knowing the exact composition patternsthat will be used for operationalization of a configuration.Given the fact that the exact composition patterns between theselected features will be known, there is no need to computereliability bounds based on best and worst case scenarios andthey can be calculated based on the actual possibilities.

The second point that deserves some discussion is relatedto the reliability-aware configuration process. In this paper, wehave assumed that a configuration is built based on a set offunctional requirements and strict reliability bounds. While itis ideal to be able to build configurations that simultaneouslysatisfy both functional and reliability constraints, one canenvision many cases where all constraints cannot be satisfiedat the same time. For instance, the selection of a featurethat would satisfy a functional requirement could result inlow reliability for the overall configuration. For this reason,it seems that in some cases the configuration process canturn into decision making with regards to tradeoffs betweenthe functionality and the reliability. Our reliability modelprovides the means to not only develop configurations withspecific reliability requirements, but also allows the estimationof reliability bounds for specialization and hence supports astaged configuration process through which different alterna-tives can be explored. Application engineers can benefit fromthis to make informed decisions at each point regarding thetradeoff between a functional requirement and the reliabilityconstraints. Some of our own prior work on dynamicallybuilding decision models based on gamble queries can becustomized for this purpose to support the process of tradeoffdecision making [32], [38].

The final point pertains to the relation between reliabilitymodels and other types of system quality characteristics.According to ISO/IEC 9126 [39], reliability is only one ofthe six dimensions of quality that need to be addressed within

a software system; however, traditionally reliability modelshave received more attention due to its importance in safety-critical software applications. One of the main points ofinterest is to explore the applicability of reliability models toother quality characteristics. For instance, would it be possibleto employ reliability models to build predictive models formaintainability or usability? To explore the possibility of suchan extension, the basic properties of reliability models need tobe explored. For instance, many reliability models are basedon independence and nonadditivity assumptions. Therefore,a framework describing the properties of different qualitycharacteristics would enable us to understand how differentcharacteristics relate to each other and whether a comprehen-sive covering quality model could be build to uniquely addressall quality characteristics. There have already been some workon quality characteristics in software product lines, includingour own work on feature model metrics and the way they canbe used for predicting maintainability [40]. The properties ofsuch models in comparison with this paper on reliability couldenable us to handle all quality characteristics as first classcitizens within the configuration process.

IX. RELATED WORK

To the authors’ knowledge, there have been no works specif-ically dedicated to the area of reliability estimation for soft-ware product lines. Therefore, we will resort to reviewing thework that is closely related to the topics covered in this paper.Capturing and handling nonfunctional properties has attractedgrowing attention recently [41]. We have previously proposeda classification framework for categorizing the work on non-functional properties in the software product line engineeringparadigm [42]. One of the earlier works in this area is theextended feature modeling notation proposed in [43]. Withinthis model, features are annotated with quality attributes,e.g., availability, cost, latency, and bandwidth requirement. Inaddition, for each attribute the domain of possible value are

Page 17: Reliability estimation for component-based software product lines

110 CANADIAN JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, VOL. 37, NO. 2, SPRING 2014

defined, which can be discrete or continuous. For instance, thepower line feature can be annotated with deployment time andcost; specifying how this feature performs with respect to thesetwo quality attributes. For higher level features, e.g., Internetconnection, the feature’s quality attribute is the summation ofchild’s associated quality attributes. Our proposed approach inthe current paper is in essence similar to this paper as we com-pute the overall reliability as a function of the reliability of thefeatures. The difference between our model and the work byBenavides et al. [43] is that we do not require explicit anno-tation and also our aggregation model is specific to reliabilityestimation. More recently, Roos–Frantz et al. [44] have builtupon extended feature models and proposed OVMQ+φ, whichadd quality information orthogonal variability models [1]. Theauthors show how their work can be automatically translatedinto CSP but do not provide empirical analysis to show itspracticality.

Zhang et al. [45] have developed a Bayesian belief network(BBN)-based approach to predict and assess quality of soft-ware product lines. In this approach, BBN is used to captureand identify the impact of variants over the set of qualityattributes. The developed model can be used during the appli-cation engineering phase for assessing the quality of productline configurations by conducting qualitative analysis. One ofthe main challenges with this paper is that the BBN networkneeds to be derived separately for each target domain. Usingthe BBN network, an application engineer can understand howselecting one feature can impact a specific quality attribute andso make more informed decisions in the configuration stage.The other challenge is that the belief network is developedbased on subjective information from the developers anddomain experts. To address this challenge, in our previouswork, we have developed a set of measure-theoretic metricsfor measuring feature model structural characteristics [40]. Wehave been able to systematically show that these metrics arecorrelated with three main nonfunctional requirements, i.e.,analyzability, changeability, and understandability. The identi-fied correlation allows the development of statistical machinelearning models to predict these nonfunctional properties byjust measuring structural metrics avoiding subjectivity. In thispaper, we also develop models for predicting one of the mostimportant nonfunctional properties, reliability, by building areliability models based on quantifiable feature reliabilities andagain avoiding subjectivity.

Closer to our approach in this paper are the two recent workin [46] and [47]. In their work, the authors have focused onpredicting the impact of feature selection on nonfunctionalproperties. They have proposed computational models formeasuring the impact of feature selection and hence providequantitative information regarding the impact of feature selec-tion decisions on nonfunctional properties. Unlike our paperwhich is developed for reliability estimation, the work bySiegmund et al. have mainly been focused on measurablequality characteristics such as main-memory consumption andfootprint, which essentially possess different properties ascompared with reliability. For instance, the nonadditivity ofreliability and the independence assumptions are ones thatdo not necessarily apply to the work presented in [47]. The

authors propose to use a CSP solver for finding an optimalconfiguration with regards to nonfunctional properties, whichis similar to our first reliability-aware configuration model.However, unlike our work, the authors do not provide anaccount of execution time, performance, and applicability oftheir proposed configuration process.

In other work on nonfunctional properties for softwareproduct lines, Soltani et al. [48] introduce a frameworkfor facilitating automated product derivation based on bothfunctional and nonfunctional requirements. To address thisproblem, this paper employs an artificial intelligence planningtechnique to automatically perform feature selection whilesatisfy stakeholders’ both functional and nonfunctional con-cerns and their preferences and constraints. In this approach,the configuration problem is reduced into hierarchical tasknetwork (HTN) planning problem [49]. The feature model andstakeholders’ preferences, which are represented using condi-tional statements, are transformed into a planning problem andusing HTN-based planning technique, the optimal product isderived. This approach does not provide explicit techniquesfor the aggregation or computation of nonfunctional propertiesof composed features. Furthermore, its main limitation is itsrestriction to feature models with a size around 200 featuresgiven the limitations of the HTN planners that are available.An interesting aspect of this paper is that conditional effectsof nonfunctional properties are properly captured. It wouldbe interesting to explore the concept of conditional reliabilitywhere the lower and upper bounds of a configuration wouldnot only depend on each individual features that are includedin that configuration, but also, on the interaction between thefeatures themselves.

Furthermore, some researchers have proposed the idea ofprioritizing features based on their quality attributes duringthe domain engineering phase in a way that would resultin the higher satisfaction of the stakeholders and productdesigners [50]. The idea of prioritizing features based onquality attributes is interesting, but poses new challenges asfeature priorities change based on the target stakeholders andthe context where the product line is being specialized andconfigured; therefore, priority values for features could bedifferent under different circumstances. The stratified analytichierarchy process (S-AHP) [51] and conditional S-AHP [52]models have been proposed that allows for the prioritization ofsoftware features based on stakeholder feedback and extractedquality attributes. In these frameworks, quality attributes aremodeled as concerns and the enumeration of concerns areshown using fuzzy variables called qualifier tags.

The work on software product line customization and alsoensuring the safe transition from requirement engineeringphase to product development phase within software productlines are abundant in the literature. Interested readers areencouraged to see [41] and [42] for more details.

X. CONCLUSION

Software product lines provide the means for systematicsoftware reuse and the rapid development of new softwareapplications for a domain of interest using existing core

Page 18: Reliability estimation for component-based software product lines

BAGHERI AND ENSAN: RELIABILITY ESTIMATION FOR COMPONENT-BASED SOFTWARE PRODUCT LINES 111

artifacts. Feature-oriented software development is one of thepopular approaches within this area that views independentand complementary domain functionality as features that canbe implemented and composed to realize a domain application.The feature space for a domain is often specified through mod-els such as feature models. Given the focus of features on func-tional aspects, addressing nonfunctional requirements becomesa challenge within this area. Existing work by the communityincluding our own previous work has addressed some aspectsof nonfunctional properties and business concerns such asmaintainability, implementation costs, performance, and speed.However, while software reliability, a significant and importantnonfunctional property, has been the focus of much attentionwithin the software community, it has not yet been directlyaddressed in the software product line paradigm.

In this paper, we have proposed a component-based softwareproduct line reliability estimation model that builds on thereliability of each individual feature and provides an estimateof the lower and upper reliability bounds. The estimationmodel is not only able to provide reliability bounds for a prod-uct line feature model, but is also able to provide reliabilitybounds for feature model specializations and fully configuredapplications. The basis of the reliability estimation model isthe consideration of variability and composition patterns andhow they can be least and most effectively integrated to resultin lower and upper reliability bounds. In light of the factthat software product lines can be the potential source forthe development of many different domain applications, sucha reliability estimation model can provide lower bound reli-ability guarantees, especially for safety-critical applications,which have strict reliability requirements.

We have also proposed reliability-aware software productline configuration methods that consider both functional andreliability constraints during the configuration process. Weoffer two concrete configuration models based on:

1) SAT-based solvers;2) genetic algorithms.

The two models were compared both theoretically and empiri-cally to show their pros and cons under different conditions. Itwas observed that the SAT-based configuration model is mostsuitable for small-size models, while the genetic algorithmconfiguration process performs quite well over a spectrum ofmodels especially larger ones. The results of other practicalobservations of our reliability estimation model has beenreported in this paper, offering insight into the possibilitiesprovided by our work.

REFERENCES

[1] K. Pohl, G. Böckle, and F. J. van der Linden, Software Product LineEngineering: Foundations, Principles, and Techniques. New York, NY,USA: Springer-Verlag, 2005.

[2] I. B. Joseph Pine and S. Davis, Mass Customization: The New Frontierin Business Competition. Boston, MA, USA: Harvard Business Press,1999.

[3] P. Clements and L. Northrop, Software Product Lines. Boston, MA,USA: Addison-Wesley, 2002.

[4] C. W. Krueger, “New methods in software product line practice,”Commun. ACM, vol. 49, no. 12, pp. 37–40, 2006.

[5] C. Kästner and S. Apel, “Feature-oriented software development,” inGenerative and Transformational Techniques in Software Engineer-ing IV. New York, NY, USA: Springer-Verlag, 2013, pp. 346–382.

[6] S. Thaker, D. Batory, D. Kitchin, and W. Cook, “Safe composition ofproduct lines,” in Proc. 6th Int. Conf. Generat. Program. Compon. Eng.,2007, pp. 95–104.

[7] K. Lee, K. C. Kang, and J. Lee, “Concepts and guidelines of featuremodeling for product line software engineering,” in Software Reuse:Methods, Techniques, and Tools. New York, NY, USA: Springer-Verlag,2002, pp. 62–77.

[8] V. Sugumaran, S. Park, and K. C. Kang, “Software product lineengineering,” Commun. ACM, vol. 49, no. 12, pp. 28–32, 2006.

[9] J. Bosch, G. Florijn, D. Greefhorst, J. Kuusela, J. H. Obbink, andK. Pohl, “Variability issues in software product lines,” in SoftwareProduct-Family Engineering. New York, NY, USA: Springer-Verlag,2002, pp. 13–21.

[10] J. van Gurp, J. Bosch, and M. Svahnberg, “On the notion of variability insoftware product lines,” in Proc. Work. IEEE/IFIP Conf. Softw. Archit.,Aug. 2001, pp. 45–54.

[11] K. Pohl and A. Metzger, “Software product line testing,” Commun. ACM,vol. 49, no. 12, pp. 78–81, 2006.

[12] G. Perrouin, S. Sen, J. Klein, B. Baudry, and Y. Le Traon, “Automatedand scalable T-wise test case generation strategies for software productlines,” in Proc. 3rd ICST, 2010, pp. 459–468.

[13] E. Bagheri, F. Ensan, and D. Gasevic, “Grammar-based test generationfor software product line feature models,” in Proc. Conf. Center Adv.Studies Collaborat. Res., 2012, pp. 87–101.

[14] F. Ensan, E. Bagheri, and D. Gaševic, “Evolutionary search-based testgeneration for software product line feature models,” in Advanced Infor-mation Systems Engineering. New York, NY, USA: Springer-Verlag,2012, pp. 613–628.

[15] E. Bagheri, T. D. Noia, D. Gasevic, and A. Ragone, “Formalizinginteractive staged feature model configuration,” J. Softw., Evol. Process,vol. 24, no. 4, pp. 375–400, 2012.

[16] M. R. Lyu, Handbook of Software Reliability Engineering, vol. 3.Los Alamitos, CA, USA: IEEE Computer Society Press, 1996.

[17] M. R. Lyu, “Software reliability engineering: A roadmap,” in Proc.Future Softw. Eng., 2007, pp. 153–170.

[18] C. Smidts, M. Stutzke, and R. W. Stoddard, “Software reliability models:An approach to early reliability prediction,” IEEE Trans. Rel., vol. 47,no. 3, pp. 268–278, Sep. 1998.

[19] K. Czarnecki, S. Helsen, and U. Eisenecker, “Staged configuration usingfeature models,” in Software Product Lines. New York, NY, USA:Springer-Verlag, 2004, pp. 266–283.

[20] W. M. van Der Aalst, A. H. Ter Hofstede, B. Kiepuszewski, andA. P. Barros, “Workflow patterns,” Distrib. Parallel Databases, vol. 14,no. 1, pp. 5–51, 2003.

[21] R. H. Reussner, H. W. Schmidt, and I. H. Poernomo, “Reliabilityprediction for component-based software architectures,” J. Syst. Softw.,vol. 66, no. 3, pp. 241–252, 2003.

[22] S. Yacoub, B. Cukic, and H. H. Ammar, “A scenario-based reliabilityanalysis approach for component-based software,” IEEE Trans. Rel.,vol. 53, no. 4, pp. 465–480, Dec. 2004.

[23] V. V. Petrov, Limit Theorems of Probability Theory. Oxford, U.K.:Oxford Science Publications, 1995.

[24] R. C. Cheung, “A user-oriented software reliability model,” IEEE Trans.Softw. Eng., vol. SE-6, no. 2, pp. 118–125, Mar. 1980.

[25] J. Rajgopal and M. Mazumdar, “Modular operational test plans forinferences on software reliability based on a Markov model,” IEEETrans. Softw. Eng., vol. 28, no. 4, pp. 358–363, Apr. 2002.

[26] J. D. Musa, “A theory of software reliability and its application,” IEEETrans. Softw. Eng., vol. 1, no. 3, pp. 312–327, Sep. 1975.

[27] I. Crnkovic and M. P. H. Larsson, Building Reliable Component-BasedSoftware Systems. Norwood, MA, USA: Artech House, 2002.

[28] D. Batory, “Feature-oriented programming and the AHEAD tool suite,”in Proc. 26th Int. Conf. Softw. Eng., 2004, pp. 702–703.

[29] M. Mendonca, A. Wasowski, and K. Czarnecki, “SAT-based analysis offeature models is easy,” in Proc. 13th Int. Softw. Product Line Conf.,2009, pp. 231–240.

[30] D. Benavides, S. Segura, and A. Ruiz-Cortés, “Automated analysis offeature models 20 years later: A literature review,” Inform. Syst., vol. 35,no. 6, pp. 615–636, 2010.

[31] T. R. Jensen and B. Toft, Graph Coloring Problems. Hoboken, NJ, USA:Wiley, 2011.

[32] E. Bagheri and F. Ensan, “Dynamic decision models for staged softwareproduct line configuration,” Require. Eng. J., vol. 19, no. 2, pp. 187–212,2014.

[33] M. Harman and B. F. Jones, “Search-based software engineering,”Inform. Softw. Technol., vol. 43, no. 14, pp. 833–839, 2001.

Page 19: Reliability estimation for component-based software product lines

112 CANADIAN JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, VOL. 37, NO. 2, SPRING 2014

[34] D. E. Goldberg, Genetic Algorithms in Search, Optimization, andMachine Learning. Boston, MA, USA: Addison-Wesley, 1989.

[35] J. Guo, J. White, G. Wang, J. Li, and Y. Wang, “A genetic algorithm foroptimized feature selection with resource constraints in software productlines,” J. Syst. Softw., vol. 84, no. 12, pp. 2208–2221, 2011.

[36] M. Mendonca, M. Branco, and D. Cowan, “SPLOT: Software productlines online tools,” in Proc. 24th ACM SIGPLAN Conf. CompanionObject Oriented Program. Syst. Lang. Appl., 2009, pp. 761–762.

[37] B. Mohabbati, D. Gaševic, M. Hatala, M. Asadi, E. Bagheri, andM. Boškovic, “A quality aggregation model for service-oriented softwareproduct lines based on variability and composition patterns,” in Service-Oriented Computing. New York, NY, USA: Springer-Verlag, 2011,pp. 436–451.

[38] M. Makki, E. Bagheri, and A. A. Ghorbani, “Automating architecturetrade-off decision making through a complex multi-attribute decisionprocess,” in Software Architecture. New York, NY, USA: Springer-Verlag, 2008, pp. 264–272.

[39] H. Al-Kilidar, K. Cox, and B. Kitchenham, “The use and usefulnessof the ISO/IEC 9126 quality standard,” in Proc. Int. Symp. EmpiricalSoftw. Eng., 2005, pp. 1–7.

[40] E. Bagheri and D. Gasevic, “Assessing the maintainability of softwareproduct line feature models using structural metrics,” Softw. Quality J.,vol. 19, no. 3, pp. 579–612, 2011.

[41] S. Montagud, S. Abrahão, and E. Insfrán, “A systematic review of qualityattributes and measures for software product lines,” Softw. Quality J.,vol. 20, nos. 3–4, pp. 425–486, 2012.

[42] M. Noorian, E. Bagheri, and W. Du, “Non-functional properties insoftware product lines: A taxonomy for classification,” in Proc. SEKE,2012, pp. 663–667.

[43] D. Benavides, P. Trinidad, and A. Ruiz-Cortés, “Automated reasoningon feature models,” in Advanced Information Systems Engineering.New York, NY, USA: Springer-Verlag, 2005, pp. 491–503.

[44] F. Roos-Frantz, D. Benavides, A. Ruiz-Cortés, A. Heuer, andK. Lauenroth, “Quality-aware analysis in product line engineering withthe orthogonal variability model,” Softw. Quality J., vol. 20, nos. 3–4,pp. 519–565, 2012.

[45] H. Zhang, S. Jarzabek, and B. Yang, “Quality prediction and assess-ment for product lines,” in Advanced Information Systems Engineering.New York, NY, USA: Springer-Verlag, 2003, pp. 681–695.

[46] N. Siegmund, M. Rosenmüller, M. Kuhlemann, C. Kästner, S. Apel,and G. Saake, “SPL conqueror: Toward optimization of non-functionalproperties in software product lines,” Softw. Quality J., vol. 20, nos. 3–4,pp. 487–517, 2012.

[47] N. Siegmund, M. Rosenmüller, C. Kästner, P. G. Giarrusso, S. Apel,and S. S. Kolesnikov, “Scalable prediction of non-functional propertiesin software product lines: Footprint and memory consumption,” Inform.Softw. Technol., vol. 55, no. 3, pp. 491–507, 2013.

[48] S. Soltani, M. Asadi, D. Gasevic, M. Hatala, and E. Bagheri, “Auto-mated planning for feature model configuration based on functional andnon-functional requirements,” in Proc. 16th Int. SPLC, vol. 1. 2012,pp. 56–65.

[49] K. Erol, Hierarchical Task Network Planning: Formalization, Analysis,and Implementation. College Park, MD, USA: Univ. Maryland atCollege Park, 1996.

[50] K. Schmid, “A comprehensive product line scoping approach and itsvalidation,” in Proc. 24th Int. Conf. Softw. Eng., 2002, pp. 593–603.

[51] E. Bagheri, M. Asadi, D. Gasevic, and S. Soltani, “Stratified analytichierarchy process: Prioritization and selection of software features,” inProc. 14th Int. SPLC, 2010, pp. 300–315.

[52] I. Ognjanovic, D. Gasevic, and E. Bagheri, “A stratified framework forhandling conditional preferences: An extension of the analytic hierarchyprocess,” Expert Syst. Appl., vol. 40, no. 4, pp. 1094–1115, 2013.

Ebrahim Bagheri (M’12–SM’12) is an AssistantProfessor and the Director of the Laboratory for Sys-tems, Software and Semantics (LS3) at Ryerson Uni-versity, Toronto, ON, Canada, and has been activein the areas of the Semantic Web and software engi-neering. His work on semantic-driven informationextraction has resulted in two provisionally patentedtechnologies, namely Denote and Derive. Denoteis a semantic annotation platform based on LinkedOpen Data and Derive is an extensible architecturefor unsupervised knowledge extraction and object

(concept and property-value pair) population from the Web. He has beeninvolved in projects that encompass the use of Semantic Web technologies inthe areas of e-commerce and business process modeling funded by NSERC,AIF, and IBM.

Prof. Bagheri is an IBM Faculty Fellow.

Faezeh Ensan received the bachelor’s degree incomputer engineering from the Faculty of Engineer-ing, University of Tehran, Tehran, Iran, in 2004,the master’s degree in engineering from FerdowsiUniversity of Mashhad, Mashhad, Iran, and thePh.D. degree in computer science from the Facultyof Computer Science, University of New Brunswick,NB, Canada.

She is a Natural Sciences and EngineeringResearch Council Industrial Postdoctoral Researchand Development Fellow. She previously held a

Postdoctoral Fellowship with the University of British Columbia, Vancouver,BC, Canada. Her research is focused on aspects of knowledge engineering,conceptual modeling, ontology development, modular ontologies, and beliefrevision.