A Scalable Approach to Exact Model

Embed Size (px)

DESCRIPTION

A Scalable Approach to Exact Model

Citation preview

  • A Scalable Approach to Exact Model andCommonality Counting for Extended

    Feature ModelsDavid Fernandez-Amoros, Ruben Heradio, Jose A. Cerrada, and Carlos Cerrada

    AbstractA software product line is an engineering approach to efficient development of software product portfolios. Key to the

    success of the approach is to identify the common and variable features of the products and the interdependencies between them,

    which are usually modeled using feature models. Implicitly, such models also include valuable information that can be used by

    economic models to estimate the payoffs of a product line. Unfortunately, as product lines grow, analyzing large feature models

    manually becomes impracticable. This paper proposes an algorithm to compute the total number of products that a feature model

    represents and, for each feature, the number of products that implement it. The inference of both parameters is helpful to describe the

    standardization/parameterization balance of a product line, detect scope flaws, assess the product line incremental development, and

    improve the accuracy of economic models. The paper reports experimental evidence that our algorithm has better runtime performance

    than existing alternative approaches.

    Index TermsFeature models, formal methods, economic models, software product lines

    1 INTRODUCTION

    A software product line (SPL) [1], [2] is an engineeringapproach to efficient development of whole portfoliosof software products. The basis of the approach is that prod-ucts are built from a core asset base (CAB): a collection ofartifacts that have been designed specifically for use acrossthe portfolio. To account for differences among the prod-ucts, some adaptations of the core assets are usuallyrequired. These adaptations should both be planned beforedevelopment and made easy for the product developers touse. In SPLs with large numbers of products and core assets,as well as requirements to make fine-grained adjustments,managing variability can become problematic very quickly.Mismanagement may result in adding unnecessary variabil-ity, implementing variation mechanisms more than once,selecting incompatible or awkward variation mechanisms,and missing required variations. As the product line growsand evolves, the need for variability increases, and manag-ing the variability grows increasingly difficult [3].

    To support the variability management of a SPL, theproduct similarities and differences are usually modeledby a feature model (FM) [4]. Implicitly, FMs include valu-able information that can be used in economic models to

    estimate the payoffs of a SPL. This paper proposes analgorithm to compute the total number of products that aFM represents and, for each feature, the number of prod-ucts that implement it. As it will be shown, the inferenceof both magnitudes is helpful to describe the standardiza-tion/parameterization balance of SPL, detect scope flaws,assess the SPL incremental development, and improve theaccuracy of economic models.

    Several techniques have been proposed to compute thetotal number of products of a FM. They follow the generalschema of translating the FM into a Boolean logic formulaF, which is then processed using an off-the-self tool t, suchas a SAT-solver or a Binary Decision Diagram (BDD)engine. Moreover, the number of products that implement afeature f can be computed by calling t with F ^ f as input[5]. Unfortunately, this approach is computationally veryexpensive since it requires as many calls to t as features theFM has. The approach proposed in this paper has betterruntime performance because it just requires one call to ouralgorithm to compute both the total number of products ofa FM and, for all the features, the number of products thatimplement them.

    Research and industry have developed different FM lan-guages. Despite the apparent language heterogeneity, Schob-bens et al. [6], [7] have formally demonstrated that most FMnotations can be generalized thanks to the group cardinalityconstruct1 and the use of textual constraints (i.e., additionalconstraints among features specified in propositional logic).

    D. Fernandez-Amoros is with the Department of Languages and ComputerSystems, Spanish Open University (UNED), Madrid, Spain.E-mail: [email protected].

    R. Heradio, J.A. Cerrada, and C. Cerrada are with the Department of Soft-ware Engineering and Computer Systems, Spanish Open University(UNED), Madrid, Spain.E-mail: {rheradio, jcerrada, ccerrada}@issi.uned.es.

    Manuscript received 8 Sept. 2011; revised 28 May 2014; accepted 11 June2014. Date of publication 15 June 2014; date of current version 18 Sept. 2014.Recommended for acceptance by P. Heymans.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference the Digital Object Identifier below.Digital Object Identifier no. 10.1109/TSE.2014.2331073

    1. group cardinality should not be confused with the concept of fea-ture cardinality proposed by Czarnecki et al. [8], [9]; whereas the formerdescribes for a group of features howmany of them have to be includedin a product, the latter specifies how many instances of a particular fea-ture has to be included in a product. Our algorithm just supports groupcardinality.

    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 40, NO. 9, SEPTEMBER 2014 895

    0098-5589 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

  • To make our approach valid for most FM languages, it dealswith both group cardinality and textual constraints.

    The target audience of this paper is:

    1) SPL tool developers, who are interested in improv-ing their support to decision makers. For instance,Gears2 is a prominent software tool to develop SPLs.It includes a statistics reporting tool that collects datafrom a SPL and then formulates vital statistics to reporton the state, growth, and health of the SPL. One of thosestatistics is an upper approximation of the total num-ber of products modeled by a FM, which does nottake into account textual constraints.

    2) Researchers in the field of automated analysis of FMsBenavides et al. [10] have surveyed 53 papers focusedon computing 30 analyses from FMs. Our proposalsupports the computation of 8 of them: (i) dead features(those that, because of their excluding dependencieson other features, cannot be included in any product),(ii) void models (FMs that are inconsistent and thus donot model any valid product), (iii) the total number ofproducts modeled by a FM, (iv) feature commonality(defined in Section 3), (v) core features (those that arepart of all the products), (vi) variant features (thosethat do not appear in all the products), (vii) SPL home-geneity (defined in Section 3), and (viii) variability fac-tor (the ratio between the number of products and 2n,where n is the number of features).

    3) Researchers in SPL economic models, who are inter-ested in the information that can be retrieved auto-matically from a FM to improve their estimations.

    4) Domain analysts who are interested in scoping a SPLadequately and estimating its payoffs.

    The remainder of the paper is structured as follows:Section 2 introduces FMs. Section 3 motivates our work.Section 4 reviews in detail several approaches to the prod-uct and commonality counting problem. Section 5presents our algorithm. Next, Section 6 describes thecomputational complexity of our approach and experi-mentally compares its runtime performance with severalother algorithms. Finally, some concluding remarks areprovided in Section 7.

    2 FEATURE MODELS

    Since the first FM notation was proposed by the FODAmethodology in 1990 [4], a number of extensions and alter-native languages have been devised to model variability infamilies of related systems. Fortunately, Schobbens et al. [6],[7] have demonstrated that most of the FM notations can beeasily and efficiently translated into a pivot language calledVaried Feature Diagram (VFD). Schobbens et al. alsoshowed that the automated analysis of a VFD model canbe optimized whenever the model is structured as a tree.Since VFD models are, in general, single-rooted directedacyclic graphs (DAGs), we restrict the input of our algo-rithm to a VFD subset named neutral feature tree (NFT)[11], where models are required to be trees.

    From now on, the hypothetical FM in Fig. 1 will be used asa running example. It may be part of a larger model about

    mobile phones. A NFT model is a hierarchically arranged setof features depicted by a tree. Each non-leaf feature has agroup cardinality label [low..high] that indicates that everyproduct includes at least low and at most high of its children.For instance, feature bluetooth is labeled with [2..3]. So, anyproduct that includes bluetooth has to include one of the fol-lowing combinations of features: {headset, hands free},{headset, remote control}, {hands free, remote control} or{headset, hands free, remote control}. In addition, the scopeof a FM can be narrowed by adding textual constraints writ-ten in propositional logic. For instance, Fig. 1 has two textualconstraints: 802:11n! HSDPA and 802:11n! HSDPU.Although a product with features {connectivity, bluetooth,headset, remote control, wifi, 802.11n} would satisfy the treestructure, it is not valid because it violates the textualconstraints.

    It should be noted that NFT and VFD are fully equiva-lent, i.e., any VFD model can be converted into a NFT oneby using the following transformation: whenever a VFD

    model has a feature f with parents f1; f2; . . . ; fn; its NFTrepresentation just keeps the edge f1 ! f and encodes theremaining ones (f2 ! f; . . . ;fn ! f) as textual constraints(see Section 6.2.1 for an explanation on how to encode edgesto propositional logic). Hence NFT is a generalization ofmost FM notations that supports:

    1) The group cardinality label, which generalizes theFODA cardinalities (i.e.,mandatory [1..1], optional[0..1], and [n..n], or [1..n] and xor [1..1]). For sim-plicity, the term group cardinality will be reserved tothose cases not covered by standard FODA.

    2) The textual constraints, which can be any proposi-tional logic formula and thus generalize the requireand exclude dependencies proposed by Pohl [1].

    Be warned that, in the appropriate context, the softwarecharacteristics in a SPL are often called features or nodes(when referring to the FM tree structure) or variables (forthe textual constraints or if the FM has been translated toBoolean logic).

    3 MOTIVATION

    From here on, the following notational conventions will beused:

    (i) F , (ii) P and (iii) Pf denote respectively the sets of(i) features, (ii) products and (iii) products thatinclude a particular feature f .

    #S is the cardinal of a set S. So, #P is the total num-ber of products represented by a FM and #Pf is thenumber of products that implement a feature f . Inthe literature, #Pf is often used in relative termsunder the name of commonality [12], [13]; i.e., the

    commonality of a feature f is#Pf#P .

    Fig. 1. NFT representation of a FM for a mobile phone product line.

    2. http://www.biglever.com/.

    896 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 40, NO. 9, SEPTEMBER 2014

  • From an input FM written in NFT, our algorithm com-putes #P and #Pf . This section summarizes the usefulnessof #P and #Pf to properly scope a SPL and estimate itspayoffs.

    3.1 Descriptive Statistics to Account for theStandardization/Parameterization Balanceof a FM

    Domain analysis consists largely in exploring the decisionsin a domain to determine which features should be com-mon to all products and which ones variable. As Cleave-land points out in [14], this determination is not a scientificprocess of discovery but one of design and engineering,and it involves trade-offs among many objectives. Assign-ing high commonality to features means standardizing,which increases the common code, reduces the family sizeand simplifies the SPL architecture. Assigning low com-monality to features means parameterizing, which reducesthe common code, increases the family size and increasesthe SPL complexity. Domain analysis attempts to achieve agood balance between these two objectives.3

    Since a FMmodels the common and variable features in aSPL, it also models the standardization and parametrizationlevels. However, getting such levels directly from the FM isnot trivial. A naive approach to calculate them might becomparing the number of terminal features4 that are man-datory and variable (i.e., #features with cardinality labeln::n versus #features with any other cardinality label). Forinstance, let us consider the electronic shopping FM5, pro-posed in [15] to model a SPL of ecommerce systems. It has192 terminal features: 10 are mandatory and 182 are vari-able. As 94:79 percent of the features are variable, we mightconclude that the SPL is highly skewed to parametrization.However, this is not the case. The causes of such an errone-ous conclusion are: (i) the FM has 21 textual constraints thathave been ignored, and (ii) using just a binary distinctionbetween mandatory and variable is too simplistic.

    The commonality computation proposed in this papersupports getting a reliable quantification of the FM standard-ization/parametrization balance since (i) it takes into acc-ount textual constraints, and (ii) it measures commonalityusing a continuous range that goes from 0 (i.e., dead features)to 1 (i.e., features common to all products). Furthermore, italso supports the usage of descriptive statistics to analyzeFMs. For instance, Fig. 2a is a histogram6 that represents howcommonality is distributed in the electronic shoppingexample. The histogram, as opposed to the naive approachresults, shows that the standardization/parametrization lev-els are quite balanced: 42 terminal features have commonal-ity between [0.4-0.5), 98 between [0.5-0.6), and 19 between[0.6-0.7), i.e., 82:8 percent of the features have commonalitybetween [0.4-0.7).

    In the SPL literature, we can also find FMs in which com-monality is not distributed around a middle value. Forexample, Figs. 2b and 2c show the commonality histogramsof the FMs Graph Product Line [16] and Home Integra-tion System (HIS) [17], which are skewed to parametriza-tion and standardization, respectively.

    The Structured Intuitive Model for Product Line Eco-nomics (SIMPLE) [18], [19] defines a measure, named homo-geneity, that characterizes how similar the products are. Itsaim is detecting domains where products are prohibitivelydissimilar from each other and thus the SPL approach doesnot pay off. The metric varies from 0 to 1, where 0 indicatesthat the products are all totally unique and 1 indicates thatthere is only one product. SIMPLE proposes Equation (1) tocompute homogeneity, where#UF is the number of uniquefeatures (i.e., features implemented by only one product).Unfortunately, this measure may produce erroneous resultsin some scenarios. For example, consider a SPL with200 products and 50 features, where every feature isincluded in just two products; although the SPL is clearlyquite heterogeneous, Equation (1) says that the SPL is totally

    homogeneous (i.e., Homogeneity#P 1 050 1). Alter-natively, SIMPLE proposes the more reliable Equation (2)that requires to know #Pf and #P. Using it for the formerscenario, we check that the SPL is certainly heterogeneous

    (i.e., Homogeneity#P P50

    i12200

    50 0:01):

    Fig. 2. Examples of commonality histograms.

    3. This also applies when feature binding is delayed to runtime and,instead of a family of programs, there is a single program whose fea-tures can be changed during its execution. In such case, high parametri-zation level means high runtime configurability.

    4. Since the configuration of non-terminal features is entirely deter-mined by the configuration of the terminal ones, they do not influencethe number of products and as such can be ignored for model counting.

    5. All the FMs referred in this paper are freely available at SPLOTwebsite: http://www.splot-research.org/.

    6. The left-end inclusion convention has been adopted in the histo-grams in Fig. 2, which stipulates that a class interval contains its left-end but not its right-end boundary point.

    FERNANDEZ-AMOROS ET AL.: A SCALABLE APPROACH TO EXACT MODEL AND COMMONALITY COUNTING FOR EXTENDED FEATURE MODELS 897

  • Homogeneity#P 1#UF#F (1)

    Homogeneity#P P

    f2F#Pf#P

    #F : (2)

    Looking carefully at Equation (2), we realize that, in fact,SIMPLE computes homogeneity as the commonality mean

    (i.e.,

    Pf2F

    #Pf#P

    #F P

    f2F Commonalityf#F Commonality).

    Thanks to our proposal and developing such an idea, othermeasures of central tendency, such as the commonalitymedian and mode, can be used to estimate the SPL homoge-neity. Moreover, dispersion statistics, such as commonalitystandard deviation, can be used to understand the stan-dardization/parametrization balance.

    3.2 Detection of Problematic Features

    According toCzarnecki and Eisenecker [20], featuremodelinghelps to avoid two serious problems: (i) relevant features andvariation points are not included in the reusable software,and (ii) many features and variation points are included butnever used and thus cause unnecessary complexity andincrease development and maintenance costs. Commonalityhelps identifying low reusable features by looking at #Pf inrelative terms. For instance, in the electronic shoppingexample there is a feature named enable profile update on

    checkout with #Pf 3:93 1048. It seems a core featurethat will be reused throughout most of the domain products.

    Nevertheless, computing its commonality 3:93 1048

    2:26 1049 0:17werealize its reusability is actually low.

    In addition, it is possible to detect scope flaws in theearly stages of the SPL development by enriching #Pfwith economic information. Economic models proposed in[21], [22], [23], [24], [25] use two abstractions, called the rel-ative cost of reuse (RCR) and the Relative Cost of Writingfor Reuse (RCWR). RCR represents the proportion of theeffort that it takes to reuse software compared to the costnormally incurred to develop it for one-time use. Forinstance, a feature has RCR 0.2 if it can be reused foronly 20 percent of the cost of implementing it. Of course,we need to pay that extra 20 percent every time we reuseit. On the other hand, RCWR represents the proportion ofthe effort that it takes to develop reusable software com-pared to the cost of writing it for one-time use. Forinstance, if it costs an additional 50 percent effort to createa feature for reuse (i.e., it is necessary to have a moregeneric design, additional documentation. . .) then RCWR 1.5. Note that RCWR is always greater or equal to 1. Pou-lin [24] defines a metric called payoff threshold, which showshow many times a feature has to be reused before theinvestment made to develop the feature is recovered. Thepayoff threshold of a feature f is calculated by Equation (3).Therefore, a feature f causes a scope problem wheneverInequality (4) is satisfied:

    Payoff Thresholdf RCWRf1RCRf (3)

    #Pf < Payoff Thresholdf : (4)

    3.3 Assessment for SPL Incremental Development

    Many managers favor an incremental approach to productline adoption, one that first tackles areas of highest andmost readily available commonality, earning payback earlyin the adoption cycle. Under this approach, the organiza-tion plans from the beginning to develop a SPL. It developspart of the Core Asset Base, including the architecture, andthen builds one or more products. In the next increment, itdevelops a portion of the rest of the CAB and builds addi-tional products. Over time, it evolves more of the CAB inparallel with new product development. In order to quan-tify the reuse improvement achieved in each developmentincrement, Cohen [26] proposes a measure called degree ofreuse (DOR), which is the portion of a complete productthat is made reusing the CAB; e.g., a DOR of 0.25 meansthat the core assets are used in the development of 25 per-cent of the software of a typical product. Although Cohenjust expresses DOR in English, such concept can be for-malized with Equation (5), that requires to know #Pf ,where:

    (1) Sizef is the size of the software that implements thefeature f (a number of techniques to estimate soft-ware size are presented in [27]).

    (2) The dividend is the size of all the software encom-passed by the SPL, i.e., the size of all the products.Such size is calculated indirectly multiplying the sizeof the software that implements every feature(Sizef) by the number of times that software isreused (#Pf.

    (3) The numerator is the size of the all the software thatis made by reusing core assets:

    DOR#P P

    f2CAB Sizef #PfPf2F Sizef #Pf

    : (5)

    3.4 Improving the Accuracy of ROI Estimations

    The computation of #Pf may also support increasing theaccuracy of existing economic models for SPLs, such as theCOnstructive Product Line Investment MOdel (COPLIMO)[21], [28], which estimates the SPL Return On Investment(ROI) by analogy. That is, COPLIMO starts estimating thedevelopment costs of a particular product, i.e., a domain ste-reotype. Then, supposing that all products in the scope of aSPL are quite similar, it extrapolates the costs of the stereotypeto compare the costs of building all the products under a SPLapproach versus building them one by one. Unfortunately,whenever the products are not extremely homogeneous,COPLIMOs simplifying assumptions produce problematicdistortions in the estimates. As Appendix B, which can befound on the Computer Society Digital Library at http://doi.ieeecomputersociety.org/10.1109/TSE.2014.2331073, shows,such assumptions are unnecessary when #Pf is known. Inthat case, ROI can be more accurately estimated by composi-tion, i.e., estimating the costs for each feature and then calcu-lating the costs of each product by adding the costs of thefeatures that it includes.

    898 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 40, NO. 9, SEPTEMBER 2014

  • 4 RELATED WORK ON PRODUCT COUNTING

    There are two main approaches to perform automatic analy-sis of FMs:

    1) A purely logic approach, where the whole FM istranslated to Boolean logic and then some standardprocedure is applied, such as a SAT solver or a Con-straint Satisfaction Problem (CSP) solver.

    A SAT solver is a program that tries to determineif a Boolean formula is satisfiable. A #SAT modelcounter is a program that tries to determine howmanymodels (i.e., how many satisfying assignments) a for-mula has.7 While the SAT problem is known to be NP-complete [29], it iswidely believed that the #SATprob-lem is even harder [30]. Section 6 experimentally com-pares our approach versus two prominent #SATcounters: relsat [31] and cachet8 [33], [34]. It is worthmentioning that although [35] claims that SAT-basedanalysis of FMs is easy, #SAT-based analysis of FMs ismuch harder and commonality-based analysis of FMsforcing features one by one is highly impractical aswill be shown in the experimental evaluation section.

    CSP solvers are also commonly used for FM analy-sis [36], [37]. The CSP problem involves a set of varia-bles over a domain and a set of constraints over thesevariables. CSPs work by searching the solution spaceperforming constraint satisfaction and backtracking.There is no advantage to using a CSP solver for a #SATproblem since the CSP techniques restricted to theBoolean case are equivalent to those in SAT and thereis an obvious performance overhead for CSP solvers.For that reason theywill not be considered for compar-ison purposes.

    2) A hybrid approach, which manipulates the textualconstraints and then runs some ad-hoc treatment forthe tree structure of the FM. These systems rely on thetree structure for good performance, so they are notusually designed to gracefully handle the textual con-straints. Some preprocessing may help though.Czarnecki and Wasowski [38] propose to follow thepath of translation in the opposite direction, that is, tointegrate the textual constraints into the tree. Thealgorithm is not always able to integrate all the con-straints, but a partial result may be of help to a hybridsystem in some cases. Another remedy is proposed in[39], where it is proven that all textual constraints canbe eliminated at the price of building a new set of fea-tures. The approach is theoretical, but the bases for animplementation are laid out. In [13], Mendoncapresents a system named Feature Model Reasoning

    System (FMRS) that depends on a reasoning engine tosolve the textual constraints while another moduleworks on the FM. There are several similaritiesbetween our proposal and FMRS, since we also applya reasoning engine and separately process the featuretree. There are also several differences. Our prototype,which we have named treecount, uses DPLL [40] as areasoning engine, while FMRS relies on a general con-straint solver. FMRS performs constraint propagationinside the tree and also relies on a system of savingand restoring states to help with the backtracking.Our treatment of the FM is more akin to a tree tra-versal whilst an efficient cache system exploits thebigger size of the model w.r.t. the textual constraintsto avoid repeating the same computations over andover as will be explained in the next section. We alsoperform Unit Propagation (UP9) [30] over a translationof the FM, so that treecount can finish the searchfaster. Also, FMRS does not support commonality cal-culations. The analysis tool in the SPLOT [41] portaluses Reduced Ordered Binary Decision Diagrams(ROBDDs) [42], which are just a special case of BinaryDecision Diagrams, to compile information about theFM. These decision diagrams hinge on a fixed order inwhich the variables can be inspected. The BDDrepresents both the tree and the textual constraints,following an ordering of the variables that has beenpreviously determined by a complex heuristic [43].Ordering heuristics can take remarkably long, and itis known that finding an ordering that produces anoptimal BDD (i.e. a BDD with the minimal number ofnodes) is an NP-complete problem [44]. The BDD isthen traversed to compute the number of products.As far as efficiency is concerned, Mendonca et al.make a big selling point of the space gains of the BDDversus DPLL. However, DPLL being a backtrackingsearch, it only keeps in memory the current stack ofsearch nodes. The main reason for efficiency of BDDcounting is not only that exploration starts from thetrue node (and thus avoid exploring unsatisfiablebranches), but also and importantly so, the efficiencycomes from use of dynamic programming, whichexploits the sharing of nodes, so that the search pathsare not explored separately. As far as commonality isconcerned, SPLOT does not support its computation,although it is certainly possible to adapt the traversealgorithm in [42] to do so.

    In previous work [11], we presented an algorithmto calculate #P and #Pf . It is noteworthy that thealgorithm is always exponential in the number ofclauses of the textual constraints, not just in the worstcase. So, the treatment of the textual constraints ren-ders the algorithm impractical for all but laboratory-sized FMs.

    7. A simple SAT solver can be easily modified to act as a modelcounter, and even as an explicit model generator. Nevertheless, as SATsolver techniques grow increasingly specialized, they become uselessfor these other problems, which demand their own techniques. Forinstance, it is now customary for solvers to implement timed restarts; ifno answer to the SAT problem is found, then the search is interruptedand continued elsewhere. For satisfiable cases, it suffices to find onemodel, so the technique seems to speed up the process. However, itdoes not carry over to efficient counting or enumerating of the models.

    8. There is also a variation of cachet called sharpSAT [32], but sincethe changes affect only cache management and not the search process itwill not be addressed in this paper.

    9. UP can be applied to an unsatisfied clause in which there isexactly one unassigned literal, to derive a satisfying assignment for thevariable involved. For instance, if there is a partial truth assignment awhere x true, and y false, and a clause :x _ y _ :z, then UPapplied to the clause yields z false, thus satisfying the clause. UP is arelatively inexpensive way of saving branching decisions during DPLLsearch and so both techniques are often employed together.

    FERNANDEZ-AMOROS ET AL.: A SCALABLE APPROACH TO EXACT MODEL AND COMMONALITY COUNTING FOR EXTENDED FEATURE MODELS 899

  • As far as the strengths and weaknesses of the approachesare concerned, component-based model counters such ascachet and relsat are expected to be very efficient when thetextual constraints present independent clauses (clauseswith no variables in common) since DPLL can efficiently cutthe translated formula into many different connected com-ponents, and very inefficient when the FM heavily exploitsproper group cardinality, since the translation to Booleanlogic suffers from a combinatorial explosion. This may alsobe true for SPLOT, since these independent clauses shouldproduce an additive-like growth in the number of nodesrather than an exponential one. The concept of branch width[45], introduced by Bacchus et al., accurately captures thisbehavior since they proved that there exists a model count-ing algorithm which runs in time exponential to the branchwidth. Comparatively, the rest of the hybrid approaches(FMRS and treecount) are expected to perform poorly in rel-ative terms when textual constraints are mostly indepen-dent (i.e., the formula has a low branch width). In our case,it is because the DPLL search will produce the same subse-quences of branching variables over and over. The sameproblem is expected to happen to FRMS since enumeratingthe solution sets can effectively cancel out the benefits of acompact BDD.

    As a quick recap, there are several working approachesto perform model counting, but commonality computinghas been more of a theoretical issue so far. Any modelcounting scheme can be used to compute commonalities byforcing the presence of a particular feature. This can be aneffective approach for a lonely feature, but when all theindividual commonalities are needed, this naive approachcan impose a considerable overhead.

    5 FAST COMMONALITY COUNTING

    This section presents our algorithm, treecount, which takesa FM as input and computes the number of products andthe commonality of each feature as output. A pseudocodedescription of the algorithm is included in Appendix A,available in the online supplemental material.

    5.1 Solving the Textual Constraints

    The first step of our algorithm is solving the textual con-straints of the FM, which must be written in ConjunctiveNormal Form (CNF10), that is, a conjunction of disjunctionsof literals (a literal is a proposition or its negation). Such dis-junctions are called clauses, so a CNF formula is a conjunc-tion of clauses. We use a variation of the standard DPLLprocedure as depicted in Algorithm 1 (Appendix A, avail-able in the online supplemental material). The core of ourcontribution is the Function computeProducts that will bedescribed in Table 1 (Appendix A, available in the onlinesupplemental material) and presented in a top-down fash-ion throughout this Section. DPLL has been thoroughlyresearched by the SAT-solver and model-counting commu-nity [31], [33], [34], [47], [48], [49], [50], [51], so we haveadopted it as is. Our heuristic in choosing the next variable

    to be conditioned (i.e. to continue the backtracking) is totake the feature appearing most often in the residual for-mula, with a preference for smaller clauses to break ties.This is known as Moms heuristic (Maximum Occurrences inclauses of Minimal Size) [52]. This heuristic has proven tobe very effective among the different alternatives [53]. It isnoteworthy that slight changes in this heuristic may dra-matically affect the number of solution sets produced. Ofcourse, efficiency-wise, the least solutions sets, the better.Unlike SPLOT, which first computes the whole BDD, webuild the solution sets iteratively.

    Both cachet and relsat also employ a typical SAT solvertechnique called conflict-driven clause learning. Whenever aconflict is detected and before backtracking is performed, anew clause is added to the original formula. Since UP is notresolution-complete, it is possible that a series of decisionsabout branching variables produce a conflict much later inthe DPLL search. This series of decisions may be repeatedover and over during the search. To avoid this, a new clausethat captures the essence of the conflict is added to the origi-nal formula so that if the series of decisions that led to theconflict are taken again, the conflict is immediately discov-ered. We have decided not to include clause learning in ouralgorithm proposal, due to the fact that in real FMs (asopposed to randomly generated) there are hardly any con-flicts. The reason is that the tree structure of a FM is alwayssatisfiable, so any conflict must come from the textual con-straints, of which there are usually too few to cause it. Thisis in stark contrast with SAT-solver test-beds, which areoften orders of magnitude larger than FMs, where conflictsare commonplace.

    First we will see how to compute #P to show some com-mon problems and their possible solutions. Let us considerthe example in Fig. 1. The two textual constraints translateinto CNF as :802:11n _HSDPA ^ :802:11n _HSDPU.DPLL is used to build a set of solutions. Since 802.11n is themost common feature in the textual constraints, it is chosento start out backtracking. We assign 802.11n the value false.Now, since both constraints are satisfied, we backtrack andflip the value of 802.11n, which is now true. Then, by UP inthe first clause, HSDPA must be true, and by UP in the sec-ond clause, HSDPU must also be true. All the clauses aresatisfied. So, the solution sets for this example aref:802.11ng and f802.11n;HSDPA;HSDPUg. To summarizethe solution sets the following notation is used: a literalwith a negation sign in front of it represents a feature whosetruth value is false, the ones without it are assigned true andthe rest of the features, not shown, are unassigned. The unas-signed value means that either true or false comply with theconstraints.

    5.2 Classifying the Nodes

    The structure of the tree plus a particular solution set mayimpose that a particular node (i) be present in all the corre-sponding products (present), (ii) be absent from all the prod-ucts (absent), (iii) be in some products and not in others(potential) or (iv) be and not be in all the products at thesame time (contradicting). The type of a node can be definedin terms of two other concepts: selected and deselected. Anode will be selected if it is necessary that the node is pres-ent in all the products to comply with the solution set plus

    10. CNF is the standard form that Boolean formulas must follow tobe processed by SAT-solvers. It is always possible, though not necessar-ily efficient, to translate a formula into CNF [46].

    900 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 40, NO. 9, SEPTEMBER 2014

  • textual constraints and it will deselected if it is necessarythat the node is absent from all the products. Combiningthese values all four possibilities are covered. Let us focusmostly on present and potential nodes, since absent andcontradicting nodes play a small part. Provided that thereare no contradicting children, the Inner Variability (IV) of anode (i.e., the different number of valid configurations forits subtree) will depend on the present and potential nodes.The present nodes will provide a present factor (pref) andthe potential nodes a potential factor. The product of bothfactors will yield the IV of the node.

    Formally, the solution set can be considered as a functiona : A! ftrue, false, unassignedg, where A is the set ofnodes. So, consider a node n, with s (possibly zero) children:n1; n2; . . . ; ns and a cardinality relation [low..high] meaningthat at least low children and at most high children mustbelong to a product. Suppose the children have alreadybeen classified and thus the number of present, potentialand contradicting nodes, which are denoted respectivelyby j pren;a j , j potn;a j and j conn;a j , are available.Then, seln;a an true _Wsi1 selni;a andn;a an false _ j pren;a j j potn;a j < low_ j pren;a j > high _ j conn;a j > 0:

    It is worth noting that the definition is not circular; the typeof node can be determined by Algorithm 2 (Appendix A,available in the online supplemental material), and theneeded attributes are computed in a bottom-up fashion. Inthe NFT syntax (which was introduced in Section 2), the selproperty has a strictly logical interpretation; if seln;a, thenn is logically implied by the solution set and the tree struc-ture (since the children imply the parent). For desel, thequestion is more complex, since the cardinalities are alsoinvolved. Essentially, deseln;ameans that the node cannotbe present in this solution set, be it because it is explicitlynegated or because it is impossible for it to comply with thecardinality restrictions. For each node n, an attribute m willbe used, and set to true iff the node is a mandatory featurew.r.t its parent. The reason is that mandatory nodes arealways selected. Let us consider now the computeTypefunction in Algorithm 2. The way to call it for a node nwould be ComputeType(n.sel _ n.m, n.desel).

    Back to the example, in the first solution set, 802.11n isabsent and everything else is potential. In the second set,802.11n, HSDPA and HSDPU are directly present, wifi,modem and connectivity are also present given that the selattribute is synthesized upwards. The rest are still potential.

    5.3 Computing Variability

    The IV of a node n (under a solution set), denoted asIV n, is the number of different valid configurations inthe subtree rooted at n, according to the tree structure.Thus, the total number of products represented by a FMfor a particular solution set is IV(r), where r is the root. Fora leaf node l, IV l 1, as long as the node is not negatedin the solution set. For a non-leaf node n with s children,n1; n2; . . .ns, the formulas to compute IV n for the FODAtype of features are:

    1) mandatory/optional. IV n Qni is optional IV ni 1Qnj is mandatory

    IV nj2) or (i.e., cards1::s). IV n

    Qsi1 IV ni 1 1

    3) xor (i.e., cards1::1). IV n Ps

    i1 IV niIn general, when a node n has s children and has [low..

    high] cardinality, IV n is given by Equation (6), where Sk isthe variability in choosing any combination of k childrenfrom s. For the sake of clarity, let us denoteIV n1; IV n2; . . . ; IV ns as iv1; iv2; . . . ; ivs. In a straightfor-ward approach, Sk can be calculated by summing the varia-bilities of all possible k-combinations (see Equation (7)).Unfortunately, this calculation has exponential complexity:

    IV n Xhighklow

    Sk; (6)

    Sk X

    1i1

  • instead and to see how much effort it takes, we will do it inthe inefficient way. Table 1 summarizes the possibilities tochoose children for each node: 4 for bluetooth, 7 for modemand 1 for wifi (since 802.11n is false). It is easy to see that ifthe number of children is big, the equivalent to Table 1 canget really long.

    We now compute the powers of the number of productsfrom the children of connectivity and their sum (see Table 2).Now, S0 1 by definition, S1 12, as it is the sum of thechildren variabilities, S2 1=212 12 66 39, followingthe general formula 8 and S3 1=339 12 12 66 408 28. Adding up S2 and S3, we get 67.

    In the second solution set, nothing has changed in thebluetooth subtree, so the old variability values could beused again. modem and wifi are now present nodes, eachwith variability 2 and 1 respectively, so the present factoris 2 and the potential factor consists of taking bluetooth[0..1] times, which gives us 5. The product of both factorsyields the variability of connectivity for this set, 10. Addingup both sets, there are 77 products in this product line.

    5.4 Propagating Contextual Variability

    IV is enough to compute #P since #P IV r, where r isthe root feature. To compute #Pf it is necessary to intro-duce the concept of Contextual Variability (CV). Back to theFM in Fig. 1, suppose that #P and #Pf have already beencomputed. Now imagine a new root node called mobilephone with connectivity as a mandatory child and anotheroptional child called USB are added. Since the new nodesdo not occur in the textual constraints, it is obvious that thenew FM has 77 2 154 products. That is because for eachproduct in the original FM, there is a new one with mobilephone added, and another one with mobile phone and USBadded. The CV of connectivity is precisely 2, because eachproduct in the connectivity subtree is expanded into 2 differ-ent products in the bigger FM. Even more, the number ofproducts each feature appears in, which we keep to com-pute the commonality of each feature, has also doubled.Intuitively, the CV of connectivity propagates down itswhole subtree. One way to compute this CV is to mentallydelete connectivity and compute what the IV of its parentwould be. In general, it is a process that has to be performedat the solution set level.

    Given a solution set, the CV for a particular node n, isthe number of different configurations for the SPL ignor-ing the subtree rooted in n. Clearly, every product can be

    decomposed as the part that depends on the features inthe subtree of n and the part that depends on the rest ofthe features. So, the number of products that n appears infor a particular solution set is the product of its IV and itsCV. CV is a global attribute that depends on the wholeoutside context of the node. Obviously, CV for the rootfeature is 1, since there is no context. In the general case,we take advantage of the fact that CV propagates down-wards rather easily. For a non-root node n with parent p,given a solution set, CV(n) CV(p) IV(p0), where p0 is acopy of p where the subtree under n has been deleted.We could also say that IV(p0) is the Sibling Variability (SV)of n, SV(n), as we call it in Algorithm 6 (Appendix A,available in the online supplemental material). Instead ofcomputing IV(p0) from scratch, which would take timequadratic to the number of siblings, some previous resultsare reused to obtain it in linear time.

    Let n be a node, with s children whose inner variabilitiesare respectively iv1, iv2, . . ., ivs, and let us suppose IV nhas been already computed using Equation (8). This calcula-tion would provide us with vector S. What would happen ifwe should add a new child ns1 with variability ivs1? anew vector S0 may be computed using the general Equa-tion (8), but it is possible to derive S0i from Si directly, forany suitable i. Obviously, S0i will contain all the possibilitiesin Si, since all of them are valid combinations of i childrenof n. These are the combinations in S0i which do not includethe new node. The combinations including the new childamount to ivs1 Si1. So, S0i Si ivs1 Si1.

    To calculate the CV of a child m of n, what we reallywant to do is exactly the opposite, i.e., having computed Si,take out m and compute the vector S0i, i.e., S

    00 1 and

    S0i Si ivm Si1:Going back to our previous example, say we want to take

    out feature wifi in order to compute its CV. Now S00 1 bydefinition, S01 12 1 1 11, S02 39 1 11 28 andS03 28 1 28 0 (as expected, since there are only twosiblings left). Connectivity is the root node, so its CV is 1.This means that wifi appears in 391 39 products in thissolution set. As 802.11n is false and 802.11g is a leaf node,initially IV(802.11g) 1. If the CV for wifi is propagated,then CV(802.11g) is 39, so, 802.11g appears in 39 products inthis solution set.

    5.5 Processing a Solution Set

    Let us discuss the algorithm formally. The attribute grammarformalism will be used to avoid the control-flow overhead oftraditional pseudocode (i.e., to keep the algorithm clear ofcode regarding the FM tree traversal). Table 1 in Appendix A,available in the online supplemental material, summarizesthe productions of the context-free grammar, together with

    TABLE 1Cardinality Restrictions for the Mobile Phone Example

    Feature Admissible children combinationsfor non-leaf nodes

    connectivity {bluetooth, modem}, {bluetooth, wifi},{modem, wifi}, {bluetooth, modem, wifi}

    bluetooth {headset, hands free}, {headset, remote control},{hands free, remote control},{headset, hands free, remote control}

    modem {GPRS}, {HSDPA}, {HSDPU},{GPRS, HSDPA}, {GPRS, HSDPU},{HSDPA, HSDPU}, {GPRS, HSDPA, HSDPU}

    wifi {802.11g}, {802.11n}, {802.11g, 802.11n}

    TABLE 2Variability Powers from Connectivity Children

    and Their Sum

    power bluetooth modem wifi sum

    1 4 7 1 122 16 49 1 663 64 343 1 408

    902 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 40, NO. 9, SEPTEMBER 2014

  • the semantic rules. The terminal symbols are presented inboldface. A syntactic tree of this grammar corresponds exactlyto a FM, and each node has a set of attributes, whose valuesproduce the needed information. But before we explain howto get it, let us briefly discuss how to encode FMs in the differ-ent syntactic alternatives to suit the grammar.

    In NFT there are two types of nodes: group nodes and leafnodes. Encoding a NFT FM into a string produced by thisgrammar is straightforward, all that is necessary is to ignorethe mandatory construction. Regular FODA does not usegroup cardinality. Suppose there is a node nwith s children,of the optional/mandatory type, it can be encoded as a groupnodewith cardinality [0..s] and then use themandatory attri-bute appropriately for the children. If n is an or node, then itwould be a group node with cardinality [1..s] and nomanda-tory children. Likewise, a xor node would require cardinality[1..1] and no mandatory children. The SPLOT metamodeluses group cardinality and the optional/mandatory con-struction, so it also follows the former encoding.

    The first thing to notice when using the grammar is thetraversal order for the nodes. Although most attributes aresynthesized, which implies that a bottom-up traversal of thesyntactic tree is necessary, there are also some inheritedattributes, so another top-down traversal will also be neces-sary. The good thing is that there is no circularity in the defi-nitions. For the sake of clarity, before each definition of theattributes, we have appended an upwards arrow for synthe-sized attributes, and a downwards arrow for inherited ones.

    Let us describe how the evaluation would take place. Inthe first, ascending phase, the IV is computed for every node.In attribute grammars, the notation for an attribute a of anode n is n:a, so IV(n) would be written as n.iv. So, we counthow many children of each type a node has with the c attri-bute. Then sel and desel attributes are computed. With thesevalues, together with the mandatory attribute, the node typeis computed out of the four possibilities. All the present chil-dren variabilities are integrated in the present factor and thevariability of potential children are added in the list attributepot. Then, its potential factor is computed via gCard, whichalso provides vector S. An attribute called extra is used tohold the values of the number of products of F , the cardinal-ity bounds and also the vector S, since all these may be usedas input parameters for Algorithm 6. Finally, the IV is com-puted. This is done for each node in a bottom-up fashion.

    In the descending phase, the CV is computed for eachnode using the CV of the parent and taking the node out. Ifthe node was present, the process is as simple as dividingthe variability of the parent by the variability of the child.If the node was potential, we employ the vector S and thenode variability to call Algorithm 6. Either way, we cancompute now the number of products the node appears in,and this amount is added to an external array adding upthe subtotals along the solution sets. The CV attribute is alsoused in a slightly subtle way: when a node is deselected,none of its descendants participates in any products, so weset the CV to zero and just let it propagate downhill, so allthe descendants end up with zero products.

    Finally, a small optimization has been added consistingof caching the results of the computation for each node,so that a node whose attributes are already cached is justvisited once. For each node n that is visited, a key is

    hashed consisting of the name of the node plus the valuesof the variables in the textual constraints that are alsodescendants of n. Typically the key is very small since theratio of variables in the textual constraint versus the totalnumber of variables in the FM is usually low, as we willdiscuss in Section 6, so this allows us to reduce the run-time of treecount.

    6 COMPUTATIONAL COMPLEXITY ANDEXPERIMENTAL EVALUATION

    6.1 Computational Complexity

    This section shows that Algorithm 1 is quadratic in thenumber of features and exponential in the number of varia-bles in the textual constraints. So, if N is the number of fea-tures in the tree and M is the number of features in the

    textual constraints, the complexity is in O(N22M ).For practical purposes it is convenient to introduce the

    concepts of Extra Constraint Representativeness (ECR) andClause Density (CD), introduced in [13]. ECR is the numberof variables in the textual constraints divided by the totalnumber of features. In the mobile phone example, thiswould be 312 0:25. CD is the number of clauses in the tex-tual constraints divided by the number of variables in thetextual constraints. For the mobile phone example, it is23 0:67. The DPLL process is exponential by nature, so inthe worst case it may have to choose and flip almost everyvariable in the textual constraints to get all the solution sets.

    DPLL backtracking search can be seen as a binary tree,with 2M leaf nodes and 2M1 internal nodes. Following Algo-rithm 1, the leaf nodes correspond to the case where the tex-tual constraints are satisfied and computeProducts is called(Table 1 in Appendix A, available in the online supplementalmaterial). The internal nodes generally need to perform UPand then choose a new variable to keep the backtracking. Ofthese elements, it is clear that UP and branch selection viaMoms heuristic can be completed in quadratic time. Wenow prove that processing each solution set, that is, running

    computeProducts, takes only O(N2) (in fact, ON if onlystandard FODA models are allowed). For claritys sake, itmay help to consider the operations step-by-step. For a callto #gCard (Algorithm 3) with a node n with s children, the

    result is in Os2, since there are two nested loops upperbounded by s. When #gCard is called for all the nodes, as it is

    done in computeProducts, it takes ON2, where N is thetotal number of nodes. This can easily be proven bymeans ofstructural induction: the leaf-nodes of the FM are the basecase of the induction and they take constant time to be proc-essed, so the condition holds trivially. Let now n be the rootof the tree with children n1; n2; . . .ns with N1; N2; . . . ; Nsnodes in their respective subtrees (excluding the root). Now,

    N Psi1Ni s 1. Trivially, Psi1 x2i Psi1 xi2 forany natural number xi. The induction hypothesis is that

    #gCardni 2 ON2i . So, the time spent computing #gCardfor all the nodes is delimited by Equation (9):

    Xsi1

    N2i

    ! s2

    Xsi1

    Ni

    ! s

    !2 N2 2 ON2: (9)

    FERNANDEZ-AMOROS ET AL.: A SCALABLE APPROACH TO EXACT MODEL AND COMMONALITY COUNTING FOR EXTENDED FEATURE MODELS 903

  • The rest of the computations (sel, desel, . . .) are linear.Therefore, the bottom-up sequence of computeProducts isquadratic. Finally, Let us consider the top-down sequence.The call to TakeOneOut for some node ni takes time in

    ON2i (ONi for regular FODA). Applying again the argu-ment expressed in Equation (9), this top-down sequence is

    ON2. Therefore, the sequence of the operations in compu-teProducts is ON2.

    The total cost of Algorithm 1 includes calling compute-Products (ON2) once for each DPLL search leaf node, ofwhich there are 2M and performing in sequence UP (ON2)and branch selection (also ON2) for each of the 2M1 inter-nal nodes. That is O2MN2 2M1N2 N2 ON22M.Calculation of the commonalities just takes computing theproducts for each feature and then traversing all the fea-tures to perform a division by the total number of products,so the complexity for commonality computing is the sameas in Algorithm 1.

    Model counting via strict DPLL as performed by cachetand relsat belongs to the complexity class NO1O2N. Thereason is that every feature appears as a variable in the cor-responding formula, regardless of the ECR, so every vari-able counts for the exponential part. The branchingheuristics can also be computationally expensive. The com-plexities for the forcing versions would be the same sincebasically they consist in calling again the tool N times, oncefor each feature to force it to be present. SPLOTs complex-

    ity is in the class O2N, because the branching strategy isdefined by the ordering constraints instead of being recon-sidered at each new step as cachet and relsat do. For com-

    monality counting, it is ON 2N because the BDD istraversed once for each feature, and the BDD may have as

    many as 2N nodes. So, as far as worst-case complexity isconcerned, our approach moves a great deal of the complex-ity from the exponential side to the quadratic one. The dif-ferent complexities are summarized in Table 3.

    6.2 Experimental Evaluation

    In addition to the theoretical evaluation that computationalcomplexity provides, five experiments have been devisedto evaluate the validity and scalability of the differentapproaches on datasets with varying characteristics; withand without group cardinality, real academic models andcomputer-generated ones, with and without textual con-straints and some others with different clause densityvalues. These experiments will show how treecount11

    behaves w.r.t to the other baselines: the propositional-logicexact model counters cachet [33] and relsat [31], and theopen-source version of SPLOT [41]. It is important to notethat cachet, relsat and SPLOT needed some tweaking inorder to perform commonality counting, since only tree-count does so natively: For cachet and relsat, commonalitycomputing requires one run for each feature with an addi-tional constraint to force the presence of that feature. Thenumber of products containing the feature is then dividedby the number of products containing the root feature toobtain the commonality. Cachet [54] presents a pseudocodeto compute the marginals, but the memory requirementsmultiply those of the counting version, which were alreadyexponential. Where the counting version caches a floatingpoint number, themarginality computing one stores a wholevector of those. The authors claim computing marginals is10-40 percent slower than counting, when the problem fits inmemory. Lacking evidence to the contrary, it is reasonable toassume that only small models do fit in memory (and thosethat do not fit in memory quickly degrade into good-oldDPLL search). This version of cachet is not available, so, aswith relsat, cachet was used as a black-box to force each fea-ture in turn. With SPLOT, the original steps of computing avariable order and building the BDD were followed. Afterthat, the BDD graph was traversed once for each feature toforce its presence and thus compute its commonality.

    The test machine was an Intel core i7 @3.5 Ghz with16 GB of RAM running Mac OS X.

    6.2.1 First Experiment: Group Cardinality #1

    The first experiment seeks to test the ability of the differentapproaches to deal with a sample of group cardinality FMs.The FM scheme is summarized in Fig. 3. The FMs consist ofa root node n, with s terminal children and cardinality [h..h1], where h is the integer division of s by 2. The results ofthe experiment are summarized by Table 4.

    Since the input to cachet and relsat are logic formulas inCNF, the FM translation to CNF is now sketched, followingthe directions in [55]. The tree structure is dealt with s 1clauses: there is a clause to express that node n is true. Also,each child ni implies the parent node n. For instance, if s 4, the tree structure is encoded by:

    n ^ n1 ! n ^ n2 ! n ^ n3 ! n ^ n4 ! n n ^ :n1 _ n ^ :n2 _ n ^ :n3 _ n ^ :n4 _ n:

    For the cardinality restriction, the low and high restrictionsare treated separately. Saying that at least low children haveto be present in a product is equivalent to say that at mosts low children can be excluded (i.e., in the logical formulano more than s low children can be false). Which means

    TABLE 3Complexity Classes for a FM with N Features M of Which

    Appear in the Textual Constraints (M < N)

    Product counting Commonality counting

    cachet NO1O2N NO1O2Nrelsat NO1O2N NO1O2Ntreecount ON2 2M ON2 2MSPLOT O2N ON 2N Fig. 3. Hard group cardinality model.

    11. The prototype implementation of our algorithm, the experimentsdescribed in this section and a number of case studies are available onhttps://sourceforge.net/projects/commonality-spl.

    904 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 40, NO. 9, SEPTEMBER 2014

  • that as soon as s low 1 children are selected, at least oneof them must be true (this constraint is a clause). So, the lowrestriction is equivalent to the conjunction of all possibleclauses obtained by choosing s low 1 children of n. Inthe example with s 4, the low limit is encoded by: n1_n2 _ n3 ^ n1 _ n2 _ n4 ^ n1 _ n3 _ n4 ^ n2 _ n3 _ n4.

    This mechanism produces 2hh1 clauses. The high restric-tion is somewhat easier. Since in a set of high 1 children atleast one of them has to be false, one may just compute allthe sets of children of size high 1 and add a clause with allthe set members negated. For the example, this gives::n1 ^ n2 ^ n3 ^ n4 :n1 _ :n2 _ :n3 _ :n4.

    Which produces 2hh2 clauses. To sum up, the s 4example is encoded by the following formula with 10clauses: n ^ :n1 _ n ^ :n2 _ n ^ :n3 _ n ^ :n4 _ n^n1 _ n2 _ n3 ^ n1 _ n2 _ n4 ^ n1 _ n3 _ n4 ^ n2 _ n3 _n4 ^:n1 _ :n2 _ :n3 _ :n4:

    Since 2hh1 2hh 2h for any h, the number of clausesroughly doubles when going from s nodes to s 2. Theintent is to test the ability of logic model counters to copewith this kind of input. As showed by Table 4, treecountcompletes each test case in 0.02 milliseconds or lesswith a modest rate of growth. SPLOT comes in second,between 2 and 3 orders of magnitude slower than tree-count. Cachet takes twice as long as SPLOT for the biggermodels and finally relsat times out at size 15 (the timeoutwas set at 1 minute).

    Using purely logic tools such as cachet and relsat doesnot provide a scalable solution for the problem of common-ality counting in the presence of group cardinality becauseof their reliance on DPLL. In this experiment, the number ofclauses grows exponentially with the number of nodes inthe input FM. The number of branching decisions taken forDPLL search is included as an alternative informative mea-sure, as well as the size of the BDD for SPLOT. The differen-ces between relsat and cachet are interesting. Cachets

    caching strategy imposes a time overhead w.r.t to relsat butit really pays off in terms of branching decisions, consider-ing that the input (number of clauses) grows exponentially.Relsat, being relatively simpler, performs faster in the firstexemplars, but at size 10 is already lagging behing cachetand the number of decisions doubles with each new model.For treecount, the quadratic time growth of the algorithm isnegligible for these input sizes.

    For SPLOT, the size of the resulting BDD is included inthe dec column. It turns out that building the BDD takes upa lot of time as expected by the exponentially growing num-ber of clauses, but the resulting graph is in fact quite com-pact. Moreover, there is a clear pattern of growth: 1, 3, 3, 4,4, 5, 5, 6, 6, 7, 7. . . It stands to reason that there should be arelation between the DAGs, maybe a recurrence relationsimilar to the ones described in Section 5. If there is a fastway to compute the graph for the proper group cardinalitynodes, it should be reasonably easy to apply it to the rest ofthe BDD, especially since such cardinality graph should beagnostic as to the ordering of the variables (i.e. the nodescan be renamed in any way). So, with some effort, BDDcould be an adequate tool for managing group cardinalityin FMs.

    Admittedly, real FMs are not likely to display such acomplex structure, but then again, extended cardinalitycould not be efficiently processed hitherto. Even a slight useof group cardinality may slow down the response of adesign framework relying on model counters.

    6.2.2 Second Experiment: Real Models

    In the second experiment, the same four approaches aretested on real-world FMs, as collected in the SPLOT reposi-tory. The 30 biggest models were chosen, excluding repeti-tions. Interestingly, none of these models makes use ofgroup cardinality, so the outcome will help establish if tree-count, group cardinality-capable, is penalized in its absence.

    TABLE 4Group Cardinality Test

    terminal clauses cachet relsat treecount SPLOT

    nodes time dec time dec time dec time BDD size

    1 4 9.49 0 0.59 2 0.01 0 48.69 22 6 12.27 0 0.63 6 0.01 0 51.99 33 8 15.38 2 0.86 16 0.01 0 55.16 64 12 18.75 10 1.15 38 0.01 0 58.31 95 18 22.06 27 1.63 83 0.01 0 62.93 136 30 24.68 54 2.55 182 0.01 0 72.04 177 52 28.70 89 5.05 376 0.01 0 82.60 228 95 33.64 138 12.12 802 0.01 0 107.38 279 180 38.25 197 38.15 1,627 0.01 0 143.61 3310 343 44.65 274 132.13 3,432 0.01 0 231.03 3911 674 55.10 363 507.20 6,917 0.01 0 448.66 4612 1,302 78.20 474 2115.55 14,506 0.01 0 693 5313 2,590 124.67 599 8740.52 29,157 0.01 0 811.45 6114 5,022 217.41 750 37934.01 60,902 0.01 0 865.53 6915 10,028 456.22 917 0.02 0 1041.23 7816 19,467 1003.94 1,114 0.02 0 1685.67 8717 38,916 2376.30 1,329 0.01 0 2452.95 9718 75,603 5645.29 1,578 0.02 0 4114.78 10719 151,186 14596.77 1,847 0.02 0 6618.15 11820 293,953 39313.78 2,154 0.02 0 15847.63 129

    Times in milliseconds.

    FERNANDEZ-AMOROS ET AL.: A SCALABLE APPROACH TO EXACT MODEL AND COMMONALITY COUNTING FOR EXTENDED FEATURE MODELS 905

  • As before, the number of branching decisions/BDD size isincluded where applicable. BDD size is an average of thedifferent runs which explains why it is sometimes a frac-tional number, since the ordering heuristic is not determin-istic and so different runs of the same test may producedifferently-sized BDDs.

    The results are shown in Table 5. As before, running timesshould not fool us. Cachets infrastructure makes it slowerthan relsat in most exemplars. However, the number ofbranching decisions is consistently below that of relsat, andeventually cachet takes over on the biggest exemplar. Tree-count is the fastest system in all cases, again by several ordersof magnitude. It always makes less branching decisions thanrelsat and, in the vast majority of cases, less than cachet. Thedifference is particularly striking in those exemplars withouttextual constraints. For those, treecount makes no branchingdecisions, but cachet and relsat still have to embark in DPLLsearch to decompose the connected components of the FM.SPLOT does a good job when the BDD is small but strugglesto complete the traversals otherwise. Results seem to confirmthat treecount is a good choice when ECR is low. The Elec-tronic Shopping exemplar deserves particular treatment. Forall four systems, it marks the peak of both branching deci-sions and time. At a first glance, it does not seem a very hardtest. With 326 features, it is the biggest model. However, theECR is 0.11, not particularly high (the DELL model has 0.69)and the number of features present in the textual constraints

    is 35, less than the 40 features sported by an easy model likeArcade Game. The average number of literals in a clause isbarely 2.05. The problem is that 27 features occur only oncein the textual constraints and nine clauses in textualconstraints are formed from these variables and so are inde-pendentthe worst-case scenario for the hybrid systems.For SPLOT, BDD size increases by two orders of magnitude.While SPLOTs ordering heuristic (Pre-CL-MinSpan) isspecifically tuned for use with FMs, BDD construction tech-niques are generic, so there is probably space for improve-ment. This exemplar is difficult for the logic systems andfor SPLOT because it is the biggest, and it is difficult for tree-count because the textual constraints are largely indepen-dent of one another.

    6.2.3 Third Experiment: Computer-Generated Models

    without Textual Constraints

    The four approaches have been measured against computergenerated randommodels. Fig. 4 displays the results of run-ning the four programs with random FMs with no textualconstraints ranging from 50 to 5,000 features, with a branch-ing factor of 6. Running times for cachet and relsat growexponentially, although both hit the 1 minute timeout ratherquickly, at sizes 3,400 and 1,300 respectively. Treecount(hardly visible in the graphic) is the clear winner, with 1.43milliseconds at size 5,000 against 23,790 for SPLOT. The

    TABLE 5SPLOT Repository Sample Results

    Name #F ECR CD cachet relsat treecount SPLOT

    time dec time dec time dec time BDD

    Web Portal 49 0.24 0.50 180.60 2,171 49.22 10,124 0.21 31 118.66 191.03Thread 51 0 167.92 571 27.61 1,010 0.03 0 110.18 88Doc Generation 53 0.25 0.62 184.62 1,344 68.62 19,416 0.28 31 123.84 340Android SPL 54 0.17 0.56 177.34 689 31.51 1,705 0.08 5 115.80 81DELL 54 0.69 2.97 182.46 976 57.64 1,924 0.49 27 168.62 275.04Experimentation PL 56 0.16 0.44 178.34 0 26.87 1,100 0.11 9 115.61 32Letovanje 56 0.05 0.67 190.90 1,205 38.91 2,101 0.05 1 114.18 99.08Face Animator 59 0.20 1.42 195.22 820 38.04 2,967 0.10 4 122.70 90.34Hotel PL 62 0 200.93 589 34.60 2,539 0.03 0 115.14 44Electronic Drum 63 0 212.04 1,488 42.91 1,488 0.03 0 118.45 77Product Family Test 64 1 1.41 207.35 150 54.69 378 0.71 70 171 180.05OW2 FraSCAti 65 0.68 1.05 219 1,019 56.51 7,954 1.95 257 192.15 1196.83Phone Meeting 66 0 215.85 809 43.39 3,610 0.03 0 121.48 66Smart Home 66 0.06 0.50 217.84 854 41.27 3,862 0.06 3 129.39 96Arcade Game 70 0.57 0.85 238.14 1093 119.15 34,334 0.25 7 166.23 483.81HIS 73 0.11 0.50 232.95 753 46.26 1,173 0.12 11 147.34 157.63SimulES 73 0 240.62 1,130 48.30 2,534 0.04 0 126.30 71Car Crisis Mgmt 74 0.04 0.67 240.75 903 50.26 2,434 0.05 1 144.44 86Video Player 76 0 244.01 282 64.11 11,359 0.03 0 151.65 79Database Tools 77 0.08 0.33 252.19 769 70.39 13,308 0.06 4 151.70 103.75E-Science 77 0 278.35 2,935 73.68 6,174 0.04 0 135.86 181Eclipse Extensions 79 0.03 0.50 271.11 2,327 59.22 4,456 0.05 1 154.98 120Web architectures 88 0 293.79 1,855 67.23 3,934 0.04 0 156.97 109Billing 90 0.76 0.87 287.28 348 67.18 3,907 48.73 5735 205.03 185.46Car Selection 91 0.30 0.78 313.53 2,829 87.50 4,773 4.14 767 174.40 137.65Ecologic Car 110 0.04 0.50 373.82 2,816 112.44 4,113 0.08 3 174.70 138Model Transf 113 0 395.17 3,144 124.09 7,613 0.05 0 167.13 146xtext 137 0.01 0.50 470.30 4401 159.55 17,001 0.05 1 204.10 205Printers 200 0 687.80 5,259 443.07 63,826 0.05 0 256.19 228Electronic Shopping 326 0.11 0.60 1350.38 24,900 1382.11 262,389 65.22 6527 1479.53 23344.18

    Times in milliseconds.

    906 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 40, NO. 9, SEPTEMBER 2014

  • forcing procedure clearly does not scale for cachet and relsat.It does for SPLOT, although the absolute timings are muchworse than those of treecount.

    6.2.4 Fourth Experiment: Computer-Generated Models

    with Textual Constraints

    The next experiment tests satisfiable exemplars (ten of each)of sizes between 50 and 1,000, ECR 0.2, and clause density

    values ranging from 0.1, to 0.5, created using SPLOT genera-tor with a branching factor of 6. We have set a timeout of 60seconds for each test in this run of experiments and onlyshow results when all 10 exemplars have been processed intime. Table 6 show the results.

    All four systems fail to finish the tests, especially withevery increase in clause density. Relsat starts out fasterthan cachet but the situation quickly reverses. In fact, relsattimes out at size 300 even for CD 0.1. Cachet lasts a bit lon-ger but times out at size 600 for CD 0.1. Treecount is againthe clear winner, and even so, it does not scale gracefully.SPLOT comes in second place. It is noteworthy that over-constraining the exemplars does not help any system, quiteto the contrary, increasing the CD is a killer for all theapproaches.

    6.2.5 Fifth Experiment: Computed Generated Models

    with Textual Constraints and Group Cardinality #2

    The final experiment consisted in modifying the generatedmodels in the last experiment to include group cardinality.Whenever there was a group node, we changed it randomlyto a legal configuration (i.e., a [low, high] cardinality wherelow high and high the number of children of the node),

    Fig. 4. Performance without textual constraints.

    TABLE 6Performance in Milliseconds With ECR 0.2 and Various CD Values

    Size Clause density

    0.1 0.2 0.3 0.4 0.5

    100 399.56 419.18 479.04 440.12 437.32 cachet182.12 312.40 677.27 343.80 553.62 relsat0.12 0.18 0.35 0.52 0.81 treecount182.11 185.78 190.11 191.97 199.08 SPLOT

    200 1022.03 1253.37 2714.42 3212.03 3196.15 cachet1486.73 7633.24 timeout timeout timeout relsat0.33 1.64 6.62 25.25 67.10 treecount325.84 376.18 445.28 568.00 744.36 SPLOT

    300 2084.60 3775.07 9427.70 18874.79 timeout cachet

    timeout timeout timeout timeout timeout relsat

    1.73 43.53 523.72 1368.78 6806.57 treecount473.95 779.60 1799.65 2548.33 timeout SPLOT

    400 4247.83 14217.84 timeout timeout timeout cachet

    timeout timeout timeout timeout timeout relsat

    11.49 583.02 9165.96 timeout timeout treecount793.49 2309.78 17081.91 timeout timeout SPLOT

    500 9960.27 timeout timeout timeout timeout cachet

    timeout timeout timeout timeout timeout relsat

    24.26 7872.75 timeout timeout timeout treecount1269.74 timeout timeout timeout timeout SPLOT

    600 timeout timeout timeout timeout timeout cachet

    timeout timeout timeout timeout timeout relsat

    233.15 timeout timeout timeout timeout treecount3004.35 timeout timeout timeout timeout SPLOT

    700 timeout timeout timeout timeout timeout cachet

    timeout timeout timeout timeout timeout relsat

    1751.34 timeout timeout timeout timeout treecount8051.32 timeout timeout timeout timeout SPLOT

    800 timeout timeout timeout timeout timeout cachet

    timeout timeout timeout timeout timeout relsat

    6072.60 timeout timeout timeout timeout treecount

    timeout timeout timeout timeout timeout SPLOT

    FERNANDEZ-AMOROS ET AL.: A SCALABLE APPROACH TO EXACT MODEL AND COMMONALITY COUNTING FOR EXTENDED FEATURE MODELS 907

  • which could be either FODA or a proper group cardinalityconfiguration.

    Run time increased 61 percent for cachet, 118 percent forrelsat, 0.32 percent for treecount and 0.5 percent for SPLOT.

    The experiments indicate that group cardinality is a bur-den for cachet and relsat, even though the small branchingfactor limited the impact for these systems. SPLOT takesadvantage of the fact that building the BDD takes a little lon-ger but the bulk of the work traversing the BDD is very sim-ilar. Treecount performs well as expected.

    6.3 Threats to Validity

    Let us discuss threats to validity of the experimentalevaluation:

    1) Choice of examples. In the design of the experiments itwas of paramount importance to address the perfor-mance of our approach and the baselines on largeFMs since any method would do for small cases. Asthere are almost no publicly available industrial-sized FMs, this led to the use of random computer-generated examples within certain parameters. Withthese, the question is always whether the generatedmodels can be considered realistic or not. We concurat this point with Mendoncas assertion: From ourexperience in examining several FMs in the literature, wenoticed that feature trees are frequently orders of magni-tude larger than the extra constraints in terms of numberof relations [13]. For this reason we generated exam-ples with no textual constraints in experiment 3 andalso examples with ECR = 0.2 and CD between 0.1and 0.5 in experiments 4 and 5.

    In [56], Berger et al. study the configurationmodelsfor the Linux (6,320 features) and eCos (1244 features)kernels, which are examples of real big variabilitymodeling systems, claiming an ECR of 0.86 in bothcases, somewhat contradicting Mendoncas assump-tions. It is worth mentioning that said systems gowellbeyond the usual FODA notation and its extensions,including feature attributes whose values mayinclude a default and then change conditionally orunconditionally of other features attribute values,including the use of arithmetic expressions for num-bers and equality for strings. Another point of interestis that Kconfig (the Linux kernel configurator) allowsillegal configurations to be selected and CDL (theeCos configuration language) is supported by aninference engine that is correct but not complete, thatis, it proposes correct configurations to complete userselections, but not all the compatible ones. Theauthors also report that the maximum number of sib-lings is 158 and 29, respectively. These FMs resemblethose described in [38] which integrate as many of thetextual constraints as possible into a tree structure.Considering the differences and the fact that none ofthe algorithms used in this paper could process thoseFMs, we will stick to FMs with low ECR in the experi-ments while duly acknowledging that future workshould address these biggermodels.

    To further question the results in experiments 3-5,we included experiment 2 in which the FMs in the

    SPLOT repository, which are all real models (mostlyfrom the academic world) albeit usually small, areprocessed. Since the results in all those experimentspoint in the same direction we consider the threataverted.

    2) Accurate and fair timing of the approaches. In experi-ment 1, since some of the running times are so smalland obviously prone to measuring errors, we haveaveraged the times over 10 runs. In experiments 2and 3, we used 100 runs for each model. For experi-ments 4 and 5 we have averaged times over 10 exem-plars for each combination of number of features andCD. To level up the field, we have edited the sourcecode to measure only processing time, not I/O. Oth-erwise running times would exponentially rise whenthe input grows exponentially, such as for cachet,relsat and SPLOT in experiment 1. Also, cachet andrelsat heavily used temporary files to force the pres-ence of the individual features.

    3) Comparison of C, C++ and Java implementations.Cachet, relsat and treecount are C++ programs,while SPLOT is written in Java. This suggests thatSPLOT timings could not be directly compared tothose of the other approaches. The reality however,is complex. Cachet is built on top of zchaff, so it con-tains quite a bit of C code wrapped up as C++. Relsatand treecount are written in pure C++ while SPLOTmakes heavy use of JavaBDD, which in this case islinked to BuDDy, a pure C library with optional C++wrappings. Only the ordering heuristic (which takesup typically less than 1 percent of runtime) is writtenin Java so language-wise all four approaches seemvery comparable.

    7 CONCLUSIONS

    Effective deployment of large-scale software productlines demands efficient support for automated analysis ofexpressively-rich FMs. In particular, this paper hasshown that the computation of the number of productsthat implement each feature is helpful to describe thestandarization/parameterization balance, detect flaws inthe scope, assess the incremental development andimprove the accuracy of economic models for productlines. Whereas there are alternative proposals for com-puting the total number of products, this paper contrib-utes with an innovative algorithm to compute the featurecommonalities.

    Our algorithm is applicable for most FM notations sinceit takes into account textual constraints and the group cardi-nality construct. Theoretical considerations suggested thatFM translation to propositional logic would be ineffectiveagainst widespread use of group cardinality. These consid-erations have been empirically tested and confirmed. Asregards the group cardinality computation, we have pre-sented a quadratic algorithm for a naturally exponentialproblem in Sections 5.3 and 5.4. The same quadratic com-plexity applies to the case in which no textual constraintsare used. Furthermore, we have shown that commonalitycomputation is possible with just a small overhead aftercomputing the total number of products.

    908 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 40, NO. 9, SEPTEMBER 2014

  • We have reported experimental evidence that the forcingtechnique to compute commonalities using models countershas serious scalability problems, especially for logic systemssuch as cachet and relsat. BDD-based approaches such asSPLOT show a greater potential in that respect, although farbehind treecount. None of the approaches is well suited formodels with high clause densitya reminder of the diffi-culty of the task. Treecounts performance was superior inall test cases, often by several orders of magnitude over avariety of models, both real and computer-generated, mak-ing it a sound tool for automated analysis of FMs.

    ACKNOWLEDGMENTS

    The authors are grateful to the anonymous reviewers for theirinsightful feedback. They also thank Roberto Lopez Herrejonand Alexander Egyed at the Institute for Software Engineer-ing and Automation (Johannes Kepler University of Linz,Austria) for their advice. This work has been supported bythe Spanish Government under the CICYT project DPI-2013-44776-R, and the Comunidad de Madrid under the RoboC-ity2030-II excellence research network S2009DPI-1559.

    REFERENCES[1] K. Pohl, G. Bockle, and F. Linden, Software Product Line Engineer-

    ing: Foundations, Principles and Techniques. New York, NY, USA:Springer, 2005.

    [2] P. Clements and L. Northrop, Software Product Lines: Practices andPatterns. Reading, MA, USA: Addison-Wesley, 2001.

    [3] F. Bachmann and P. C. Clements, Variability in software productlines, Softw. Eng. Inst., Pittsburgh, PA, USA, Tech. Rep. CMU/SEI-2005-TR-012, 2005.

    [4] K. Kang, S. Cohen, J. Hess, W. Novak, and S. Peterson, Feature-oriented domain analysis (FODA) feasibility study, Softw. Eng.Inst., Pittsburgh, PA, USA, Tech. Rep. CMU/SEI-90-TR-21, 1990.

    [5] A. Kubler, C. Zengler, and W. Kuchlin, Model counting in prod-uct configuration, in Proc. 1st Int. Workshop Logics Component Con-figuration, Edinburgh, U.K., Jul. 2010, pp. 4453.

    [6] P.-Y. Schobbens, P. Heymans, J.-C. Trigaux, and Y. Bontemps,Generic semantics of feature diagrams, Comput. Netw., vol. 51,no. 2, pp. 456479, 2007.

    [7] A. Metzger, P. Heymans, K. Pohl, P.-Y. Schobbens, and G. Saval,Disambiguating the documentation of variability in softwareproduct lines: A separation of concerns, formalization and auto-mated analysis, in Proc. 15th IEEE Int. Requirements Eng. Conf.,2007, pp. 243253.

    [8] K. Czarnecki, S. Helsen, and U. W. Eisenecker, Formalizing car-dinality-based feature models and their specialization. Softw.Process: Improv. Practice, vol. 10, no. 1, pp. 729, 2005.

    [9] K. Czarnecki, S. Helsen, and U. W. Eisenecker, Staged configura-tion using feature models, in Proc. Softw. Product Line Conf., 2004,vol. 3154, pp. 266283.

    [10] D. Benavides, S. Segura, and A. Ruiz-Cortes, Automated analysisof feature models 20 years later: A literature review, Inf. Syst.,vol. 35, no. 6, pp. 615636, 2010.

    [11] D. Fernandez-Amoros, R. Heradio, and J. Cerrada, Inferringinformation from feature diagrams to product line economic mod-els, in Proc. 13th Int. Conf. Softw. Product Lines, 2009, pp. 4150.

    [12] D. Benavides, On the automated analysis of software productlines using feature models. a framework for developing auto-mated tool support, Ph.D. dissertation, Dept. Comput. Lang.Syst., School. Comput. Sci, Univ. Seville, Sevilla, Spain, Jun. 2007.

    [13] M. Mendonca, Efficient reasoning techniques for large scale fea-ture models, Ph.D. dissertation, School. Comput. Sci., Univ.Waterloo, Waterloo, ON, Canada, Jan. 2009.

    [14] J. C. Cleaveland, Program Generators with XML and Java. Engle-wood Cliffs, NJ, USA: Prentice Hall, 2001.

    [15] S. Q. Lau, Domain analysis of ecommerce systems using fea-turebased model templates, Masters thesis, Dept. Elect. Com-put. Eng., Univ. Waterloo, Waterloo, ON, Canada, 2006.

    [16] D. Batory, Feature models, grammars, and propositional for-mulas, in Proc. 9th Int. Conf. Softw. Product Lines, 2005, pp. 720.

    [17] K. C. Kang, J. Lee, and P. Donohoe, Featureoriented productline engineering, IEEE Softw., vol. 19, no. 4, pp. 5865, Jul./Aug.2002.

    [18] G. Bockle, P. Clements, J. McGregor, D. Muthig, and K. Schmid,Calculating ROI for software product lines, IEEE Softw., vol. 21,no. 3, pp. 2331, May/Jun. 2004.

    [19] P. Clements, J. McGregor, and S. Cohen, The structured intuitivemodel for product line economics, Softw. Eng. Inst. Pittsburgh,PA, USA, Tech. Rep. CMU/SEI-2005-TR-003, 2005.

    [20] K. Czarnecki and U. Eisenecker, Generative Programming: MethodsTools and Applications. Reading, MA, USA: Addison-Wesley, 2000.

    [21] B. Boehm, A. W. Brown, R. Madachy, and Y. Yang, A softwareproduct line life cycle cost estimation model, in Proc. Int. Symp.Empirical Softw. Eng., 2004, pp. 156164.

    [22] S. B. A. B. Lamine, L. L. Jilani, and H. H. B. Ghezala, Cost estima-tion for product line engineering using cots components, inProc. Softw. Product Line Conf., 2005, vol. 3714, pp. 113123.

    [23] J. P. Nobrega, E. S. de Almeida, and S. R. de Lemos Meira,Income: Integrated cost model for product line engineering, inProc. 34th Euromicro Conf. Softw. Eng. Adv. Appl., 2008, pp. 2734.

    [24] J. S. Poulin, The economics of software product lines, Int. J. Appl.Softw. Technol., vol. 3, no. 1, pp. 2034, Mar. 1997.

    [25] J. H. Wesselius, Strategic scenario-based valuation of product lineroadmaps, in Software Product Lines Research Issues in EngineeringandManagement. Berlin, Germany: Springer, 2006, pp. 5389.

    [26] S. Cohen, Predicting when product line investment pays, Soft-ware Eng. Inst., Pittsburgh, PA, USA, Tech. Rep. (CMU/SEI-2003-TN-017), 2003.

    [27] D. D. Galorath and M. W. Evans, Software Sizing, Estimation, andRisk Management: When Performance Is Measured PerformanceImproves. New York, NY, USA: Auerbach, 2006.

    [28] H. P. In, J. Baik, S. Kim, Y. Yang, and B. Boehm, A quality-basedcost estimation model for the product line life cycle, Commun.ACM, vol. 49, no. 12, pp. 8588, 2006.

    [29] S. A. Cook, The complexity of theorem-proving procedures, inProc. Third Annu. ACM Symp. Theory Comput., 1971, pp. 151158.

    [30] A. Biere, A. Biere, M. Heule, H. van Maaren, and T. Walsh, Hand-book of Satisfiability: Volume 185 Frontiers in Artificial Intelligence andApplications. Amsterdam, The Netherlands: IOS Press, 2009.

    [31] R. Bayardo and J. Pehoushek, Counting models using connectedcomponents, in Proc. Nat. Conf. Artif. Intell., 2000, pp. 157162.

    [32] M. Thurley, sharpSATCounting models with advanced compo-nent caching and implicit BCP, in Proc. 9th Int. Conf. TheoryAppl. Satisfiability Testing, 2006, pp. 424429.

    [33] T. Sang, F. Bacchus, P. Beame, H. Kautz, and T. Pitassi,Combining component caching and clause learning for effectivemodel counting, in Proc. 7th Int. Conf. Theory Appl. SatisfiabilityTesting, 2004, pp. 2028.

    [34] T. Sang, P. Beame, and H. A. Kautz, Heuristics for fast exactmodel counting, in Proc. 8th Int. Conf. Theory Appl. SatisfiabilityTesting, 2005, pp. 226240.

    [35] M. Mendonca, A. Wasowski, and K. Czarnecki, SAT-based anal-ysis of feature models is easy, in Proc. 13th Int. Softw. Product LineConf., 2009, pp. 231240.

    [36] D. Benavides, A. Ruiz-Cortes, and P. Trinidad, Automatic rea-soning on feature models, in Proc. 17th Int. Conf. Adv. Inf. Syst.Eng., 2005, pp. 491503.

    [37] D. Benavides, S. Segura, and P. Trinidad, Using java CSP solversin the automated analyses of feature models, Tech. Softw., vol. 1,pp. 399408, 2006.

    [38] K. Czarnecki and A. Wasowski, Feature diagrams and logics:There and back again, in Proc. 11th Int. Softw. Product Line Conf.,2007, pp. 2334.

    [39] Y. Gil, and S. Kremer-Davidson, Sans constraints? feature dia-grams vs. feature models, in Proc. 14th Int. Conf. Softw. ProductLines: Going Beyond, 2010, vol. 1, no. 6287, pp. 271285.

    [40] M.Davis, G. Logemann, andD. Loveland, Amachine program fortheorem-proving,Commun. ACM, vol. 5, pp. 394397, Jul. 1962.

    [41] M. Mendonca, M. Branco, and D. Cowan, S.P.L.O.T.Softwareproduct lines online tools, in Proc. 24th ACM SIGPLAN Conf.Object Oriented Programming Syst. Lang. Appl.OOPSLA Compan-ion, Oct. 2009, p. 761.

    [42] R. E. Bryant, Graph-based algorithms for Boolean functionmanipulation, IEEE Trans. Comput., vol. C-35, no. 8, pp. 677691,Aug. 1986.

    FERNANDEZ-AMOROS ET AL.: A SCALABLE APPROACH TO EXACT MODEL AND COMMONALITY COUNTING FOR EXTENDED FEATURE MODELS 909

  • [43] M. Mendonca, A. Wasowski, K. Czarnecki, and D. Cowan,Efficient compilation techniques for large scale feature models,in Proc. 7th Int. Conf. Generative Program. Component Eng., 2008,pp. 1322.

    [44] B. Bollig and I. Wegener, Improving the variable ordering ofOBDDs is NP-complete, IEEE Trans. Comput., vol. 45, no. 9,pp. 9931002, Sep. 1996.

    [45] F. Bacchus, S. Dalmao, and T. Pitassi, Algorithms and complexityresults for # SAT and Bayesian inference, in Proc. 44th Annu.IEEE Symp. Found. Comput. Sci., 2003, pp. 340351.

    [46] G. Tseitin, On the complexity of derivation in propositionalcalculus, in Structures in Constructive Mathematics and Mathemati-cal Logic, Part II, Seminars in Mathematics (translated from Russian).Moscow, Russia: Steklov Math. Inst., 1968, pp. 115125.

    [47] J. P. Marques-Silva and K. A. Sakallah, GraspA new searchalgorithm for satisfiability, in Proc. 1996 IEEE/ACM Int. Conf.Comput.-Aided Des., 1996, pp. 220227.

    [48] N. Sorensson and N. Een, Minisat v1.13A SAT solver with con-flict-clause minimization, in Proc. Syst. Descriptions SAT Competi-tion, 2005, vol. 2005, p. 53.

    [49] M. W. Moskewicz, C. F. Madigan, Y. Zhao, L. Zhang, and S. Malik,Chaff: Engineering an efficient SAT solver, in Proc. 38th Annu.Des. Autom. Conf., 2001, pp. 530535.

    [50] K. Pipatsrisawat and A. Darwiche, Rsat 1.03: SAT solverdescription, Automated Reasoning Group, Comput. Sci. Dep.,UCLA, Los Angeles, CA, USA, Tech. Rep. D152, 2006.

    [51] Y. Hamadi, S. Jabbour, and L. Sais, ManySAT: A parallel SATsolver, J. Satisfiability, Boolean Model. Comput., vol. 6, pp. 245262,2009.

    [52] D. Pretolani, Satisfiability and hypergraphs, Ph.D. dissertation,Dipt. di Informatica, Universita di Pisa, 1993.

    [53] J. W. Freeman, Improvements to propositional satisfiabilitysearch algorithms, Ph.D. dissertation, Comput. Info. Sci., Univ.Pennsylvania, Philadelphia, PA, USA, 1995.

    [54] T. Sang, P. Beame, and H. A. Kautz, Solving Bayesian networksby weighted model counting, in Proc. 20th Nat. Conf. Artif. Intell.,2005, vol. 1, pp. 475482.

    [55] A. Biere, M. J. Heule, H. vanMaaren, Toby, andWalsh,Handbook ofSatisfiability, ser. Frontiers in Artificial Intelligence and Applica-tions, vol. 185. Amsterdam, TheNetherlands: IOS Press, Feb. 2009.

    [56] T. Berger, S. She, and R. Lotufo, Variability modeling in the real:A perspective from the operating systems domain, inProc. IEEE/ACM Int. Conf. Automated Softw. Eng., 2010, pp. 7382.

    David Fernandez-Amoros received the MSdegree in mathematics and the MS degree in phi-lology from the Universidad Complutense deMadrid, Spain, in 1994 and 2009, respectively,and the PhD degree in computer science fromthe Universidad Nacional de Educacion a Distan-cia, Spain, in 2004. Since November 2001, hehas been an assistant professor at the Universi-dad Nacional de Educacion a Distancia, Spain.His research and teaching interests include soft-ware engineering, natural language processing,

    compiler design and concurrent programming.

    Ruben Heradio received the MS degree incomputer s