Generating Diverse Solutions in SAT

Slide 1

Generating Diverse Solutions in SATAlexander Nadel, Intel Israel

IBM; Haifa, Israel October 31, 2011AgendaIntroductionAnalysisPolarity-based AlgorithmsVariable-based AlgorithmsLocal AlgorithmsGlobal AlgorithmsConclusionDiversekSet in SATGiven a propositional formula in CNF, generate a number of solutions that are as diverse as possibleA solution is a satisfying assignmentThe threshold on the number of solutions is provided by the userDiversekSet: Brief HistoryDiversekSet in CSP is studied since 2005See the paper for referencesDiversekSet in SATThe first work is our FMCAD10 paper on semi-formal FPVSemi-formal FPV finds bugs in hardware that cannot be identified by other methodsDiversekSet is the prime reasoning engineThe problem has a number of additional applications at IntelThis work is the first full-blown paperAlgorithms for DiversekSet in SAT in a GlanceThe idea: Adapt a modern CDCL SAT solver for DiversekSetMake minimal changes to remain efficientCompact algorithms:Invoke the SAT solver once to generate all the solutionsRestart after a solution is generatedThis work: Diversity is achieved by modifying polarity and variable selection heuristicsAgendaIntroductionAnalysisPolarity-based AlgorithmsVariable-based AlgorithmsLocal AlgorithmsGlobal AlgorithmsConclusionDiversification Quality as the Average Hamming DistanceQuality: the average Hamming distance between the solutions, normalized to [01] a b c1 0 0 02 1 1 03 0 1 14 1 0 012341234Hamming distances matrix2Diversification Quality as the Average Hamming DistanceQuality: the average Hamming distance between the solutions, normalized to [01] a b c1 0 0 02 1 1 03 0 1 14 1 0 012341234Hamming distances matrix22Diversification Quality as the Average Hamming DistanceQuality: the average Hamming distance between the solutions, normalized to [01] a b c1 0 0 02 1 1 03 0 1 14 1 0 012341234Hamming distances matrix221Diversification Quality as the Average Hamming DistanceQuality: the average Hamming distance between the solutions, normalized to [01] a b c1 0 0 02 1 1 03 0 1 14 1 0 012341234Hamming distances matrix2212Diversification Quality as the Average Hamming DistanceQuality: the average Hamming distance between the solutions, normalized to [01] a b c1 0 0 02 1 1 03 0 1 14 1 0 012341234Hamming distances matrix22112Diversification Quality as the Average Hamming DistanceQuality: the average Hamming distance between the solutions, normalized to [01] a b c1 0 0 02 1 1 03 0 1 14 1 0 012341234Hamming distances matrix221123Diversification Quality as the Average Hamming DistanceQuality: the average Hamming distance between the solutions, normalized to [01] a b c1 0 0 02 1 1 03 0 1 14 1 0 012341234Hamming distances matrix221123

VariablesSolutionsHamming DistanceDiversification Quality as the Average Hamming DistanceQuality: the average Hamming distance between the solutions, normalized to [01] a b c1 0 0 02 1 1 03 0 1 14 1 0 012341234Hamming distances matrix221123

Quality via Variable ContributionWe formulate an alternative definition for the same notion of quality:Induces quality-efficient polarity and variable selection strategies by:Allowing one to estimate the contribution of each variable to quality onlineInducing methods to improve the quality online

Variable QualityVariable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable a b c1 0 0 02 1 1 03 0 1 14 1 0 0Sa = 0 Different: update Sa!Start countingVariable quality for aVariable QualityVariable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable a b c1 0 0 02 1 1 03 0 1 14 1 0 0Sa = 1 Different: update Sa!Variable QualityVariable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable a b c1 0 0 02 1 1 03 0 1 14 1 0 0Sa = 1 Equal: do not update SaVariable QualityVariable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable a b c1 0 0 02 1 1 03 0 1 14 1 0 0Sa = 1 Different: update Sa!Variable QualityVariable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable a b c1 0 0 02 1 1 03 0 1 14 1 0 0Sa = 2 Different: update Sa!Variable QualityVariable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable a b c1 0 0 02 1 1 03 0 1 14 1 0 0Sa = 2 Different: update Sa!Variable QualityVariable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable a b c1 0 0 02 1 1 03 0 1 14 1 0 0Sa = 3 Different: update Sa!Variable QualityVariable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable a b c1 0 0 02 1 1 03 0 1 14 1 0 0Sa = 3 Equal: do not update SaVariable QualityVariable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable a b c1 0 0 02 1 1 03 0 1 14 1 0 0Sa = 3 Different: update Sa!Variable QualityVariable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable a b c1 0 0 02 1 1 03 0 1 14 1 0 0Sa = 4 Different: update Sa!Variable QualityVariable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variable a b c1 0 0 02 1 1 03 0 1 14 1 0 0Sa = 4 A Simple Way to Calculate Variable QualityVariable Quality: the number of different pairs, that is {0,1} or {1,0}, amongst distinct pairs of values assigned to the variableSv = pv nvNumber of 1s assigned to vNumber of 0s assigned to v a b c1 0 0 02 1 1 03 0 1 14 1 0 0Sa = 4 Quality via Variable QualityQuality: the average variable quality, normalized to [01]:

Why the Definitions are Identical?Both definitions of quality sum up the number of different pairs of values assigned to variablesWhy the Definitions are Identical?Hamming distance-based definitionCounts the number of different pairs per each pair of rows a b c1 0 0 02 1 1 03 0 1 14 1 0 0

Why the Definitions are Identical?Variable quality-based definitionCounts the number of different pairs per variable, that is column by column a b c1 0 0 02 1 1 03 0 1 14 1 0 0

Quality via Variable QualityQuality: the average variable quality, normalized to [01]:

This definition induces DiversekSet algorithms that improve quality online, e.g.:Polarity-based algorithm:For a decision variable, pick a polarity that maximizes variable quality after the assignmentThe current partial assignment is taken into account as a solution when calculating quality onlineA simple way to keep track of which polarity maximizes variable quality is provided next

Variable PotentialVariable Potential: difference between the number of 1s and 0s assigned to a variable:

v = pv - nvVariable PotentialVariable Potential: difference between the number of 1s and 0s assigned to a variable:

a b c1 0 0 02 1 1 03 0 1 14 1 0 0 0 0 -2v = pv - nvRelation between Potential and QualityThe closer variable potential to 0 the higher variable qualityv = pv nv = pv (m pv) = 2pv mSv = pv nv = pv (m pv) = mpv pv2

Relation between Potential and QualityThe closer variable potential to 0 the higher variable qualitySv = pv nv = pv (m pv) = mpv pv2v = pv nv = pv (m pv) = 2pv m

v = 0; Sv is maxRelation between Potential and QualityThe closer variable potential to 0 the higher variable qualitySv = pv nv = pv (m pv) = mpv pv2v = pv nv = pv (m pv) = 2pv m

Absolute potential moves away from 0; Variable quality dropsAgendaIntroductionAnalysisPolarity-based AlgorithmsVariable-based AlgorithmsLocal AlgorithmsGlobal AlgorithmsConclusionpGuide: a Dedicated Compact Polarity-based AlgorithmCompactness:The SAT solver is invoked onceThe solver restarts upon new solutionOnly the polarity selection heuristic is modifiedpGuides polarity selection heuristic:If the potential is positive, pick 0If the potential is negative, pick 1If the potential is 0, pick a random valueProperties:Always prefers the value that yields better quality (if such exists)The potential is closer to 0 variable quality improves overall quality improvesYields the best possible quality given a tautological formulaCan be easily proven by induction on the number of solutionspRand : a Randomized Compact Polarity-based AlgorithmChooses the polarity randomlyUnlike pGuide, might choose a polarity which yields worse quality that the second polaritypGuide vs. pRand on Tautological FormulaspGuide picks values r0, r0, r1, r1, r2, r2, for every variablepRand picks random valuesFor 2 solutions: The quality is: 1 for pGuide: {0, 1} or {1,0}0.5 for pRand: {0, 0} or {0, 1} or {1,0} or {1, 1} with equal probability

pGuide vs. pRand on Tautological Formulas

pGuide

Even mOdd mpRand:

Quality Comparison between pGuide and pRandOur Experimental Setup66 instances from semi-formal FPVStats:Variables: 213,047 to 910,868Clauses: 738,862 to 3,251,382All the instances are available by emailMachines:Intel Xeon 4Ghz CPU frequency32Gb memoryQuality Comparison between pGuide and pRandpGuide is clearly preferable to pRand, especially for a small number of solutionsThere is a resemblance between the quality function on tautological and real-life formulasThe quality for real-life formulas is significantly lowerHow one can improve the quality on real-life formulas?pBCPGuide: a BCP-aware Compact Polarity-based AlgorithmThe idea: take constraints into account by considering the impact of Boolean Constraint Propagation (BCP)For a newly selected decision variable:For each polarity {0,1}Pick Propagate with BCPWrite down the quality QUndoPick the polarity with larger QIt is sufficient to measure the delta in the absolute value of the variable potentials for step (3): the lower delta the better quality

pBCPGuide: MorePlain pBCPGuide is too costly in terms of performancePerforms BCP 2 or 3 times per decisionOptimizations:Continue with the second polarity if it yields better quality (instead of undoing) to save a BCPThe first polarity should be the inferior one in terms of the impact on variable quality of the decision variableTo increase the chances that the second polarity yields better quality, in which case only 2 BCPs are requiredpBCPGuide_TUse pBCPGuide until T conflicts are encountered from the moment when either:The search is startedA new solution is discoveredSwitch to pGuide after the threshold is reachedPolarity-based Algorithms: Empirical Comparison SummarypGuide is preferable to pRand in terms of both quality and run-timepBCPGuide_T is preferable to pGuide in terms of quality, but pays fee in terms of performanceT regulates the trade-off between quality and run-timepBCPGuide_100 seems to achieve an attractive balance between run-time and quality

AgendaIntroductionAnalysisPolarity-based AlgorithmsVariable-based AlgorithmsLocal AlgorithmsGlobal AlgorithmsConclusionVariable-based AlgorithmsThe idea: change the variable ordering to improve the qualityTrade-off between run-time and quality: The better quality the worse run-timeWe provide ways to control the trade-offOur variable-based algorithms are built on top of polarity-based algorithmsBackground: CBH Decision HeuristicAll the clauses are organized in a listBoth initial and conflict clausesThe next decision variable is picked from the top-most unsatisfied clauseVSIDS is used to decide which variable to pickUpon conflict, the following clauses are placed in the top:The new conflict clause, followed byClauses that participated in its resolution generationClassification of Variable-based MethodsLocal vs. Global With respect to the default variable ordering

Guided vs. RandomizedGuiding the solver away from previous solutions vs. using randomization over variables

pGuide vs. pBCPGuide_100The underlying polarity-based algorithm

Classification of Variable-based MethodsGuidedRandom.LocalGlobalpGuidepBCPGuide_100AgendaIntroductionAnalysisPolarity-based AlgorithmsVariable-based AlgorithmsLocal AlgorithmsGlobal AlgorithmsConclusionPick variable with the highest absolute value of potential from the top-most non-satisfied clauseGuided: the higher absolute potential the better quality after the assignmentLocal: makes only minimal modification to the default decision heuristicPicks the variable from the same clause as the original CBH, but does not follow VSIDS within that clausepGuide: uses pGuide for polarity selection

LocalGlobalGuidedRandom.pBCPGuide_100pGuideLocalGlobalGuidedRandom.pGuidepBCPGuide_100Pick variable at random from the top-most non-satisfied clauseRandomized: picks the variable at randomLocal: picks the variable from the same clause as the original CBH, but does not follow VSIDS within that clausepGuide: use pGuide for polarity selection

pBCPGuide_100LocalGlobalGuidedRandom.pGuideHow to integrate pBCPGuide_T?Run BCP for each polarity for a number of variables and pick the best?Too expensive!Use pBCPGuide_T as the polarity selection heuristic?Yields a slight deterioration in quality as compared to plain pBCPGuide_T!The solution: run plain pBCPGuide_T until the threshold, then switch to a variable-based strategy+pGuidepBCPGuide_100Local guided variable-based vs. plain polarity-basedModerate improvement in quality for marginal run-time feeGuided vs. randomized variable-basedThe guided method yields faster run-time and better quality for a small number of solutionsThe randomized method yields better quality for a large number of solutionsBreaking dependencies pays off, when the problem is well-exploredAgendaIntroductionAnalysisPolarity-based AlgorithmsVariable-based AlgorithmsLocal AlgorithmsGlobal AlgorithmsConclusionpBCPGuide_100LocalGlobalGuidedRandom.Global methods: trade run-time for quality by getting away from the default decision heuristic

pBCPGuide_100pGuidepBCPGuide_100LocalGlobalGuidedRandom.Pick the variable with the highest potential from N top-most clauses, including satisfied clausesGuided: the higher potential the better quality after the assignmentGlobal: makes substantial modification to the default decision heuristicTie-breaking strategies target better performance by trying to get closer to the original CBH:Prefer clauses closer to the top-most non-satisfied clausesPrefer variables with higher VSIDS scores

pBCPGuide_100pGuidepBCPGuide_100LocalGlobalGuidedRandom.Pick an unassigned variable at random in T% of the casesRandomized: the variable choice in randomizedGlobal: makes substantial modification to the default decision heuristicpBCPGuide_100pGuideGlobal Variable-based Strategies: Results SummaryLocal vs. GlobalRun-time: local came the bestQuality: global wonGuided vs. RandomizedRun-time: guided came the bestQuality: guided won for a low number of solutions; randomized won for a high number of solutionsTrade-off between quality and run-time:Thresholds can be used for regulating the trade-off between quality and run-time for global strategies:Numbers of clauses to consider for guided strategiesPercent of randomized decisions for randomized strategies

Recommended StrategiesConclusionDiversekSet in SAT: generate a user-given number of diverse solutionsUsed in a number of prominent FV flows at IntelCompact algorithms:The SAT solver is invoked onceThe solver restarts upon new solutionpGuide -- the fundamental polarity-based algorithm:Only polarity selection strategy is modifiedThe goal is to balance the number of 1s and 0s assigned to each variable in the solutionsYields optimal diversification quality given a tautological formulaOn real-world formulas quality can be improved by taking BCP into account and adapting the variable orderingOne can trade run-time for quality We demonstrated how to control the trade-offThanks!67

Documents

Generating Diverse Solutions in SAT