24
Levels of Abstraction Levels of Abstraction in Probabilistic Modeling and in Probabilistic Modeling and Sampling Sampling Moshe Looks November 18 th , 2005

Levels of Abstraction in Probabilistic Modeling and Sampling

  • Upload
    chi

  • View
    48

  • Download
    2

Embed Size (px)

DESCRIPTION

Levels of Abstraction in Probabilistic Modeling and Sampling. Moshe Looks November 18 th , 2005. Outline. Global Optimization with Graphical Models Where Do Our Random Variables Come From? Incorporating Levels of Abstraction Results Conclusions. Global Optimization. User Determines - PowerPoint PPT Presentation

Citation preview

Page 1: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of AbstractionLevels of Abstractionin Probabilistic Modeling and Samplingin Probabilistic Modeling and Sampling

Moshe LooksNovember 18th, 2005

Page 2: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 2

OutlineOutline

• Global Optimization with Graphical Models

• Where Do Our Random Variables Come From?

• Incorporating Levels of Abstraction

• Results

• Conclusions

Page 3: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 3

Global OptimizationGlobal Optimization

• User Determines– How instances are represented

• E.g., fixed-length bit-strings drawn from {0,1}n

– How instance quality is evaluated• E.g., a fitness function from {0,1}n to R

– May be expensive to compute

• Assumes a Black Box– No additional problem knowledge

Page 4: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 4

ApproachesApproaches

• Blind Search– Generate and test random instances

• Local Search (Hill-climbing, Annealing, etc..)– Search from a single (best) instance seen

• Population-Based Search– Search from a collection of good instances seen

Page 5: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 5

Population-Based SearchPopulation-Based Search

1. Generate a population of random instances

2. Recombine promising instances in the population to create new instances

3. Remove the worst instances from the population

4. Goto step 2

Page 6: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 6

Population-Based SearchPopulation-Based Search

• How to Generate New Instances?

• Genetic Algorithms– Crossover + Mutation

• Estimation of Distribution Algorithms (EDAs)– Generate instances by sampling from a probability

distribution reflecting the good instances– How to represent the distribution?

Page 7: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 7

Probability-Vector EDA (Bit-String Case)Probability-Vector EDA (Bit-String Case)

• Each position in the bit-string corresponds to a random variable in the model; X = {X1,X2,…,Xn}

• Assume independence– For the population 001, 111, and 101

• P(X1=0) = 1/3 P(X1=1) = 2/3• P(X2=0) = 2/3 P(X2=1) = 1/3• P(X3=0) = 0 P(X3=1) = 1

– E.g., P(011) = P(X1=0) • P(X2=1) • P(X3=1) = 1/3 • 1/3 • 1 = 1/9

– Can generate instances according to the distribution

• Population-Based Incremental Learning (PBIL)– Baluja, 1995

Page 8: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 8

Graphical ModelsGraphical Models

• Probability + Graph Theory– Nodes are random variables– Graph structure encodes variable

dependencies

• Great for– Uncertainty– Complexity– Learning

Page 9: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 9

The Bayesian Optimization The Bayesian Optimization AlgorithmAlgorithm ((Pelikan, Goldberg, and Cantú-Paz, 1999)Pelikan, Goldberg, and Cantú-Paz, 1999)

• Dynamically learn dependencies between variables (nodes in a Bayesian network)

• Without dependenciesP(X1X2X3X4)

= P(X1) • P(X2) • P(X3) • P(X4)

• With dependenciesP(X1X2X3X4)

= P(X1) • P(X2 | X1) • P(X3) • P(X4 | X1, X3)– Dependencies (edges) correspond to partitions of the target

node’s distribution based on the source node’s distribution

Variables must now be sampled in topological order

Page 10: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 10

Where Do Our Random Variables Come From?Where Do Our Random Variables Come From?

• The real world is a mess!– Boundaries are fuzzy and ambiguous– Even in discrete domains, ambiguity remains:

• E.g., DNA- a gene’s positions is sometimes critical, and sometimes irrelevant

• Consider abstracted features, defined in terms of “base-level variables”

• E.g., contains a prime number of ones• E.g., does not contain the substring AATGC

Page 11: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 11

FeaturesFeatures• Features are predicates over instances,

describing the presence or absence of some pattern

• Base-level variables Xi, 1 < i < n, are a special case (i.e., “Xi=1” is a feature)

• Any well-defined feature (f) may be introduced as a node in the graph for probabilistic modeling

• What about model-based instance generation?

Page 12: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 12

Definitions for Feature-Based SamplingDefinitions for Feature-Based Sampling

• Given a set of base variables X={X1,X2,…,Xn}, and an instance x=(x1x2..xn) – SX is sufficient for fx if f is true for every solution

with the same assignment as x for variables in S– S is minimally sufficient if none of its subsets are

sufficient– The unique grounding of fx is the union of all

minimally sufficient sets• If f is not present in x, the grounding of fx is the empty set

• E.g., for f = “contains the substring 11”, the grounding of f10110111 is {X3,X4,X6,X7,X8}

Page 13: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 13

Generalizing Variable Assignment in Generalizing Variable Assignment in Instance GenerationInstance Generation

• Assigning fx to a newinstance means:

– When the feature is present in x• Partitioning the distribution to include the grounding of fx

– When the feature is absent from x• Partitioning the distribution to exclude the groundings of f for

all instances in the population

Page 14: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 14

Generalizing Variable Assignment in Generalizing Variable Assignment in Instance GenerationInstance Generation

• Consider assignmentwith f = “contains the substring 11”

We may now generate instances as before, although some assignments may fail (i.e., features may overlap)

Assignment with f110 (i1) Assignment with f010 (i3)

Page 15: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 15

Feature-Based BOA With MotifsFeature-Based BOA With Motifs

• Acceptance criterion for a motif f:

• F is the set of existing features

• count(A,B) is the number of instances in the population with all features in A and none in B

• spread(f) is the relative frequency of f across possible substring positions (more possibilities for short strings)

Page 16: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 16

Feature-Based BOA With MotifsFeature-Based BOA With Motifs

• N (the population size) random substrings are tested as possible motifs

• c = 0.4 was chosen based on ad-hoc experimentation with small (n = 30, 60) OneMax instances

• Motif-learning is O(n2•N), assuming a fixed upper bound on the number of motifs

• The general complexity of BOA-modeling is O(n3 + n2•N)

Page 17: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 17

Test ProblemsTest Problems

• OneMax should be very easy for fbBOA– Features are strings of all-ones

• TwoMax (aka Twin Peaks)– Global optima at 0n and 1n

– Features are strings of all-ones and strings of all-zeros

– Requires dependency learning

• 3-deceptive– Hard because of traps (local optima)

Page 18: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 18

TwoMax ~ ResultsTwoMax ~ Results

Qualitatively similar results were obtained for OneMax and 3-deceptive

Page 19: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 19

More Test ProblemsMore Test Problems

• One-Dimensional Ising Model– Penalizes every transition

between 0 and 1– Large flat regions in the fitness

landscape

• Hierarchal IFF (H-IFF) and Hierarchal XOR (H-XOR)– Adjacent pairs of variables are

grouped recursively (i.e., problem size is 2k)

– Global optima achieved when all levels are synchronized (0n and 1n for H-IFF, 01101001 and 10010110, for H-XOR)

H-IFF-64's Fitness Landscape (Cross-Section)

Page 20: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 20

Results ~ Ising ModelResults ~ Ising Model

Page 21: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 21

Results ~ H-IFFResults ~ H-IFF

Page 22: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 22

Results ~ H-XORResults ~ H-XOR

Page 23: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 23

ConclusionsConclusions

• Generalized probabilistic modeling and sampling with an additional level of abstraction, features

• Constraining instance generation on an abstract level can speed up the discovery of optima by an Estimation of Distribution Algorithm

• Current/Future Work: Strings to Trees– Dynamically rewrite trees

• identify meaningful variables

– New kinds of features• Semantics rather than syntax

Page 24: Levels of Abstraction in Probabilistic Modeling and Sampling

Levels of Abstraction 24