20
Chapter 6 - Basic Similarity Topics Case-based reasoning

Chapter 6 - Basic Similarity Topics Case-based reasoning

Embed Size (px)

Citation preview

Page 1: Chapter 6 - Basic Similarity Topics Case-based reasoning

Chapter 6 - Basic Similarity Topics

Case-based reasoning

Page 2: Chapter 6 - Basic Similarity Topics Case-based reasoning

Introduction

• Common term in everyday language, where two objects usually are considered similar if they look or sound similar

• Similarity is a core concept within CBR

• From a CBR perspective: «Two problems are similar if they have similar solutions»

• Not as clear defined as the term equality

• Accepted that similarity is subjective and requires approximate rather than exact reasoning

Page 3: Chapter 6 - Basic Similarity Topics Case-based reasoning

Similarity and case representation

• Similarity measures are defined to compare objects (cases)

• The measures operate on the case representation

• Similarity is the essential function used for retrieval and the link between case representation and retrieval

• Only consider attribute-value case representations and attribute-based similarity measures

Page 4: Chapter 6 - Basic Similarity Topics Case-based reasoning

The mathematics of similarity• Two influencing factors:

- Fuzzy sets offers a background to model inexact expressions. Do not deal with classical yes-or-no answers, but rather ones that have vague character

- Metrics are used in mathematics whenever approximations (rather than exact solutions) are involved. This make them suitable for modeling similarity

• Similarity measures may inherit and benefit from properties of these two factors. Examples of such properties are symmetry, transitivity, etc.

Page 5: Chapter 6 - Basic Similarity Topics Case-based reasoning

Two mathematical models of similarity

• Similarity as a relation:

- Qualitative measure comparing different similarities

- Example: two objects are more similar to each other than two other objects

R(x,y,z) ⇔ «x is at least as similar to y as x is to z»

- Allows the definition the nearest neighbour concept

➡ The nearest neighbor of x is the y for which the R-relation above holds for all z

Example of k-NN where k=3

Page 6: Chapter 6 - Basic Similarity Topics Case-based reasoning

Two mathematical models of similarity

• Similarity as a function:

- Make similarity quantitative by expressing how similar two objects are

- Assigning a number/degree of similarity to pairs of objects

- Def.: A similarity measure for a problem space P is a function

sim: P x P → [0,1]

- Example of similarity functions and how they may be compared

sim (x,y) ≥ sim (x,z) ⇔ «x is at least as similar to y as x to z»

Page 7: Chapter 6 - Basic Similarity Topics Case-based reasoning

Distances

• Proxy to similarities, both look at the same object from different point of view

• In most situations we can freely choose between distances and similarities

• It is possible to convert between similarities and distances. However, such a transformation may not necessarily conserve the exact numerical similarity/distance values

Page 8: Chapter 6 - Basic Similarity Topics Case-based reasoning

Types of similarity measures

• Counting similarities

• Metric similarities

• Transformation similarities

• Structure-oriented similarities

• Information-oriented similarities

• Relevance-oriented similarities

• Dynamic-oriented similarities

Page 9: Chapter 6 - Basic Similarity Topics Case-based reasoning

Types of similarity measures

• Counting similarities

• Metric similarities

• Transformation similarities

• Structure-oriented similarities

• Information-oriented similarities

• Relevance-oriented similarities

• Dynamic-oriented similarities

‣Measures similarity by counting certain occurrences in the representation➡ Count the number of family members for tax purposes

‣Example: Hamming measures

Page 10: Chapter 6 - Basic Similarity Topics Case-based reasoning

Types of similarity measures

• Counting similarities

• Metric similarities

• Transformation similarities

• Structure-oriented similarities

• Information-oriented similarities

• Relevance-oriented similarities

• Dynamic-oriented similarities

‣Applicable to attributes with numerical values

‣Arise as variations of Euclidean metrics➡Typically distance functions that represent a travel view

Page 11: Chapter 6 - Basic Similarity Topics Case-based reasoning

Types of similarity measures

• Counting similarities

• Metric similarities

• Transformation similarities

• Structure-oriented similarities

• Information-oriented similarities

• Relevance-oriented similarities

• Dynamic-oriented similarities

‣The measure counts the number of operations required to transform one object into another

‣Example: Levenshtein distance. Uses insertion, deletion and modification as possible change actions and counts the number of changes required

Page 12: Chapter 6 - Basic Similarity Topics Case-based reasoning

Types of similarity measures

• Counting similarities

• Metric similarities

• Transformation similarities

• Structure-oriented similarities

• Information-oriented similarities

• Relevance-oriented similarities

• Dynamic-oriented similarities

‣The structure in which the knowledge is presented plays a role, e.g. object-orient representation

‣Refers mainly to attributes that have symbolic attribute values from with the attribute-based structure is built

Page 13: Chapter 6 - Basic Similarity Topics Case-based reasoning

Types of similarity measures

• Counting similarities

• Metric similarities

• Transformation similarities

• Structure-oriented similarity

• Information-oriented similarities

• Relevance-oriented similarities

• Dynamic-oriented similarities

‣Information and knowledge plays an essential role

‣Often used for texts; considered similar if they provide similar information to the user

Page 14: Chapter 6 - Basic Similarity Topics Case-based reasoning

Types of similarity measures

• Counting similarities

• Metric similarities

• Transformation similarities

• Structure-oriented similarity

• Information-oriented similarities

• Relevance-oriented similarities

• Dynamic-oriented similarities

‣Weight the importance of different aspects contributing to similarity

‣Not a type in itself, but rather may rather be used in combination with the other types

Page 15: Chapter 6 - Basic Similarity Topics Case-based reasoning

Types of similarity measures

• Counting similarities

• Metric similarities

• Transformation similarities

• Structure-oriented similarity

• Information-oriented similarities

• Relevance-oriented similarities

• Dynamic-oriented similarities ‣Consider and compare dynamic processes

Page 16: Chapter 6 - Basic Similarity Topics Case-based reasoning

Local-global principle of similarity

• Useful when dealing with complex structures

• The principle: Each object is constructed from atomic parts, by some construction process.

• Possible to compare the atomic parts by using local measures, before comparing the more complex structure.

• Determine the influence of each one of the local parts should have on the global measure by assigning weights to each part

• Difficult problem to determine the weights

Page 17: Chapter 6 - Basic Similarity Topics Case-based reasoning

Virtual attributes• A problem with the local-global principle arises when there are

dependencies between the attributes that influence similarity

• Example: bank loans

Reliability for getting a loan depends on both income and spending

• Assigning weights to independent attributes make little sense

• Introduce additional attributes that reflect the dependencies explicitly

• Such attributes are defined in terms of the given attributes and are called virtual attributes

• Allows simpler similarity measure

Page 18: Chapter 6 - Basic Similarity Topics Case-based reasoning

Which similarity measure should be used?

• Some influencing factors for the choice are:

- Case representation

- Size of case base

- Efficiency needed for retrieval

- Number of values in the domain of the attributes

• Useful guidelines:

- Try to ensure compatibility between case representation and the similarity measure

- If possible, apply the local-global principle for complex structures

Page 19: Chapter 6 - Basic Similarity Topics Case-based reasoning

Summary• Link between case representation and retrieval

• There is no clear definition of the concept and there exists a variety of different types of measures

• Similarity measures are heavily influenced by mathematics. Two mathematical ways to represent similarity is as a function or as a relation

• The local-global principle may also apply to similarity measures

• What type of similarity measure that should be used depends on the objects to be compared

Page 20: Chapter 6 - Basic Similarity Topics Case-based reasoning

Comments

• Few comparisons, missing an overview of the differences between the different types of similarity measures

- Mainly descriptive presentation, making it difficult to distinguish between the different measures

• What that the implications of choosing one type of measure over the other

- In a later chapter?