28
1 APPROACHES TO IMPLEMENT SEMANTIC SEARCH Johannes Peter Product Owner / Architect for Search

APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

1

APPROACHES TO IMPLEMENT SEMANTICSEARCH

Johannes PeterProduct Owner / Architect for Search

Page 2: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

2

WHAT IS SEMANTICSEARCH ?

Page 3: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

3

Success of search• Interface of shops to brains of customers• Wide range of usage• Success depends on a proper understanding

?Search

Page 4: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

4

Simple keyword search

title type description attribute

MyMobile 7 Smartphone ... with a contract ... contract

MyMobile 7 Smartphone ...

Marriage without contract DVD ...

MyMobile 6 Smartphone ...

Sitcom season 7 DVD ... 7 …

MyMobile 6 Smartphone ... with a contract ... contract

mymobile 7 without contract

Page 5: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

5

Identifying entities

mymobile 7 without contract

product / product group without certain attribute

Entity Example

Products mymobile 7

Attributes contract

Product with / without attribute mymobile 7 without contract

Product group with approximate price mymobile under 300 euro

Page 6: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

6

Semantic search

title type description attribute

MyMobile 7 smartphone ...

MyMobile 6 smartphone ...

MyMobile 7 smartphone ... with a contract ... contract

MyMobile 6 smartphone ... with a contract ... contract

mymobile 7 without contract

product / product group without certain attribute

Page 7: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

7

Core benefits

Better precision Betterrecommendation

Facilitated searchmanagement

Page 8: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

8

Future perspectives

Voice search Sophisticated salesadvisors Chat bots

Page 9: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

9

APPROACHES

Page 10: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

10

ONTOLOGIES & RULECOLLECTIONS

Page 11: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

11

Ontologies & rule collections

mymobile 7 without contract

Step Example

Identify entities product(mymobile 7) without attribute(contract)

Execute rules to combineentities product(mymobile 7) not(attribute(contract))

Translate into search query title:("mymobile 7") AND NOT flag:(contract)

Page 12: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

12

Ontologies• Hierarchies of entities• Products, attributes and relations

mymobile

mymobile 6 mymobile 7

product

color

black white

attribute

Page 13: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

13

Rule collections

• Condition: There is the term without between a product and an attribute• Action: Negate the attribute

mymobile 7 without contract

pink dvd

• Pink: color or artist? à Disambiguation

• Condition: The term pink appears together with entities related to music or movies• Action: Annotate the term pink as artist

Page 14: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

15

Implementation

• Two parts of implementation- Development of the application- Information extraction part (creation of ontologies & rule collections)

• Service for ontology extraction- Solr and Elasticsearch are not suitable- Highly scalable and performant solution with Spring Boot & Apache Lucene (using term

vectors as payloads)

• Rule engine- Configurable rulesets- Routing concept

Page 15: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

16

Implementation

• Well suited for agile development• Pieces of information can be extracted fairly independently

Sprint(s) Extract prices Ontology for products

Sprint(s) Combinations of products & pricesRules for products

Page 16: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

17

Implementation• More complex cases- Extract information out of product descriptions- Understanding of natural language

Developers Analysts / Linguists

• Requires maintenance for ontologies and rule collections

Page 17: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

18

MACHINE LEARNING

Page 18: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

19

Machine learning

training data model new query

Page 19: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

20

term mymobile 7 without contractpart of speech noun digit preposition nounrelation head mymobile contract mymobilechunks noun phrase noun with negationentity product with negated attribute

Machine learning

training data model new query

Page 20: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

21

Machine learning – NLP

• How natural is the language used for queries?

• Considering grammatical information can be complicated

• Disambiguation is very difficult for some cases

term pink mymobilepart of speech adjective noun

term pink dvdpart of speech proper noun noun

• Natural language processing:

- "The label saw potential in Pink and offered her a contract."

Page 21: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

22

Implementation• Established procedures from the area of natural language processing• Libraries (e. g. spaCy) providing- Functionalities fairly easy to use- High performance- Customizations

• All discussed steps require their own model (training + evaluation data)

• Still highly experimental- Fail early?- Continuous delivery?

Page 22: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

23

TERM CO-OCCURRENCES

Page 23: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

24

Term co-occurrences• Enrich documents by contextual information

• Using collaborative filters (recommendation)

• Which terms / attributes appear in the context of a product?

Page 24: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

26

Term co-occurrences

title category color description

MyMobile 7 MyMobile black Smartphone MyMobile 7 black with 128 gb

MyMobile 7 MyMobile white New smartphone MyMobile 7, 64 gb, white

Sitcom season 7 DVD Season number 7 of the sitcom …

MyMobile 6 MyMobile black MyMobile 6 – smartphone – 32 gb – black

MyMobile 6 MyMobile white MyMobile 6, smartphone black with 128 gb

• Co-occurring terms for category MyMobile:Ø Term "smartphone": 7, black, white

mymobile 7

Page 25: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

27

Term co-occurrences

title category color context

MyMobile 7 MyMobile black 6, white

MyMobile 7 MyMobile white 6, black

MyMobile 6 MyMobile black 7, white

MyMobile 6 MyMobile white 7, black

Sitcom season 7 DVD …

mymobile 7

Page 26: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

29

Implementation• Fairly easy to implement• Generic

• Produces side effects• Requires high data quality

• Only partially solves problems related to semantic search• Not suitable for complex cases

Page 27: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

30

ConclusionTerm co-occurrences Ontologies + rules Machine learning

Effort moderate high high

Holistic solution no yes yes

Suitable for complex cases no yes yes

Maintenance effort low high low

Success factors• High data quality• Agile development

• Ability of linguists• Quality of rules• Agile development

• Ability of data scientists• Quality of training data

Risk factors • Side effects • Never-ending rule-building

• Never-ending generationof training data

• Too high expectations

Page 28: APPROACHES TO IMPLEMENT SEMANTIC SEARCHmices.co/...Peters_Approaches-to-semantic-search.pdf · SEARCH Johannes Peter Product Owner / Architect for Search. 2 WHAT IS SEMANTIC SEARCH

31

THANK YOU !!

BTW: We are hiring …[email protected]