11
Anton Martynenko on Using Elastic Search Outside Full-Text Search on behalf of Delodi UG

Using Elastic Search Outside Full-Text Search

Embed Size (px)

Citation preview

Anton Martynenkoon

Using Elastic Search Outside Full-Text Search

on behalf of Delodi UG

2

Full-Text Search

“An Introduction to Information Retrieval”

Written by Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze

http://nlp.stanford.edu/IR-book/

3

Elastic Search

● Bases on Lucene

● Provides full-text search

● Scalable, clusterable, distributable

● Has REST API

● Free and Open-Source

● Has FOS Elastica bundle for Symfony2

● Has Marvel plugins (Sense)

4

FOSElasticaBundle

https://github.com/FriendsOfSymfony/FOSElasticaBundle

● Integrates the Elastica library into a Symfony2 environment

● Automatically generate mappings using a serializer

● Listeners for Doctrine events for automatic indexing

5

A startup process for FOS Elastica Bundle

● Define indexes, create mapping for entity

● Run fos:elastica:populate

● Make queries

6

NetzShopping.de

● 2.5 millions of products in 8k categories

● 630 shops

● Daily updates

● 6 separate servers:– Varnish

– Web (nginx)

– MySQL

– Elastic

– Cron

– RabbitMQ

7

Filter Shadows Concept

Precalculated set of filters with an URL

8

Filter Shadow Flow

● GET /damenmode-jeans-reduziert-rosa-shop-breuninger

● Query Elastic, get FilterShadow by URL

● Get FilterArray:

– category: Damenmode > Jeans

– reduced: true

– color: rose

– shop: Breuninger● Query Elastic, get product IDs by filter values

● Query MySQL, get infos

9

Why Elastic

● MySQL:

– Slow LIKE search

– Long JOINs on many-to-many relations

● Elastic:

– Ability to map and index a whole entity

– Lots of analyzers and filters

– Fast full-text search on text fields

10

Aggregations

Result:

11

Beware

● Resources consuming:

– 10gb of indexes for a dev instance of NetzShopping

– Heavy CPU load all of a sudden

– High memory consumption

– High disk I/O

● Better to use a separate server