39
Copyright © 2014 Intridea Inc. All rights reserved. Elasticsearch with Rails July 10, 2014 Tom Zeng Director of Engineering [email protected] @tomzeng www.linkedin.com/in/tomzeng

Using elasticsearch with rails

  • Upload
    tomzeng

  • View
    763

  • Download
    1

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Using elasticsearch with rails

Copyright © 2014 Intridea Inc. All rights reserved.

Elasticsearch with RailsJuly 10, 2014

Tom Zeng Director of Engineering

[email protected] @tomzeng

www.linkedin.com/in/tomzeng

Page 2: Using elasticsearch with rails

What is Elasticsearch!Elasticsearch is a “flexible and powerful open source, distributed real-time search and analytics engine for the cloud” More than just full text search, it has powerful analytics capability, and can be used as a NoSQL data store Easy to setup, easy to use, easy to scale, easy to maintain Suitable for projects of any size (large and small, cloud or non-cloud) that need full text search and/or analytics, it’s our preferred search engine for Rails apps

Page 3: Using elasticsearch with rails

Elasticsearch Quick Live DemoUse curl to add some data (local Elasticsearch instance at port 9200) !curl -X DELETE "http://localhost:9200/todos" (clean up the index and start from scratch) !curl -X POST "http://localhost:9200/todos/task/1" -d '{"title" : "Learn Elasticsearch", "due_date" : "20140710T00:00:00", "done" : true, "tags" : ["seach","backend"]}' !curl -X POST "http://localhost:9200/todos/task/2" -d '{"title" : "Learn D3 and Backbone", "due_date" : "20140720T00:00:00", "done" : true, "tags" : ["frontend","javascript","visualization"]}' !curl -X POST "http://localhost:9200/todos/task/3" -d '{"title" : "Learn Rails 4", "due_date" : "20140830T00:00:00", "done" : false, "tags" : ["backend","ruby","rails"]}' !curl -X POST "http://localhost:9200/todos/task/4" -d '{"title" : "Learn Backbone Marionette", "due_date" : "20140715T00:00:00", "done" : true, "tags" : ["frontend","javascript"]}' !

Page 4: Using elasticsearch with rails

Elasticsearch Quick Live DemoUse curl to query data curl http://localhost:9200/todos/task/1 (use as K/V store) curl http://localhost:9200/todos/_search?pretty&q=done:false curl http://localhost:9200/todos/_search?pretty&q=tags:backend curl http://localhost:9200/todos/_search?pretty&q=title:back* curl -X POST "http://localhost:9200/todos/task/_search?pretty" -d ' { query : { range : { due_date : { from : "20140701", to : "20140715" } } } }' !

Page 5: Using elasticsearch with rails

Elasticsearch Quick Live Demo

Page 6: Using elasticsearch with rails

What is Elasticsearch

http://www.elasticsearch.org

Page 7: Using elasticsearch with rails

What is Elasticsearch

http://www.elasticsearch.org

Page 8: Using elasticsearch with rails

What is Elasticsearch

http://www.elasticsearch.org

Page 9: Using elasticsearch with rails

Who uses Elasticsearch

http://www.elasticsearch.org

Page 10: Using elasticsearch with rails

Who uses Elasticsearch - Github

Sort

HighlightFacets

Filters

Fulltext Search

Pagination

Page 11: Using elasticsearch with rails

Elasticsearch Key ConceptsCluster – A cluster consists of one or more nodes which share the same cluster name. Each cluster has a single master node which is chosen automatically by the cluster and which can be replaced if the current master node fails. Node – A node is a running instance of elasticsearch which belongs to a cluster. Multiple nodes can be started on a single server for testing purposes, but usually you should have one node per server. At startup, a node will use unicast (or multicast, if specified) to discover an existing cluster with the same cluster name and will try to join that cluster. Index – An index is like a ‘database’ in a relational database. It has a mapping which defines multiple types. An index is a logical namespace which maps to one or more primary shards and can have zero or more replica shards. Type – A type is like a ‘table’ in a relational database. Each type has a list of fields that can be specified for documents of that type. The mapping defines how each field in the document is analyzed.

!

http://www.elasticsearch.org/guide/reference/glossary

Page 12: Using elasticsearch with rails

Elasticsearch Key ConceptsDocument – A document is a JSON document which is stored in elasticsearch. It is like a row in a table in a relational database. Each document is stored in an index and has a type and an id. A document is a JSON object (also known in other languages as a hash / hashmap / associative array) which contains zero or more fields, or key-value pairs. The original JSON document that is indexed will be stored in the _source field, which is returned by default when getting or searching for a document. Field – A document contains a list of fields, or key-value pairs. The value can be a simple (scalar) value (eg a string, integer, date), or a nested structure like an array or an object. A field is similar to a column in a table in a relational database. The mapping for each field has a field ‘type’ (not to be confused with document type) which indicates the type of data that can be stored in that field, eg integer, string, object. The mapping also allows you to define (amongst other things) how the value for a field should be analyzed. Mapping – A mapping is like a ‘schema definition’ in a relational database. Each index has a mapping, which defines each type within the index, plus a number of index-wide settings. A mapping can either be defined explicitly, or it will be generated automatically when a document is indexed !

http://www.elasticsearch.org/guide/reference/glossary

Page 13: Using elasticsearch with rails

Elasticsearch Key ConceptsShard – A shard is a single Lucene instance. It is a low-level “worker” unit which is managed automatically by elasticsearch. An index is a logical namespace which points to primary and replica shards. Elasticsearch distributes shards amongst all nodes in the cluster, and can move shards automatically from one node to another in the case of node failure, or the addition of new nodes. Primary Shard – Each document is stored in a single primary shard. When you index a document, it is indexed first on the primary shard, then on all replicas of the primary shard. By default, an index has 5 primary shards. You can specify fewer or more primary shards to scale the number of documents that your index can handle.

Replica Shard – Each primary shard can have zero or more replicas. A replica is a copy of the primary shard, and has two purposes: 1) increase failover: a replica shard can be promoted to a primary shard if the primary fails. 2) increase performance: get and search requests can be handled by primary or replica shards. !

!http://www.elasticsearch.org/guide/reference/glossary

Page 14: Using elasticsearch with rails

Elasticsearch Key Concepts!

Elasticsearch SQL !Index => Database

Type => Table

Document => Row

Field => Column

Mapping => Schema

Shard => Partition

!

!

Page 15: Using elasticsearch with rails

Elasticsearch Installation

OS X brew install elasticsearch Ubuntu wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.1.1.deb sudo dpkg -i elasticsearch-1.1.1.deb Centos wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.1.1.noarch.rpm sudo yum install elasticsearch-1.1.1.noarch.rpm !

Page 16: Using elasticsearch with rails

Elasticsearch Status Check

Page 17: Using elasticsearch with rails

Elasticsearch Cluster Status

Page 18: Using elasticsearch with rails

Elasticsearch Monitoring

elasticsearch-head - https://github.com/mobz/elasticsearch-head

!

!

!

!

Marvel - http://www.elasticsearch.org/guide/en/marvel/current/#_marvel_8217_s_dashboards

Paramedic - https://github.com/karmi/elasticsearch-paramedic

Bigdesk - https://github.com/lukas-vlcek/bigdesk/ !

!

Page 19: Using elasticsearch with rails

Elasticsearch APIs

Page 20: Using elasticsearch with rails

Elasticsearch API Examples

Use curl to run the query and facet APIs curl -X POST "http://localhost:9200/todos/_search?pretty=true" -d ' { "query" : { "query_string" : {"query" : "Learn*"} }, "facets" : { "tags" : { "terms" : {"field" : "tags"} } } } ' Facets – todos tagged with keywords javascript: 2 frontend: 2 backend: 2 visualization: 1

!

!

!

Page 21: Using elasticsearch with rails

Elasticsearch Query DSL

http://www.elasticsearch.org

Page 22: Using elasticsearch with rails

Elasticsearch Query DSL Examples

http://www.elasticsearch.org

Page 23: Using elasticsearch with rails

Elasticsearch Query DSL Examples

Page 24: Using elasticsearch with rails

Elasticsearch Query DSL Examples

http://www.elasticsearch.org

Page 25: Using elasticsearch with rails

Elasticsearch Plugins and Rivers!

Use plugins to extend Elasticsearch functionality elasticsearch-head, paramedic, bigdesk are all plugins

!

Rivers are pluggable services that pull and index data into Elasticsearch Rivers are available for mongodb, couchdb, rabitmq, twitter, wikipedia, mysql, and etc

!

!

!

Page 26: Using elasticsearch with rails

Elasticsearch and Hadoop

Create an external Hive table using ES query q=china

Page 27: Using elasticsearch with rails

Elasticsearch and HadoopExternal Hive table data – wiki articles that reference the word 'china'

Page 28: Using elasticsearch with rails

Elasticsearch and RailsWell supported with the following gems:

!

elasticsearch-rails https://github.com/elasticsearch/elasticsearch-rails

!

elasticsearch-ruby https://github.com/elasticsearch/elasticsearch-ruby !searchkick https://github.com/ankane/searchkick !

tire (retire) https://github.com/karmi/retire !!

Page 29: Using elasticsearch with rails

Elasticsearch and Rails/Ruby

Page 30: Using elasticsearch with rails

Elasticsearch vs Solr!Feature Parity between Elasticsearch & Solr http://solr-vs-elasticsearch.com/ !Elasticsearch is easier to use and maintain !Built from ground up for scale (for all features) !Solr - not all features are available in Solr Cloud !

!

!

!

Page 31: Using elasticsearch with rails

Advanced Features of Elasticsearch!Fuzzy and Proximity Search !Autocomplete (term, phrase, completion, and context suggesters) !Suggest API !Geospatial Search (point, bounding box, polygon) !Plugins to extend functionality !Scripting in JavaScript, Python, Groovy, and Java !

!

!

Page 32: Using elasticsearch with rails

Advanced Features of Elasticsearch!Aggregation (more dimensions than Facets) !Related Image Search using LIRE (search similar images based on criteria) !Percolator (index queries & match on data - useful for event alert, i.e. back in stock) !Re-scoring on query results !Polymorphic Search !

!

!

!

Page 33: Using elasticsearch with rails

Elasticsearch the ELK Stack!Combining the massively popular Elasticsearch, Logstash and Kibana !End-to-end stack that delivers actionable insights in real-time from almost any type of structured and unstructured data source !

!

!

Page 34: Using elasticsearch with rails

Spree + Elasticsearch - you do e-commerce?

!Elasticsearch compliments or replaces Spree’s built-in search (AR with the ransack gem) !Two existing gems spree_elasticsearch and spree_elastic !spree_elasticsearch - replace built-in search - i.e. Product.search, etc

spree_elastic - complement built-in search - i.e. Product.elasticsearch !Both using model Decorators to add the ES search capabilities !Can build on top of one of them, or use the elasticsearch_rails directly !

!

!

Page 35: Using elasticsearch with rails

Elasticsearch and Spree - spree_elastic gem

Page 36: Using elasticsearch with rails

Elasticsearch and Spree - spree_elasticsearch gem

Page 37: Using elasticsearch with rails

Elasticsearch Resources!

http://www.elasticsearch.org/overview/ http://www.elasticsearch.org/guide/ https://github.com/elasticsearch/elasticsearch-hadoop https://github.com/mobz/elasticsearch-head http://railscasts.com/episodes/306-elasticsearch-part-1 http://railscasts.com/episodes/307-elasticsearch-part-2 !

!

!

Page 38: Using elasticsearch with rails

! 🌎 #

Copyright © 2014 Intridea Inc. All rights reserved.

BY WORKING REMOTELY

9,816 Hours SavedACROSS 4 COUNTRIES

31 EmployeesFOUNDED & STARTED IN 2007

Washington D.C.

We Make.Designers, developers and project managers; this is who we are.

Building, breaking and solving; this is what we do.

Page 39: Using elasticsearch with rails

Gracias

Merci ありがとう

Danke 谢谢

Thank You

Copyright © 2014 Intridea Inc. All rights reserved.