45
Prototyping Apps with Elasticsearch and Heroku October 29th 2015

Prototyping applications with heroku and elasticsearch

  • Upload
    protofy

  • View
    650

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Prototyping applications with heroku and elasticsearch

Prototyping Apps with Elasticsearch and Heroku October 29th 2015

Page 2: Prototyping applications with heroku and elasticsearch

Protofy Martin and Mike: Prototyping with Heroku and Elasticsearch

Page 3: Prototyping applications with heroku and elasticsearch

Protofy builds Prototypes.Salad-Delivery-Service: From idea to first shipped salad -> 4days.

- Validation of the concept in a prototype - Live-Launch in March 2015- Continuous prototyping while quickly growing key KPIs- Seed-Financing in October, November

Page 4: Prototyping applications with heroku and elasticsearch

Protofy builds Prototypes.Voicefile transcoding and indexing for callcenters

client-a.com:- Validation of the concept in a prototype - Excessive usage of Elasticsearch as main database=> THIS is the FIRST project we will show deeper.

Page 5: Prototyping applications with heroku and elasticsearch

Protofy builds Prototypes.Automatic news aggregation by given list of keywords and synonyms.

client-b.com:- Validation of the concept in a prototype - Excessive usage of Elasticsearch to filter feed items and merge them=> THIS is the SECOND project we will show deeper.

Page 6: Prototyping applications with heroku and elasticsearch

And big infrastructure.Education Community Framework with Log-Everything strategy.

PokerStrategy.com:- 7 Mio members (2007-2013)- up to 1 Billion pageviews/year- sold in mid 2013After that 2 companies have been found: DECK36 and Feelgood. Both merged in early 2015 to PROTOFY.

Page 7: Prototyping applications with heroku and elasticsearch

Prototyping with Heroku.Focus on the app.

Page 8: Prototyping applications with heroku and elasticsearch

Heroku: Platform as a service.Prebuilt VMs for different programming languages- deployment via git- customizable with build-packs and add-ons- easily scalable- full logging of each part of the app and process- releases: Easy rollback on errors- heroku toolbelt to support local execution

Page 9: Prototyping applications with heroku and elasticsearch

Heroku: Prepare your app- Apps using other infrastructural services like MongoDB or Redis need to be aware of

environment variables

For example: Elasticsearch-service BONSAI provides: BONSAI_URL=https://user:[email protected]

- Use environment variables for everthing dependent.

- In general relay on the (Attention. Buzzword.) http://12factor.net/ methodology.

Page 10: Prototyping applications with heroku and elasticsearch

Buzzwording: 12 FactorsI. Codebase One codebase tracked in revision control, many deploysII. Dependencies Explicitly declare and isolate dependenciesIII. Config Store config in the environmentIV. Backing Services Treat backing services as attached resourcesV. Build, release, run Strictly separate build and run stagesVI. Processes Execute the app as one or more stateless processes

VII. Port binding Export services via port bindingVIII. Concurrency Scale out via the process modelIX. Disposability Maximize robustness with fast startup and graceful shutdownX. Dev/prod parity Keep development, staging, and production as similar as possibleXI. Logs Treat logs as event streamsXII. Admin processes Run admin/management tasks as one-off processes

Page 11: Prototyping applications with heroku and elasticsearch
Page 12: Prototyping applications with heroku and elasticsearch

Heroku: add-ons- Logentries

- NewRelic

- Bonsai-Elasticsearch

- MongoLabs

- Scheduler

- SSL

TIP: Care about backups! Even if they promise to do.

Page 13: Prototyping applications with heroku and elasticsearch

Heroku: Test before deployCONTINUOUS DEPLOYMENT using codehip.io (or others)

git

bitbucket

codeship test heroku

Page 14: Prototyping applications with heroku and elasticsearch

Terraforming.Infrastructure as code.

Page 15: Prototyping applications with heroku and elasticsearch

Heroku: Infrastructure as a service.- deployment via git

- a lot of add ons

- individual scaling of parts of the app

- process isolation

- full logging of each part of the app and process

- easy-to-use command line tools

- supports several languages (NodeJS, PHP, Rails, etc.

- releases: Easy rollback on errors.

- heroku toolbelt to support local execution like it would be on heroku with ForemanTERRAFORMBuild,'Combine,'and'Launch'Infrastructure

Page 16: Prototyping applications with heroku and elasticsearch

from automatic provisioning of servers ...

Configuration as Code

Page 17: Prototyping applications with heroku and elasticsearch

Infrastructure as Code… to automatic provisioning of services.

Page 18: Prototyping applications with heroku and elasticsearch

Why do we need that?As with Configuration Management:

-Replace “click-paths” with source code

-Reproducible Environment

-Versioning in SCM

-Specification and Documentation

Page 19: Prototyping applications with heroku and elasticsearch

What does it do?Configuration Language for Services

Actions:

-Plan

-Apply

-Refresh

-Destroy

Page 20: Prototyping applications with heroku and elasticsearch

What does it manage?Providers:

- Google Cloud- AWS- Azure- Heroku- DNSMadeEasy- …

Resources:

- aws_instance- aws_vpc- azure_instance- heroku_app- …

Provisioners:

- chef- file- exec

Page 21: Prototyping applications with heroku and elasticsearch

Example (part 1)###  AWS  Setupprovider  "aws"  {    access_key  =  "${var.aws_access_key}"    secret_key  =  "${var.aws_secret_key}"    region          =  "${var.aws_region}"}

#  Queue  between  importer  and  analyzerresource  "aws_sqs_queue"  "importqueue"  {    name  =  "${var.app_name}-­‐${var.app_env}-­‐import-­‐queue"}

resource  "aws_s3_bucket"  "importdisk"  {    bucket  =  "${var.app_name}-­‐${var.app_env}-­‐app-­‐importer"    acl        =  "private"}  

Page 22: Prototyping applications with heroku and elasticsearch

Example (part 2)###  Heroku  Setupprovider  "heroku"  {...}

#  App  EntityImporterresource  "heroku_app"  "importer"  {    name  =  "${var.app_name}-­‐${var.app_env}-­‐importer"    config_vars  {        SQS_REGION        =  "${var.aws_region}"        SQS_QUEUE_URL  =  "${aws_sqs_queue.importqueue.id}"        S3_BUCKET          =  "${aws_s3_bucket.importdisk.id}"        NODE_ENV            =  "${var.app_env}"    }}

resource  "heroku_addon"  "mongolab"  {    app    =  "${heroku_app.importer.name}"    plan  =  "mongolab:sandbox"}

Page 23: Prototyping applications with heroku and elasticsearch

Graph

Page 24: Prototyping applications with heroku and elasticsearch

Live-Demo

Launch application

terraform plan

terraform apply

terraform show

terraform destroy

Page 25: Prototyping applications with heroku and elasticsearch

Comparable Software– AWS CloudFormation

– HEAT, OpenStack orchestration

– boto, Python AWS library

– fog, Ruby cloud abstraction library

Page 26: Prototyping applications with heroku and elasticsearch

Problems– Version 0.6

– Still a few bugs

– Provider coverage

– Modules too simple

– Lacking syntactic sugar

Page 27: Prototyping applications with heroku and elasticsearch

Software as a service. Elasticsearch.

Page 28: Prototyping applications with heroku and elasticsearch

Elasticsearch ServiceLet other do the dirty work.

- Relatively complex setup with Shards and Replicas is maintained by specialists.

- Backups and version upgrades are done by these specialists, too.

- But 1: If version upgrades are announced YOU have to take action.

- But 2: Backups SHOULD be done by the specialists. In some cases they cannot provide consistent backups and that can lead to data loss. => Care about them yourself.

- But 3: If you need plugins: in the non-dedicated plans you cannot install them.

Decide well if or if not to use a service or do it yourself.

Page 29: Prototyping applications with heroku and elasticsearch

The projects. Short overview.

Page 30: Prototyping applications with heroku and elasticsearch

client-a.comVoicefile transcoding and indexing for callcenters

- Make telephone calls searchable- AccessManagement per Callcenter and Customer- Fast responses and results- Mobile- Be able to white label

Page 31: Prototyping applications with heroku and elasticsearch

Callcenter

Page 32: Prototyping applications with heroku and elasticsearch

client-b.comAutomatic content aggregation based on editor’s given input.

- Have up to 250.000 news items/day related to a topic from blogs, twitter/facebook/instagram and other configurable sources.- Have automatic sorting and merging of similar items into stories.- Be nearly realtime- Make editing of main stories possible- Mobile first

Page 33: Prototyping applications with heroku and elasticsearch
Page 34: Prototyping applications with heroku and elasticsearch

Elasticsearch. Some magic for the app.

Page 35: Prototyping applications with heroku and elasticsearch

Elasticsearch: GeneralSearch server based on Lucene. Providing RESTful web interface for JSON documents.

- Near real-time search.

- Sophisticated mapping configuration options. => Where the magic comes from.

- Highly scaleable and available.

- Conflict management with optimistic version control to avoid dataloss during concurrent write operations.

- Supporting Plugins for different areas (Like Filters, Queries, Analyzers etc.)

Page 36: Prototyping applications with heroku and elasticsearch

Elasticsearch: client-a.comElasticsearch as main database

- Provide several states of a document based on the state of processing. Always findable and restricted by ACLs

How to reach that?

Page 37: Prototyping applications with heroku and elasticsearch

Elasticsearch: client-a.comRestrict access by ACLs for „normal“ search

1. Check if user is allowed to access groups trying to request documents for.

2. If yes: Build query with filter restricting results to customers and callcenters based on ACL.

Page 38: Prototyping applications with heroku and elasticsearch

Find documentsrelated to callcenter1and callcenter2

{ "query": { "filtered": { "query": { "query_string": { "default_operator": "AND", "minimum_should_match": "55%", "auto_generate_phrase_queries": true, "phrase_slop": 3, "fields": [ "tags^2", "transscript" ], "query": "*" } }, "filter": { "bool": { "must": [ { "range": { "lastUpdated": { "gte": "now-24h", "lte": "2015-10-25T17:34:24+00:00" } } }, { "range": { "lastUpdated": { "gte": "2015-08-30T21:04:08+00:00" } } }, { "bool": { "should": [ { "term": { "source.callcenter": "callcenter1" } }, { "term": { "source.callcenter": "callcenter2" } } ] } } ] } } } }}

{ "query": { "filtered": { "query": { "query_string": { "default_operator": "AND", "minimum_should_match": "55%", "auto_generate_phrase_queries": true, "phrase_slop": 3, "fields": [ "tags^2", "transscript.texts.contents" ], "query": "*" } }, "filter": { "bool": { "must": [ { "range": { "lastUpdated": { "gte": "now-24h", "lte": "2015-10-25T17:34:24+00:00" } } }, { "range": { "lastUpdated": { "gte": "2015-08-30T21:04:08+00:00" } }

}, { "bool": { "should": [ { "term": { "source.callcenter": "callcenter1" } }, { "term": { "source.callcenter": "callcenter2" } } ] } } ] } } } }}

Page 39: Prototyping applications with heroku and elasticsearch

Elasticsearch: client-a.comRestrict access for suggests

1. Completion suggests are special handling for really fast autocompletionhttps://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html

2. How to make suggestions context (ACL) aware?

Page 40: Prototyping applications with heroku and elasticsearch

{ "body": { "suggest": { "text": "Agent 007", "completion": { "field": "agent.suggest", "size": 20, "fuzzy": false, "context": { "customer": [ "customer1", "customer4" ], "callcenter": [ "callcenter1", "callcenter2" ] } } } }}

Find suggestions related to context{ "agent": { "type": "multi_field", "fields": { "agent": { "type": "string", "copy_to": "autocompletion" }, "autocompletion": { "type": "string", "index_analyzer": "edgeNGram_analyzer_suggest" }, "suggest": { "type": "completion", "index_analyzer": "nGram_analyzer_suggest2", "search_analyzer": "whitespace_analyzer", "max_input_length": 20, "context": { "customer": { "type": "category", "path": "source.customer_lowercase" }, "callcenter": { "type": "category", "path": "source.callcenter_lowercase" } } } }, "include_in_all": false } }

Query Mapping

Page 41: Prototyping applications with heroku and elasticsearch

Elasticsearch: client-b.comElasticsearch to find similar articles and match them to stories

- Index stories and automatically find entities within the articles text

- Match similar articles to at least one story (based on entities) and context

How to do that?

Page 42: Prototyping applications with heroku and elasticsearch

Elasticsearch: client-b.comEntity matching by list of keep words and aliases

1. Create a list of synonyms and keep words to be used in filters.https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-keep-words-tokenfilter.htmlhttps://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-synonym-tokenfilter.html

2. Index document 1st time to find entities based on keep words and synonyms.

3. Take document enriched with entities to build a query from it to match against the set of documents to find similar articles.

4. Combine them to a story.

Page 43: Prototyping applications with heroku and elasticsearch

Setting for matching entities

"settings": { "analysis": { "filter": {}, "analyzer": {

"entity_analyzer": { "tokenizer": "whitespace", "filter": [ "german_stop", "shingle", "entity_synonym", "shingle", "entity_keepwords" ] } } }},

Page 44: Prototyping applications with heroku and elasticsearch

Live-Demo

Check how entities are matched in a text

1. ./load_entities_list

2. curl -XGET "localhost:9200/talk/_analyze?analyzer=entity_analyzer&pretty=true" -d "Text"

=> Document is indexed with found entities on indexing time. Analyzing process is like operating on a stream.

Page 45: Prototyping applications with heroku and elasticsearch

Martin Schütte and Mike Lohmann

ProtofyKaiser-Wilhelm-Straße 8520355 Hamburg

[email protected]@protofy.com