Deploying Immutable infrastructures with RabbitMQ and Solr

Deploying Immutable Infrastructures

RabbitMQ and Solr

@jordillonch@eloipoch

March 2017

Xavier Sanchez

≥

Who are we?

Eloi Poch Jordi LlonchXavier Sanchez

Agenda

• What is an immutable infrastructure?

• How do we deploy a RabbitMQ cluster?

• How do we deploy a Solr cluster?

• Conclusions

What is an immutable infrastructure?

ChangeChange

Change

Change

Change

Change

Change

once you instantiate something you never

change it

replace instances with other ones with the desired changes already applied

State A

State B

State E

State D

State F

State C

Key benefits• no differences between servers (eventually)

• lower failures & easier troubleshooting

• no downtime

• more confident update (up, running & tested)

• simple update & rollback

...but this is a simplification

we should be aware that infrastructure is divided into “data”

and “everything else”

How do we deploy a RabbitMQ cluster?

What is RabbitMQ?

M M

M

M

M

M

M M

MMM

M

M M M

M

M

RabbitMQ @ Wallapop

Domain Event Bus

• preferred system of communication between services

• avoids coupling between origin (publisher) and destiny/destinies (subscribers)

• failure tolerant architecture

Characteristics

• transient data

• no control over the publishers or consumers

• sync publish and async consumption

• soft/near realtime

How do we migrate to the new cluster?

Initial status

0

Great Service

Bored ServiceWTF Service

RabbitMQ Cluster

Crazy Service

Suicidal Service

Mainstream Service

Create new cluster

1

Great ServiceBored ServiceWTF Service

RabbitMQ Cluster

Crazy ServiceSuicidal Service Mainstream Service

RabbitMQ Cluster

new

Copy definitions

2


RabbitMQ Cluster


RabbitMQ Cluster

newconfigconfig

Create federated queues

3

Queue A

Queue B

Queue C

Queue A

Queue B

Queue C

message flow

message flow

message flow

Upstream Downstream

Only when a downstream queue has idle consumers

pull messages

pull messages

pull messages


RabbitMQ Cluster


RabbitMQ Cluster

newconfig

Federated Queues

config

Change DNS

4


RabbitMQ Cluster


RabbitMQ Cluster

new

Federated Queues

configconfig

Close connections

5


RabbitMQ Cluster


RabbitMQ Cluster

new

Federated Queues

configconfig


RabbitMQ Cluster


RabbitMQ ClusterFederated Queues

oldconfigconfig

Delete federated queues

6


RabbitMQ Cluster


RabbitMQ Cluster

oldconfigconfig

Federated Queues

Delete old cluster

7

Great Service

Bored ServiceWTF Service

RabbitMQ Cluster

Crazy Service

Suicidal Service

Mainstream Service

How do we deploy a Solr cluster?

What is Apache Solr?

• Solr is an open source enterprise search platform built on Apache Lucene providing advanced full-text search capabilities

• distributed search

• faceted search

• real-time indexing

• index replication

• hit highlighting

• suggesters

• ...

Why Solr?• NRT indexing

• Fast replication

• Handle large volumes of data (>25 millions docs)

• Handle high traffic rates

• Easy to scale

• Geospatial search

• ...

Solr @ Wallapop

Multiple uses• Wall index

• General Search index

• Specialized verticals indexes

• Search suggesters

• Upload suggesters

• Filter suggesters

• ...

Characteristics

• Non-volatile data

• Total control over the publishers and consumers

• Sync/async publish and sync consumption

• Near real-time

How do we deploy a Solr cluster?

• Combining live (sync) indexing and async reindexing

• Approach used in our NRT clusters

• Search service is not affected by the process

• Fast and easy roll-back

Why reindexing?• Changes in the index format and schema require reindexing the

whole corpus!!!

• New features might require changes in schema

• Changes in data type

• Changes in tokenization and analysis chain

• Add new fields or update fields to already indexed documents

• Solr/Lucene version upgrades require reindexing

• We need seamless tools for reindexing

Initial status

0

Create new cluster

1

Activate double NRT indexing

2

Reindex old data

3

Switch search traffic

4

Deactivate double NRT indexing

5

now we can kill our old cluster...

... or maybe we can use both!

Search traffic doubling: Functional testing

• Duplicate search requests

• Send sync request to main cluster

• Send async request to alternative cluster

• Check that everything works correctly

• use real traffic without serving to real users.

Traffic segmentation• A/B testing

• relevance tests

• new ranking algorithms

• changes in tokenization/analysis chain

• new search components

• Performance testing

• Changes in machine configuration

• Changes in Solr configuration

• Changes in java configuration

• Changes in index/schema format

Much more than a traffic split

• DNS traffic splitting has some limitations

• Unable to send different search queries to each cluster

• Splitting traffic at the API level allows us to change Solr queries

• Removing barriers for testing new Solr features.

Conclusions

• there is no silver bullet

• actually know your service

• you are not the only one

• repeat it (even when not necessary)

Thanks!We are hiring!

Mobile Test Automation Engineer Web Test Automation Engineer

Functional QA Software Engineer Junior - Mobile iOS

Android Engineer

Technology

Deploying Immutable infrastructures with RabbitMQ and Solr