16
QUEUE IN THE CLOUD WITH MONGODB MONGODB LA 2013 NURI HALPERIN

Queue in the cloud with mongo db

Embed Size (px)

DESCRIPTION

Leveraging MongoDB as a queue infrastructure. This talk describes a low-overhead approach for building a queue as a service built on top of MongoDB, covering concerns of durability, long running processes and distributed processing.

Citation preview

Page 1: Queue in the cloud with mongo db

QUEUE IN THE CLOUD WITH MONGODBMONGODB LA 2013

NURI HALPERIN

Page 2: Queue in the cloud with mongo db

QUEUE

Page 3: Queue in the cloud with mongo db

USAGE

Ordered execution

Buffering consumer/producer

Work distribution

Page 4: Queue in the cloud with mongo db

GOALS OF PROJECT

Leverage Mongo

• Reduce ops overhead by reusing infrastructure• Map queue semantics to Mongo’s strengths

Reliable

• Durable - support long running process• Resilient to machine failure• Narrow down window of failure/ data loss.

Centralized, distributed:

• Multiple producers• Multiple consumers

Page 5: Queue in the cloud with mongo db

ITERATION 0

Capped collection – not the perfect choice

• Tailing queue seems attractive, but…• Need external sync to avoid double-consume• Secondary indexes and updating are anti-pattern

Relaxing FIFO is OK

• No guarantee that first-popped is first done• Multi-client is negated if they have to sync on execution order• Race condition for queue insertion has same effect

Conclusion: Project doesn’t use capped collection and relaxes FIFO.

Page 6: Queue in the cloud with mongo db

PARANOID BY DESIGN

Network diesProcess dies

DB dies

Machine dies Poison letter Dead letter

Page 7: Queue in the cloud with mongo db

ITERATION 1

db.q4foo.save({v:{f:1}})

db.q4foo.findAndModify({query: {}, sort: {_id:1}, remove: true})

Hot: quick and simple

Not: dead client, dead in transit, no trace

Page 8: Queue in the cloud with mongo db

ARE WE THERE YET?

Network diesProcess dies

DB dies

Machine dies Poison letter Dead letter

Page 9: Queue in the cloud with mongo db

QUEUE SEMANTICSLocal / Memory Distributed

Push Put

Pop Get << visibility >>

<< exception >> Release << retry >>

Delete

<< exception >>

Page 10: Queue in the cloud with mongo db

ITERATION 2db.q4foo.save({v:{f:1}, dq: null})

db.q4foo.findAndModify( { query: { dq: null}, sort: {_id:1}, update:{ $set: { dq: later(60)}}})

… If processing was success => delete..

Hot: If client dies, item remains in queue. Data not lost.

Not: index on _id less useful in high volume.

Page 11: Queue in the cloud with mongo db

ARE WE THERE YET?

Network diesProcess dies

DB dies

Machine dies Poison letter Dead letter

Page 12: Queue in the cloud with mongo db

ITERATION 3db.q4foo.save({v:{f:1}, dq: null, pc: 0})

db.q4foo.findAndModify({query: { dq: null, pc:{$lt:3}}, sort: {_id:1}, update:{$set:{dq:later(60)},$inc:{pc:1}}}) // consume

db.q4foo.findAndModify({query: {_id:"..."}, update:{$set:{dq: null}}}) // release

Hot: An item can be retried automatically (pc) after released. Exhausted item remains in queue.

Not: Not strict FIFO.

Page 13: Queue in the cloud with mongo db

ARE WE THERE? YES.

Network diesProcess dies

DB dies

Machine dies Poison letter Dead letter

Page 14: Queue in the cloud with mongo db

ITERATION 4

Ensure your queue writes use applicable durability

• db.q4foo.save() + getLastError(…)• db.q4foo.findAndModify () + getLastError(…)

Replica sets for durability only. No capacity or speed gain.

Page 15: Queue in the cloud with mongo db

OTHER THOUGHTSCreate admin jobs to monitor queues:

• Growth• Retries exhausted

Consider TTL risks (ex: client failure before calling Release())

Consider idempotent operations when possible

Design clients to back off polling

Separate queue vs. extra “topic” field

Consider dedicated DB for write-lock scope

Capped vs. regular collection – capped now can have _id, in-place update.

Page 16: Queue in the cloud with mongo db

Q&A

Nuri Halperin

[email protected]

Thank you!