Quaestor · Typical Speedup: 15x Impact of Global Baqend Caching T R Y T H I S benchmark.baqend.com 0,2s 1,5s 1,6s 2,7s 3,8s 4,4s California 0,3s 2,1s 1,8s 2,6s 4,2s 0,8s 0,3s 1,6s

QuaestorQuery Web Caching for DBaaS ProvidersFelix Gessert, Michael Schaarschmidt, Wolfram Wingerath, Erik Witt, Eiko Zoneki, Norbert Ritter

Aug 29, VLDB 2017, Munich

Who we are

Backend-as-a-Service Startup since 2014Research Project since 2010

Challenges and Main Contributions

Industry Application

Introduction Main Part Conclusion

Motivation and High-Level Overview

Presentationis loading

Average: 9,3s

The Latency Problem

Loading…

-1% Revenue

-9% Visitors

-20% Traffic

State of the ArtThree Bottlenecks: Frontend, Latency and Processing

High Latency

Processing Overhead

Frontend

NetzwerkThroughput vs. Latency

I. Grigorik, High performance browser networking. O’Reilly Media, 2013.

NetzwerkBandbreite vs. Latenz

2× Throughput = Same Load Time

½ Lantecy ≈ ½ Load Time

Solution: Global CachingFresh Data From Distributed Web Caches

Low Latency

Less Processing

New Caching AlgorithmsSolve Consistency Problem

1 0 11 0 0 10

Typical Speedup: 15xImpact of Global Baqend Caching

T R Y T H I S

benchmark.baqend.com

0,2s1,5s1,6s

2,7s3,8s

4,4s

California

0,3s2,1s

1,8s2,6s

4,2s0,8s

0,3s1,6s

1,9s3,2s

5,3s6,6s

0,3s6,3s

4,7s6,5s

10,0s9,3s

BaqendAzureParse

FirebaseKinvey

Apiomat

Baqend

Baqend

Baqend

Frankfurt

Tokyo

Sydney

https://benchmark.baqend.com/





ChallengesHow to achieve globally low latency

Cache CoherenceHow keep dynamic data up-to-date in expiration-based caches?

Invalidation DetectionHow detect when queryresults change?

TTL EstimationWhat data to cache forhow long?

1 2

3

1 4 020

purge(obj)

hashB(oid)hashA(oid)

31 1 110Flat(Counting Bloomfilter)

hashB(oid)hashA(oid)

BrowserCache

CDN

1

𝑓 ≈ 1 − 𝑒−𝑘𝑛𝑚

𝑘

𝑘 = ln 2 ⋅ (𝑛

𝑚)

False-Positive

Rate:

Hash-

Functions:

With 20.000 entries and a 5% false positive rate: 11 Kbyte

Consistency: Δ-Atomicity, Read-Your-Writes, Monotonic Reads,

Monotonic Writes, Causal Consistency

Cache CoherenceDynamic Caching in Detail

Has Time-to-Live (expiration)





1 2

3

InvaliDB on Storm

Server

CreateUpdateDelete

Pub-Sub Pub-Sub

1 0 11 0 0 10 1 1

Fresh Bloom filter

Real-TimeQueries

(Websockets)

Fresh Caches

How to detect changes toquery results:„Give me the most popularproducts that are in stock.“

Add

Change

Remove

Invalidation DetectionMatching Queries and Updates

UpdateUpdate Database

Match!

Two-dimensional partitioning:• by Query• by Object→ scales with queries and writes

Implementation:• Apache Storm & Java• MongoDB query language• Pluggable engine

Subscription

Write op

17

InvaliDBFilter Queries: Distributed Query Matching

var query = DB.Tweet.find().matches('text', /my filter/).descending('createdAt').offset(20).limit(10);

query.resultList(result => ...);

query.resultStream(result => ...);

Static Query

Real-Time Query

Programming Real-Time QueriesJavaScript API





1 2

3

Learning RepresentationsDetermining optimal query result representation

Setting: query results can either be represented as references (id-list) or fullresults (object-lists)

Approach: Cost-based decision model that weighs expected round-trips vsexpected invalidations

Ongoing Research: Reinforcement learning of decisions

[𝑖𝑑1, 𝑖𝑑2, 𝑖𝑑3]

Object-ListsId-Lists

[ 𝑖𝑑: 1, 𝑣𝑎𝑙: ′𝑎′ , 𝑖𝑑: 2, 𝑣𝑎𝑙: ′𝑏′ ,{𝑖𝑑: 3, 𝑣𝑎𝑙: ′𝑐′}]

Less Invalidations Less Round-Trips

Problem: if TTL ≫ time to next write, then it is contained in Cache Sketch unnecessarily long

TTL Estimator: finds „best“ TTL and decides cacheability

Trade-Offs:

TTL EstimationDetermining the best TTL and cacheability

Longer TTLsShorter TTLs

• higher cache-hit rates• more invalidations

• less invalidations• less stale reads

Verfahren: Zeit bis zum nächsten Write 𝐸[𝑇𝑤] schätzen (Alex Protocol, EWMA, Deep Reinforcement Learning)

Gute TTLs optimaler Bloomfilter

TTL < TTLmin write-heavy Daten nicht cachen





Content-Delivery-Network

Scalable DatabasesBackend-as-a-Service API:Data, Queries, User Login, etc.Inclusion of all Web CachesAccess through HTTP

Backend ArchitectureBaqend Cloud

Content-Delivery-NetworkContent-Delivery-Network

Baqend Cloud

on

CDN

on

Backend ArchitectureBaqend Cloud

DevelopmentOn Baqend

Dashboard

Create schema, configure, browse data, etc.

CLI

Develop, deploy and testfrontend und backend code

REST & SDK

Website logic: load site, get data, login, etc.

Baqend Speed KitApplying Our Caching to Existing Websites

Redirects requests to Baqend forfaster delivery by including a snippet

Also available as a WordPress-Plugin Will be released in August 2017

Cache

Bloom filter

SW.js

Websites Internet & CDN Baqend Legacy System

LegacyWebsite

H2

Files, API

Purge Refresh Handler

Media

TextPull

Domain Experts

Config: what to intercept?

VLDB Performance with Speedkit

Ziel mit InnoRampUpFor a web without loading times.

[email protected]@Baqendcom

>10xFaster Loads

AutomaticScaling

Faster Development

Contact us:

mailto:[email protected]

https://www.baqend.com/

https://twitter.com/Baqendcom

Documents

Quaestor · Typical Speedup: 15x Impact of Global Baqend Caching T R Y T H I S benchmark.baqend.com 0,2s 1,5s 1,6s 2,7s 3,8s 4,4s California 0,3s 2,1s 1,8s 2,6s 4,2s 0,8s 0,3s 1,6s