HTTP cache @ PUG Rome 03-29-2011

Preview:

DESCRIPTION

A short presentation about HTTP caching at the Rome PHP User Group.

Citation preview

HTTP cache

Alessandro Nadalin

March, 29th 2011

The Webis

inconsistent

get over it

Why?

Because it scales to billions of users

Because it scales to billions of users

WWW data traffic won't be possible without being

inconsistent

How does it scales?

Cache

What is caching?

Storing data so that future requests for that data can be served faster

( general purpose )

Process of storing copies of your most accessed resources near to a/some client/s

( the WWW )

Caching is not saving a webpage, or a fragment of it, on the disk, to serve it to the next client

Caching is not saving a webpage, or a fragment of it, on the disk, to serve it to the next client

This is a caching mechanism

Disk-persisted or in-memory

Can be distributed

Can be distributed with a NoSQL database

Cache implementation can be really funny

Don't reinvent the wheel

Goals

Goals

evolve

Goals

evolveloose coupling

Goals

Work lessevolve

loose coupling

Evolve Because you want your platform to extensible

Evolve Because you want your platform to extensible

Loose coupling Because you want it to be easy to integrate with, evolve, plug and mantain

Evolve Because you want your platform to extensible

Loose coupling Because you want it to be easy to integrate with, evolve, plug and mantain

Work less

Because every LoC is bug-prone and our man-day is a hard to scale cost

enters our Hero #1

enters our Hero #2

http://www.lullabot.com/articles/a-beginners-guide-to-caching-data

Goals

Work lessevolve

loose coupling

Goals

Work lessevolve

loose coupling

everything is done for us!

:)

but....

tmp files, cache tables, procedural crap...

mmmmh....

gotta be something better

Frameworks

Cache is used for compiling routes, autoloading, ...

Cache is used for compiling routes, autoloading, ...

...but also for storing the view

Goals

Work lessevolve

loose coupling

Goals

Work lessevolve

loose coupling

at least because we use a framework

but, probably

PHP sucks

Let's call a hero, from the East

on Rails

Goals

Work lessevolve

loose coupling

gosh :-(

hey, what aboutthe latest

products?

both Symfony2 andRuby's Rack embrace

HTTP

(here comes the nice part)

Local

Shared/proxy

Shared/reverse proxy

Local

Shared/proxy

Shared/reverse proxy

Local

Shared/proxy

Shared/reverse proxy

Local

Shared/proxy

Shared/reverse proxy

on the server side

Proxy Reverse proxy

http://www.codeproject.com/KB/aspnet/ExploringCaching.aspx

There is something cool about caching softwares.

Free as free beer

...but the server should take advantage of them

( the boring part were you need to write code )

and here are a few ways to do so, using

and here are a few ways to do so, using

expiration

and here are a few ways to do so, using

expiration

validation

and here are a few ways to do so, using

expiration

validation

invalidation

and here are a few ways to do so, using

expiration

validation

invalidationwe will se why

Expiration

GET / HTTP/1.1Host: www.example.comExpires: 0

GET / HTTP/1.1Host: www.example.comExpires: 0

GET / HTTP/1.1Host: www.example.comExpires: Tue, 15 Nov 1994 01:00 GMT

GET / HTTP/1.1Host: www.example.comCache-Control: max-age=60, public

GET / HTTP/1.1Host: www.example.comCache-Control: max-age=60, public

GET / HTTP/1.1Host: www.example.comCache-Control: max-age=60, public

Cacheable for 60 seconds

GET / HTTP/1.1Host: www.example.comCache-Control: max-age=60, public

Cacheable by both local and shared caches

GET / HTTP/1.1Host: www.example.comCache-Control: stale-if-error=600, stale-while-revalidate=600

GET / HTTP/1.1Host: www.example.comCache-Control: stale-if-error=600, stale-while-revalidate=600

fault-tolerant

GET / HTTP/1.1Host: www.example.comCache-Control: stale-if-error=600, stale-while-revalidate=600

available during downtime

GET / HTTP/1.1Host: www.example.comCache-Control: stale-if-error=600, stale-while-revalidate=600

available during revalidation

Validation

GET / HTTP/1.1Host: www.example.comEtag: 1234

GET / HTTP/1.1Host: www.example.comEtag: 1234

an identifier for your response

GET / HTTP/1.1Host: www.example.comIf-None-Match: 1234

the browsers asks you if it has been modified

Conditional requests

Relax

Calculating an Etag is cheaper than generating a full MVC response

HTTP/1.1 304 Not Modified

GET / HTTP/1.1Host: www.example.comLast-Modified: Tue, 15 Jan 2011 12:00:00 GMT

GET / HTTP/1.1Host: www.example.comLast-Modified: Tue, 15 Jan 2011 12:00:00 GMT

tell the client about the latest change

GET / HTTP/1.1Host: www.example.comIf-Modified-Since: Tue, 15 Jan 2011 12:00:00 GMT

the client asks you if it has been modified since the last time

Conditional requests

Relax

Calculating a date is cheaper than retrieving an entire object

HTTP/1.1 304 Not Modified

Invalidation

The web is not meant for invalidating data.

Server should not be able to keep clients' state, otherwise they wont scale well.

That's why long-polling and endless connections haven't had big success dealing with caching.

but hey, you say

HTTP's cache fails when dealing with really dynamic pages, because consumers will always have to hit the origin server, although a part of the page would be cacheable ( header and

footer, for example )

Nope

Nope

ESI was built for thathttp://www.w3.org/TR/esi-lang

and hey, Varnish is a reverse proxy implementing what you need of the ESI specification

take 2, pay for 1

So what does HTTP cache is meant to solve?

Less work

because the hard work is delegated to the browser/proxy

http://www.flickr.com/photos/snakphotography/5004775320/sizes/o/in/photostream/

evolve

because cache is abstracted from the application

loose coupling

because caching is bound to the protocol, HTTP, not to your implementation ( Sf, RoR, Django )

The presentations lacks on technical stuff

So here are a few resource for who wants documentation:

http://tomayko.com/writings/things-caches-dohttp://www.slideshare.net/fabpot/caching-on-the-edgehttp://www.odino.org/301/rest-better-http-cachehttp://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html

Deep version of the same talk,about Sf2 cache internals and ESI

May, 12/14 2011 in Verona

Early bid ends on April, 6th

Thanks!

Alessandro Nadalin@_odino_ odino.org