56
INTERPOLIQUE (OR, THE ONLY GOOD DEFENSE IS THROUGH A BRUTAL OFFENSE) Dan Kaminsky, Chief Scientist Recursion Ventures [email protected]

Interpolique

Embed Size (px)

Citation preview

Page 1: Interpolique

INTERPOLIQUE(OR, THE ONLY GOOD DEFENSE IS THROUGH A BRUTAL OFFENSE)

Dan Kaminsky, Chief Scientist

Recursion Ventures

[email protected]

Page 2: Interpolique

ANNOUNCEMENT

This is my new company. Woot. Recursion productizes significant research

It’s time to do things a little differently This talk isn’t a sales pitch for Recursion, but it’s

an idea regarding its philosophy

Page 3: Interpolique

A STORY

Design flaw in SSL The server thought it was resuming, the client

thought it was connecting Project Mogul spawned to fix it

Several months in deep secrecy Thousands of hours spent on IETF fix

The fix broke <1% of servers No big deal, right?

Page 4: Interpolique

REALITY

“Note that to benefit from the fix for CVE-2009-3555 added in nss-3.12.6, Firefox 3.6 users will need to set their security.ssl.require_safe_negotiation preference to true. In Mandriva the default setting is false due to problems with some common sites.” – Mandriva Patch Notes They thought knocking out a few sites was

acceptable for a remediation They were wrong

Page 5: Interpolique

THE BAD NEWS

We give bad advice Pen testers are very good at breaking things Our “remediation” advice tends towards myopia

We consider only our own engineering requirements We assume tools are static, and bash the craftsman

Page 6: Interpolique

THE GOOD NEWS

We are the keys to there actually being good advice We are the one community that actually knows

how things break We hold the knowledge to end the bugs we keep

seeing

Page 7: Interpolique

SESSION MANAGEMENT

Page 8: Interpolique

A SIMPLE QUESTION

When I log into two SSH servers, do I need to worry about one accessing the other? No

When I log into two web sites, do I need to worry about one accessing the other? Yes

Why? Because SSH does not have totally broken

session management

Page 9: Interpolique

SIMPLE THINGS, SIMPLY BROKEN

The web was never designed to have authenticated resources Auth was bolted on (because Basic/Digest never

got fixed) Normal Mechanism For Managing Credentials

Password causes Set-Cookie Cookie sent with each query to target domain Cookie is sent even with requests caused

by third party domains User’s credentials are mixed with attacker’s URL This is why most XSS/XSRF attacks are dangerous

Cross Site Scripting and Cross Site Request Forgery wouldn’t be nearly the big deal they are if they didn’t work cross site

Page 10: Interpolique

THE PEN TESTER REACTION:DEV, DO MORE WORK

XSRF Tokens Manually add a token to every authenticated URL Requires touching everything in a web app that

generates a URL How’s that working out for us?

This seems to be a lot of work If/when we come back six months later, it’s not

usually done, is it?

Page 11: Interpolique

A MODEST PROPOSAL

Couldn’t the tools be better? The big debate: Should SVGs animate? Unsaid: Shouldn’t it be possible to easily log

into a web site without other sites being able to use your creds?

Page 12: Interpolique

AN ATTEMPT

A fix that requires no change to the browser is better So I tried to find one

Server Side Referrer Checking Client Side Referrer Checking Window.Name Checking Window.SessionStorage Checking

It says SessionStorage! Surely it’s perfect for Session Management!

They all failed Thank you Cstone, Kuza55, Amit Klein, David Ross,

SirDarckcat

Page 13: Interpolique

WHEN FAILURE IS SUCCESS:OUR PROBLEM WITH LATENCY

My suggested defenses were defeated early in development We, as a community, have a latency problem

We don’t break during development We don’t break at release We don’t break when early adopters are deploying We break only when it gets really popular

By then, it’s in customer hands, and the best we can do is give the customers really expensive advice on how to fix it

We need to close the feedback loop

Page 14: Interpolique

AT MINIMUM

Whatever’s going on with other defenses, I want mine to be thoroughly, even brutally audited as soon as possible Life is too short to back broken code!

Session Management will require modifications to the browser

Something else might not…

Page 15: Interpolique

ON LANGUAGES

"The bottom-line is that there just isn't a large measurable difference in the security postures from language to language or framework to framework -- specifically Microsoft ASP Classic, Microsoft .NET, Java, Cold Fusion, PHP, and Perl. Sure in theory one might be significantly more secure than the others, but when deployed on the Web it's just not the case.”

--Jeremiah Grossman, CTO, White Hat Security (a guy who has audited a lot of web applications)

Question: Why aren’t the type safe languages safer against web attack than the type unsafe languages?

Page 16: Interpolique

WE AREN’T ACTUALLY USING THEM

Reality of web development HTML and JavaScript and CSS and XML and SQL

and PHP and C# and… “On the web, every time you sneeze, you’re

writing in a new language” How do we communicate across all these

languages? Strings

And how type safe are strings? Not at all

Page 17: Interpolique

ALL INJECTIONS ARE TYPE BUGS

select count(*) from foo where x=‘x' or '1'='1'; The C#/PHP/Java/Ruby sender thinks there’s a

string there. The SQL receiver thinks there’s a string, a

concatenator, another string, and comparator, and another string there.

The challenge: Maintaining type safety across language boundaries

Page 18: Interpolique

ISN’T THIS A SOLVED PROBLEM?

Escaping? Parameterized Queries?

Page 19: Interpolique

NO ESCAPE

$conn->query(“select * from foo where x=\“$foo\”;”); Is this secure or not? Who knows, depends on whether $foo has been

escaped between when it first came in on the wire, and when it’s being passed into the DB

This simple line of code is expensive to debug!

If somebody removes the escape(), the code still works “Fails open”

Page 20: Interpolique

ACCIDENTAL ESCAPE

What does it mean to escape? “Block Evil Characters”

Was very easy to determine evil characters when we just had ASCII Only 256 possible bytes

Unicode changes that Millions of characters All of which could mutate (“best fit match”) into one

another All of which have multiple possible encodings, and

representations within encodings Escaping works by accident, without a solid

contract Keeps getting updated escape(), escapeURI(), escapeURIComponent()

Page 21: Interpolique

WHAT ABOUT PARAMETERIZED QUERIES?

Which would you rather write? $r = $m->query(“SELECT * from foo where

fname=‘$fname’ and lname=‘$lname’ and address=‘$address’ and city=‘$city’”);

$p->prepare(“SELECT * from foo where fname=‘$fname’ and lname=‘$lname’ and address=‘$address’ and city=‘$city’”);$p->set(1, $fname);$p->set(2, $lname);$p->set(3, $address);$p->set(4, $city);$r = $m->queryPrepared($p);

Page 22: Interpolique

REALITY OF PARAMETERIZED QUERIES

No developer has ever written a parameterized query without a gun to his head We should know We hold the gun

Page 23: Interpolique

POSITIONAL GENERATION ISN’T ANY BETTER (C/O MIKE SAMUEL)

Page 24: Interpolique

O(N) UI WORK FAILS(BEST CASE EYE TRACKING)

Page 25: Interpolique

HOW INJECTIONS HAPPEN /HOW DEVS LIKE TO WRITE CODE

String Interpolation: select count(*) from foo where x=‘$_GET[“foo”]';

String Concatenation:“select count(*) from foo where x=\”“ + $_GET[“foo”] + “\”;”;

Why they write code this way Devs are thinking inline They want to be writing inline

See: Fitts’ Law

Page 26: Interpolique

IS IT POSSIBLE…

…to let devs write inline code, without exposing the resultant strings to injections? Yes – by making String Interpolation smarter

RETAIN: The language still sees the boundary between the environment(“select * from…”) and the variable ($_GET…).

TRANSLATE: Given that metadata, the language can do smarter things than just slap unprocessed strings together

(This overlaps with, and extends, Mike Samuel’s excellent “Secure String Interpolation” work, seen at http://tinyurl.com/2lbrdy.) Working with Mike

Page 27: Interpolique

INTERPOLIQUE DEMO [0]

Page 28: Interpolique

INTERPOLIQUE DEMO[1]

Page 29: Interpolique

INTERPOLIQUE DEMO[3]

Page 30: Interpolique

INTERPOLIQUE DEMO[4]

Page 31: Interpolique

INTERPOLIQUE DEMO[5]

Submit if($_POST[action] == "add"){ $conn->query(eval(b('insert into posts values(^^_POST[author] , ^^_POST[content] );‘)));}

Return$r = $conn->query("select * from posts");while($row = $r->fetch_assoc()) { echo eval(sb('data: ^^row[author] ^^row[content]<br>\n‘)); }

Page 32: Interpolique

WHAT’S GOING ON

Language interpolators are blind – they just push strings into strings So we write custom interpolators – the dev puts

in what he wants, the compiler sees what it needs

Page 33: Interpolique

WHAT TO INTERPOLATE INTO

Parameterized Queries are an obvious target Programmer writes:

select * from table where fname=^^fname and country=^^country and x=^^x;

Interpolique expands:$statement = $conn->prepare("select * from table where fname=? and country=? and x=? ");$statement->bind_param("s", $fname);$statement->bind_param("s", $country);$statement->bind_param("s", $x);

Page 34: Interpolique

COULD DO ESCAPES…

…but no faith they actually work correctly

Page 35: Interpolique

BASE64: ESCAPING DONE RIGHT

Programmer writes:select * from table where fname=^^fname and country=^^country and x=^^x;

Interpolique expands:select * from table where fname=b64d("VEhJUyBJUyBUSEUgU1RPUlkgQUxMIEFCT1VUIEhPVyBNWSBMSUZFIEdPVCBUVVJORUQgVVBTSURFIERPV04=") and country=b64d("d2Fzc3Nzc3Nzc3Nzc3Nzc3NzdXA=") and x=b64d("eXl5eXk=") ;

Page 36: Interpolique

WHY THIS WORKS

Type safe going into b64d() function That’s never getting interpreted as anything but

a string Type safe coming out of b64d() function

B64d() cast to return a string Not a subquery, not a conditional, not anything

other than a string B64d() a MySQL UDF that’s already written, has

no apparent time penalty, will be released with Interpolique Most other databases already have B64 support In a pinch, could use MySQL hex/unhex

Page 37: Interpolique

TWO MODES OF BASE64

Late binding Interpolation inserts the Base64 handler Text is plain until right before it crosses the

frontend/backend layer SQL looks like this:

select * from foo where x=^^foo; Early Binding

Base64 the variable as soon as it comes in off the HTTP request

SQL looks like this:select * from foo where x=b64d($foo);

Pen testers: If somebody fails to escape $foo, everything still works. If somebody fails to Base64 Encode $foo, everything breaks immediately

Page 38: Interpolique

STATIC ANALYSIS

You know what’s better than having a static analyzer?

Page 39: Interpolique

AHEM

Not needing a static analyzer

Page 40: Interpolique

BASE64 IN THE OTHER DIRECTION

<span id=3520750 b64text="Zm9v">___</span><script>do_decode(3520750)</script> Create a SPAN with a random ID and a dynamic

attribute that contains its base64’d content Call do_decode with that ID, which can now look

up the element in O(1) time Use this construction to retain streamability

Thank/Blame CP for this

Page 41: Interpolique

DOM INTERACTION: SIMPLE

Push to textContent ob = document.getElementById(id);

ob.textContent = Base64.decode(ob.getAttribute("b64text"));

We never go through the browser HTML parser

Page 42: Interpolique

DOM INTERACTION: COMPLEX

Push to appropriate createElements ob = document.getElementById(id);

raw = Base64.decode(ob.getAttribute("b64text")); safeParse(raw, ob);

HTMLParser(src, { start: function( tag, attrs, unary ) { … if(tag == "i" || tag == "b" || tag == "img" || tag == "a"){ el = document.createElement(tag); …Basic idea is to have a simple HTML parser that extracts what it can, creates elements according to whitelisted rules, and importantly, never goes through the browser HTML parser

See also: “Blueprint”, a system that moves all DOM generation to JS http://www.cs.uic.edu/~venkat/research/papers/blueprint-oa

kland09.pdf

Page 43: Interpolique

IMPORTANT NOTE

Security Is Quantized There’s a set of elements that can be safely

exposed There’s a set that can’t The game is to expose only those tags and

attributes that don’t expand to arbitrary JS Either you have prevented wishing for more

wishes, or you have not (We see this from the webmail attack surface)

Page 44: Interpolique

HOW THIS WORKS Primary Mechanism: Eval

Yes, there’s risk here, and yes we’re going to talk about that risk – we need this for scoping reasons

Programmer written query: select * from table where fname=^^fname and country=^^country and x=^^x;.

To Eval: return ("select * from table where fname=b64d(\"" . base64_encode($fname) . "\") and country=b64d(\"" . base64_encode($country) . "\") and x=b64d(\"" . base64_encode($x) . "\") ;");

Eval Out: select * from table where fname=b64d("VEhJUyBJUyBUSEUgU1RPUlkgQUxMIEFCT1VUIEhPVyBNWSBMSUZFIEdPVCBUVVJORUQgVVBTSURFIERPV04=") and country=b64d("d2Fzc3Nzc3Nzc3Nzc3Nzc3NzdXA=") and x=b64d("eXl5eXk=")

Page 45: Interpolique

CAN WE OPERATE WITHOUT EVAL?

No Eval in Java or C# One approach: Combine variable argument

functions with string subclass tagging public bwrap w = new bwrap();

w.s(w.c("select * from foo where x="), argument1, w.c("and y="), argument2);

If you forget to mark the safe code, it breaks Another approach:

w.code(“select * from foo where x=“).data(argument1).code(“and y=“).data(argument2).toString() Similar to LINQ etc. but actually works for arbitary grammars If you mismark code as data, or vice versa, it breaks

Both actually implemented! (Tiny HOPE Announce)

Page 46: Interpolique

THE STATUS QUO

We see this doesn’t work: String s = “select * from foo where x = \”“ +

escape(s) + “\”;”; By doesn’t work: It is too similar to this:

String s = “select * from foo where x = \”“ + s + “\”;”;

Devs mess this up, but the code works anyway As a matter of principle, devs will do

enough work to make the code function If it works, it should work securely If it isn’t working securely, it shouldn’t be

working at all The trick is to not make it easier to get around the

security, than it is to do things right

Page 47: Interpolique

WHY CUSTOM INTERPOLATORS ARE HARD: THE ANCIENT SCOPE WAR

Lexical Scope: Scope Known At Compile Time Variables are “pushed” into child scopes

Dynamic Scope: Scope Determined At Run Time Variables are “pulled” by child scopes

Lexical scope has won, and has systematically removed methods that allow any code to access variables not explicitly pushed in This makes it rather difficult to write a function that

sees ^^variable and thus deferences that variable There are silly “superclass” or “parent” modifiers in

some languages, but they’re all special case In Java and C#, they went so far as to leave local

variables unnamed on the stack, so you couldn’t just hop into previous stack frames and dereference from there!

Page 48: Interpolique

TO BE CLEAR

Yes, there is risk to eval, and we’ll be talking about it Yes, there are very nice and very good reasons for

lexical scope to be the default state The fact that the vast majority of

programming languages, type safe or not, are repeatedly found to expose injection flaws is a direct sign that something is wrong Put simply, language design needs to be informed by the

bloody findings of pen testers It is informed by performance engineers It is informed by usability engineers Memory safety didn’t come from security engineers, it came

from reliability engineers I think we need a way to write functions that execute in present

scope

Page 49: Interpolique

YES, THIS MEANS

(LISP) (WAS) (RIGHT) (((NOT ABOUT EVERYTHING))) (((THEY ( HAD A POINT ( HERE ))))

Crazy theory JavaScript has been successful because it’s been

able to mutate to absorb almost any language construct

“More dialects of JavaScript than Chinese”

Page 50: Interpolique

RISKS

There are three things that can go wrong with any defensive technology It doesn’t work

None of this mealy mouthed, “well, it depends on what your threat model is”

Either it does what it says, or it doesn’t! It doesn’t work in the field

Security: It is too easy to screw up It has side effects

Fails other first class engineering requirements (too slow, unstable, hard to deploy, etc)

I am looking for destructive analysis on these techniques, and will accept criticism on any of the above fronts Here is what I know so far

Page 51: Interpolique

THE HANDLERS APPEAR RELATIVELY SOLID No known SQL Injection bypasses for Base64 into a

b64d() function Using a fast base64 decode – could be flaws here Could be databases that don’t type-lock return values

No known flaws when putting arbitrary text into a span.textContent field Well, except it doesn’t work in IE Will port to its wonky

DOM Most testing is in Firefox -- Could be problems in

Chrome/Safari, Opera, etc. No known flaws when creating arbitrary DOM elements

and populating them, rather than pushing HTML IE6 is apparently slow at this Need to enumerate the full set of tags which are safe to put

into HTML

Page 52: Interpolique

EVAL ADDS SOME RISK

Don’t buy that a PHP server is safer if it isn’t running eval Month of PHP Bugs = PHP not safe against any

arbitrary PHP, eval or not Eval in this context can make programmer

errors more severe Correct: eval(b(“select * from foo where

x=‘^^x’”)); Incorrect: eval(b(“select * from foo where x =

‘$x’;”)); Before we had SQLi. Now we potentially have front

end code execution! This is why it’s now ^^foo instead of $!foo

Page 53: Interpolique

MANAGING RISK OF EVAL

b() can be smarter It can be aware of strings that break out of

string-returner It can be aware of SQL grammar, to the point

that in order to write a right hand variable, it must be ^^’d Select * from foo where x=^^x and y=safe(1);

It can even be self-auditing – in PHP, it can use debug_backtrace() to find the line that called it, and validate that that line doesn’t have an unsafe language deref

Page 54: Interpolique

WHAT ONLY SORT OF WORKS

“Requiring” Single Quotes In some languages, ‘$foo’ doesn’t interpolate,

while “$foo” does So, the thinking is, require eval(b(‘$foo’)) This is a policy that cannot be enforced by

present compilers or languages (both ‘$foo’ and “$foo” turn into a string in the parse tree) Could be enforced by a preprocessor At large shops, significant improvements in security

are won by blocking otherwise legal expressions as a coding policy

Not convinced that smaller shops can/should absorb

Page 55: Interpolique

PERFORMANCE

Eval is slower than compiled code Translating strings could be a major pain point in

some languages Easy to cache the translation (because we retain

the boundary, accessing the normalized query form is trivial)

Could potentially parameterize/accelerate more, because it’s suddenly easy for the framework to autorecognize repeated queries

Base64 is fast Slight bandwidth increase, but nothing compared

to URLEncoding

Page 56: Interpolique

ANYTHING ELSE?

I don’t know. Hope: There’s about two months till Black

Hat. Lets find out! This isn’t a recommendation yet Clearly what we are doing right now is not

working Lets find out the best things we can do with the

present languages Lets find out what we’d want from future

languages It’s time we got involved in the discussion of

what software looks like