48
Protecting JavaScript source code using obfuscation Facts and Fiction Pedro Fortuna, Co-Founder and CTO AuditMark OWASP Europe Tour 2013 Lisbon - June 21st, 2013

Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

Protecting JavaScript

source code using

obfuscationFacts and Fiction

Pedro Fortuna, Co-Founder and CTO

AuditMark

OWASP Europe Tour 2013 Lisbon - June 21st, 2013

Page 2: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

2

Code Obfuscation

concepts

Code Obfuscation

metrics

Practical examples

Outline

OWASP Europe Tour 2013

Page 3: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

3PART 1 – OBFUSCATION CONCEPTS PART 2 – OBFUSCATION METRICS PART 3 – JAVASCRIPT OBFUSCATION PRACTICAL EXAMPLES

WHAT IS CODE OBFUSCATION?

PART 1

PART 1 – OBFUSCATION CONCEPTS

Page 4: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

4

Obfuscation“transforms a program into a form that is more difficult for an adversary to understand or change than the original code” [1]

Where more difficult means“requires more human time, more money, or more computing power to analyze than the original program.”

[1] in Collberg, C., and Nagra, J., “Surreptitious software: obfuscation, watermarking, and

tamperproofing for software protection.”, Addison-Wesley Professional, 2010.

Code Obfuscation

OWASP Europe Tour 2013

Page 5: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

5

Lowers the code quality in terms of

Readability Maintainability

Delay program understanding

Time required to reverse it > program useful lifetime

Resources needed to reverse it > value obtained from reversing it

Delay program modification

Cost reversing it > cost of developing it from scratch

Code Obfuscation

OWASP Europe Tour 2013

Page 6: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

6

Obfuscation != Encryption

Web Application

EncryptionAlgorithm

Decryption Algorithm

JS Engine

Executable JavaScript Source Code

Executable JavaScript Source Code

Non-Executable Encrypted Code

Encryption Key Decryption Key

{ { {• This is a common misconception• Encrypted code is not executable by the browser or JS Engine• A decryption process is always needed

OWASP Europe Tour 2013

Page 7: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

7

Obfuscation != Encryption

Web Application

Obfuscation Engine

JS Engine

Executable JavaScript Source Code

Executable JavaScript Source Code

{ {• JavaScript obfuscated code is still valid, ready to execute code• It does not require a symmetric deobfuscation function

OWASP Europe Tour 2013

Page 8: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

8

JavaScript Obfuscation Example #1

HTML5 Canvasexample from mozilla.org

• Being JavaScript, this code is delivered to the browser as clear text, and as such, it can be captured by anyone

Page 9: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

9

JavaScript Obfuscation Example #1

• This is the obfuscated version of the code.• It can still be captured by anyone, but it is much harder to

grasp and to change.

Page 10: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

What is it good for?Good• Prevent code theft and reuse

– E.g. Stop a competitor from using your code as a quickstart to build its own

• Protect Intellectual Property – Hide algorithms– Hide data– DRM (e.g. Watermarks)

• Enforce license agreements– e.g. domain-lock the code

• As an extra security layer– Harder to find vulnerabilities in the client-side

• Test the strength of security controls (IDS/IPS/WAFs/web filters)

Evil• Test the strength of security controls

(IDS/IPS/WAFs/web filters)

• Hide malicious code

• Make it look like harmless code

OWASP Europe Tour 2013

Page 11: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

11

• Question often raised: why not move security sensitive code to the server and have JS request it whenever needed ?

• Sometimes you can... and you should!

• But there are plenty situations where you can’t:– You may not have a server

• Widgets

• Mobile Apps

• Standalone, offline-playable games

• Windows 8 Apps made with WinJS

– You may not want to have a server• May not be cost effective doing computations on a server (you have to guarantee 100% uptime,

support teams)

• Latency

Why not rely on the Server?

OWASP Europe Tour 2013

Page 12: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

12PART 1 – OBFUSCATION CONCEPTS PART 2 – OBFUSCATION METRICS PART 3 – JAVASCRIPT OBFUSCATION PRACTICAL EXAMPLES

CODE OBFUSCATION METRICS

PART 2

PART 2 – OBFUSCATION METRICS

Page 13: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

13

• Potency

• Resilience

• Stealthiness

• Execution Cost

• Maintainability

Measuring Obfuscation

Next:• We’ll present each metric using

simple examples• This is intentional, to ease the

process of understanding the metrics

• However, they do not represent to the full extent what you can obtain if you combine a large set of different obfuscation transformations.

OWASP Europe Tour 2013

Page 14: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

14

Generates confusion

Obfuscation PotencyMeasuring Obfuscation

• Measure of confusion that a certain obfuscation adds

• Or “how harder it gets to read and understand the new form when compared with the original”

• To the left you can see a simple example of a factorial function

OWASP Europe Tour 2013

Page 15: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

15

Generates confusion

Obfuscation PotencyMeasuring Obfuscation

Rename all + Comment removal

• Now to the right you see the result of renaming every symbol to a mix of lower and upper O’s. We all know that function names and variable names are quite useful for the purpose of understanding the code. Not only we’ve lost that, but the new names can be easily confused.

• Also comments were removed. They are also important to understand a program.• So we can definitely say that the obfuscation introduced a certain degree of confusion. It has

added some potency.

Page 16: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

16

Generates confusion

Obfuscation PotencyMeasuring Obfuscation

Rename all + Comment removal

Whitespace removal

• Now, below, you can see the result of removing whitespaces from the code. It becomes slightly more confusing, so we can say it is slighly more potent than the previous example.

Page 17: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

17

Resistance to deobfuscation techniques

be it manual or automatic

Obfuscation ResilienceMeasuring Obfuscation

• Represents the measure of the resistance that a certain obfuscation offers to deobfuscation techniques

• Or “how hard it is to undo the back to the original form”

• To the left you can see the same example function as before

OWASP Europe Tour 2013

Page 18: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

18

Resistance to deobfuscation techniques

be it manual or automatic

Obfuscation ResilienceMeasuring Obfuscation

Rename all + Comment removal

• On the right you can see the result of applying rename_all obfuscation.

• This is an example of an obfuscation which is 100% resilient, because, assuming that you don’t have access to the original source code, it’s impossible to tell what were the original names.

• The comment removal obfuscation is also 100% resilient as you can’t possibly know if the original form had any comments and recover them

Page 19: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

19

Resistance to deobfuscation techniques

be it manual or automatic

Obfuscation ResilienceMeasuring Obfuscation

Rename all + Comment removal

String splitting• on the bottom, you see the result after applying string splitting.

• You can definitly see that it is more potent than the previous, but if you look carefully, you can see that its not hard to revert back to the previous form.

• So we can say that this version does not really add much resilience when compared with the previous form.

Page 20: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

20

One way of attacking obfuscation is using a Static Code Analyser

1. Parses the code

2. Transforms it to fullfill a purpose– Usually to make it simpler => better performance

– Simpler also fullfills reverse-engineering purpose

• Example simplifications– Constant propagation, constant folding

– Remove (some) dead code

• And most importantly, it is automatic!

Static Code Analysisfor defeating obfuscation

Constant propagation:

x = 10;

y = 7 – x / 2;

x = 10;

y = 7 – 10 / 2;

Constant folding:

N = 12 + 4 – 2;

N = 14;

OWASP Europe Tour 2013

Page 21: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

21

• We used Google Closure Compiler, a Static Code Analyser to simplify the code.• The result is on the right, which as you can see returned much easier to read code.

Page 22: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

22

• If we compare the code on the right with the original code (on the left) we can see that they are not far apart.

• So the potency of the obfuscation is only apparent. The real potency or the potency we should consider is the one that you get after using automated ways of reversing the code.

• This does not mean that the string-splitting obfuscation is useless. It has to be combined with other obfuscations that provide more resilience.

Page 23: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

23

• Another way of attacking obfuscation• Analysis performed by executing the code

– Retrieves the Control flow graph (CFG) of the code executed– Retrieve values of some expressions

• How it can be used to defeat obfuscation– Understand (one instance of) the program execution

• You can focus on parts that you are sure that are executed

– Retrieve values of some expressions• Aids code simplification• Find needle in the haystack => e.g. retrieve encryption key

– Bypasses deadcode– Not very good for automatic reversal of obfuscation

• May not “see” all useful code• If you need to make sure the code will remain 100% functional, you cannot use this technique

– Gather knowledge for manual reverse engineering

Dynamic Code Analysisfor defeating obfuscation

OWASP Europe Tour 2013

Page 24: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

24

• How hard is to spot?

– Or “how hard is to spot the changes performed by the

obfuscation”

– Or “how successfull the obfuscation was in making the

obfuscated targets look like other parts of the code”

• An obfuscation is more stealthy if it avoids common telltale

indicators

– eval()

– unescape()

– Large blocks of meaningless text

Obfuscation StealthinessMeasuring Obfuscation

OWASP Europe Tour 2013

Page 25: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

25

• Impact on performance– Runs per second

– FPS (e.g. Games)

– Usually obfuscation does not have a positive impact on performance, but it does not necessarily have a negative impact. It depends on the mix of transformations chosen and on the nature of the original source code.

• E.g. Renaming symbols => Same execution cost

• Impact on loading times– Time before starting executing

– Usually a function of file size

– Usually obfuscation tends to grow filesize. But there are also some obfuscation transformations which also makes it smaller.

• E.g. Renaming symbols again

Obfuscation Execution CostMeasuring Obfuscation

Page 26: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

26

Effect on maintainability = 1/potency (after static code analysis)

Lower maintainability => mitigates code theft and reuse

This is one of the most important

concepts around obfuscation

Obfuscation & MaintainabilityMeasuring Obfuscation

OWASP Europe Tour 2013

Page 27: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

27PART 1 – OBFUSCATION CONCEPTS PART 2 – OBFUSCATION METRICS PART 3 – JAVASCRIPT OBFUSCATION PRACTICAL EXAMPLES

PRACTICAL EXAMPLES

PART 3

PART 3 – JAVASCRIPT OBFUSCATION PRACTICAL EXAMPLES

Page 28: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

28

Compression/Minification vs Obfuscation

• This first example aims to clarify one of the most common misconceptions around obfuscation: a lot of people do not understand very well the difference between compressing or minifying the code and obfuscating it.

• This code is a portion of a md5 function in JavaScript.

Page 29: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

29

Compression/Minification vs Obfuscation

• This is a compressed version of it

• It really seems to be more potent. No doubt about it.

Page 30: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

30

Compression/Minification vs Obfuscation

• But look, it has got an eval() on it. Not much stealthy.

• It is needed because the javascript has been encoded and the result of decoding it must be evaluated in runtime.

• When encoding is used, there is always a decoding function somehwere.

• The real questions is: Is it resilient ?

Page 31: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

31

eval(

(function(....))

);

document.write(‘<textarea>

(function(...))

</textarea>’);

A simple trick will do it

• By replacing the eval() with a document.write (just one way to do it) you get access to the decoded source.

OWASP Europe Tour 2013

Page 32: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

32

Reverse-engineered result

Original source

• And that results in the code you see on the right. If you compare with the original source code, you can see that it’s pretty much the same code.

• To many this isn’t surprising, but a lot of people uses JavaScript compressors or minifiers with the intention of protecting the code.• This is perfect example of a code transformation that is very potent but with almost null resilience.• Compressor/Minifier tools do not aim at protecting the code. Their sole purpose is to make it smaller and faster.

Page 33: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

33

• First JavaScript version proposed by Yosuke Hasegawa (in sla.ckers.org, Jun 2009)

• Encoding method which uses strictly non-alphanumeric symbols

• Example: alert(1) (obfuscated version below)

Non alphanumeric Obfuscation

Page 34: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

34

• Using type coercion and browser quirks

• We can obtain alphanumeric characters indirectly

How is that possible ?

+[] -> 0

+!+[] -> 1

+!+[]+!+[] -> 2 Easy to get any number

+”1” -> 1 Type coercion to number

“”+1 = “1” Type coercion to string

How to get letters?

+”a” -> NaN

+”a”+”” -> “NaN”

(+”a”+””)[0] -> “N”

Ok, but now without alphanumerics:

(+”a”+””)[+[]] -> “N”

How to get an “a” ?

![] -> false

![]+“” -> “false”

(![]+””)[1] -> “a”

(![]+””)[+!+[]]

(+(![]+"")[+!+[]]+””)[+[]] -> “N”

eval( (![]+"")[+!+[]]+"lert(1)");

OWASP Europe Tour 2013

Page 35: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

35

Page 36: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

36

• “eval” uses alphanumeric characters!

• eval() is not the only way to eval() !

• You have 4 or 5 methods more

• Examples– Function("alert(1)")()

– Array.constructor(alert(1))()

– []["sort"]["constructor"]("alert(1)")()• Subscript notation

• Strings (we already know how to convert them)

Wait... What about the eval ?

OWASP Europe Tour 2013

Page 37: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

37

Let me see that again!

OWASP Europe Tour 2013

Page 38: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

38

• 100% potent

• 0% stealthy (when you see it, you know someone is trying to hide something)

• High execution cost

– eval is a bit slower

– But the worst is: file is much larger => slower loading times

• May not work in all browsers

• What about resilience ?

– Unfortunately, not much (you can get a parser to simplify it back to the original source)

• Good for bypassing filters (e.g. WAFs)

Non alphanumeric Obfuscation

OWASP Europe Tour 2013

Page 39: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

39

Original source code

Deadcode injection + Rename localDeadcode injection

Can you spot where is the dead code ?

Page 40: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

40

Original source code

Deadcode injection + Rename localDeadcode injection

Page 41: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

41

• Deadcode insertion is a natural way of adding confusion to a source code, and thus increasing the potency of obfuscation.

• Being deadcode, the code isn’t really executed, so this has no impact on Execution Cost

• Would a Static Code Analyser remove this particular dead code?

• No, because it relies on opaque predicates

– Not removable using Static Code Analysis

– Predicates similar to ones found in the original source ( ++stealthiness )

• Randomly injected ( ++potent )

• Increase complexity of control flow ( ++potent )

• Dummy statements created out of own code (++potent & ++stealthiness )

Deadcode injection

OWASP Europe Tour 2013

Page 42: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

42

All Together Now

HTML5 Canvasexample from mozilla.org

• Up to now we have mostly seen no more than two or three obfuscation transformations working together.

• Let’s go back to the first example and see what happens when we mix a larger number of code obfuscation transformations together.

Page 43: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

43

All Together Now

• remove comments• dot notation• rename local• member enumeration• literal hooking :low• deadcode injection• string splitting :high• function reordering• function outlining• literal duplicates• expiration date "2199-12-31

00:00:00"

Page 44: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

44

All Together Now

• As you can see, you get and heavily obfuscated result.

• We intentionally didn’t used any encoding-based obfuscation in this example to let you see the effect of these transformations together. Also, you are not seeing the whole code here.

• For the record, not all encoding transformations are easily reversed. We could use for instance a Domain-lock encoding which needs to get the correct information from the browser to decode properly.

Page 45: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

45

After Closure Compiler

• And this is the result after running the code through Google Closure Compiler.

• It didn’t improved the readability much because the obfuscation transformations offered a good degree of resillience.

Page 46: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

46

• People often judge obfuscation based on its (aparent) potency

• Its resilience and the “real” potency that matters– Potency that you get after applying automated tools to reverse it

• Evaluating resilience is not trivial– Looking at simpler examples it may be relatively easy “at naked eye” to tell which

of two obfuscations is more resilient

– But looking when comparing complicated obfuscated versions, that use many code transformations, its not easy to say which version is more resilient.

– This is a job for JavaScript obfuscators• They should offer not only potency, but also resilience

• Make an effort to explain its users what is best to protect their code

• Avoid making available options that may reduce resilience

Conclusion

OWASP Europe Tour 2013

Page 47: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

47

• Don’t forget execution cost

– And where the code is executed. A Smartphone usually has less resources than a desktop computer. Obfuscation should be tuned to the platform where the code is being executed.

• Obfuscation can be very effective as a way to prevent code theft and reuse, by

– Making it a real pain to understand of the code

– Making it a real pain to change the code successfully

– Significantly lower the value that can be obtained by an attacker from reversing a code

Conclusion

OWASP Europe Tour 2013

Page 48: Owaspeutour2013lisbon pedrofortunaprotectingjavascriptsourcecodeusingobfuscation-130625180450-phpapp02

Contact Information

Pedro Fortuna

Owner & Co-Founder & CTO

[email protected]

Phone: +351 917331552

Porto - HeadquartersEdifício Central da UPTECRua Alfredo Allen, 4554200-135 Porto, Portugal

Lisbon officeStartup LisboaRua da prata, 121 5A1100-415 Lisboa, Portugal