Upload
kris-mok
View
3.899
Download
5
Embed Size (px)
DESCRIPTION
Presentation done at JingJS, Nov 10, 2013.
Citation preview
Implementing aJavaScript Engine
Krystal Mok (@rednaxelafx)2013-11-10
Implementing a(Modern, High Performance?)
JavaScript Engine
About Me
• Programming language and virtual machine enthusiast
• Worked on the HotSpot JVM at Taobao and Oracle
• Also worked on a JavaScript engine project• Twitter / Sina Weibo: @rednaxelafx• Blog: English / Chinese
Agenda
• Know the Heritage• JavaScript Engine Overview• Implementation Strategies and Tradeoffs• A bit about Nashorn
KNOW THE HERITAGEThe roots of JavaScript language and modern JavaScript engines
Heritage of the Language
Self
Scheme
Java
…
JavaScript
function closure
prototype-based OO
C-like syntax,built-in objects
Language Comparison
Self• Prototype-based OO• Multiple Prototype• Dynamically Typed• Dynamically Extend Objects• Mirror-based Reflection• Block (closure)• Support Non-local Return• (pass a error handler to
methods that might one)
JavaScript (ECMAScript 5)• Prototype-based OO• Single Prototype• Dynamically Typed• Dynamically Extend Objects• Reflection• First-class Function (closure)• (no non-local return)• Exception Handling
Heritage of the Languagefunction MyPoint(x, y) { this.x = x; this.y = y;}
MyPoint.prototype.distance = function (p) { var xd = this.x - p.x, yd = this.y - p.y; return Math.sqrt(xd*xd + yd*yd);}
var p = new Point(2013, 11);
Heritage of the Languagetraits myPoint = (| parent* = traits clonable. initX: newX Y: newY = (x: newX. y: newY) distance: p = (| xd. yd | xd: x - p x. yd: y - p y. (xd squared + yd squared) squareRooted ).|).
myPoint = (| parent* = traits myPoint. x <- 0. y <- 0|).
p: myPoint copy initX: 2013 Y: 11
Heritage of the Language
on Self 4.4 / Mac OS X 10.7.5
Heritage of the VM
Self VM(Self)
Strongtalk VM(Smalltalk)
HotSpot VM(Java)
V8(JavaScript)
CLDC-HI(Java)
What’s in common?
• Lars Bak!
VM Comparison
Self VM (3rd Generation)• Fast Object w/Hidden Class• Tiered Compilation
– OSR and deoptimization– support for full-speed debugging
• Type Feedback– Polymorphic Inline Caching
• Type Inference• Method/Block Inlining• Method Customization• Generational Scavenging
– Scavenging + Mark-Compact
V8 (with Crankshaft)• Fast Object w/Hidden Class• Tiered Compilation
– OSR and deoptimization– support for full-speed debugging
• Type Feedback– Polymorphic Inline Caching
• Type Inference• Function Inlining• Generational Scavenging
– Scavenging + Mark-Sweep/Mark-Compact
To Implement a High Performance JavaScript Engine
• Learn from Self VM as a basis!
Themes
• Pay-as-you-go / Lazy• Take advantage of runtime information– Type feedback
• Take advantage of actual code stability– Try to behave as static as possible
JAVASCRIPT ENGINE OVERVIEWOutline of the main components
Components of a JavaScript Engine
• Parser• Runtime• Execution Engine• Garbage Collector (GC)• Foreign Function Interface (FFI)• Debugger and Diagnostics
Components of a JavaScript Engine
Memory (Runtime Data Areas)
Source Code
FFI
host / external library
GCJavaScriptObjects
CallStack
Parser ExecutionEngineAST
Components of a JavaScript Engine
• Parser• Runtime• Execution Engine• Garbage Collector (GC)• Foreign Function Interface (FFI)• Debugger and Diagnostics
Parser
• Parse source code into internal representation• Usually generates AST
var z = x + y
VarDecl: z
BinaryArith: +
x y
Components of a JavaScript Engine
• Parser• Runtime• Execution Engine• Garbage Collector (GC)• Foreign Function Interface (FFI)• Debugger and Diagnostics
Runtime
• Value Representation• Object Model• Built-in Objects• Misc.
__proto__
x 2013
y 11
Object
__proto__
prototype
__proto__ null
constructor
… …
Function
__proto__
prototype
__proto__
constructor
… …
Components of a JavaScript Engine
• Parser• Runtime• Execution Engine• Garbage Collector (GC)• Foreign Function Interface (FFI)• Debugger and Diagnostics
Execution Engine
• Execute JavaScript Code
VarDecl: z
BinaryArith: +
x y
addl %rcx, %rax
Components of a JavaScript Engine
• Parser• Runtime• Execution Engine• Garbage Collector (GC)• Foreign Function Interface (FFI)• Debugger and Diagnostics
Garbage Collector
• Collect memory from unused objects
Components of a JavaScript Engine
• Parser• Runtime• Execution Engine• Garbage Collector (GC)• Foreign Function Interface (FFI)• Debugger and Diagnostics
Foreign Function Interface
• Handle interaction between JavaScript and “the outside world”
• JavaScript call out to native function• Native function call into JavaScript function, or
access JavaScript object
Components of a JavaScript Engine
• Parser• Runtime• Execution Engine• Garbage Collector (GC)• Foreign Function Interface (FFI)• Debugger and Diagnostics
Debugger and Diagnostics
IMPLEMENTATION STRATEGIES AND TRADEOFFS
Parser
• LR• LL• Recursive Descent• Operator Precedence• Lazy Parsing / Deferred Parsing
Value Representation
• Pointers, and all values allocated on heap• Discriminated Union• Tagged Value / Tagged Pointer
Value Representation
• Pointers, and all values allocated on heap• Discriminated Union• Tagged Value / Tagged Pointer
Tag_Int
2013
typedef Object* JSValue;
Value Representation
• Pointers, and all values allocated on heap• Discriminated Union• Tagged Value / Tagged Pointer
Tag_Int
2013
class JSValue { ObjectType ot; union { double n; bool b; Object* o; // … } u;}
Tagged
• Tagged Pointer– Non-zero tag on pointer– Favor small integer arithmetics
• Tagged Value– Non-zero tag on non-pointer– Favor pointer access
• NaN-boxing– use special NaN value as box
small integer 00
pointer 01
Tagged
• Tagged Pointer– Non-zero tag on pointer– Favor small integer arithmetics
• Tagged Value– Non-zero tag on non-pointer– Favor pointer access
• NaN-boxing– use special NaN value as box
small integer 01
pointer 00
Tagged
• Tagged Pointer– Non-zero tag on pointer– Favor small integer arithmetics
• Tagged Value– Non-zero tag on non-pointer– Favor pointer access
• NaN-boxing– use special QNaN value as box
00000000 pointer
xxxxxxxx double
11111111 00000000 integer
Value Representation in Self
Numeric Tower
• Internal Numeric Tower• Smi -> HeapDouble• int -> long -> double• unboxed number
Object Model
• Hash based– “Dictionary Mode”
• Hidden Class based– “Fast Object”
Groovy code: Equivalent Java code:
Object ModelExample: behind Groovy’s “object literal”-ish syntax
obj = [ x: 2013, y: 42];
i = obj.x;
obj = new LinkedHashMap(2);obj.put("x", 2013);obj.put("y", 42);
i = obj.get("x");
keySet null
values null
table
size 2
threshold 1
loadFactor 0.75
modCount 2
entrySet null
header
accessOrder false
2013
42
0
1
key “y”
value
next null
hash 126
before x
after header
key null
value null
next null
hash -1
before
after
key “x”
value
next null
hash 127
before header
after y
map
__proto__
context …
flags 0
spill null
arrayData
L0 x
L1 y
L2 (unused)
L3 (unused)
Key Getter Setter“x” x getter x setter“y” y getter y setter
map
__proto__
…
2013
42
EMPTY_ARRAY
Nashorn Object Model
map
__proto__
context …
flags 0
spill null
arrayData
L0 x
L1 y
L2 (unused)
L3 (unused)
Key Getter Setter“x” x getter x setter“y” y getter y setter
map
__proto__
…
2013
42
EMPTY_ARRAY
Let’s ignore some fields
for now
map
L0 x
L1 y
Key Getter Setter“x” x getter x setter“y” y getter y setter
2013
42
… and we’ll get this
metadata
x
y
Key Offset“x” +12“y” +16
2013
42
looks just like a Java
object
class Point { Object x; Object y;}
… with boxed fields
metadata
x 2013
y 42
Key Offset“x” +12“y” +16
would be even better
if …
class Point { int x; int y;}
but Nashorn doesn’t go this far yet
map
__proto__
context …
flags 0
spill
arrayData
L0 x
L1 y
L2 z
L3 a
Key Getter Setter“x” x getter x setter“y” y getter y setter“z” z getter z setter“a” a getter a setter“b” b getter b setter
map
__proto__
…
1
2
0 6
1 7
b
5
3
4
Inline Cache
• Facilitated by use of hidden class• Improve property access efficiency• Collect type information for type feedback– later fed to JIT compilers for better optimization
• Works with both interpreted and compiled code
String
• Flat string• Rope / ConsString / ConcatString• Substring / Span• Symbol / Atom• External String
RegExp
• NFA• Optimize to DFA where profitable• Interpreted• JIT Compiled
Call Stack
• Native or separate?• Native– fast– easier transition between execution modes– harder to implement
• Separate (aka “stack-less”)– easy to implement– slow– overhead when transitioning between exec modes
Execution Engine
• Interpreter• Compiler– Ahead-of-Time Compiler– Just-in-Time Compiler– Dynamic / Adaptive Compiler
• Mixed-mode• Tiered
Execution Engine in Self
Interpreter
• Line Interpreter• AST Interpreter• Stack-based Bytecode Interpreter• Register-based Bytecode Interpreter
Interpreter
• Written in– C/C++– Assembler– others?
Compiler Concurrency
• Foreground/Blocking Compilation• Background Compilation• Parallel Compilation
Baseline Compiler
• Fast compilation, little optimization• Should generate type-stable code
Optimizing Compiler
• Type Feedback• Type Inference• Function Inlining
On-stack Replacement
Garbage Collection
• Reference Counting?– not really used by any mainstream impl
• Tracing GC– mark-sweep– mark-compact– copying
GC Advances
• Generational GC• Incremental GC• Concurrent GC• Parallel GC
GC Concurrency
Application Thread
JavaScript
GCmark sweep
Mark-Sweep
GC Concurrency
Application Thread
JavaScript
GCmark compact
Mark-Compact
GC Concurrency
Application Thread
JavaScript
GCscavenge
Scavenging
GC Concurrency
Application Thread
JavaScript
GCincremental mark sweep
Incremental Mark
GC Concurrency
Application Thread
JavaScript
GCmark lazy sweep
Lazy Sweep
GC Concurrency
Application Thread
JavaScript
GCincremental mark lazy sweep
Incremental Mark + Lazy Sweep
GC Concurrency
Application Thread
JavaScript
GCincremental markand scavenge
lazy sweep
Generational:Scavenging + (Incremental Mark + Lazy Sweep)
GC Concurrency
GC Thread
Application Thread
JavaScript
GC
initial markconcurrent sweepconcurrent mark
remark reset
(Mostly) Concurrent Mark-Sweep
A BIT ABOUT NASHORNA new high performance JavaScript on top of the JVM
What is Nashorn?
• Oracle’s ECMAScript 5.1 implementation, on the JVM
• Clean code base, 100% Java– started from scratch; no code from Rhino
• An OpenJDK project• GPLv2 licensed
Overview
What is Nashorn?Origins of the “Nashorn” name: the Rhino book
What is Nashorn?Origins of the “Nashorn” name: Mozilla Rhino
What is Nashorn?Origins of the “Nashorn” name: the unofficial Nashorn logo
What is Nashorn?Origins of the “Nashorn” name: my impression
Dynamic Languages on the JVMCan easily get to a sports-car-ish level
Dynamic Languages on the JVMTakes some effort to get to a decent sports car level
Dynamic Languages on the JVMHard to achieve extremely good performance
Nashorn Execution Model
Lexical Analysis
Syntax Analysis
Constant Folding
Control-flow Lowering
Type Annotating
Range Analysis (*)
Code Splitting
Type Hardening
Bytecode Generation
JavaScript Source Code
AST
Java Bytecode
Parser (Compiler Frontend)
Compiler Backend
* Not complete yet