LCU14 100-dalvik is dead long live dalvik

Preview:

DESCRIPTION

LCU14 100-dalvik is dead long live dalvik --------------------------------------------------- Speaker: Stuart Monteith Date: September 15, 2014 --------------------------------------------------- ★ Session Summary ★ The Dalvik virtual machine is the crucial part of Android responsible for executing platform independent code in Android apps.The upcoming L release of Android replaces "Dalvik" with a new implementation of the Dalvik virtual machine called the "Android RunTime" (ART). In this session you can learn about ART, Dalvik compatibility, and our experiences assisting with the 64bit porting efforts on AOSP. --------------------------------------------------- ★ Resources ★ Zerista: http://lcu14.zerista.com/event/member/137702 Google Event: https://plus.google.com/u/0/events/c8qejrla7b95okb56apbn89d770 Video: https://www.youtube.com/watch?v=dR0FbB4uJC0&list=UUIVqQKxCyQLJS6xvSmfndLA Etherpad: http://pad.linaro.org/p/lcu14-100 --------------------------------------------------- ★ Event Details ★ Linaro Connect USA - #LCU14 September 15-19th, 2014 Hyatt Regency San Francisco Airport --------------------------------------------------- http://www.linaro.org http://connect.linaro.org

Citation preview

Dalvik is Dead, Long Live Dalvik! LCU14-September 2014

Stuart Monteith Systems & Software

▪What is Dalvik™? ▪Porting Dalvik onto AArch64 ▪ART ▪Working on AOSP1 ▪Q&A

2

Outline

1 - Android™ Open Source Project

What is Dalvik™?

3

?

Android™ & Dalvik™

4

Applications

Application Framework

Libraries

Kernel

Android™ Runtime

Core Libraries

Dalvik VM

Bionic

SSL

▪ Dalvik is a virtual machine ▪ A managed runtime ▪ Interpreter ▪ Executes Java™ class files translated into

Dalvik “dex” bytecode ▪ Exception handling ▪ Object oriented ▪ References rather than pointers ▪ Garbage collection ▪ Concurrency ▪ Platform independence

What is Dalvik™?

5

Interpreter

Heap

Native Code OS

Bytecode

Compiling for Dalvik

6

*.java

*.class

*.dex *.apk *.odex

dx

javac

dexopt

InstallationDevelopment

/data/dalvik-cache/

Devices

7

Relative to Phone in 2008

0

7.5

15

22.5

30

CPU RAM Pixels

Phone `08 Phone `14 Tablet `10 Tablet `14

▪ Just-In-Time (JIT) Compiler - Android™ 2.2 “Froyo”, May 2010 ▪ Concurrent Garbage collection - Android 2.3 “Gingerbread”, December 2010 ▪ SMP1 support - Android 3.0 “Honeycomb”, February 2011

Dalvik Evolution

81 - Symmetric MultiProcessing

▪ Just-In-Time Compiler ▪ Compiles code at runtime ▪ Only code that is executed is compiled ▪ Only “Hot code” is compiled

▪ Interpreter executes bytecode instruction by instruction ▪ Profiles code ▪ Sends linear sequences of code to JIT ▪ Native code branched to from interpreter

▪ However… ▪ Code produced is not ideal

▪ Still, 5x faster than interpreter alone

Tracing JIT

9

Traces

10

0000: const-wide/16 v0, #int 0 // #0!0002: const/4 v2, #int 0 // #0!0003: move v6, v2!0004: move-wide v2, v0!0005: if-ge v6, v13, 001d // +0018!0007: mul-double v4, v2, v2!0009: mul-double v7, v0, v0!000b: sub-double/2addr v4, v7!000c: add-double/2addr v4, v9!000d: const-wide/high16 v7, #long 4611686018427387904 // #4000!000f: mul-double/2addr v2, v7!0010: mul-double/2addr v0, v2!0011: add-double/2addr v0, v11!0012: mul-double v2, v4, v4!0014: mul-double v7, v0, v0!0016: add-double/2addr v2, v7!0017: const-wide/high16 v7, #long 4616189618054758400 // #4010!0019: cmpl-double v2, v2, v7!001b: if-lez v2, 0026 // +000b!001d: sget v0, LMandle;.threshold:I // field@0001!001f: int-to-double v0, v0!0020: int-to-double v2, v13!0021: div-double/2addr v0, v2!0022: int-to-double v2, v6!0023: mul-double/2addr v0, v2!0024: double-to-int v0, v0!0025: return v0!0026: add-int/lit8 v2, v6, #int 1 // #01!0028: move v6, v2!0029: move-wide v2, v4!002a: goto 0005 // -0025

Traces

11

0000: const-wide/16 v0, #int 0 // #0!0002: const/4 v2, #int 0 // #0!0003: move v6, v2!0004: move-wide v2, v0!0005: if-ge v6, v13, 001d // +0018!0007: mul-double v4, v2, v2!0009: mul-double v7, v0, v0!000b: sub-double/2addr v4, v7!000c: add-double/2addr v4, v9!000d: const-wide/high16 v7, #long 4611686018427387904 // #4000!000f: mul-double/2addr v2, v7!0010: mul-double/2addr v0, v2!0011: add-double/2addr v0, v11!0012: mul-double v2, v4, v4!0014: mul-double v7, v0, v0!0016: add-double/2addr v2, v7!0017: const-wide/high16 v7, #long 4616189618054758400 // #4010!0019: cmpl-double v2, v2, v7!001b: if-lez v2, 0026 // +000b!001d: sget v0, LMandle;.threshold:I // field@0001!001f: int-to-double v0, v0!0020: int-to-double v2, v13!0021: div-double/2addr v0, v2!0022: int-to-double v2, v6!0023: mul-double/2addr v0, v2!0024: double-to-int v0, v0!0025: return v0!0026: add-int/lit8 v2, v6, #int 1 // #01!0028: move v6, v2!0029: move-wide v2, v4!002a: goto 0005 // -0025

0000: const-wide/16 v0, 0 0002: const/4 v2, #int 0!0003: move v6, v2!0004: move-wide v2, v0!0005: if-ge v6, v13, 001d

Traces

12

0000: const-wide/16 v0, #int 0 // #0!0002: const/4 v2, #int 0 // #0!0003: move v6, v2!0004: move-wide v2, v0!0005: if-ge v6, v13, 001d // +0018!0007: mul-double v4, v2, v2!0009: mul-double v7, v0, v0!000b: sub-double/2addr v4, v7!000c: add-double/2addr v4, v9!000d: const-wide/high16 v7, #long 4611686018427387904 // #4000!000f: mul-double/2addr v2, v7!0010: mul-double/2addr v0, v2!0011: add-double/2addr v0, v11!0012: mul-double v2, v4, v4!0014: mul-double v7, v0, v0!0016: add-double/2addr v2, v7!0017: const-wide/high16 v7, #long 4616189618054758400 // #4010!0019: cmpl-double v2, v2, v7!001b: if-lez v2, 0026 // +000b!001d: sget v0, LMandle;.threshold:I // field@0001!001f: int-to-double v0, v0!0020: int-to-double v2, v13!0021: div-double/2addr v0, v2!0022: int-to-double v2, v6!0023: mul-double/2addr v0, v2!0024: double-to-int v0, v0!0025: return v0!0026: add-int/lit8 v2, v6, #int 1 // #01!0028: move v6, v2!0029: move-wide v2, v4!002a: goto 0005 // -0025

0007: mul-double v4, v2, v2!0009: mul-double v7, v0, v0!000b: sub-double/2addr v4, v7! : : : : : :!0017: const-wide/high16 v7, #long!0019: cmpl-double v2, v2, v7!001b: if-lez v2, 0026 // +000b!

0026: add-int/lit8 v2, v6, #int 1 // #01!0028: move v6, v2!0029: move-wide v2, v4!002a: goto 0005 // -0025!0005: if-ge v6, v13, 001d

Traces

13

0000: const-wide/16 v0, #int 0 // #0!0002: const/4 v2, #int 0 // #0!0003: move v6, v2!0004: move-wide v2, v0!0005: if-ge v6, v13, 001d // +0018!0007: mul-double v4, v2, v2!0009: mul-double v7, v0, v0!000b: sub-double/2addr v4, v7!000c: add-double/2addr v4, v9!000d: const-wide/high16 v7, #long 4611686018427387904 // #4000!000f: mul-double/2addr v2, v7!0010: mul-double/2addr v0, v2!0011: add-double/2addr v0, v11!0012: mul-double v2, v4, v4!0014: mul-double v7, v0, v0!0016: add-double/2addr v2, v7!0017: const-wide/high16 v7, #long 4616189618054758400 // #4010!0019: cmpl-double v2, v2, v7!001b: if-lez v2, 0026 // +000b!001d: sget v0, LMandle;.threshold:I // field@0001!001f: int-to-double v0, v0!0020: int-to-double v2, v13!0021: div-double/2addr v0, v2!0022: int-to-double v2, v6!0023: mul-double/2addr v0, v2!0024: double-to-int v0, v0!0025: return v0!0026: add-int/lit8 v2, v6, #int 1 // #01!0028: move v6, v2!0029: move-wide v2, v4!002a: goto 0005 // -0025

Traces

14

0000: const-wide/16 v0, #int 0 // #0!0002: const/4 v2, #int 0 // #0!0003: move v6, v2!0004: move-wide v2, v0!0005: if-ge v6, v13, 001d // +0018!0007: mul-double v4, v2, v2!0009: mul-double v7, v0, v0!000b: sub-double/2addr v4, v7!000c: add-double/2addr v4, v9!000d: const-wide/high16 v7, #long 4611686018427387904 // #4000!000f: mul-double/2addr v2, v7!0010: mul-double/2addr v0, v2!0011: add-double/2addr v0, v11!0012: mul-double v2, v4, v4!0014: mul-double v7, v0, v0!0016: add-double/2addr v2, v7!0017: const-wide/high16 v7, #long 4616189618054758400 // #4010!0019: cmpl-double v2, v2, v7!001b: if-lez v2, 0026 // +000b!001d: sget v0, LMandle;.threshold:I // field@0001!001f: int-to-double v0, v0!0020: int-to-double v2, v13!0021: div-double/2addr v0, v2!0022: int-to-double v2, v6!0023: mul-double/2addr v0, v2!0024: double-to-int v0, v0!0025: return v0!0026: add-int/lit8 v2, v6, #int 1 // #01!0028: move v6, v2!0029: move-wide v2, v4!002a: goto 0005 // -0025

07:!T0!

1b:

Traces

15

0000: const-wide/16 v0, #int 0 // #0!0002: const/4 v2, #int 0 // #0!0003: move v6, v2!0004: move-wide v2, v0!0005: if-ge v6, v13, 001d // +0018!0007: mul-double v4, v2, v2!0009: mul-double v7, v0, v0!000b: sub-double/2addr v4, v7!000c: add-double/2addr v4, v9!000d: const-wide/high16 v7, #long 4611686018427387904 // #4000!000f: mul-double/2addr v2, v7!0010: mul-double/2addr v0, v2!0011: add-double/2addr v0, v11!0012: mul-double v2, v4, v4!0014: mul-double v7, v0, v0!0016: add-double/2addr v2, v7!0017: const-wide/high16 v7, #long 4616189618054758400 // #4010!0019: cmpl-double v2, v2, v7!001b: if-lez v2, 0026 // +000b!001d: sget v0, LMandle;.threshold:I // field@0001!001f: int-to-double v0, v0!0020: int-to-double v2, v13!0021: div-double/2addr v0, v2!0022: int-to-double v2, v6!0023: mul-double/2addr v0, v2!0024: double-to-int v0, v0!0025: return v0!0026: add-int/lit8 v2, v6, #int 1 // #01!0028: move v6, v2!0029: move-wide v2, v4!002a: goto 0005 // -0025

07:!T0!

1b:

26:!T1!

05:

1d:

1d:

Traces

16

0000: const-wide/16 v0, #int 0 // #0!0002: const/4 v2, #int 0 // #0!0003: move v6, v2!0004: move-wide v2, v0!0005: if-ge v6, v13, 001d // +0018!0007: mul-double v4, v2, v2!0009: mul-double v7, v0, v0!000b: sub-double/2addr v4, v7!000c: add-double/2addr v4, v9!000d: const-wide/high16 v7, #long 4611686018427387904 // #4000!000f: mul-double/2addr v2, v7!0010: mul-double/2addr v0, v2!0011: add-double/2addr v0, v11!0012: mul-double v2, v4, v4!0014: mul-double v7, v0, v0!0016: add-double/2addr v2, v7!0017: const-wide/high16 v7, #long 4616189618054758400 // #4010!0019: cmpl-double v2, v2, v7!001b: if-lez v2, 0026 // +000b!001d: sget v0, LMandle;.threshold:I // field@0001!001f: int-to-double v0, v0!0020: int-to-double v2, v13!0021: div-double/2addr v0, v2!0022: int-to-double v2, v6!0023: mul-double/2addr v0, v2!0024: double-to-int v0, v0!0025: return v0!0026: add-int/lit8 v2, v6, #int 1 // #01!0028: move v6, v2!0029: move-wide v2, v4!002a: goto 0005 // -0025

07:!T0!

1b:

26:!T1!

05:

1d:!T2!

25:

00:!T3!

05:

17

Garbage CollectionThread!Stack

A

B

C

D

E

F

18

Garbage Collection (Mark)Thread!Stack

A

B

C

D

E

F

19

Garbage Collection (Sweep)Thread!Stack

A

B

C

D

E

F

XX

Porting Dalvik™ onto AArch64

20

▪ Model, kernel, bionic and shell below ▪ LCU14-411 From zero to booting Nano-Android

▪ Not just a recompile! ▪ Dalvik™ VM implementation

▪ Portable C interpreter, garbage collection, class loading, JNI ▪ Compressed references

▪ Java™ core libraries - platform/libcore: ▪ java.* classes ▪ int always 32-bit, long always 64-bit ▪ Java: int pointer; ➤ long pointer;!▪ C: jint pointer; ➤ jlong pointer;!▪ pointer = (jlong)(void*) nativeStructure;!

▪ Build system21

ARM’s AArch64 Porting effort

▪ Then: ▪ AArch64 assembler interpreter

▪ Slightly before with VIXL ▪ VIXL - library for simulating, assembling and disassembling ARMv8 A64

instructions. ▪ Just-In-Time compiler ▪ The rest of the Android™libraries

▪ End result - Android with only 64-bit binaries ▪ Initially 64-bit Dalvik™ running on host in November 2012 ▪ On ARM’s ARMv8 models on command line in February 2013 ▪ Then AOSP on models from July 2013

22

ARM’s AArch64 Porting effort (2)

▪ After porting AOSP to AArch64 - ART came along in October 2013 ▪ Not all was lost:

▪ Able to boot AOSP from July 2013 (4.2/4.3) through to 4.4 on Dalvik™ for AArch64

▪ Demonstrated AOSP on Juno 2014

Dalvik is Dead, Long Live ART!

23

ART - Android™ Runtime

24

▪ Introduced October 2013 as experimental runtime in Android™ 4.4 “KitKat” ▪ First release in Android “L” ▪ Less lag, more performance ▪ Productised through 2014 ▪ ART also introduces 64-bit support into Android

▪ ARM contributed compiler backend components from Dalvik for AArch64 ▪ + JNI compiler, glue, fixes, performance features/tweaks

ART

25

▪ Dalvik™ Virtual Machine ▪ Java™ applications as before ▪ Garbage collection, class loading, object references, all as before ▪ It is not translating programs into C/C++

▪ Native code works as before (Java Native Interface - JNI) ▪ Eclipse + ADT or Android Studio, NDK are essentially unchanged

▪ Targeting the same platform - Android ▪ Debugging ▪ dalvikvm!▪ Zygote

▪ app_process - Android’s command for starting VMs.

Unchanged

26

Changed▪ Garbage collection + allocation

▪ Parallel, less pauses ▪ C++ Interpreter ▪ Ahead-of-time compilation (AOT) ▪ Support for 64-bit execution ▪ Diagnostics ▪ Stricter JNI ▪ Stricter bytecode verification

27

▪ No JITing during startup ▪ Compilation time spent at installation

▪ boot.art: ▪ Part of the heap stored on flash ▪ Built as part of firmware image ▪ Pre-initialized VM heap ▪ ~12 MB on AOSP

▪ Zygote as before ▪ Initialize ▪ Wait for binder request to fork new apps

Initialization

28

▪ Stacks now unified - each thread has one stack each ▪ Original Dalvik™ implementation had separately allocated VM stack ▪ ART has VM, interpreted, compiler and JNI frames all on same stack ▪ Stack characteristics may be different

▪ Stack overflow + null pointer exceptions detected through fault handlers ▪ Trap and handle

▪ Thread local allocation ▪ Threads can allocate objects without getting global heap lock

Threads

29

▪ More pluggable ▪ Provisions in runtime for different GC schemes

▪ Parallel & Concurrent ▪ More threads doing the work

▪ Background collection ▪ More throughput, less responsive

▪ Mark & Sweep, semi-space, large objects, variations

Garbage Collection

30

▪ Two Zygotes, one 32-bit , one 64-bit ▪ 64-bit is the default ▪ Files duplicated on flash - 32/64-bit. ▪ Compressed references

▪ 32-bit object references ▪ Mapped within bottom 4 GB of memory ▪ Heap size: 256 MB

▪ Hard-float ABI ▪ Parameters passed in floating-point registers

▪ JNI 64-bit libraries ▪ Apps with 32-bit JNI run by 32-bit Zygote

64-bit Support

31

Zygote32 Zygote64

32 bit App 64 bit App64-bit App32-bit App

fork()

Compiling for ART

32

*.java

*.class

*.dex *.apk *.odex

dx

javac

dex2oat

InstallationDevelopment

/data/dalvik-cache/arm!or!

/data/dalvik-cache/arm64

▪ Compiler driver ▪ Portable compiler ▪ Sea of nodes IR ▪ Quick compiler ▪ Optimising compiler

▪ More platform independent code ▪ ARM, MIPS & x86 with 64-bit variants

▪ Performed at install time ▪ Compiler compiles with multiple threads in parallel ▪ Good code quality without onerous compile time

Compilation

33

Working on AOSP

34

▪ Google working in the open with ART in AOSP ▪ ARM, MIPS, Intel & ARM partners contribute ▪ 64-bit porting work has been a proving ground for this approach

▪ Ideas are nice, but code is better ▪ Understand who is doing what

▪ Check, post to the Google groups (see android-platform, etc.) ▪ Important to test on more than just ARM platforms - check MIPS & x86 ▪ Frequently unstable: reversions, build system restructuring

▪ Android is big, and components have interdependencies

Working on AOSP

35

▪ AOSP’s gerrit has useful features ▪ Use it to track new changes in projects through email ▪ Volume can be high - can filter on git fields ▪ esp. if you are working on a particular feature

▪ Keep a working branch - pull it forward as quickly as possible though ▪ Use and add to the unit tests

▪ art # mma test-art

Working on AOSP

36

Thank You

The trademarks featured in this presentation are registered and/or unregistered trademarks of ARM Limited (or its subsidiaries) in the EU and/or elsewhere. All rights reserved. Any other marks featured may be trademarks of their

respective owners

37

▪ Today: ▪ LCU14-104: Everything’s Done! Android™ for 64-bit ARMv8, What’s next?

◦ Next in this room ▪ LCU14-108: Panel: Faster, Better and more Open AOSP Support

◦ 12:10, this room ▪ Wednesday:

▪ LCU14-309: Introducing Android NDK for 64bit ARMv8 SOCs ◦ 12:10 Grand Peninsula A

▪ Thursday: ▪ LCU14-411: From zero to booting Nano-Android with 64bit support

◦ 12:10 Grand Peninsula C ▪ Friday

▪ LCU14-502: Android User-Space Tests: Multimedia codec tests, Status and Open Discussions ◦ 09:15 Grand Peninsula B

Sessions

38

▪ Introducing ART: https://source.android.com/devices/tech/dalvik/art.html ▪ ART compatibility: https://developer.android.com/guide/practices/verifying-apps-art.html ▪ Google I/O 2014, The ART Runtime: https://www.youtube.com/watch?v=EBlTzQsUoOw ▪ VIXL: https://github.com/armvixl/vixl ▪ ARM Juno: http://www.arm.com/products/tools/development-boards/versatile-express/

juno-arm-development-platform.php ▪ AOSP Gerrit: https://android-review.googlesource.com/ ▪ Linaro's Android team: https://wiki.linaro.org/Platform/Android ▪ Bug Reports: https://source.android.com/source/report-bugs.html ▪ ARM Connected Community: http://community.arm.com/groups/android-community ▪ ARMv8 Reference Manual: http://infocenter.arm.com/help/index.jsp?topic=/

com.arm.doc.ddi0487a.c/index.html

39

References

!▪ The Android™ robot is reproduced or modified from work created and shared by Google and

used according to terms described in the Creative Commons 3.0 Attribution License.

Notices

40

Backup slides

41

42

CONFIDENTIAL 6

Multi-lib: A 64bit 'primary' boot