Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Optimising large dynamic code bases.
2 28/07/2014
Who am I?
Duncan MacGregor
Lead Software Engineer on the Magik on Java project at General Electric in Cambridge
Aardvark179 on twitter
3 28/07/2014
What is Magik?
• Dynamic, weakly typed message passing based language
• Does everything "wrong" i.e. numerical type promotion, closures over mutable values, etc.
• Not that special in terms of the language.
4 28/07/2014
So what does make it special?
• It s tightly integrated with a long transaction version managed data store.
• There are large applications built using it.
• No, larger than that, I mean really large
5 28/07/2014
So what is large?
• A typical application with an open database has 200,000 methods defined
• About 10% of those are record field accessors and other code generated by the ORM
6 28/07/2014
It gets worse
• There are bigger data models out there, some with over 4000 tables
• Many customers will use third party add ons and their own customisations which may be almost as large as the base apps
7 28/07/2014
What are our main concerns?
• Performance must be good enough
• Need to startup and open the database as fast as possible
• Must not use ridiculous amounts of memory as the application is used
• Warmup time needs to be as short as possible
8 28/07/2014
What I’m going to talk about today
• Memory usage
• Startup time
• Smoothing the path of moving from an old VM to a new one
Memory usage
Where we started on Java 8
11 28/07/2014
First steps running on Java 8
• Measure startup speed relative to Java 7
• Examine memory relative to Java 7
• Take heap dumps as the application is used
• Project memory usage of fully exercised application
12 28/07/2014
What's using the memory?
1. LambdaForms
2. LambdaForms
3. LambdaForms
4. Concurrent Collections (!)
5. Atomics
6. Everything else
13 28/07/2014
LambdaForms!
• Binding followed by adapting generated many many LFs
• Likewise drop arguments
• These combined to make SwitchPoint GWT surprising inefficient
14 28/07/2014
Concurrent collections?
• We want a run time as lock free as possible—a bad application shouldn't lock up core language infrastructure
• Concurrent collections are great in small numbers
• If every method has a little one to hold type adapted versions then the memory adds up
15 28/07/2014
Atomics
• Again, we want things lock free if possible
• Methods have
• SwitchPoints to handle invalidation
• Flags
• Other meta-data
• All held in atomics to support update by multiple threads
16 28/07/2014
What is everything else?
• Antlr’s lexer generation produces large short arrays
• Our globals are stored in a prefix tree, we need to move them to a more compact structure
• Reflection caches can be huge
• Many other small things, but they add up
Reducing memory usage
18 28/07/2014
Prototyping / proof of concept
• Change MH adaption order
• Cache unbound adapters for CallSites
• Reduce the PIC of each site to 1
• Write our own SwitchPoint class
19 28/07/2014
Old CallSite fallback adaption
Fallback
Bind to CallSite
Type adaption
Exact Invoker
Fold
20 28/07/2014
Old CallSite fallback adaption
Fallback
Bind to CallSite
Type adaption
Exact Invoker
Fold
21 28/07/2014
New CallSite fallback adaption
Fallback
Type adaption
Exact Invoker
Bind to CallSite Fold
Drop argument
22 28/07/2014
New CallSite fallback adaption
Fallback
Type adaption
Exact Invoker
Bind to CallSite Fold
Drop argument
23 28/07/2014
GuardWithTest issue
Guard With Test
Bind to Class
ClassCheck
Fallback Target
24 28/07/2014
Drop arguments
GuardWithTest issue
Guard With Test
Bind to Class
ClassCheck
Fallback Target
25 28/07/2014
Bind to Class
GuardWithTest solution
Guard With Test
Drop arguments
ClassCheck
Fallback Target
26 28/07/2014
Bind to Class
GuardWithTest solution
Guard With Test
Drop arguments
ClassCheck
Fallback Target
27 28/07/2014
Bind to Class
GuardWithTest solution
Guard With Test
Drop arguments
ClassCheck
Fallback Target
28 28/07/2014
SwitchPoint dynamicInvoker
GET_TARGET
Bind to CallSite
Type adaption
Exact Invoker
Fold
29 28/07/2014
SwitchPoint dynamicInvoker
GET_TARGET
Bind to CallSite
Type adaption
Exact Invoker
Fold
30 28/07/2014
SwitchPoint guardWithTest
Guard With Test
Drop Arguments
Invoker
Fallback Target
31 28/07/2014
SwitchPoint guardWithTest
Guard With Test
Bind to CallSite
Fold
Fallback Target
Type adaption
GET_TARGET
ExactInvoker
32 28/07/2014
SwitchPoint guardWithTest
Guard With Test
Bind to CallSite
Fold
Fallback Target
Type adaption
GET_TARGET
ExactInvoker
33 28/07/2014
Did it help?
• Yes! More than halved the LFs
• Applications feel faster when first started
• Memory usage significantly reduced
34 28/07/2014
Putting it into production
• Started by tackling easy parts. Concurrent collections, Atomics etc.
• Then refactor call sites
• This work is still on going
35 28/07/2014
CallSites: optimising for the common cases
• Method calls are complex
• Normal, super, self…
• How many results?
• All sites supported instrumentation
• Added complexity to each instance
36 28/07/2014
Refactor into many classes
Normal Call Call To Self Super Call
Single Result Single PrivateSingle SuperPrivateSingle
Tuple Result Tuple PrivateTuple SuperPrivateTuple
Unknown Number of Results
Private SuperPrivate
37 28/07/2014
Move towards functional composition
• Hard to build a good class hierarchy without repetition
• Some functionality based on choices at runtime (instrumentation etc.)
• Allows megamorphic costs to only be paid by the sites that need it
38 28/07/2014
Unexpected advantages
0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000
MethodSendPrivateSingleCallSite
MethodSendSingleCallSite
MethodSendTupleCallSite
MethodSendPrivateTupleCallSite
MethodSendCallSite
MethodSendPrivateCallSite
MethodSendSuperPrivateSingleCallSite
MethodSendSuperPrivateCallSite
MethodSendSuperPrivateTupleCallSite
Number
Start up and database open
40 28/07/2014
Bootstrapping the system
• Bootstrapping the system involves loading a lot of code that is only run once
• Loading a class looks up MethodHandle constants repeatedly
• Loading a module causes resources to be scanned on disk
41 28/07/2014
Improving this
• Improving Java’s MH constant caching improves our start up time by over 10%
• Emitting fewer larger classes may also help
• Restructure our resource system to stop using the file system
• We may look at more radical solutions depending on where the time is spent
42 28/07/2014
Opening the database
• Lots of meta-programming
• Subclasses for record types
• Field access methods
• Join navigation
• …
• Old VM allowed images to be saved after all this had been initialised
43 28/07/2014
Opening the database… faster
• Creating new classes requires reflection
• Cache data on classes you know you’ll need again
• Be very careful about the reflection calls you make, some are faster than others
• Be very focused about invalidating CallSites
44 GE Title or job number
7/28/2014
Why?
0
10
20
30
40
50
60
MagikSF Alchemy(Java 7u25)
Alchemy(Java 8)
Unoptimised class/method creation
0
2
4
6
8
10
12
MagikSF Alchemy(Java 7u25)
Alchemy(Java 8)
After performance work
45 28/07/2014
Serialisation
• Need to turn the whole data dictionary into a series of blobs
• Some things are hard to serialise
• Want to restore database connections in parallel—trial by fire for thread safety
• Additional work then needed to stitch everything back together
46 28/07/2014
How much faster is it?
0
10
20
30
40
50
60
70
80
Without serialisation With serialisation
Time to open database
47 28/07/2014
What’s still to do?
• Fix some concurrency bugs
• Test with more databases
Easing the transition
49 28/07/2014
Need to help people moving from our old VM
• Used to running tests with some form of coverage analysis
• Not used to threads being truly concurrent
50 28/07/2014
Coverage of non-Java languages
• Jacoco can compile coverage stats without any problems
• We’ve run large test suites through it and nothing terrible happened
• Displaying those results is a different story
51 28/07/2014
Coverage results
• Presenting coverage data for a large code base requires hierarchical display
• The packages for your classes need to be in a good hierarchy as well or the results still won’t be managable
• If the class file contains info about the source file then use that, don’t depend on Java source conventions
52 28/07/2014
How much work is this?
• Prototyping changes to Jacoco took an afternoon
• Haven’t turned those into proper patches yet
• Changes to our compiler still to be done
53 28/07/2014
Threading
• We can instrument thread creation and use of atomic queues and locks
• That only catches the cases where people thought about thread safety at all
Summary
55 28/07/2014
• MethodHandle caching and reuse is vital
• Serialisation and optimising meta-programming really help startup speeds
• No easy answers for finding concurrency issues