JVM performance options. How it works

Preview:

DESCRIPTION

This is presentation for Cogniance Java Evening 13.02.2013. My related to topic article you can find here (russian) : http://habrahabr.ru/post/160049/

Citation preview

JVM performance options.

How it works?

Dmitriy DumanskiyCogniance, Velti project

Java Team Lead

Xmx2048M -Xms2048M -XX:ParallelGCThreads=8 -Xincgc -XX:+UseConcMarkSweepGC -XX:

+UseParNewGC -XX:+CMSIncrementalPacing -XX:+AggressiveOpts

-XX:+CMSParallelRemarkEnabled -XX:+DisableExplicitGC -

XX:MaxGCPauseMillis=500 -XX:SurvivorRatio=16 -XX:TargetSurvivorRatio=90

-XX:+UseAdaptiveGCBoundary -XX:-UseGCOverheadLimit -Xnoclassgc -

XX:UseSSE=3 -XX:PermSize=128m -XX:LargePageSizeInBytes=4m

Options may vary per

architecture / OS / JVM version

JVM 6 ~ 730 options

JVM 7 ~ 680 options

-X : are non-standard (not all JVM)

-XX : are not stable

Boolean : -XX:+<option> or -XX:-<option>

Numeric : -XX:<option>=<number>

String : -XX:<option>=<string>

Types

Categories

Behavioral options

Garbage Collection options

Performance tuning options

Debugging options

Analys : Can objects be created on stack?

Are objects accessed from 1 thread?

-XX:+DoEscapeAnalysis

Analys result : GlobalEscape

ArgEscape

NoEscape

-XX:+DoEscapeAnalysis

NoEscape

class Cursor {

String icon;

int x;

public void create() {

Cursor c = new Cursor(); //HEAP

c.icon = null; //HEAP

c.x = 0; //HEAP

}

}

NoEscape → scalar replacement

class Cursor {

String icon;

int x;

public void create() {

String icon = null; //ref on stack frame

int x = 0; //int on stack frame

}

}

NoEscape → scalar replacement

NoEscape → scalar replacement

-XX:+DoEscapeAnalysis

~20-60% locks elimination

~15-20% performance improvement

-XX:+DoEscapeAnalysis

-XX:+AggressiveOpts

-AggressiveOpts +AggressiveOpts

AutoBoxCacheMax 128 20000

BiasedLockingStartupDelay 4000 500

EliminateAutoBox false true

OptimizeFill false true

OptimizeStringConcat false true

-XX:AutoBoxCacheMax=size

class Integer {public static Integer valueOf(int i) {

if(i >= -128 && i <= IntegerCache.high) return IntegerCache.cache[i + 128]; else return new Integer(i); }}

Sets IntegerCache.high value :

-XX:AutoBoxCacheMax=size

new Integer(1) vs Integer.valueOf(1)

valueOf ~4 times faster

-XX:BiasedLockingStartupDelay=delay

Biased

Thin

Fat

-XX:-OptimizeStringConcatString twenty = «12345678901234567890»;

String sb = twenty + twenty + twenty + twenty;

String twenty = «12345678901234567890»;String sb = new StringBuilder().append(twenty).append(twenty).append(twenty).append(twenty).toString();

-XX:-OptimizeStringConcatString twenty = «12345678901234567890»;

String sb = new StringBuilder()

.append(twenty).append(twenty)

.append(twenty).append(twenty).toString();

new char[16];

new char[34];

new char[70];

new char[142];

-XX:+OptimizeStringConcatString twenty = «12345678901234567890»;

String sb = new StringBuilder()

.append(twenty).append(twenty)

.append(twenty).append(twenty).toString();

new char[80];

-XX:+OptimizeStringConcatString twenty = «12345678901234567890»;

StringBuilder sb1 = new StringBuilder();

sb1.append(new StringBuilder()

.append(twenty).append(twenty)

.append(twenty).append(twenty)

);

new char[80];

XX:+OptimizeFillArrays.fill(), Arrays.copyOf() or code patterns :

for (int i = fromIndex; i < toIndex; i++) {

a[i] = val;

}

Native machine instructions

XX:+EliminateAutoBox

Removes unnecessary AutoBox operations

Works only for Integers

-XX:+UseStringCache

Look like not used anymore

-XX:+UseCompressedStrings

For ASCII characters:

char[] -> byte[]

-XX:+UseCompressedOops

Heap size up to 32GbReferences size 50% smallerJVM performance boost 2-10%20 — 60% less memory consumption;

-XX:+UseCompressedOops

32-bit 64-bit 64-bit Comp.0

0.2

0.4

0.6

0.8

1

1.2

-XX:+EliminateLocks

synchronized (object) { //doSomething1}synchronized (object) { //doSomething2}

synchronized (object) { //doSomething1 //doSomething2}

-XX:+EliminateLockssynchronized (object) { //doSomething1}

//doSomething2

synchronized (object) { //doSomething3}

synchronized (object) { //doSomething1 //doSomething2 //doSomething3}

-XX:+UseLargePages

Translation-Lookaside Buffer (TLB) is a page translation cache that holds the most-

recently used virtual-to-physical address translations

-XX:CompileThreshold=n

Client mode n = 1500

Server mode n = 10000

More profile data — more optimizations

-XX:hashCode=n

Object.hashCode() - internal address of the object?

-XX:hashCode=nn is :

0 – Park-Miller RNG (default) 1 – f (address, global state) 2 – const 1 3 – sequence counter 4 – object address 5 – Thread-local Xorshift

Recommended