35
Arvind Jayaprakash Thinking in C/C++, coding in Java Thinking in C/C++, coding in Java foss.in 2012 Arvind Jayaprakash

Thinking in C/C++, coding in Java

Embed Size (px)

DESCRIPTION

foss.in 2012 talk (http://fossdotin2012.shdlr.com/conferences/talk/196) Intent: There comes a time in every C/C++ programmer's life where he is looking at a smashed stack, a trashed heap & wishes that core dumps happened only when null pointers get deferenced. This is the weak moment when people hang up their gdb boots & trade it for java.lang.NullPointerException We shall be exploring how to use Java as a safer version of C without giving up too much of control. A lot of big open source projects are starting to show up in Java for this very reason (eg: hadoop) Overview: The Java programming language was considered too slow and too high level in its early days by performance junkies who believed that the only true way out was to code in C (and very reluctantly in C++). The language itself made significant strides by the time it reached v5 and JVMs also have become quite good at what they do

Citation preview

Page 1: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

Thinking in C/C++, coding in Java

foss.in 2012Arvind Jayaprakash

Page 2: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

Audience

• Surely not for you if you’ve never done *nix system programming or bare C/C++

• Maybe for you if you’ve done reasonable amount of the above and “hello world” Java

• Prime audience if you are being pushed into/want to explore Java as an option for moderately high performance applications

Page 3: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

whoami

Page 4: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

finger

Home• anomalizer

• anomalizer

• http://anomalizer.net/

Work• anomalizer

• http://inmobi.com/

Page 5: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

history/uname

Home• MS-DOS in 1990

• Primarily Win98 & a little bit of RH7 in 2001

• Win7 for PPT and Gentoo for everything else in 2012 (fluxbox is my window manager, xterm is my favourite terminal)

Work• 5 years of FreeBSD & 1 year

of RHEL

• Chose the OS for current employer’s servers (Ubuntu since 2008)

• Gentoo/Win7 combo on my laptop

Page 6: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

Primer

Page 7: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

Survival tips

• Java language (J2SE) != J2EE• J2SE 5 (also known as 1.5 or JLS5) is lowest

respectable version of the language• Sun (now Oracle) JRE continues to remain the

most popular free JRE+JDK• Sun-JRE 1.6.0.22 is a good min version if you have

64 bit, x86_64, NUMA hardware running linux• IDEs are necessary evil; vim/emacs just doesn’t

cut it

Page 8: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

Why Java 1.5?

• Extensive concurrency libs• Generics• Annotations• Lint checks• Enums (typesafe too!)• Variable arguments• foreach

Page 9: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

Let us get started now*

* usually means over-simplification that shall be clarified later

Page 10: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

Classes & Objects

Page 11: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

D’oh

Page 12: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

Primitives v/s objects

• Primitive data types, structs & classes play by the exact same set of rules in C/C++ in almost every context

• Java fundamentally drives a wedge between the two both at a language level and runtime level

• This is why there a primitive int and a class Integer. These 2 are not interchangeable*

* Auto boxing is a deception

Page 13: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

The approximate analogy

Primitives• Think of primitives of values

that can reside on the stack• Lifespan always tied to

source scope for local variables

Composites (Objects)• Think of objects (classes) as

values that always* reside on heap

• Now it becomes obvious that you are always dealing with pointers/references

• It also becomes obvious that their true lifespan is not tied to source scope

*escape analysis implementations in some JVMs

Page 14: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

Nested structs/classesclass Point { public int x; public int y;}

class Rect { public Point top_left; public Point bottom_right;}

struct Point { int x; int y;}

struct InlineRect { Point top_left; Point bottom_right;}

struct IndirectRect { Point *top_left; Point *bottom_right;}

Page 15: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

Null & void

• The notorious void* exists in Java; it is commonly referred to as the class named Object– Any object (reference) can be directly cast to

Object– An object (reference) of type Object can be

downcast to any type at compile time#

• null is not a type, however it is a language defined literal (like true & false)

# but can throw an error at runtime

Page 16: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

What are references in java?

Why it is like a C pointer• Think of a reference as C

pointer• Think of the dot operator in

Java as C’s arrow operator• null is NULL, dereferencing

it is a bad idea • Think of a final ref in Java as

a const ptr (not to be confused with ptr to const)

Why it is not like a C++ reference

• j-refs are nullable (d’uh)• C++ refs cannot be made to

point to something else post declaration unlike Java refs

• == operator in J has ptr equivalence semantics, not dereferenced object equivalence; use equals() for that

Page 17: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

vtables

• Every class inherits from Object class• Every member function is virtual in Java; there is

no opt-out– Hence, internally, every class has a vtable– And every object instance has an internal pointer/ref

to the vtable of its actual type (for dynamic dispatch)– And a fn-call is via ptr-to-fn*

• RTTI (of C++ fame) comes at no additional cost as a side-effect & guaranteed to be available

*Unless you do some class/method finalisation

Page 18: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

Other deceptive similarities

Page 19: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

Generics & templates aren’t the same

Java generics• No support for primitives• Single copy of code exists

regardless of the number of type arguments a generic code is used with

• Generified code get compiled as an entity in itself

• Bounded type parameters, possible, unbounded defaults to Object

C++ templates• Supports all types• One copy of object code for

each template instantiation• Glorified C style marcos,

compilation happens once for each expansion; some compilation errors crop up here

• No inheritance family based bounding of type parameters, only explicit specialization is possible

Page 20: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

casts

• Syntactically identical to C casts• Let us speak in C++ terms for semantic clarity– static_cast is permitted– No const_cast as there are no consts to begin with– dynamic_cast permitted due to implicit RTTI

support (hence Object objects can be cast to anything)

– reinterpret_cast disallowed; convert & copy is the only way out

Page 21: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

Memory issues

Page 22: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

Auto-boxing woes

• Java 5 made it syntactically possible to use a primitive and it’s objectified version interchange-able (eg: Long & long)

• The costs however are very different– Indirection (ptr de-ref) to read value– Memory footprint is 2 ptrs (one to value, and the

vptr inside object) + that of actually storing the primitive

Page 23: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

You don’t want to see this

Integer x;

for(int i = 0 ; i < 100; i++) { x = i * i; }

Page 24: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

int[] v/s ArrayList<Integer>

• vector<int> & int[] have identical performance in C++, don’t carry that assumption into Java!

• Remember, generics only work with objects, so we can’t use an int with it

• And int is just not the same as an Integer

Page 25: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

In figures

int[]

ArrayList<Integer>

Array header a0 a1 a2 an-1

Array header

ObjectHeader

a2ObjectHeader

a1

ObjectHeader

a0

ObjectHeader

an-1

Page 26: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

In words

• On an un-tuned 64 bit JVM, pay at-least 400% memory tax (it is still 200% on a tuned JVM)

• 100% apparent memory access cost• Completely wreck your cache lines by simply

iterating through the array (real tax can exceed 100%)

• And yes, there is copying involved when you expand beyond a certain limit

• And more work for GC …

Page 27: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

The solution

• So what about collections of primitives?– What if you want an expandable array of ints?– What if you want a map of short to double?

• Use primitive collection libraries– trove4j solves the above problems– It is GNU project & comes with LGPL license too

• The larger point however is to understand the object model & memory layout

Page 28: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

No reinterpret cast for you!

• Imagine trying to read values from byte streams such as files & sockets

• You have 3 choices– Bottom-up read, one primitive at a time (entire class

chain must play nice for this)– Slurp the blob, break the blob and make meaningful

object by copying over the primitives in top-down fashion (a.k.a. memcpy)

– Use java serialization (disallows conditional parsing)

Page 29: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

I/O ops

Page 30: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

Dealing with slow parts(of any language)

• A common reason to fall back to “native” languages is when a large amount of I/O is involved

• I/O is dreaded as it usually translates to *nix syscalls

• A lot of syscalls exist specifically to optimize userspace/kernel space transition inefficiencies

• They also have OS idosyncracies

Page 31: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

Java & I/O*nix & C feature Java equivalent Available since

Allocate char* ByteBuffer.allocate() 1.4

sendfile() FileChannel.transfer{To|From} 1.4

mmap() FileChannel.map() 1.4

epoll() Channels.Selector() + SelectorProvider

API since 1.4, epoll as implementation since 1.6

readv()/writev() Channel.read/write (ByteBuffer[]) 1.4

chmod()/chown()/inotify()/stat()/copy()/symlink()/readdir/…

NIO2 file api 1.7

SCTP - 1.7

Page 32: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

etc

Page 33: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

Not covered in the talk

• Reflection– Runtime inspection of types & dynamic code gen

• JIT– JRE profiles applications & recompiles code with

optimizations mid-flight!– Discovers structural shortcuts possible in a given

app & exploits it• JNI– When you have to bridge your C code

Page 34: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

Go read about the following

• “maven” (awesome build mgmt tool)• “Google guavas” (as important as boost for

cpp, historically speaking)• “Project lombok” (uses annotations to tuck

away massive boilerplate coding)• “slf4j” (log4j is so Java 1.2, never code against

it)• “netty” (the libevent of Java)

Page 35: Thinking in C/C++, coding in Java

Arvi

nd Ja

yapr

akas

hThinking in C/C++, coding in Java

And some more

• “testng” (unit & module testing system)• “mockito” (helps in creating test mocks)• “javassist” (create entire classes from strings

at runtime!)• “guice” & “Spring DI” (dependency injection)