26
2/14/01 RightOrder : Telegraph & Java 1 Telegraph Java Experiences Sam Madden UC Berkeley [email protected]

Telegraph Java Experiences

  • Upload
    penn

  • View
    31

  • Download
    0

Embed Size (px)

DESCRIPTION

Telegraph Java Experiences. Sam Madden UC Berkeley [email protected]. Telegraph Overview. 100% Java In memory database Query engine for alternative sources Web Sensors Testbed for adaptive query processing. Telegraph & WWW : FFF. Federated Facts and Figures - PowerPoint PPT Presentation

Citation preview

Page 1: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 1

Telegraph Java Experiences

Sam MaddenUC [email protected]

Page 2: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 2

Telegraph Overview 100% Java In memory database Query engine for alternative sources

Web Sensors

Testbed for adaptive query processing

Page 3: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 3

Telegraph & WWW : FFF Federated Facts and Figures Collect Data on the Election Based on Avnur and Hellerstein

Sigmod ‘00 Work: Eddies Route tuples dynamically based on

source loads and selectivities

Page 4: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 4

fff.cs.berkeley.edu

Page 5: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 5

Architecture Overview Query Parser

Jlex & CUP Preoptimizer

Chooses Access Paths Eddy

Routes Tuples To Modules

Page 6: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 6

Modules Doubly-Pipelined Hash Joins Index Joins

For probing into web-pages Aggregates & Group Bys Scans

Telegraph Screen Scraper: View web pages as Relations

Page 7: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 7

Execution Framework One Thread Per Query Iterator Model for Queries

Experimented with Thread Per Module Linux threads are expensive

Two Memory Management Models Java Objects Home Rolled Byte Arrays

Page 8: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 8

Tuples as Java Objects Tuple Data stored as a Java Object Each in separate byte array Tuples copied on joins, aggregates Issues

Memory Management between Modules, Queries, Garbage collector control

Allocation Overhead Performance: 30,000 200byte tuples / sec

-> 5.9 MB / sec

Page 9: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 9

Tuples As Byte Array All tuples stored in same byte array / query Surrogate Java Objects

Offset, Size

Offset, Size

Offset, Size

Surrogate Objects

Byte Array

Directory

Page 10: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 10

Byte Array (cont) Allows explicit control over memory /

query (or module) Compaction eliminates garbage

collection randomness Lower throughput: 15,000 t/sec

No surrogate object reuse Synchronization costs

Page 11: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 11

Other System Pieces XML Based Catalog

Java Introspection Helps Applet-based Front End JDBC Interface Fault Tolerance / Multiple Servers

Via simple UNIX tools

Page 12: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 12

RightOrder Questions Performance vs. C JNI Issues Garbage Collection Issues Serialization Costs Lots of Java Objects JDBC vs ODI

Page 13: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 13

Performance Vs. C JVM + JIT Performance Encouraging: IBM

JIT == 60% of Intel C compiler, faster than MSC for low level benchmarks

IBM JIT 2x Faster than HotSpot for Telegraph Scans

Stability Issues www.javalobby.org/features/jpr

Page 14: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 14

JIT Performance vs C

IBM JIT

Optimized Intel

Optimized MS

Source: www.javalobby.org/features/jpr

Page 15: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 15

Performance Gotchas Synchronization

~2x Function Call overhead in HotSpot Used in Libraries: Vector, StringBuffer

• String allocation single most intensive operation in Telegraph

• Mercatur: 20% initial CPU Cost

Garbage Collection Java dumb about reuse Mercatur: 15% Cost OceanStore: 30ms avg latency, 1S peak

Page 16: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 16

More Gotchas Finalization

Finalizing methods allows inlining Serialization

RMI, JNI use serialization Philippsen & Haumacher Show

Performance Slowness

Page 17: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 17

Performance Tools Tools to address some issues

JAX, Jopt: make bytecode smaller, faster• www.alphaworks.ibm.com/tech/JAX

www.condensity.com• Bytecode optimizer

www.optimizeit.com• Good profiler, memory allocation and garbage

collection monitor

Page 18: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 18

JNI Issues Not a part of Telegraph JNI overhead quite large (JDK 1.1.8, PII

300 MHz)

Source: Matt Welsh. A System Support High Performance Communication and IO In Java. Master’s Thesis,

UC Berkeley, 1999.

Page 19: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 19

More JNI But, this is being worked on

IBM JDK 100,000 B copy in 5ms, vs 23ms for 1.1.8 (500 Mhz PIII)

JNI allows synchronization (pin / unpin), thread management See http://developer.java.sun.com/developer/onlineTraining/Programming/JDCBook/

jni.html

GCJ + CNI: access Java objects via C++ classes http://gcc.gnu.org/java/

Page 20: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 20

Garbage Collection Performance

Big problem: 1 S or longer to GC lots of objects Most Java GCs blocking (not concurrent or multi-

threaded) Unexpected Latencies

OceanStore: Network File Server, 30ms avg. latencies for network updates, 1000 ms peak due to GC

In high-concurrency apps, such delays disastrous

Page 21: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 21

Garbage Collection Cont. Limited Control

Runtime.gc() only a hint Runtime.freeMemory() unreliable No way to disable

No object reuse Lots of unnecessary memory allocations

Page 22: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 22

Serialization Not in Telegraph Philippsen and Haumacher, “More Efficient Object Serialization.”

International Workshop on Java for Parallel and Distributed Computing. San Juan, April, 1999. Serialization costs for RMI are 50% of total RMI time Discard longevity for 7x speed up

Sun Serialization provides versioning Complete class description stored with each serialized object Most standard classes forward compatible (JDK docs note

special cases) See http://java.sun.com/products/jdk/1.2/docs/guide/serialization/spec/serialTOC.doc.html

Page 23: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 23

Lots of Objects GC Issues Serious Memory Management

GC makes programmers allocate willy-nilly Hard to partition memory space

Telegraph byte-array ugliness due to inability to limit usage of concurrent modules, queries

Page 24: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 24

Storage Overheads Java Object class is big:

Integer requires 23 bytes in JDK 1.3 int requires 4.3 bytes No way to circumvent object fields Use primitives or hand-written

serialization whenever possible

Page 25: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 25

JDBC vs ODI No experience with Oracle JDBC overheads are high, but don’t

have specific performance numbers

Page 26: Telegraph Java Experiences

2/14/01 RightOrder : Telegraph & Java 26

Bottom Line Java great for many reasons

GC, standard libraries, type safety, introspection, etc. Significant reductions in development and debugging

time. Java performance isn’t bad

Especially with some tuning Memory Management an Issue Lack of control over JVMs bad

When to garbage collect, how to serialize, etc.