Database Methods for Scientific Computing

Database Methods for Scientific Computing

David R. O’Hallaron

Associate Professor of CS and ECE

Carnegie Mellon University

(joint work with Tiankai Tu and Julio Lopez)

The Scientific Computing Process

Mesh

t

Simulation results

Physical model

Mesh generation

Visuali-zation

Solver

The Euclid Project

Goal: Run large-scale physical simulations on PC’s with limited physical memory.

Approach: Index and store the input and output datasets in databases, and compute on the databases directly.

Requires research at the intersection of scientific computing, algorithms, databases, and systems.

Mesh generation

Mesh DBs

t

Simulation results DB

Physical model DB Visuali-

zationSolver

David O’Hallaron, Jacobo Bielak, Omar Ghattas (Carnegie Mellon)Jonathan Shewchuk (UC Berkeley)

Steven Day (SD State)

Teora, Italy1980

San Fernando Valley

x

epicenter lat. 34.32 long. -118.48

lat. 34.08 long. -118.75

lat. 34.38 long. -118.16

San Fernando Valley

San Fernando Valley (Top View)

Soft soil

Hard rock

xepicenter

San Fernando Valley (Side View)Soft soil

Hard rock

Node Distribution

Partitioned Unstructured Mesh

element

nodes

Simulation and Visualization

Scientific Computing with Euclid

Represent physical model, mesh, and simulation results on disk in spatial database structures called etrees (Euclid trees)

– Linear octree indexed by standard Morton-based locational codes.

– Disk pages indexed by standard B-tree indexing structure.

Perform entire process out-of-core by querying and updating the etrees.

Mesh generation

Mesh node and element etrees

t

Simulation results

etree

Physical model etree

Visuali-zation

Solver

Octree mesh generation

a b c

d e f g

h

i j k l

m

Balance requirement for meshes (2-to-1 constraint)

h1 h2 h3 h4

element/octant

a

b

c

d

e

f

g

h

i

j

k

l

m

h2

h1

h4

h3

master node

slave node

Octrees

a b c

d e f g

h

i j k l

m

x

y

a

b

c

d

e

f

g

h

i

j

k

l

m

32 4 7 85 60 1

0

1

2

3

4

5

6

7

8

Linear Octrees

a b c d e f g h i j k l m

B-tree index

B-tree Pages

010

00 11 00

010

Interleave the bits to obtain Morton code

d’s left-lower corner (2, 2)

Binary form (010, 010)

001100_11

Append level of d to obtain locational code

x

a

b

c

d

e

f

g

h

i

j

k

l

m

32 4 7 85 60 1

0

1

2

3

4

5

6

7

8

Morton code: Maps n-dimensional points to one-dimensional scalars

Locational code: Appends an octant’s level to the Morton code of its left-lower corner

Addressing Linear Octree Elementsy

An addressing scheme that clusters nearby octants

Finding an octant without knowing its locational code

The order imposed by the locational code is the same as the preorder traversal of leafs in octree

a b c

d e f g

h

i j k l

m

a

b

c

d

e

f

g

h

i

j

k

l

x

mm

Nice Properties of Linear Octrees

unbalancedoctree

Application-specific input

construct

etree library

transform

etree library

balancedoctree

balance

etree library

elementdatabase

nodedatabase

Etree Mesh Generator

Application(e.g., construct, balance)

Application(e.g., construct, balance)

Etree Library

B-TreeB-Tree

Lin

ea

rO

ctr

ee

Lin

ea

rO

ctr

ee

Au

toN

av

iga

tio

n

Au

toN

av

iga

tio

n

Lo

cal

Ba

lan

cin

g

Lo

cal

Ba

lan

cin

gEtree API

Etree API — Octant (insert) and octree (balance) level operations.

Linear octree — Well-known coding scheme to assign keys to octants.

Auto navigation — New algorithm for constructing octree automatically.

Local balancing — New algorithm to speed up balancing operation.

B-tree — Well-known DB indexing structure.

Etree Library: A Framework In C for Manipulating Etrees on Disk

Mesh Element Etree

root

01 10 11

00 01 10 11

B C D E

A F G

00 01 10 11 00 01 10 11 00 01 10 11

0000_01 A 0100_10 B 0101_10 C 0110_10 D 0111_10 E 1000_01 F 1100_01 G

X:0101_10

exact hit

Y:1010_10

aggregate hit

KEY FACT: Leaf nodes and aggregated nodes can be located within a B-treepage with a fast binary search, without traversing the edges of the octree.

B-tree page (locational code keys)

Mesh Node Etree

000000 a 000100 b 000101 c 000110 d 000111 e 001000 f 001100 g

001101 h 010000 i 010010 j 011000 k 100000 l 100100 m 110000 n

B-tree leaf page 1 (Morton code keys)

B-tree leaf page 2 (Morton code keys)

a(0,0) f(2,0) l(4,0)

m(4,2)

n(4,4)

c(0,3)

g(2,2)d(1,2)

h(2,3)e(1,3)

k(2,4)j(1,4)i(0,4)

b(0,2)

Navigation octree

Guided by an application function

An in-memory pointer-based octree

Dynamically grows in depth-first fashion

Leaf octants are pruned and flushed to disk in preorder (in increasing locational code order)

Appends the octants to the etree database to avoid database search

: Octants not yet processed (in memory)

: Non-leaf octants being decomposed (in memory): Leaf octants (flushed to database)

Auto Navigation

Operational steps

1. Partition the entire domain into equal-size blocks

2. Perform internal balancing to enforce 2-to-1 constraint within each block (in a memory resident blocking array)

3. Perform boundary balancing to resolve interactions between adjacent blocks

Local Balancing

Key Fact: Interactions between adjacent blocks are always absorbed by boundary octants and will not be propagated into the blocks.

Is etree mesh generation feasible?

How does running time vary with the physical memory size?

What is the performance impact of auto navigation?

What is the performance impact of local balancing?

Some Evaluation Questions

Used etree mesh generator to build family of finite element meshes for San Fernando Valley earthquake ground motion simulations.

Mesh Elements Nodes Slave nodes

SF10 7,940 12,118 4,432

SF5 76,330 105,886 34,858

SF2 1,838,524 2,213,035 407,336

SF1 13,579,124 15,097,365 1,649,855

Evaluation Methodology

SFx : A mesh of the 50 km x 50 km x 12 km San Fernando Valley that resolves seismic waves with periods of at most x seconds.

All experiments conducted on a PIII 1GHz machine running Linux 2.4.17.

Machine’s physical memory for the experiments ranged from 128 MB to 880 MB.

Before each experiment, two 1.5 GB files were sequentially scanned to ensure that the operating system’s buffer cache was flushed.

Evaluation Setup

Mesh Elements DB size (MB) Time (sec) Thruput (elem/s)

SF10 7,940 2.5 40 199

SF5 76,330 24 186 410

SF2 1,838,524 583 1,637 1,123

SF1 13,579,124 4,300 9,449 1,439

All experiments performed with 128 MB physical memory

Etree Feasibility

– Generating a mesh with 13.6 million elements and of size 4.3 GB in 2.6 hours seems reasonable

– The overall throughput increases with mesh size

– Memory size does not have a significant impact on the running time

– The etree method is not relying on the operating system’s internal caching mechanism to achieve its performance

0%

20%

40%

60%

80%

100%

128

256

512

880

128

256

512

880

128

256

512

880

128

256

512

880

Memory size (MB)

Run

ning

tim

e

findslave

query

transform

balance

construct

SF5 SF2 SF1SF10

Impact of Physical Memory Size

1.0E+00

1.0E+01

1.0E+02

1.0E+03

1.0E+04

1.0E+05

1.0E+06

512 1,000 2,000 4,000 8,000 16,000 32,000 64,000

B-tree buffer size (KB)

Co

nst

ruct

ion

tim

e (m

s)

sf1

sf2

sf5

sf10

– Reducing B-tree buffer size does not increase the construction time

– Auto navigation is not sensitive to B-tree buffer size

Impact of Auto Navigation

1.0E+02

1.0E+03

1.0E+04

1.0E+05

1.0E+06

1.0E+07

1.0E+08

0 262 2,097 16,777

Blocking array size (KB)

Bal

ance

tim

e (m

s) sf1

sf2

sf5

sf10

– Achieves speedups ranging from 8 (SF1) to 28 (SF10)

– Benefits from the one-time scan of the database and the efficient array-based neighbor finding algorithm

Impact of Local Balancing

General octree algorithms: Samet 90

Octree mesh: Shepard & Geoges 91, Bern et al. 90, Young et al. 91, Wang99

Out-of-core octree solver method: Salmon 97

Linear quadtree: Gargantini 82, Morton 66

Space filling curve: Orenstein 84, Orenstein 86, Faloutsos & Roseman 89

Large dataset processing: Freitag & Loy 99, Seamons & Winslett 96, Ferreira et al. 99, Kurc et al. 01, Choudhary et al. 99, Parashar & Browne 97

Some Related Work

Summary and Conclusions

Euclid project aims to recast entire scientific computing process in terms of database ops.

Incorporating existing database techniques (linear octree and B-tree) with new algorithms (auto navigation and local balancing) in a unified framework (the etree) can deliver new capabilities.

On the horizon:– Caching and prefetching for etree solver

– Remote access and derived value caching for visualization

– Parallell visualization system based on etrees

– Unstructured tetrahedral mesh generation using R-trees.

Unix file I/O style, three levels of abstraction:

Initialization and cleanup. e.g.,etree_t *etree_open(const char *path, int flag, …);

Octant-level operations. e.g.,

int etree_insert(etree_t *ep, location_t loc, void* value);

Octree-level operations. e.g.,

int etree_balance(etree_t *ep, decom_t *baldecom);

Etree API

Documents

Database Methods for Scientific Computing