14
U.S. Department of Energy Contract No. DE-AC03-76SF00098 Retrieving Objects from Toriodal Mesh Data Using FastBit Technology – A Progress Report Outline Overview of FastBit technology Recent progresses John Wu Scientific Data Management, Berkeley Lab http://sdm.lbl.gov/fastbit

Retrieving Objects from Toriodal Mesh Data Using FastBit Technology – A Progress Report

  • Upload
    bardia

  • View
    43

  • Download
    0

Embed Size (px)

DESCRIPTION

Retrieving Objects from Toriodal Mesh Data Using FastBit Technology – A Progress Report. Outline Overview of FastBit technology Recent progresses. John Wu Scientific Data Management, Berkeley Lab. http://sdm.lbl.gov/fastbit. FastBit Started In a Big Smash. - PowerPoint PPT Presentation

Citation preview

Page 1: Retrieving Objects from Toriodal Mesh Data Using FastBit Technology – A Progress Report

U.S. Department of Energy Contract No. DE-AC03-76SF00098

Retrieving Objects from Toriodal Mesh Data Using FastBit Technology

– A Progress Report

OutlineOverview of FastBit technology

Recent progresses

John WuScientific Data Management, Berkeley Lab

http://sdm.lbl.gov/fastbit

Page 2: Retrieving Objects from Toriodal Mesh Data Using FastBit Technology – A Progress Report

FastBit 2

FastBit Started In a Big Smash

Searching for clues to Quark-Gluon Plasma in a large set of high-energy collisions High-Energy Physics

experiment STAR 600 participants / 50

institutions / 12 countries Data rate 200 MB/s Data collected 5 PB ~ 1 Billion collision events, 5

MB per event (equivalent to having millions of variables)

Challenge: finding 100 or so events with the best evidence of QGP

Page 3: Retrieving Objects from Toriodal Mesh Data Using FastBit Technology – A Progress Report

FastBit 3

FastBit 10x Faster than DBMS

Queries on 12 most queried attributes (2.2 million records) from STAR High-Energy Physics Experiment, average attribute cardinality 222,000

Experiments confirm that: WAH compressed indexes are 10X faster than bitmap indexes from a

DBMS, 5X faster than our own implementation of BBC Size of WAH compressed indexes is only 30% of raw data size (a B+-

tree index from a popular DBMS system is 3-4X)

2-D queries 5-D queries

[Wu, Otoo, Shoshani 2001]

Page 4: Retrieving Objects from Toriodal Mesh Data Using FastBit Technology – A Progress Report

FastBit 4

FastBit Grew with a Big Boom

Searching for a more fuel efficient combustion engine (Homogeneous-Charge Compression Ignition engine) Require detailed

numerical simulation with hundreds of variables

Simulation mesh: 1000 x 1000 x 1000

1000s time steps per simulation

Challenge: finding and tracking ignition kernels

Page 5: Retrieving Objects from Toriodal Mesh Data Using FastBit Technology – A Progress Report

FastBit 5

FastBit Finds Volumes Faster Than Best Isocontour Finder

FastBit finds volume of interest efficiently with compressed representation of the volume

FastBit identifies volumes of interest as efficient as the best algorithm that identify the surface only (isocontouring), in theory

FastBit is three times faster than the best isocontouring algorithm in VTK

0

0.1

0.2

0.3

Tim

e [

se

c]

vtkKitwareContourFilter DEX

3X

[Wu, Koegler, Chen, Shoshani 2003]

[Stockinger, Shalf, Bethel, Wu 2005]

Page 6: Retrieving Objects from Toriodal Mesh Data Using FastBit Technology – A Progress Report

FastBit

FastBit Milestones

2007/08: FastBit speed up drug discovery tool (first publication not involving any FastBit developers)

2007/08: First public release, version a0.7 2007/06: Physical design reviewed 2007/06: First PhD thesis involving FastBit completed 2006/03: Prove formal optimality 2006/02: Work on Enron data made headline at PRIMEUR 2005/05: Appeared in ACM TechNews 2005/05: Grid Collector wins ISC Award 2005/01: CRD news report on FastBit 2004/12: WAH patent issued

Page 7: Retrieving Objects from Toriodal Mesh Data Using FastBit Technology – A Progress Report

U.S. Department of Energy Contract No. DE-AC03-76SF00098

FastBit Progress Report

Two-level encoding Feature identification on toroidal mesh

http://sdm.lbl.gov/fastbit

Page 8: Retrieving Objects from Toriodal Mesh Data Using FastBit Technology – A Progress Report

FastBit

Two Levels Are Better Than One

Most commonly used bitmap index is one-level equality encoded (e1)

Multi-level encoding was postulated to possibly improve query performance [Wu, Otoo, Shoshoni, 2000] [Sinha, Winslett, 2007]

Through extensive analyses, we found the correct number of coarse level bins to use, and ensure that the two-level encoding always perform better [Wu, Stockinger, Shoshani]

bn = binary encodinge1 = one-level equalityee = equality-equalityre = range-equalityie = interval-equality

Page 9: Retrieving Objects from Toriodal Mesh Data Using FastBit Technology – A Progress Report

FastBit

Feature Identification on Toroidal Mesh

Defines connectivity based on the distances computed from (x, y, z) coordinates

Two ways to speed up the feature identification work with lines instead of points use an efficient connected component labeling algorithm

10 – 100 times faster than working with points [Sinha, Winslett, Wu]

Page 10: Retrieving Objects from Toriodal Mesh Data Using FastBit Technology – A Progress Report

FastBit

Better Approach – Redefine Connectivity

Redefine connectivity based on toroidal coordinates Node A is connect to B and C on the same circle To D and E on the circle just below the current one in the same plane To F and G on the circle of the same radius in the plane just before By symmetry, there are four more points on circles above and after A total of 10 neighbors for every node – more than previous approach

Advantages of such connectivity definition Neighbors of consecutive nodes on a circle, i.e., arc, also form arcs These neighboring arcs fall on four different circles Our labeling algorithm examines only two out of four circles

Page 11: Retrieving Objects from Toriodal Mesh Data Using FastBit Technology – A Progress Report

FastBit

New Connectivity Improves Region Finding

1.E-04

1.E-03

1.E-02

1.E-01

1.E+00

1.E+01

1.E+02

1.E+03

1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07

num of nodes

tim

e (

s)

XYZ torus - 1 torus - 2

Preliminary results Three different labeling methods shown

XYZ: a nearest-neighbor mesh constructed from (x, y, z) coordinates

Torus – 1: connectivity described on previous page, label nodes Torus – 2: connectivity described previously, label arcs

Speedup = ratio of total time used by two methods

SpeedupTorus – 1 v. XYZ 25Torus – 2 v. XYZ 150Torus – 2 v. Torus – 1 6

Page 12: Retrieving Objects from Toriodal Mesh Data Using FastBit Technology – A Progress Report

FastBit

New Approach Scales Well

Approach torus – 1 scales linearly with the number of nodes in the regions of interest

Approach torus – 2 scales linearly with the number of arcs in the regions on interests

Number of arcs <= number of nodes on the boundaries of the regions

O(|arcs|) O(|boundary|) For regions defined with simple

range conditions such “potential >= 1e-8”, where the boundaries of the regions are isocontours, approach torus – 2 scales as well as the best isocontouring algorithms

Need formal proof

torus - 1

y = 2E-06x

0

2

4

6

8

10

12

14

16

18

0.E+00 2.E+06 4.E+06 6.E+06 8.E+06 1.E+07

number of nodes

lab

el t

ime

torus - 2

y = 3E-06x

0

0.5

1

1.5

2

2.5

3

3.5

0.0E+00 2.0E+05 4.0E+05 6.0E+05 8.0E+05 1.0E+06 1.2E+06

number of arcs

lab

el t

ime

Page 13: Retrieving Objects from Toriodal Mesh Data Using FastBit Technology – A Progress Report

FastBit

Future Plans

GTC data Wrap up the current work on 3D GTC data Prepare for new 5D data Add visualization front-end Work with particles

FastBit software Python API?

Other applications Visualization ?

Page 14: Retrieving Objects from Toriodal Mesh Data Using FastBit Technology – A Progress Report

U.S. Department of Energy Contract No. DE-AC03-76SF00098

Contact Information

FastBit website http://sdm.lbl.gov/fastbit

John’s email [email protected]

Arie’s email [email protected]

FastBit is an efficient searching tool for data-driven science. Key techniques in FastBit have been extensively exercised. If you have an application that requires searching operations, feel

free to contact us.