30
Lecture 1 Introduction to Data Centric Computation

Lecture 1 - University of California, San Diego

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 1 - University of California, San Diego

Lec

ture

1

Intr

oduct

ion to D

ata

Cen

tric

Com

puta

tion

Page 2: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

92

Wel

com

e to

CSE

262!

•Y

our

inst

ruct

or

is S

cott

B. B

aden

bad

en@

ucs

d.e

du

•O

ffic

e: r

oom

3244 in E

BU

3B

•O

ffic

e hours

–M

on 2

-4, or

by a

ppoin

tmen

t

•T

he

clas

s hom

e pag

e is

http://www.cse.ucsd.edu/classes/sp09/cse262

Page 3: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

93

Cours

e R

equir

emen

ts

•C

lass

par

tici

pat

ion a

nd p

repar

atio

n: 20%

–B

e pre

par

ed to d

iscu

ss the

assi

gned

rea

din

gs

–W

rite

-ups

due

at s

tart

of

clas

s (1

2)

•In

cla

ss p

rese

nta

tion o

f re

adin

gs:10%

•Pro

ject

: 70%

–In

cludes

pro

posa

l, p

rogre

ss r

eport

and in c

lass

pre

senta

tion, fi

nal

rep

ort

Page 4: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

94

Polici

es

•A

cadem

ic I

nte

gri

ty

–D

o y

ou o

wn w

ork

–P

lagia

rism

and c

hea

ting w

ill not be

tole

rate

d

•B

y tak

ing this

cours

e, y

ou im

plici

tly a

gre

e

to a

bid

e by the

follow

ing the

cours

e police

s:http://www.cse.ucsd.edu/classes/sp09/cse262/Admin.html

http://www.cse.ucsd.edu/classes/sp09/cse262/Admin.html

http://www.cse.ucsd.edu/classes/sp09/cse262/Admin.html

http://www.cse.ucsd.edu/classes/sp09/cse262/Admin.html

Page 5: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

95

Bac

kgro

und

•C

om

mon B

ackgro

und

–P

rogra

mm

ing a

nd D

ata

stru

cture

s

•D

iffe

ring b

ackgro

unds

–P

rior

exper

ience

in p

aral

lel co

mputa

tion?

–G

PU

s/C

ell?

–N

um

eric

al a

nal

ysi

s?

–C

/C+

+Ja

va

Fort

ran?

Pyth

on?

Mat

lab?

Page 6: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

96

Bac

kgro

und M

arker

s

•N

avie

rS

tokes

Equat

ions

•S

par

se f

acto

riza

tion

•T

LB

mis

ses

•M

ult

ithre

adin

g

•R

PC

•A

bst

ract

bas

e cl

ass

•A

mdah

l’s

law

•R

elat

ional

or

Spat

ial Jo

in

∇•u=

0

Dt+ρ∇•v

()=

0

∇×u

f(a

)+

′ f

(a)

1!(x−a)+

′ ′ f

(a)

2!(x−a)2+

...

Page 7: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

97

3/3

1/2

009

7

Motivat

ion

•T

echnic

al o

bst

acle

sar

e li

mit

ing o

ur

abil

ity to

der

ive

know

ledge

from

num

eric

al s

imula

tions

to

under

stan

dte

chnolo

gic

ally

im

port

antphen

om

ena

•E

xponen

tial

gro

wth

in

dat

aset

s

–Pro

duce

d b

y n

um

eric

al s

imula

tions

–D

ata

gen

erat

ed b

y inst

rum

ents

–M

edia

•E

xponen

tial

incr

ease

in h

ardw

are

capab

ility

–M

ultic

ore

pro

cess

ors

–C

lust

ers

–N

etw

ork

ing

Page 8: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

98

3/3

1/2

009

8

Obse

rvat

ion

•T

he

more

we

know

about th

e how

the

dat

a w

as g

ener

ated

•…

the

conte

xt in

whic

h the

dat

a w

ill be

use

d

•…

the

more

we

can e

mplo

y that

know

ledge

to im

pro

ve

pro

duct

ivity

Page 9: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

99

Syllab

us

•T

echniq

ues

, to

ols

and a

lgori

thm

s to

im

pro

ve

our

abil

ity to d

eriv

e know

ledge

from

dat

a or

med

ia

•S

pat

ial quer

y tec

hniq

ues

and a

cces

s m

ethods

•F

eatu

re e

xtr

acti

on a

nd tra

ckin

g

•D

ata

com

pre

ssio

n

•W

avel

ets,

im

age

pro

cess

ing

•H

ardw

are

acce

lera

tion

(i.e

. C

ell B

E a

nd G

PU

s)

•A

ppli

cations

–Sim

ula

tion d

ata

sets

–M

edia

/obse

rvat

ional

Page 10: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

910

Sch

eduli

ng

•Pro

ject

Sym

posi

um

–Ju

ne

4th

-5th

–N

o c

lass

on 6

/2

•Som

e le

cture

s to

be

resc

hed

ule

d

–M

onday

5/1

9 →

Fri

day

5/8

(In c

lass

pre

senta

tions

of

pro

gre

ss r

eport

s)

–T

ues

day

5/2

1: F

riday

5/1

5

–T

ues

day

5/2

6 →

6/5

Page 11: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

911

Har

dw

are

pla

tform

s

•A

be

–D

ell cl

ust

er a

t N

CS

A

–L

inco

ln: at

tach

ed N

VID

IA T

esla

•C

ell B

road

ban

d E

ngin

e (G

eorg

ia T

ech)

•If

you w

ant to

use

Cel

l B

E, fi

ll o

ut an

acco

unt re

ques

t fo

rmhttps://cell

https://cell

https://cell

https://cell- ---web.cc.gatech.edu/signup/index.pl

web.cc.gatech.edu/signup/index.pl

web.cc.gatech.edu/signup/index.pl

web.cc.gatech.edu/signup/index.pl

Page 12: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

912

Motivat

ing A

pplica

tions

Page 13: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

913

3/3

1/2

009

13

Com

puta

tional

flu

id d

ynam

ics

•D

irec

t N

um

eric

al S

imula

tion o

f hori

zonta

l sh

ear

•S

uta

nu

Sar

kar

, U

CS

D M

AE

Dep

artm

ent

•80 T

erab

yte

sof

dat

a per

sim

ula

tion

–1024 C

ray X

T-4

pro

cess

ors

for

48 h

ours

–2 ×

10

9unknow

ns,

10K

tim

e st

eps

[4096×1

024×5

12]

–1000 s

nap

shots

@ 8

0 ×

10

9byte

s

Horizontal spanwise vorticity (mag) using DNS, [Basak & Sarkar, JFM 2006]

Page 14: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

914

3/3

1/2

009

14

The

Input D

ata

•T

ime

dep

enden

t si

mula

tion

•E

ach s

nap

shot

–5 c

om

ponen

ts: vel

oci

ty (

vec

tor)

, den

sity

, pre

ssure

Page 15: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

915

3/3

1/2

009

15

The

end-t

o-e

nd a

pplica

tion

•Id

entify

vort

ex c

ore

s

usi

ng the

“∆cr

iter

ion”

> 1

.75×10

-4

•C

om

pute

∆dir

ectly

on a

com

pre

ssed

ver

sion

of

the

dat

aset

•C

om

pute

quan

tities

conditio

ned

on ∆

F(x

; co

ndit

ion)

Page 16: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

916

3/3

1/2

009

16

Work

ing w

ith s

par

se r

epre

senta

tions

•C

ontr

oll

ed o

per

atio

ns

on irr

egula

r m

ult

iblo

ckst

ruct

ure

s

•G

eom

etri

c ab

stra

ctio

ns,

e.g

. K

eLP

, C

hom

bo, T

itan

ium

•T

eraL

ab, F

aisa

l M

ir, K

TH

Sw

eden

Wit

h W

illiam

Ker

ney

, Pet

er D

iam

essi

s, a

nd K

eiko N

om

ura

(U

CSD

) [2

002]

mega58

Page 17: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

917

3/3

1/2

009

17

Dynam

ics

•C

ontinuat

ion: th

e over

lap b

etw

een a

fea

ture

an

d a

des

cenden

t or

an a

nce

stor

•C

reat

ions

and d

issi

pat

ions

are

not

continuat

ions

B

A

B

A

Page 18: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

918

3/3

1/2

009

18

How

lar

ge

can a

dat

aset

get

?•

“Ter

aShak

e”G

round F

ault S

imula

tor

•Sei

smic

wav

esgen

erat

edby lar

ge

eart

hquak

esth

atocc

ur

on

the

south

ern

San

Andre

as F

ault

•43 T

erab

yte

sof

dat

a–

240 I

BM

Pow

er4 p

roce

ssors

for

5 d

ays

–2000 s

nap

shots

@ 2

1.1

×10

9byte

s

–Sto

red o

n tap

e in

SD

SC

’sSto

rage

Res

earc

h B

roker

(SR

B)

http://www.sdsc.edu/SCEC/

•South

ern C

alif

orn

ia E

arth

quak

e C

ente

r’s

Com

munit

y M

odel

ing E

nvir

onm

ent Pro

ject

(SC

EC

/CM

E),

N

SF http://www.scec.org/cme

–U

SC

, Scr

ipps

Inst

. fo

r O

cean

ogra

phy,

San

Die

go S

uper

com

pute

r C

ente

r(SD

SC

), S

DSU

...

Page 19: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

919

Ter

ashak

e

How

it w

ork

s:

1.

Div

ide

up S

outh

ern

Cal

iforn

ia into

“blo

cks”

2.

For

each

blo

ck, get

all

the

dat

aon g

round

surf

ace

com

posi

tion,

geo

logic

al s

truct

ure

s,

fault info

rmat

ion, et

c.

Slide Courtesy of DataCentral@

SDSC

Sim

ula

tes

a 7.7

ear

thquak

e al

ong

the

south

ern S

an A

ndre

as f

ault

close

to L

A u

sing s

eism

ic,

geo

physi

cal, a

nd o

ther

dat

a fr

om

the

South

ern C

alif

orn

ia

Ear

thquak

e C

ente

r

Page 20: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

920

Anim

ation

3.

Map

the

blo

cks

on to

pro

cess

orsof the

supercomputer

4.

Run the

sim

ula

tion

using current

inform

ation on fault

activity and the physics

of earthquakes

SD

SC

Mac

hin

e

Room

SD

SC

’s D

ataS

tar

Slide Courtesy of DataCentral@

SDSC

Page 21: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

921

47 TB output

data for 1.8

billion grid

points

Continuous

I/O 2GB/sec

240 procs on

SDSC Datastar,

5 days, 1 TB

of main memory

Data parking of 100s of TBs for

many months

Large memory Nodes with 256

GB of DS for pre-processing

and post visualization

10-20 TB data archived a day

The next generation sim

ulation will require even m

ore

resources: Researchers plan to double the

temporal/spatial resolution of TeraShake

Req

uir

emen

ts

Parallel

file system

Data

parking

Slide Courtesy of DataCentral@

SDSC

Page 22: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

922

3/3

1/2

009

22

The

Role

of

Com

pre

ssio

n•

If the

info

rmat

ion c

onte

nt of

a dat

aset

is

low

, co

mpre

ssio

n c

an h

elp u

s …

–in

crea

se r

esolu

tion f

or

a giv

en s

tora

ge

budget

–dis

cover

know

ledge

hid

den

within

the

dat

a

Dia

mes

sis,

Ker

ney, N

om

ura

, B

aden

[P

ara

‘02]

Page 23: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

924

3/3

1/2

009

24

Dat

a C

om

pre

ssio

n•

Loss

less

–Per

fect

rec

onst

ruct

ion, not m

ore

than

2:1

com

pre

ssio

n

–R

LE

, H

uff

man

codin

g, L

Z77, D

EFL

AT

E (

gzi

p)

•In

pra

ctic

e: loss

y

–W

avel

et C

om

pre

ssio

n (

WC

)

–Fea

ture

Extr

action

–In

troduce

s ar

tifa

cts

into

the

dat

a

–A

dap

tive

Com

pre

ssio

n (

wit

h H

rom

adka,

Unat

, Shaf

aat)

Page 24: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

925

Tec

hnolo

gy

•M

ultic

ore

pro

cess

ors

may

enab

le u

s to

continue

the

tren

d

of

exponen

tial

gro

wth

in p

roce

ssin

g s

pee

ds

•W

ork

stat

ions

or

clust

ers

wit

h a

ccel

erat

ors

–R

oad

runner

: w

ith S

TI

Cel

l B

E (

IBM

and L

os

Ala

mos)

–L

inco

ln: w

ith N

vid

iaT

esla

(N

CSA

and N

vid

ia)

•D

istr

ibute

d c

om

puta

tion

–“C

loud”

and “

Gri

d”

com

puting

http://computing.llnl.gov/tutorials/ibm_sp(IBM)

Page 25: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

926

Today

’s lap

top w

ould

hav

e bee

nyes

terd

ay’s

super

com

pute

r

•Cray-1 Supercomputer

•80 M

Hz processor

•8 M

egabytes mem

ory

•Water cooled

•6 feet H x 7 feet W

•4 tons

•Over $10M in 1976

•MacBook

•2.4GHz Intel Core 2 Duo

•2 GBytes DDR2 SDRAM

4 M

egabyte shared cache

•Air cooled

•1.1 × ×××12.8 × ×××8.9 inches

•5.0 pounds (2.3 kg)

•$1299 in 2008

Page 26: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

927

1980

1985

1990

1995

2000

2005

100

101

102

103

104

105

The

pro

cess

or-

mem

ory

gap

Mem

ory

(DR

AM

)

Pro

cess

or

Page 27: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

928

An im

port

ant univ

ersa

l:

the

loca

lity

pri

nci

ple

•Pro

gra

ms

gen

eral

ly e

xhib

it tw

o f

orm

s of

loca

lity

when

acce

ssin

g m

emory

, oft

en involv

ing loops

–T

empora

l lo

cality

(tim

e)

–Spat

ial lo

cality

(sp

ace)

•O

ften

involv

es loops

for t= 0 to T-1

for i= 1 to N-2

u[i]= (u[i-1] + u[i+1] -8*h*h

) / 2

Page 28: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

929

Re-

use

•M

emory

hie

rarc

hie

s re

ly o

n loca

lity

to

impro

ve

mem

ory

acc

ess

tim

es thro

ugh r

e-use

•U

nder

lyin

g p

rinci

ple

s fo

r im

pro

vin

g c

ache

re-

use

apply

to s

tore

d d

ata,

par

alle

l m

emory

hie

rarc

hie

s an

d h

ardw

are

acce

lera

tors

Page 29: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

931

ST

I C

ell B

road

ban

d E

ngin

e

16B/cycle (2x)

16B/cycle

BIC

FlexIO

TM

MIC

Dual

XDRTM

16B/cycle

EIB (up to 96B/cycle)

16B/cycle 64-bit Power Architecture with VMX

PPE

SPE

LS

SXU

SPU MFC

PXU

L1

PPU

16B/cycle

L2

32B/cycle

LS

SXU

SPU MFC

LS

SXU

SPU MFC

LS

SXU

SPU MFC

LS

SXU

SPU MFC

LS

SXU

SPU MFC

LS

SXU

SPU MFC

LS

SXU

SPU MFC

IBM

Page 30: Lecture 1 - University of California, San Diego

3/3

1/2

009

Sco

tt B

. B

aden

/ C

SE

262 / S

P '0

932

NV

IDIA

GeF

orc

eG

TX

280

•240 c

ore

s @

1.2

96 G

Hz

•Par

alle

l co

mputing o

r gra

phic

mode

•1 G

B m

emory

•512 b

it m

emory

inte

rfac

e @

141.7

GB

/s