Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Lec
ture
1
Intr
oduct
ion to D
ata
Cen
tric
Com
puta
tion
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
92
Wel
com
e to
CSE
262!
•Y
our
inst
ruct
or
is S
cott
B. B
aden
bad
en@
ucs
d.e
du
•O
ffic
e: r
oom
3244 in E
BU
3B
•O
ffic
e hours
–M
on 2
-4, or
by a
ppoin
tmen
t
•T
he
clas
s hom
e pag
e is
http://www.cse.ucsd.edu/classes/sp09/cse262
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
93
Cours
e R
equir
emen
ts
•C
lass
par
tici
pat
ion a
nd p
repar
atio
n: 20%
–B
e pre
par
ed to d
iscu
ss the
assi
gned
rea
din
gs
–W
rite
-ups
due
at s
tart
of
clas
s (1
2)
•In
cla
ss p
rese
nta
tion o
f re
adin
gs:10%
•Pro
ject
: 70%
–In
cludes
pro
posa
l, p
rogre
ss r
eport
and in c
lass
pre
senta
tion, fi
nal
rep
ort
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
94
Polici
es
•A
cadem
ic I
nte
gri
ty
–D
o y
ou o
wn w
ork
–P
lagia
rism
and c
hea
ting w
ill not be
tole
rate
d
•B
y tak
ing this
cours
e, y
ou im
plici
tly a
gre
e
to a
bid
e by the
follow
ing the
cours
e police
s:http://www.cse.ucsd.edu/classes/sp09/cse262/Admin.html
http://www.cse.ucsd.edu/classes/sp09/cse262/Admin.html
http://www.cse.ucsd.edu/classes/sp09/cse262/Admin.html
http://www.cse.ucsd.edu/classes/sp09/cse262/Admin.html
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
95
Bac
kgro
und
•C
om
mon B
ackgro
und
–P
rogra
mm
ing a
nd D
ata
stru
cture
s
•D
iffe
ring b
ackgro
unds
–P
rior
exper
ience
in p
aral
lel co
mputa
tion?
–G
PU
s/C
ell?
–N
um
eric
al a
nal
ysi
s?
–C
/C+
+Ja
va
Fort
ran?
Pyth
on?
Mat
lab?
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
96
Bac
kgro
und M
arker
s
•N
avie
rS
tokes
Equat
ions
•S
par
se f
acto
riza
tion
•T
LB
mis
ses
•M
ult
ithre
adin
g
•R
PC
•A
bst
ract
bas
e cl
ass
•A
mdah
l’s
law
•R
elat
ional
or
Spat
ial Jo
in
∇•u=
0
Dρ
Dt+ρ∇•v
()=
0
∇×u
f(a
)+
′ f
(a)
1!(x−a)+
′ ′ f
(a)
2!(x−a)2+
...
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
97
3/3
1/2
009
7
Motivat
ion
•T
echnic
al o
bst
acle
sar
e li
mit
ing o
ur
abil
ity to
der
ive
know
ledge
from
num
eric
al s
imula
tions
to
under
stan
dte
chnolo
gic
ally
im
port
antphen
om
ena
•E
xponen
tial
gro
wth
in
dat
aset
s
–Pro
duce
d b
y n
um
eric
al s
imula
tions
–D
ata
gen
erat
ed b
y inst
rum
ents
–M
edia
•E
xponen
tial
incr
ease
in h
ardw
are
capab
ility
–M
ultic
ore
pro
cess
ors
–C
lust
ers
–N
etw
ork
ing
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
98
3/3
1/2
009
8
Obse
rvat
ion
•T
he
more
we
know
about th
e how
the
dat
a w
as g
ener
ated
…
•…
the
conte
xt in
whic
h the
dat
a w
ill be
use
d
•…
the
more
we
can e
mplo
y that
know
ledge
to im
pro
ve
pro
duct
ivity
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
99
Syllab
us
•T
echniq
ues
, to
ols
and a
lgori
thm
s to
im
pro
ve
our
abil
ity to d
eriv
e know
ledge
from
dat
a or
med
ia
•S
pat
ial quer
y tec
hniq
ues
and a
cces
s m
ethods
•F
eatu
re e
xtr
acti
on a
nd tra
ckin
g
•D
ata
com
pre
ssio
n
•W
avel
ets,
im
age
pro
cess
ing
•H
ardw
are
acce
lera
tion
(i.e
. C
ell B
E a
nd G
PU
s)
•A
ppli
cations
–Sim
ula
tion d
ata
sets
–M
edia
/obse
rvat
ional
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
910
Sch
eduli
ng
•Pro
ject
Sym
posi
um
–Ju
ne
4th
-5th
–N
o c
lass
on 6
/2
•Som
e le
cture
s to
be
resc
hed
ule
d
–M
onday
5/1
9 →
Fri
day
5/8
(In c
lass
pre
senta
tions
of
pro
gre
ss r
eport
s)
–T
ues
day
5/2
1: F
riday
5/1
5
–T
ues
day
5/2
6 →
6/5
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
911
Har
dw
are
pla
tform
s
•A
be
–D
ell cl
ust
er a
t N
CS
A
–L
inco
ln: at
tach
ed N
VID
IA T
esla
•C
ell B
road
ban
d E
ngin
e (G
eorg
ia T
ech)
•If
you w
ant to
use
Cel
l B
E, fi
ll o
ut an
acco
unt re
ques
t fo
rmhttps://cell
https://cell
https://cell
https://cell- ---web.cc.gatech.edu/signup/index.pl
web.cc.gatech.edu/signup/index.pl
web.cc.gatech.edu/signup/index.pl
web.cc.gatech.edu/signup/index.pl
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
912
Motivat
ing A
pplica
tions
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
913
3/3
1/2
009
13
Com
puta
tional
flu
id d
ynam
ics
•D
irec
t N
um
eric
al S
imula
tion o
f hori
zonta
l sh
ear
•S
uta
nu
Sar
kar
, U
CS
D M
AE
Dep
artm
ent
•80 T
erab
yte
sof
dat
a per
sim
ula
tion
–1024 C
ray X
T-4
pro
cess
ors
for
48 h
ours
–2 ×
10
9unknow
ns,
10K
tim
e st
eps
[4096×1
024×5
12]
–1000 s
nap
shots
@ 8
0 ×
10
9byte
s
Horizontal spanwise vorticity (mag) using DNS, [Basak & Sarkar, JFM 2006]
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
914
3/3
1/2
009
14
The
Input D
ata
•T
ime
dep
enden
t si
mula
tion
•E
ach s
nap
shot
–5 c
om
ponen
ts: vel
oci
ty (
vec
tor)
, den
sity
, pre
ssure
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
915
3/3
1/2
009
15
The
end-t
o-e
nd a
pplica
tion
•Id
entify
vort
ex c
ore
s
usi
ng the
“∆cr
iter
ion”
∆
> 1
.75×10
-4
•C
om
pute
∆dir
ectly
on a
com
pre
ssed
ver
sion
of
the
dat
aset
•C
om
pute
quan
tities
conditio
ned
on ∆
F(x
; co
ndit
ion)
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
916
3/3
1/2
009
16
Work
ing w
ith s
par
se r
epre
senta
tions
•C
ontr
oll
ed o
per
atio
ns
on irr
egula
r m
ult
iblo
ckst
ruct
ure
s
•G
eom
etri
c ab
stra
ctio
ns,
e.g
. K
eLP
, C
hom
bo, T
itan
ium
•T
eraL
ab, F
aisa
l M
ir, K
TH
Sw
eden
Wit
h W
illiam
Ker
ney
, Pet
er D
iam
essi
s, a
nd K
eiko N
om
ura
(U
CSD
) [2
002]
mega58
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
917
3/3
1/2
009
17
Dynam
ics
•C
ontinuat
ion: th
e over
lap b
etw
een a
fea
ture
an
d a
des
cenden
t or
an a
nce
stor
•C
reat
ions
and d
issi
pat
ions
are
not
continuat
ions
B
A
B
A
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
918
3/3
1/2
009
18
How
lar
ge
can a
dat
aset
get
?•
“Ter
aShak
e”G
round F
ault S
imula
tor
•Sei
smic
wav
esgen
erat
edby lar
ge
eart
hquak
esth
atocc
ur
on
the
south
ern
San
Andre
as F
ault
•43 T
erab
yte
sof
dat
a–
240 I
BM
Pow
er4 p
roce
ssors
for
5 d
ays
–2000 s
nap
shots
@ 2
1.1
×10
9byte
s
–Sto
red o
n tap
e in
SD
SC
’sSto
rage
Res
earc
h B
roker
(SR
B)
http://www.sdsc.edu/SCEC/
•South
ern C
alif
orn
ia E
arth
quak
e C
ente
r’s
Com
munit
y M
odel
ing E
nvir
onm
ent Pro
ject
(SC
EC
/CM
E),
N
SF http://www.scec.org/cme
–U
SC
, Scr
ipps
Inst
. fo
r O
cean
ogra
phy,
San
Die
go S
uper
com
pute
r C
ente
r(SD
SC
), S
DSU
...
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
919
Ter
ashak
e
How
it w
ork
s:
1.
Div
ide
up S
outh
ern
Cal
iforn
ia into
“blo
cks”
2.
For
each
blo
ck, get
all
the
dat
aon g
round
surf
ace
com
posi
tion,
geo
logic
al s
truct
ure
s,
fault info
rmat
ion, et
c.
Slide Courtesy of DataCentral@
SDSC
Sim
ula
tes
a 7.7
ear
thquak
e al
ong
the
south
ern S
an A
ndre
as f
ault
close
to L
A u
sing s
eism
ic,
geo
physi
cal, a
nd o
ther
dat
a fr
om
the
South
ern C
alif
orn
ia
Ear
thquak
e C
ente
r
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
920
Anim
ation
3.
Map
the
blo
cks
on to
pro
cess
orsof the
supercomputer
4.
Run the
sim
ula
tion
using current
inform
ation on fault
activity and the physics
of earthquakes
SD
SC
Mac
hin
e
Room
SD
SC
’s D
ataS
tar
Slide Courtesy of DataCentral@
SDSC
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
921
47 TB output
data for 1.8
billion grid
points
Continuous
I/O 2GB/sec
240 procs on
SDSC Datastar,
5 days, 1 TB
of main memory
Data parking of 100s of TBs for
many months
Large memory Nodes with 256
GB of DS for pre-processing
and post visualization
10-20 TB data archived a day
The next generation sim
ulation will require even m
ore
resources: Researchers plan to double the
temporal/spatial resolution of TeraShake
Req
uir
emen
ts
Parallel
file system
Data
parking
Slide Courtesy of DataCentral@
SDSC
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
922
3/3
1/2
009
22
The
Role
of
Com
pre
ssio
n•
If the
info
rmat
ion c
onte
nt of
a dat
aset
is
low
, co
mpre
ssio
n c
an h
elp u
s …
–in
crea
se r
esolu
tion f
or
a giv
en s
tora
ge
budget
–dis
cover
know
ledge
hid
den
within
the
dat
a
Dia
mes
sis,
Ker
ney, N
om
ura
, B
aden
[P
ara
‘02]
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
924
3/3
1/2
009
24
Dat
a C
om
pre
ssio
n•
Loss
less
–Per
fect
rec
onst
ruct
ion, not m
ore
than
2:1
com
pre
ssio
n
–R
LE
, H
uff
man
codin
g, L
Z77, D
EFL
AT
E (
gzi
p)
•In
pra
ctic
e: loss
y
–W
avel
et C
om
pre
ssio
n (
WC
)
–Fea
ture
Extr
action
–In
troduce
s ar
tifa
cts
into
the
dat
a
–A
dap
tive
Com
pre
ssio
n (
wit
h H
rom
adka,
Unat
, Shaf
aat)
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
925
Tec
hnolo
gy
•M
ultic
ore
pro
cess
ors
may
enab
le u
s to
continue
the
tren
d
of
exponen
tial
gro
wth
in p
roce
ssin
g s
pee
ds
•W
ork
stat
ions
or
clust
ers
wit
h a
ccel
erat
ors
–R
oad
runner
: w
ith S
TI
Cel
l B
E (
IBM
and L
os
Ala
mos)
–L
inco
ln: w
ith N
vid
iaT
esla
(N
CSA
and N
vid
ia)
•D
istr
ibute
d c
om
puta
tion
–“C
loud”
and “
Gri
d”
com
puting
http://computing.llnl.gov/tutorials/ibm_sp(IBM)
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
926
Today
’s lap
top w
ould
hav
e bee
nyes
terd
ay’s
super
com
pute
r
•Cray-1 Supercomputer
•80 M
Hz processor
•8 M
egabytes mem
ory
•Water cooled
•6 feet H x 7 feet W
•4 tons
•Over $10M in 1976
•MacBook
•2.4GHz Intel Core 2 Duo
•2 GBytes DDR2 SDRAM
4 M
egabyte shared cache
•Air cooled
•1.1 × ×××12.8 × ×××8.9 inches
•5.0 pounds (2.3 kg)
•$1299 in 2008
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
927
1980
1985
1990
1995
2000
2005
100
101
102
103
104
105
The
pro
cess
or-
mem
ory
gap
Mem
ory
(DR
AM
)
Pro
cess
or
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
928
An im
port
ant univ
ersa
l:
the
loca
lity
pri
nci
ple
•Pro
gra
ms
gen
eral
ly e
xhib
it tw
o f
orm
s of
loca
lity
when
acce
ssin
g m
emory
, oft
en involv
ing loops
–T
empora
l lo
cality
(tim
e)
–Spat
ial lo
cality
(sp
ace)
•O
ften
involv
es loops
for t= 0 to T-1
for i= 1 to N-2
u[i]= (u[i-1] + u[i+1] -8*h*h
) / 2
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
929
Re-
use
•M
emory
hie
rarc
hie
s re
ly o
n loca
lity
to
impro
ve
mem
ory
acc
ess
tim
es thro
ugh r
e-use
•U
nder
lyin
g p
rinci
ple
s fo
r im
pro
vin
g c
ache
re-
use
apply
to s
tore
d d
ata,
par
alle
l m
emory
hie
rarc
hie
s an
d h
ardw
are
acce
lera
tors
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
931
ST
I C
ell B
road
ban
d E
ngin
e
16B/cycle (2x)
16B/cycle
BIC
FlexIO
TM
MIC
Dual
XDRTM
16B/cycle
EIB (up to 96B/cycle)
16B/cycle 64-bit Power Architecture with VMX
PPE
SPE
LS
SXU
SPU MFC
PXU
L1
PPU
16B/cycle
L2
32B/cycle
LS
SXU
SPU MFC
LS
SXU
SPU MFC
LS
SXU
SPU MFC
LS
SXU
SPU MFC
LS
SXU
SPU MFC
LS
SXU
SPU MFC
LS
SXU
SPU MFC
IBM
3/3
1/2
009
Sco
tt B
. B
aden
/ C
SE
262 / S
P '0
932
NV
IDIA
GeF
orc
eG
TX
280
•240 c
ore
s @
1.2
96 G
Hz
•Par
alle
l co
mputing o
r gra
phic
mode
•1 G
B m
emory
•512 b
it m
emory
inte
rfac
e @
141.7
GB
/s