Download pdf - Algorithms for Higher Order Spatial Statisticshipacc.ucsc.edu/Lecture Slides/Szapudi_foa.pdf · Algorithms for Higher Order Spatial Statistics István Szapudi Institute for Astronomy

IntroductionThree-point Algorithm

Summary

Algorithms for Higher Order Spatial Statistics

István Szapudi

Institute for AstronomyUniversity of Hawaii

Future of AstroComputing Conference, SDSC, Dec 16-17

I. Szapudi Algorithms for Higher Order Statistics


Summary

Outline

1 Introduction

2 Three-point Algorithm



Summary

Random Fields

DefinitionA random field is a spatial field with an associated probabilitymeasure: P(A)DA.

Random fields are abundant in Cosmology.The cosmic microwave background fluctuations constitutea random field on a sphere.Other examples: Dark Matter Distribution, GalaxyDistribution, etc.Astronomers measure particular realization of a randomfield (ergodicity helps but we cannot avoid “cosmic errors”)



Summary

Definitions

The ensemble average 〈A〉 corresponds to a functionalintegral over the probability measure.Physical meaning: average over independent realizations.Ergodicity: (we hope) ensemble average can be replacedwith spatial averaging.Symmetries: translation and rotation invariance

Joint Moments

F (N)(x1, . . . , xN) = 〈T (x1), . . . , T (xN)〉



Summary

Connected MomentsThese are the most frequently used spatial statistics

Typically we use fluctuation fields δ = T/〈T 〉 − 1

Connected moments are defined recursively

〈δ1, . . . , δN〉c = 〈δ1, . . . , δN〉 −∑

P

〈δ1 . . . ...δi〉c . . . 〈δj . . . δk 〉c . . .

With these the N-point correlation functions are

ξ(N)(1, . . . , N) = 〈δ1, . . . , δN〉c



Summary

Gaussian vs. Non-Gaussian distributionsThese two have the same two-point correlation function or P(k)

These have the same two-point correlation function!



Summary

Basic ObjectsThese are N-point correlation functions.

Special Cases

Two-point functions 〈δ1δ2〉Three-point functions 〈δ1δ2δ3〉Cumulants 〈δN

R 〉c = SN〈δ2R〉N−1

Cumulant Correlators 〈δN1 δM

2 〉cConditional Cumulants 〈δ(0)δN

R 〉c

In the above δR stands for the fluctuation field smoothed onscale R (different R’s could be used for each δ’s).Host of alternative statistics exist: e.g. Minkowskifunctions, void probability, minimal spanning trees, phasecorrelations, etc.



Summary

ComplexitiesCombinatorial explosion of terms

N-point quantities have a large configuration space:measurement, visualization, and interpretation becomecomplex.e.g, already for CMB three-point, the total number of binsscales as M3/2

CPU intensive measurement: MN scaling for N-pointstatistics of M objects.Theoretical estimationEstimating reliable covariance matrices



Summary

Algorithmic Scaling and Moore’s Law

Computational resources grow exponentially(Astronomical) data acquisition driven by the sametechnologyData grow with the same exponent

CorrolaryAny algorithm with a scaling worse then linear will becomeimpossible soon

Symmetries, hierarchical structures (kd-trees), MC,computational geometry, approximate methods



Summary

Example: Algorithm for 3ptOther algorithms use symmetries

!!4!4!3!2!1



Summary

Algorithm for 3pt Cont’d

Naively N3 calculations to find all triplets in the map:overwhelming (millions of CPU years for WMAP)Regrid CMB sky around each point according to theresolutionUse hierarchical algorithm for regridding: N log NCorrelate rings using FFT’s (total speed: 2minutes/cross-corr)The final scaling depends on resolutionN(log N + NθNα log Nα + NαNθ(Nθ + 1)/4)/2With another cos transform one and a double Hankeltransform one can get the bispectrumIn WMAP-I: 168 possible cross correlations, about 1.6million bins altogether.How to interpret such massive measurements?



Summary

3pt in WMAP



Summary

Recent Challanges

Processors becoming multicore (CPU and GPU)To take advantage of Moore’s law: parallelizationDisk sizes growing exponentially, but not the IO speedData size can become so large that reading mightdominate processingNot enough to just consider scaling



Summary

Alternative view of the algorithm: lossy compression

!!4!4!3!2!1



Summary

Compression

Compression can increase processing speed simply by theneed of reading less dataThe full compressed data set can be sent to all nodesThis enables parallelization in multicore or MapReduceframeworkFor any algorithm specific (lossy compression) is needed



Summary

Another pixellization as lossy compression



Summary

Summary

Fast algorithm for calculating 3pt functions with N log Nscaling instead of N3

Approximate algorithm with a specific lossy compressionphaseScaling with resolution and not with data elementsCompression in the algorithm enables multicore orMapReduce style parallelizationWith a different compression we have done approximatelikelihood analysis for CMB (Granett,PhD thesis)