17
Introduction Three-point Algorithm Summary Algorithms for Higher Order Spatial Statistics István Szapudi Institute for Astronomy University of Hawaii Future of AstroComputing Conference, SDSC, Dec 16-17 I. Szapudi Algorithms for Higher Order Statistics

Algorithms for Higher Order Spatial Statisticshipacc.ucsc.edu/Lecture Slides/Szapudi_foa.pdf · Algorithms for Higher Order Spatial Statistics István Szapudi Institute for Astronomy

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Algorithms for Higher Order Spatial Statisticshipacc.ucsc.edu/Lecture Slides/Szapudi_foa.pdf · Algorithms for Higher Order Spatial Statistics István Szapudi Institute for Astronomy

IntroductionThree-point Algorithm

Summary

Algorithms for Higher Order Spatial Statistics

István Szapudi

Institute for AstronomyUniversity of Hawaii

Future of AstroComputing Conference, SDSC, Dec 16-17

I. Szapudi Algorithms for Higher Order Statistics

Page 2: Algorithms for Higher Order Spatial Statisticshipacc.ucsc.edu/Lecture Slides/Szapudi_foa.pdf · Algorithms for Higher Order Spatial Statistics István Szapudi Institute for Astronomy

IntroductionThree-point Algorithm

Summary

Outline

1 Introduction

2 Three-point Algorithm

I. Szapudi Algorithms for Higher Order Statistics

Page 3: Algorithms for Higher Order Spatial Statisticshipacc.ucsc.edu/Lecture Slides/Szapudi_foa.pdf · Algorithms for Higher Order Spatial Statistics István Szapudi Institute for Astronomy

IntroductionThree-point Algorithm

Summary

Random Fields

DefinitionA random field is a spatial field with an associated probabilitymeasure: P(A)DA.

Random fields are abundant in Cosmology.The cosmic microwave background fluctuations constitutea random field on a sphere.Other examples: Dark Matter Distribution, GalaxyDistribution, etc.Astronomers measure particular realization of a randomfield (ergodicity helps but we cannot avoid “cosmic errors”)

I. Szapudi Algorithms for Higher Order Statistics

Page 4: Algorithms for Higher Order Spatial Statisticshipacc.ucsc.edu/Lecture Slides/Szapudi_foa.pdf · Algorithms for Higher Order Spatial Statistics István Szapudi Institute for Astronomy

IntroductionThree-point Algorithm

Summary

Definitions

The ensemble average 〈A〉 corresponds to a functionalintegral over the probability measure.Physical meaning: average over independent realizations.Ergodicity: (we hope) ensemble average can be replacedwith spatial averaging.Symmetries: translation and rotation invariance

Joint Moments

F (N)(x1, . . . , xN) = 〈T (x1), . . . , T (xN)〉

I. Szapudi Algorithms for Higher Order Statistics

Page 5: Algorithms for Higher Order Spatial Statisticshipacc.ucsc.edu/Lecture Slides/Szapudi_foa.pdf · Algorithms for Higher Order Spatial Statistics István Szapudi Institute for Astronomy

IntroductionThree-point Algorithm

Summary

Connected MomentsThese are the most frequently used spatial statistics

Typically we use fluctuation fields δ = T/〈T 〉 − 1

Connected moments are defined recursively

〈δ1, . . . , δN〉c = 〈δ1, . . . , δN〉 −∑

P

〈δ1 . . . ...δi〉c . . . 〈δj . . . δk 〉c . . .

With these the N-point correlation functions are

ξ(N)(1, . . . , N) = 〈δ1, . . . , δN〉c

I. Szapudi Algorithms for Higher Order Statistics

Page 6: Algorithms for Higher Order Spatial Statisticshipacc.ucsc.edu/Lecture Slides/Szapudi_foa.pdf · Algorithms for Higher Order Spatial Statistics István Szapudi Institute for Astronomy

IntroductionThree-point Algorithm

Summary

Gaussian vs. Non-Gaussian distributionsThese two have the same two-point correlation function or P(k)

These have the same two-point correlation function!

I. Szapudi Algorithms for Higher Order Statistics

Page 7: Algorithms for Higher Order Spatial Statisticshipacc.ucsc.edu/Lecture Slides/Szapudi_foa.pdf · Algorithms for Higher Order Spatial Statistics István Szapudi Institute for Astronomy

IntroductionThree-point Algorithm

Summary

Basic ObjectsThese are N-point correlation functions.

Special Cases

Two-point functions 〈δ1δ2〉Three-point functions 〈δ1δ2δ3〉Cumulants 〈δN

R 〉c = SN〈δ2R〉N−1

Cumulant Correlators 〈δN1 δM

2 〉cConditional Cumulants 〈δ(0)δN

R 〉c

In the above δR stands for the fluctuation field smoothed onscale R (different R’s could be used for each δ’s).Host of alternative statistics exist: e.g. Minkowskifunctions, void probability, minimal spanning trees, phasecorrelations, etc.

I. Szapudi Algorithms for Higher Order Statistics

Page 8: Algorithms for Higher Order Spatial Statisticshipacc.ucsc.edu/Lecture Slides/Szapudi_foa.pdf · Algorithms for Higher Order Spatial Statistics István Szapudi Institute for Astronomy

IntroductionThree-point Algorithm

Summary

ComplexitiesCombinatorial explosion of terms

N-point quantities have a large configuration space:measurement, visualization, and interpretation becomecomplex.e.g, already for CMB three-point, the total number of binsscales as M3/2

CPU intensive measurement: MN scaling for N-pointstatistics of M objects.Theoretical estimationEstimating reliable covariance matrices

I. Szapudi Algorithms for Higher Order Statistics

Page 9: Algorithms for Higher Order Spatial Statisticshipacc.ucsc.edu/Lecture Slides/Szapudi_foa.pdf · Algorithms for Higher Order Spatial Statistics István Szapudi Institute for Astronomy

IntroductionThree-point Algorithm

Summary

Algorithmic Scaling and Moore’s Law

Computational resources grow exponentially(Astronomical) data acquisition driven by the sametechnologyData grow with the same exponent

CorrolaryAny algorithm with a scaling worse then linear will becomeimpossible soon

Symmetries, hierarchical structures (kd-trees), MC,computational geometry, approximate methods

I. Szapudi Algorithms for Higher Order Statistics

Page 10: Algorithms for Higher Order Spatial Statisticshipacc.ucsc.edu/Lecture Slides/Szapudi_foa.pdf · Algorithms for Higher Order Spatial Statistics István Szapudi Institute for Astronomy

IntroductionThree-point Algorithm

Summary

Example: Algorithm for 3ptOther algorithms use symmetries

!!4!4!3!2!1

I. Szapudi Algorithms for Higher Order Statistics

Page 11: Algorithms for Higher Order Spatial Statisticshipacc.ucsc.edu/Lecture Slides/Szapudi_foa.pdf · Algorithms for Higher Order Spatial Statistics István Szapudi Institute for Astronomy

IntroductionThree-point Algorithm

Summary

Algorithm for 3pt Cont’d

Naively N3 calculations to find all triplets in the map:overwhelming (millions of CPU years for WMAP)Regrid CMB sky around each point according to theresolutionUse hierarchical algorithm for regridding: N log NCorrelate rings using FFT’s (total speed: 2minutes/cross-corr)The final scaling depends on resolutionN(log N + NθNα log Nα + NαNθ(Nθ + 1)/4)/2With another cos transform one and a double Hankeltransform one can get the bispectrumIn WMAP-I: 168 possible cross correlations, about 1.6million bins altogether.How to interpret such massive measurements?

I. Szapudi Algorithms for Higher Order Statistics

Page 12: Algorithms for Higher Order Spatial Statisticshipacc.ucsc.edu/Lecture Slides/Szapudi_foa.pdf · Algorithms for Higher Order Spatial Statistics István Szapudi Institute for Astronomy

IntroductionThree-point Algorithm

Summary

3pt in WMAP

I. Szapudi Algorithms for Higher Order Statistics

Page 13: Algorithms for Higher Order Spatial Statisticshipacc.ucsc.edu/Lecture Slides/Szapudi_foa.pdf · Algorithms for Higher Order Spatial Statistics István Szapudi Institute for Astronomy

IntroductionThree-point Algorithm

Summary

Recent Challanges

Processors becoming multicore (CPU and GPU)To take advantage of Moore’s law: parallelizationDisk sizes growing exponentially, but not the IO speedData size can become so large that reading mightdominate processingNot enough to just consider scaling

I. Szapudi Algorithms for Higher Order Statistics

Page 14: Algorithms for Higher Order Spatial Statisticshipacc.ucsc.edu/Lecture Slides/Szapudi_foa.pdf · Algorithms for Higher Order Spatial Statistics István Szapudi Institute for Astronomy

IntroductionThree-point Algorithm

Summary

Alternative view of the algorithm: lossy compression

!!4!4!3!2!1

I. Szapudi Algorithms for Higher Order Statistics

Page 15: Algorithms for Higher Order Spatial Statisticshipacc.ucsc.edu/Lecture Slides/Szapudi_foa.pdf · Algorithms for Higher Order Spatial Statistics István Szapudi Institute for Astronomy

IntroductionThree-point Algorithm

Summary

Compression

Compression can increase processing speed simply by theneed of reading less dataThe full compressed data set can be sent to all nodesThis enables parallelization in multicore or MapReduceframeworkFor any algorithm specific (lossy compression) is needed

I. Szapudi Algorithms for Higher Order Statistics

Page 16: Algorithms for Higher Order Spatial Statisticshipacc.ucsc.edu/Lecture Slides/Szapudi_foa.pdf · Algorithms for Higher Order Spatial Statistics István Szapudi Institute for Astronomy

IntroductionThree-point Algorithm

Summary

Another pixellization as lossy compression

I. Szapudi Algorithms for Higher Order Statistics

Page 17: Algorithms for Higher Order Spatial Statisticshipacc.ucsc.edu/Lecture Slides/Szapudi_foa.pdf · Algorithms for Higher Order Spatial Statistics István Szapudi Institute for Astronomy

IntroductionThree-point Algorithm

Summary

Summary

Fast algorithm for calculating 3pt functions with N log Nscaling instead of N3

Approximate algorithm with a specific lossy compressionphaseScaling with resolution and not with data elementsCompression in the algorithm enables multicore orMapReduce style parallelizationWith a different compression we have done approximatelikelihood analysis for CMB (Granett,PhD thesis)

I. Szapudi Algorithms for Higher Order Statistics