Upload
truongdieu
View
226
Download
2
Embed Size (px)
Citation preview
The Millimeter-wave Bolometric Interferometer: Data Analysis,
Simulations and Microwave Instrumentation
by
Siddharth S. Malu
A dissertation submitted in partial fulfillment of the
requirements for the degree of
Doctor of Philosophy
(Physics)
at the
University of Wisconsin – Madison
2007
i
The Millimeter-Wave Bolometric Interferometer:
Data Analysis, Simulations and Microwave Instrumentation
Siddharth S. Malu
Under the supervision of Professor Peter T. Timbie
At the University of Wisconsin–Madison (Co-Superviser: Professor Benjamin D. Wandelt,
University of Illinois-Urbana-Champaign)
Abstract
The past decade has been the most exciting time in cosmology in these respects:
1. The discovery of Dark Energy, and an estimate of the composition of the universe.
2. Advances in the understanding of the composition of dark matter.
3. The discovery that the universe is flat.
The following advances have occured in Cosmic Microwave Background (CMB) cosmol-
ogy:
1. A systematic characterization of cosmological models followed by a large number of suc-
cessful large an d small-scale CMB experiments.
2. Measurements of CMB temperature power spectrum.
3. Detection of CMB polarization.
4. Appearance of large CMB datasets with new techniques for data analysis.
Results from CMB theory, experiments and analysis have thus dominated advances in cosmology
over the past few years, and are expected to do so with the upcoming experiments and analysis
techniques as well. All the aforementioned results fit well within and are explained well by the
inflationary paradigm. However, current evidence for inflation is indirect. The next generation
of CMB experiments (this thesis describes one of these) will aim at providing the most direct
evidence for the inflationary paradigm through the detection of B-modes in CMB polarization.
In this thesis, we describe the design, construction and plans for implementation of a novel
instrument, the Millimeter-Wave Bolometric Interferometer (MBI), an interferometer designed
ii
to measure the power spectrum of CMB polarization. We discuss the optics - antennas, waveg-
uides and Fizeau beam combiner, as well as simulations of the instrument and data analysis /
power spectrum estimation techniques to be used after the instrument begins observations.
MBI is designed for sensitive measurements of the polarization of the cosmic microwave
background (CMB). MBI combines the differencing capabilities of an interferometer with the
high sensitivity of bolometers at millimeter wavelengths. It views the sky directly through
corrugated horn antennas with low sidelobes and nearly symmetric beam patterns to avoid
spurious instrumental polarization from reflective optics. The design of the first version of the
instrument with four 7 field-of-view corrugated horns (MBI-4) is discussed. The MBI-4 optical
band is defined by filters with a central frequency of 90 GHz. The set of baselines determined by
the antenna separations makes the instrument sensitive to CMB polarization fluctuations over
the multipole range ℓ=150-270. In MBI-4, signals from antennas are combined with a Fizeau
beam combiner and interference fringes are detected by an array of spider-web bolometers. In
order to separate the visibility signals from the total power detected by each bolometer, the
phase of the signal from each antenna is modulated by a ferrite-based waveguide phase shifter.
Observations are planned from the Pine Bluff Observatory outside Madison, WI.
iii
Acknowledgements
- Sanskrit. Translation: What I am dedicating to you, O Guru, O Lord, was never mine
- it was always yours.
Friend, philosopher and guide - that is what a Guru is supposed to be. It is my pleasure
to have worked with an advisor who has turned out to be all of these, in every sense of the word.
Peter Timbie has been a pillar of support the entire time that I have been his student. Obviously,
I have learnt everything I know about laboratory techniques in Experimental Cosmology from
him. He has, however, taught me much more than that - to be patient when the first few
versions of anything do not work out, to keep my calm when everything that can possibly go
wrong does, but above all, to believe in myself - and that, at times when I had almost given
up.
Of course, one could describe those many dinners, picnics, and ’work-parties’ that were
a lot of fun, but it really is Peter’s dedication to students - teaching, training, and sometimes
even tolerating them - that makes him a true Guru.
Ben Wandelt has been equally encouraging and supportive during the time that I have
worked with him. Peter and Ben are together responsible for most of my knowledge and
achievements during the course of my thesis, and it with them in mind that I quote the Sanskrit
shloka above.
I have learned a great deal from members of the MBI team - Carolina helped me through
data analysis, Jaiseung with programming, Andrei with instrumentation and instrument design.
It has been fun working with a wonderful team at UW-Madison - Peter H., Amanda and Emily,
my fellow graduate students. A special thanks to the undergraduates who worked with me, in
variuos projects - Steve Kaeppler, Seth Bruch, Eric Lopez and Lauren Levac. It was insipiring
to work with such a dedicated bunch od people.
I have been fortunate to have been guided by others quite like Peter and Ben throughout
my life, the first of them being my parents. It is one thing to guide and support, and quite
another to brave all the storm, ridicule APART from guiding and supporting me through all
iv
the troubles I faced, because of the obviously wrong decisions I made in my life. It takes a
huge amount of strength to believe in someone when all they are doing is committing mistakes,
repeating them over and over, and generally making a hash of their life and career. I am proud
to say that my parents were never found wanting, and while I am sorry that I made them go
through all that they did in the past ten years (which had nothing to do with this thesis!), I
am glad that they taught me, along with Peter Timbie, to believe in myself and the people
close to me. They have been my base, my pillar of support, without which I would barely
become a tenth of what I have, far less achieve anything. They changed their lives around my
sister and me, just so we could have a stable childhood. They stayed apart for long periods
of time, so that we would not have to change cities or even schools as my parents’ jobs took
them from one place to another. Nor can I forget the contribution of the rest of my family -
my grandparents in particular, who had already filled up our home with all sorts of books and
supported us through difficult times, because they, like my parents, believed in the value of a
good education. It was my parents that filled in us (my sister and me) a sense of curiosity for
the world/universe around us and the value and importance of perseverence in the face of all
difficulty and disenchantment. This thesis is dedicated to them - my father, Suman Malu, and
my mother, Shashi Rani Malu. And to my sister, who, with her great sense of humour and wit
kept me alive.
Going through my school years will produce a long list of people, all of them dedicated
teachers and great colleagues, but a few of them stand out in my memory. Ms. Suchita Bhengra,
for making even the dreariest parts of Chemistry come alive; Mr. Alan Cowell, for teaching
me the value of discipline and for making men out of us children; Mr. Donald Martin, for
patiently plowing through the derivations; Ms. Annamma, for kindling my interest in Biology;
and my friends Evanjan Banerjee, Rohit Sharma and Ravikirti for being constant support and
unwavering belief in my abilities, especially through two of the toughest years in my life; and
finally, Don Bosco Academy, Patna, which was my anchor for 12 years.
St. Stephen’s College, while elitist and exclusive, gave me the rare opportunity to learn
from Dr. Bhargava, Dr. Swaminathan, Dr. Phookun and Mr. Bhatia - every one of them a
gem of a teacher. I owe my mathematical physics background to Dr. Bhargava, who made the
subject so lively that I ended up extending one of the ideas he gave out in a lecture as a full
project! Working on this project with Dr. Bhargava and Dr. Phookun has been one of the most
immemorable experiences of my life - only now do I realize the full extent of their dedication
to the welfare and training of students and their patience. Yes, it would be fair to say that
I wouldn’t have the training or the courage to end up doing research in Physics had it not
been for these two Gurus. They taught me to take my dreams more seriously than I thought
was possible. They also taught me to keep my feet firmly on ground, in order to be able to
translate those dreams into reality. SSC also introduced me to some truly colourful characters
v
that have provided different shades of companionship and amusement - from Swamit’s unending
laugh-fest to Chako’s paranoia; Vivek’s overcautiousness and conscientiousness to Sumantra’s
pragmaticism; Vinayak and Vikram’s steely resolve to uncover the mysteries of Geek-land to
Advaith and Pranjal’s crazy ideas of fun.
Under Prof. Stone and Dr. Podsiadlowski’s guidance, I continued my training at Oxford.
I thank Prof. Stone for his encoragement, particularly when I needed it during the dreary,
grey days. He drives his students and appreciates their qualities in a way that I have rarely
ever seen anyone do. Dr. Podisadlowski has an amazing knack for presenting anything in
theoretical physics and making it look simple. I am forever in debt of Jenny, my High Energy
Physics supervisor/tutor - she has to be the most enthusiastic and encouraging tutor I have
come across. My classmates Rachel, Tom and the two Wills helped me get through the doom
of the Finals. Venkat and Prashant have been my pillars of support here in Madison through
my worst times.
The author gratefully acknowledges support from Sigma-Xi through the Grants-In-Aid
of Research program, grant number G20063131556544060. The MBI program has been made
possible by the NASA ARPA grants. Lauren Levac was supported by the Bernice Durand Award
for her work with the MBI team in summer 2007. Prof. van der Weide in UW Engineering
very kindly allowed us to use his equipment for our tests.
This thesis has made extensive use of CMBFAST and HealPix packages, and the LAMBDA
website and tools.
vi
To my family - my first Gurus
- Sanskrit couplet about the Guru. Translation: Creation, sustenance and destruction
are but like child’s play to the Guru, who is the supreme Lord, and to this Lord do I bow with
all my soul.
vii
Contents
1 Overview 1
1.1 Thesis overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Introduction 5
2.1 Hubble’s Law and FRWL Cosmology . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Cosmodynamic calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.2 Horizon size at recombination . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.3 Age of the Universe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 The CMB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.1 Problems with the simple early-universe model . . . . . . . . . . . . . . . 14
2.3.2 Multipole expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3 Theory of CMB Polarization 21
3.1 Quasi-monochromatic EM waves . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Spin Harmonics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3 Application of Spin-harmonics to Polarization . . . . . . . . . . . . . . . . . . . . 26
3.4 Thomson Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.5 CMB Polarization and Cosmology . . . . . . . . . . . . . . . . . . . . . . . . . . 30
viii
4 Current status of CMB observations 36
4.1 Detectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.2 The Wilkinson Microwave Anisotropy Probe . . . . . . . . . . . . . . . . . . . . . 37
4.3 The Degree Angular Scale Interferometer . . . . . . . . . . . . . . . . . . . . . . 38
5 Interferometry 41
5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.2 The Mutual Coherence Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.3 The Coherence Function of Extended Sources . . . . . . . . . . . . . . . . . . . . 42
5.4 Visibility as a function on Intensity pattern on the sky . . . . . . . . . . . . . . . 44
5.5 Interlude: A small discussion on interferometry . . . . . . . . . . . . . . . . . . . 48
5.6 Visibility, the power spectrum and the beam . . . . . . . . . . . . . . . . . . . . 51
5.6.1 Window function for one baseline in an interferometer . . . . . . . . . . . 55
5.6.2 Effect of finite frequency bandwidth on width of window function . . . . . 55
5.7 Visibility in the polarized case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.8 Why Use an Interferometer? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.8.1 Angular Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.8.2 No Rapid Chopping and Scanning . . . . . . . . . . . . . . . . . . . . . . 60
5.8.3 Clean Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.8.4 Direct Measurement of Stokes Parameters . . . . . . . . . . . . . . . . . . 61
5.9 Systematic Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.10 The Adding Interferometer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6 The Fizeau Combiner: A Concept Study 69
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.2 Spectral information from an interferometer using a Fizeau approach . . . . . . . 73
6.2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
ix
6.2.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.2.3 Effect of non-zero detector size . . . . . . . . . . . . . . . . . . . . . . . . 76
6.2.4 Feasibility of using techniques in §6.2 for MBI . . . . . . . . . . . . . . . . 76
6.3 The Fizeau combiner as an imager . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.3.1 Remarks about the Fizeau system . . . . . . . . . . . . . . . . . . . . . . 79
7 The MBI Instrument 83
7.1 Antennae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.2 Fizeau Beam combiner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
7.3 Detectors, electronics and data acquisition . . . . . . . . . . . . . . . . . . . . . . 88
7.4 Cryogenics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.5 Telescope and mount . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.6 Measurements 1: Analysis of data from the Faraday-Effect Phase Modulator . . 90
7.6.1 Estimation - no losses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.6.2 Estimation with losses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.6.3 Correcting for Ferrite loss . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
7.6.4 Over/under-estimation of Ferrite loss . . . . . . . . . . . . . . . . . . . . . 94
7.7 Measurements 2: Antenna Beam Patterns . . . . . . . . . . . . . . . . . . . . . . 95
7.7.1 Loss in an overmoded circular waveguide . . . . . . . . . . . . . . . . . . 96
7.7.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
8 Simulations of the CMB sky and the MBI Instrument 107
8.1 Simulation of the CMB sky patch . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
8.2 Simulation of the MBI Instrument . . . . . . . . . . . . . . . . . . . . . . . . . . 110
8.2.1 Interferometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
8.2.2 Integration over the field-of-view (FOV) / sky patch . . . . . . . . . . . . 116
8.2.3 Interference pattern in focal plane . . . . . . . . . . . . . . . . . . . . . . 118
x
8.2.4 Effect of finite bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
8.2.5 Implementation of formalism to the instrument . . . . . . . . . . . . . . . 121
8.2.6 Recovery of Cℓ from instrument simulation . . . . . . . . . . . . . . . . . 123
9 CMB Data Analysis 130
9.1 Mapmaking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
9.1.1 The general mapmaking problem . . . . . . . . . . . . . . . . . . . . . . . 130
9.2 Power Spectrum Estimation: Bayesian Approach . . . . . . . . . . . . . . . . . . 134
9.2.1 Detailed Bayesian Formalism . . . . . . . . . . . . . . . . . . . . . . . . . 135
9.2.2 The problem with the Bayesian approach . . . . . . . . . . . . . . . . . . 135
9.3 Interlude: The Gibbs Sampler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
9.3.1 The problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
9.3.2 Bayes’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
9.3.3 Sampling Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
9.3.4 Application to experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
9.3.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
9.4 Cℓ extraction using Gibbs’ Sampling . . . . . . . . . . . . . . . . . . . . . . . . . 140
9.4.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
9.4.2 Formalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
9.5 Application to simulated data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
9.5.1 Gelman-Rubin Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
10 Conclusions 158
A Dr. Planck, or: How I Learned to Stop Worrying and Love Stat Mech. 161
A.1 The general problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
A.2 Average Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
A.3 Number of phase states available, or phase factor . . . . . . . . . . . . . . . . . . 162
xi
A.4 Planck Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
A.5 Distribution for particle number . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
B S- and T-matrix formulation 166
B.1 Two port devices and the S-matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 166
B.2 The need for a T-matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
B.3 Conversion between S- and T-matrix . . . . . . . . . . . . . . . . . . . . . . . . . 168
C Relationship between ℓ and θ 170
D Inflaton field equation of motion and slow-roll conditions 172
D.1 The equation of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
D.2 Slow-roll conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
E E-B decomposition 176
E.1 Stokes’ parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
E.2 Relationship between E-B and Q-U . . . . . . . . . . . . . . . . . . . . . . . . . . 177
xii
List of Tables
5.1 Comparison of various optical designs for the EIP. To achieve the same angular
resolution each instrument allows different amounts of throughput (number of
modes) and requires different aperture diameters, D. For the Gregorian the edge
taper on the primary mirror illumination is assumed to be −40dB, the diame-
ter of the FOV is given in degrees and the number of modes is approximately
[FOV/(angular resolution)]2, assuming all the modes reaching the focal plane are
coupled to detectors. For the imaging horn array, the horn diameter = D. For
the interferometric horn array, D = B, the diameter of a close-packed array of
horns, each of diameter d, and the number of modes is given by the number of
horns ∼ (D/d)2. In the last three columns, for all cases, the angular resolution
= 1 and λ = 3 mm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.2 A Comparison of Systematic Effects . . . . . . . . . . . . . . . . . . . . . . . . . 63
xiii
List of Figures
2.1 Evolution of perturbations. Shown here are three oscillation sizes which are
important for extracting informatin from the CMB. . . . . . . . . . . . . . . . . . 14
2.2 Acoustic oscillations in the CMB. What we are able to measure today is pro-
portional to the square of the amplitude at recombination, via the CMB power
spectrum. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 B-mode power spectrum compared with temperature and EE power spectra[5]. . 17
2.4 WMAP 3 year power spectrum. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.1 B-mode level compared with the levels of E-modes, foregrounds and the lensing
contribution to B-modes[7] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 Scalar and Tensor modes with corresponding E and B components. . . . . . . . . 32
3.3 The contribution of tensor modes to the temperature power spectrum (in green). 32
3.4 WMAP 1st year power spectrum, showing cosmic variance at low ℓs. Notice
that the cosmic variance shown here is significantly larger than the tensor mode
contribution in fig.(3.3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.1 A schematic of a bolometer, showing how it works. . . . . . . . . . . . . . . . . . 36
4.2 A schematic of how a bolometer is used. . . . . . . . . . . . . . . . . . . . . . . . 37
4.3 WMAP parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.1 A general interferometric setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.2 One baseline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.3 Schematic of interferomentric observation - one baseline. The two antennas are
at G and D. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
xiv
5.4 The u-v plane coverage of an imager and an interferometer. Figure courtesy Dr.
Carolina Calderon[2]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.5 The u-v plane with several pixels. Pixels marked “1” and “2” have the same
distance from the origin, but differ only in their angular position (this corresponds
to the phase of the fringe). Pixels marked “3” and “4” differ in their distance
from the origin and angular position. . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.6 FOV and pixel in the image plane. In this figure and the one alongside, red
represents a pixel in image space and green the FOV in image space. . . . . . . . 52
5.7 The same FOV and pixel as in the previous figure. The size of the interferometer’s
FOV determines its resolution in u-v space. Notice that the two objects have
swapped their dimensions. If N pixels fit in the FOV in the image plane, then
the u-v plane is also divided into N pixels whose size is inversely proportional to
the FOV. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.8 Adding interferometer. At antenna A2 the electric field is E0, and at A1 it is
E0eiφ, where φ = kB sinα and k = 2π/λ. B is the length of the baseline, and
α is the angle of the source with respect to the symmetry axis of the baseline,
as shown. (For simplicity consider only one wavelength, λ, and ignore time
dependent factors.) In a multiplying interferometer the in-phase output of the
correlator is proportional to E20 cosφ. For the adding interferometer, the output
is proportional to E20 + E2
0 cos(φ + ∆φ(t)). Modulation of ∆φ(t) allows the
recovery of the interference term, E20 cosφ, which is proportional to the visibility
of the baseline. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.9 Block diagram of a planned CMB polarization experiment. Light enters the
instrument from the left. Each phase switch is modulated in a sequence that
allows recovery of the interference terms (visibilities) by phase-sensitive detec-
tion at the detectors. The signals are mixed in the beam combiner and detected
on cold bolometers at the right. The beam combiner can be implemented either
using guided waves (Butler combiner, as shown here) or quasioptically (Fizeau
combiner, see below). The triangles represent corrugated conical horn antennas,
which connect through transitions to rectangular waveguide. Orthomode trans-
ducers (OMTs) allow all the Stokes parameters to be determined simultaneously. 66
6.1 A simple multi-slit diffraction/interference experiment. Phase differences occur
after light has passed through the slit, inside the instrument. . . . . . . . . . . . 69
xv
6.2 A simple traditional interferometer. Rays suffer phase differences before they
enter the slits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.3 A simple 1-d Fizeau system. Notice that there are two sets of phase differences. . 71
6.4 2-slit diffraction pattern. The large envelope is caused by the single-slit diffrac-
tion and the fine features by the interference between 2 slits. . . . . . . . . . . . 72
6.5 8-slit diffraction pattern. The pattern is more “focused”, leading to better image
recovery. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.6 16-slit diffraction pattern, source 10 away from center. . . . . . . . . . . . . . . 72
6.7 16-slit diffraction pattern, source 20 away from center. . . . . . . . . . . . . . . 72
6.8 The u-v plane coverage of one baseline of an interferometer for a single pointing
in a single baseline orientation angle. The two causes of spread in a single pixel
in the u-v plane are shown. Also shown is the size of the FOV, which is the
fundamental limit to u-v resolution. . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.9 The u-v coverage of a single baseline has been divided into many pixels; however,
the beam of a single antenna is larger than a single pixel, so that this division is
not physical. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
7.1 A schematic of the main parts of the MBI instrument. . . . . . . . . . . . . . . . 83
7.2 A schematic of the main parts of the MBI instrument. . . . . . . . . . . . . . . . 84
7.3 A detailed schematic/view of how the Fizeau combiner system fits inside the
MBI instrument. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
7.4 CMB foreground spectra from the WMAP team [2]. The frequency range of MBI
is indicated by the last yellow column on the right marked “W” for the W-band,
which is very close to the minimum of the combined foreground spectrum. This
is the frequency band in which the MBI operates. . . . . . . . . . . . . . . . . . . 86
7.5 The antenna arrangement (right) and how it looks from atop the cryostat, covered
by filters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.6 (a) Simulation of fringe patterns formed in the focal plane of the Fizeau beam
combiner from a single baseline.(b) Superposition of fringes from 6 baselines (as
expected in MBI). Fringes are separated by phase modulation sequence. . . . . . 88
7.7 A spider-web JPL bolometer, with NTD germanium thermistor. . . . . . . . . . 89
7.8 The MBI mount. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
xvi
7.9 The Vector Network Analyzer(VNA) at the van der Weide lab at UW-Madison.
The FRM is inside the gold cryostat. . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.10 Rotation angle and how it is related to S21 . . . . . . . . . . . . . . . . . . . . . 93
7.11 Rotation angle vs. current, corrected for Ferrite loss, as described in the text. . . 100
7.12 The WR-10 to 0.2” transition (gold) connected with an adapter which then
connects to the circular copper tube. . . . . . . . . . . . . . . . . . . . . . . . . . 101
7.13 Schematics of the planned antenna beam test. . . . . . . . . . . . . . . . . . . . . 101
7.14 Raw data from the tube test for pipes of different lengths. The oscillations are
caused by standing waves in the pipes. Notice that the signal from different
lengths decreases monotonically with increasing length. . . . . . . . . . . . . . . 102
7.15 The same data as in fig.(7.14), but with resonances smoothed out. . . . . . . . . 103
7.16 Graph of loss per 10 feet derived from smoothed data. . . . . . . . . . . . . . . . 104
7.17 Resonances in the data in a small frequency range. These are consistent with
standing waves in the tube lengths used. . . . . . . . . . . . . . . . . . . . . . . . 105
8.1 The power spectrum used to generate the simulated maps shown below. This
was obtained by choosing a set of cosmological parameters in CMBFAST[1]. . . . 109
8.2 The temperature map obtained from the power spectrum above and the method
described in this chapter. The size of the map is in degrees, indicated on the two
axes. Temperatures are in K. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
8.3 Q map obtained from the power spectrum above and the method described in
this chapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
8.4 The temperature map obtained from the power spectrum above and the method
described in this chapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
8.5 The temperature map that a 6-baseline ideal interferometer is expected to output,
given the sky map shown in fig.(8.2). . . . . . . . . . . . . . . . . . . . . . . . . . 112
8.6 This is a basic check of the map in fig.(8.2). The curves on the top and bottom
indicate the 1-σ error bars expected from eq.(8.6), and the marked points make
up the recovered power spectrum. Note that the vertical scale is different from
the power spectrum in fig.(8.1). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
8.7 Schematic of the Quasioptical beam combination set-up inside the cryostat . . . 122
xvii
8.8 Schematic of the Quasioptical beam combination set-up inside the cryostat . . . 123
8.9 The power spectrum used for the simulation. . . . . . . . . . . . . . . . . . . . . 126
8.10 Temperature map from the power spectrum shown in fig.(8.9) above. Used as
input for the instrument simulation. Temperature anisotropies are in µK. . . . . 127
8.11 Recovered power spectrum from the Fizeau system simulation. . . . . . . . . . . 128
9.1 Results from Gibbs’ sampling for the experiment mentioned above. . . . . . . . . 141
9.2 Simulated “flat-sky” CMB map. . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
9.3 The power spectrum used for map simulation and the spectrum recovered from
the simulated map. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
9.4 Estimates of Cℓ recovered from Gibbs’ sampling; beam effects not included. . . . 150
9.5 Map recovered from Gibbs’ sampling, no beam. . . . . . . . . . . . . . . . . . . . 151
9.6 Estimates of Cℓ recovered from Gibbs’ sampling; beam effects included - I. . . . . 152
9.7 Estimates of Cℓ recovered from Gibbs’ sampling; beam effects included - II. . . . 153
9.8 Map recovered from Gibbs’ sampling, beam included - I. . . . . . . . . . . . . . . 154
9.9 Map recovered from Gibbs’ sampling, beam included - II. . . . . . . . . . . . . . 155
9.10 Histograms of recovered values of Cℓs: beam NOT included. . . . . . . . . . . . . 156
B.1 Scematic of the 2-port device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
C.1 Stereographic projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
1
Chapter 1
Overview
We present here a brief overview of this thesis, followed by an overview of the author’s specific
contributions to the MBI project as described in this thesis.
1.1 Thesis overview
In Chapter 2, we introduce cosmology and build up on first principles to get to the
Friedmann equation. An overview of the physics behind anisotropies in the CMB is given,
followed by a discussion of the serious problems in the model of the early universe, and inflation
is presented as a possible and logical solution to all of these problems. Observable signatures
of inflation on CMB polarization are mentioned.
In Chapter 3, we discuss a way to analyze CMB polarization using spin-harmonics, a
technique reviewed by Wandelt et al [1]. It is shown heuristically that the existence of B-modes
implies the existence of gravitational waves in the early universe, which had their origins in the
inflationary era.
In Chapter 4, we discuss the current state of CMB polarization experiments and briefly
discuss problems with imaging experiments. This leads into a discussion of interferometry and
its merits in Chapter 5. Chapter 6 discusses a novel idea for beam combination that yields
spectral information in fourier space, unlike traditional interferometric systems.
Chapter 7 is an overview of the MBI instrument. Chapter 8 discusses sky and instrument
simulations. Data analysis techniques are discussed in Chapter 9.
1.2 Contributions
MBI is a collaboration between several institutions, UW-Madison, and Brown and Cardiff
Universities being the largest contributors in terms of manpower and resources. MBI’s mount
2
has been designed and built at UW-Madison by Peter Hyland. Tests of MBI’s tracking ability
are ongoing and have so far proven successful. The cryostat was built by Lucio Piccirillo and
tested extensively at Cardiff by Carolina Calderon. Corrugated antennae were tested at Brown
and UW-Madison by Andrei Korotkov and Melissa Lucero. The Fizeau beam combiner was
conceived and designed by Peter Timbie, Gregory Tucker, Lucio Piccirillo and Andrei Korotkov
and has been tested extensively at Brown by Andrei Korotkov. Spider-web bolometers have
been provided by JPL and have been tested at Brown by AK. Faraday effect phase modulators
have been provided by Brian Keating at UCSD. These devices have undergone tests at UW-
Madison, done mostly by Amanda Gault, with support from the author and Peter Hyland.
Evan Bierman at UCSD has provided expert knowledge necessary to carry out these tests.
MBI started out with a Butler beam combiner. Analysis of simulated data from the
Butler version of MBI and extraction of bandpowers have been discussed in exquisite detail
in C. Calderon’s thesis [2]. CC has also studied non-linear methods to recover images from
incomplete u-v coverage. Jaiseung Kim provided the antenna placement that maximizes u-v
coverage.
The author’s main contributions MBI is a unique instrument: it is able to function
simultaneously as an imager and an interferometer. The realization that the MBI is capable of
this is a novel idea that has been introduced in this thesis. Also, simulation and analysis tech-
niques developed in this thesis greatly enhance the capability of this instrument and will allow
exquisite control of systematic effects and in future versions of MBI-4, the ability to characterize
foregrounds - the most important step towards B-mode detection and characterization. With
these as the broad aims of this thesis, the specific contributions of the author are as follows:
1. Chapter 6: The Fizeau combiner system is developed and its application to interferometry
to recover spectral information in the fourier plane, as well as the possibility of operating
an instrument with the Fizeau system as an imager and an interferometer simultaneously
are discussed in detail.
2. Chapter 7:
(a) A measurement of loss in an overmoded circular waveguide system, with a view to
testing antenna beam patterns for MBI.
(b) Characterization of a ferrite-based phase modulator in the W-band (with Amanda
Gault).
(c) Plans to carry out tests of antenna beam patterns once tests described in 2a above
are complete.
3. Chapter 8:
3
(a) Simulation of a CMB sky patch - this was done with a lot of help from Carolina
Calderon.
(b) Simulation of the MBI instrument (specifically the Fizeau system) and crude power
spectrum recovery.
4. Chapter 9: Gibbs’ sampling is a robust, computationally efficient data analysis technique
and is the only efficient method that allows global inference of covariance. This has been
applied to imaging before [3, 4], and we adapt the technique to use it with interferometric
data. This work has been done with Benjamin Wandelt.
5. Chapter 10: Measurements of losses in microstrip lines with a view to replacing guided
wave systems by a compact beam combination scheme (with R. Pathak). This measure-
ment is mentioned only in passing, since this technique is being developed for future
versions of MBI and space-based interferometeric experiments.
MBI’s novelty lies not just in the fact that it is a new instrument with a novel combination
of interferometry and bolometry, but that its specific design allows it to achieve the capability
of characterizing both the CMB signal and foreground. This thesis explores how this is made
possible through design, instrumentation, simulations and data analysis.
4
Bibliography
[1] Y.-T. Lin and B. D. Wandelt, “A beginner′s guide to the theory of CMB temperature and
polarization power spectra in the line-of-sight formalism,” Astroparticle Physics, vol. 25,
pp. 151–166, Mar. 2006.
[2] C. Calderon, “SIMULATION OF THE PERFORMANCE OF THE MILLIMETRE-WAVE
BOLOMETRIC INTERFEROMETER (MBI) FOR COSMIC MICROWAVE BACK-
GROUND OBSERVATIONS. Ph.D. Thesis, Cardiff.,” Ph.D. Thesis, 2006.
[3] B. D. Wandelt, D. L. Larson, and A. Lakshminarayanan, “Global, exact cosmic microwave
background data analysis using Gibbs sampling,” Phys. Rev. D, vol. 70, no. 8, pp. 083511–+,
Oct. 2004.
[4] B. D. Wandelt, “MAGIC: Exact Bayesian Covariance Estimation and Signal Reconstruction
for Gaussian Random Fields,” ArXiv Astrophysics e-prints, Jan. 2004.
5
Chapter 2
Introduction
- Rig Ved, Mandala I (Translation: The primeval atom gave rise to everything we know
in the universe. However, where did it come from, and if its source is unknown, does there even
exist anyone we can offer prayers to?)
It is only relatively recently (19th century onwards) that scientists have made predictions
about and observations of the early Universe and have come up with a successful paradigm that
explains the observations and reconcile them with physical theories.
Once upon a redshift (c. 1965), two scientists at Bell Labs decided to test their shiny new
antenna by pointing it to different parts of the sky. They ended up with a residual noise with
an equivalent temperature of ∼ 3K and a huge confusion on their hands. The puzzle about the
source of this seemingly uniform source was solved only when physicists at the nearby Princeton
University shared with them their ideas about the origins of the universe. Thus started the field
of CMB cosmology, one which has proved to be even more fundamental to our understanding
of the universe over time.
The rest of this chapter will briefly introduce two of the three “pillars” of Cosmology (we
do not discuss primoridal nucleosynthesis here - see, e.g.[1]), as well as the background in Gen-
eral Relativity and discuss the physics and importance of CMB temperature and polarization
anisotropies and how they can acts as windows to the very early universe.
6
2.1 Hubble’s Law and FRWL Cosmology
In the 1920s[2], Hubble pointed his telescope to a few galaxies and discovered the fact
that each one of them was moving away from us, with a velocity proportional to the distance
between us and the galaxy we’re looking at. Since there is no reason to expect that the Milky
Way is at the centre of the Universe, it is reasonable to extend this result and say that every
galaxy is receding from every other with the same property of recession. This has been checked
with observations as well. It turns out that the formalism for expressing Hubble’s law is simple,
and the idea along with all its results remains the same in the General Theory of Relativity
(GR henceforth) as well as Newtonian mechanics. Clearly, Newtonian mechanics is not up to
the task of dealing with the expanding Universe, for several reasons.
Let us denote by r the physical distance between two galaxies, and by v their relative
velocity. Then, Hubble law says that v ∝ r. We can then write the equation
v = Hr (2.1)
where H is called the Hubble parameter (technically, it should be a “constant”, but we have
tacitly ignored curvature and every other issue associated with GR; H can be thought to en-
compass all these GR effects). We would do well to remember that this expansion is not just a
widening in distance between galaxies, it is a “stretching” of space (space-time, strictly speak-
ing, but the beauty of the presently-accepted Friedmann-Robertson-Walker-Lemaitre (FRWL)
universe model is that one can view “spatial slices” or spatial hypersurfaces at different times; it
is possible that the Universe is not FRWL - there are other solutions to the Einstein equations
that are not homogeneous spatially or temporally, but while that is an active area of research,
everyone in the astrophysics comuunity agrees that FRWL is by far the most likely model that
the Universe obeys). In that case, we can (as a matter of fact, we ought to, as we will see later)
reformulate the picture in the following way. We encode the expansion of the Universe in a
single variable which is a function of time, and define what is called a “comoving” frame of
reference in which the distance between, say, any two given galaxies is a constant, i.e. we are
“viewing” this distance from a pre-defined epoch. There is nothing that prevents us from this
pre-defined epoch to “now” - indeed, this is often a convenient choice as we will see. The vari-
able that encodes the expansion of the Universe is called the “scale-factor”, which we represent
here with a (t). We can then write any given physical distance as
r = a (t)x (2.2)
where x is the comoving distance between the two given points under consideration. The
velocity is v = drdt , meaning that
v =d
dt(a (t)x) = x
da
dt(2.3)
7
where the last equality holds because x (i.e. the comoving distance between between any two
given objects) is fixed by definition. We can then write the Hubble law as
v = xda
dt= Hax = Hr (2.4)
⇒ H =1
a
da
dt(2.5)
This, then, is the most general definition of the Hubble parameter. By calling it a parameter,
we have gotten away with proving this relation for any theory of gravity we might choose to
consider - Newtonian or Einsteinian.
Next, we look at a special case of the FRWL metric, namely, the Minkowski metric in
spherical polar co-ordinates:
ds2 = c2dt2 − a (t)2[dr2 + r2dΩ2
](2.6)
It is more convenient to set c = 1 so that
ds2 = dt2 − a (t)2[dr2 + r2dΩ2
](2.7)
where clearly dΩ2 = dθ2 + sin2 θdφ2. This represents flat space-time only. Let us generalize
this to a space-time with positive curvature, in analogy with a 2-sphere (the object we know
and love as a “sphere”). This is a 3-d surface, so in analogy with the “normal” or 2-d sphere
whose equation is
x2 + y2 + z2 = r2 (2.8)
(where r is the radius of the sphere), we have
x2 + y2 + z2 + w2 = b2 (2.9)
Here, x, y and z are ordinary spatial dimensions, and w can be thought of as a fiducial variable,
whose physical interpretation is that it is a 3-sphere embedded in 4-d space. If we accept this
without much ado, we can go about expressing w completely in terms of r, b etc. in the following
way.
We first rewrite the above equation as
r2 + w2 = b2 ⇒ w2 = b2 − r2 (2.10)
Differentiating this equation, we get
2rdr + 2wdw = 0⇒ dw = −rdrw⇒ dw2 =
r2dr2
w2=
r2dr2
b2 − r2 (2.11)
8
Now, the metric has to be modified to
ds2 = dt2 − a (t)2[dr2 + dw2 + r2dΩ2
](2.12)
Let us evaluate a part of the metric:
dr2 + dw2 = dr2[
1 +r2
b2 − r2]
= dr2[b2 − r2 + r2
b2 − r2]
=dr2
1− r2/b2 (2.13)
The 1b2
in the denominator is reminiscent of curvature, and so we call it exactly that and rewrite
it as k. Combining everything together, we then have
ds2 = dt2 − a (t)2[
dr2
1− kr2 + r2dΩ2
]
(2.14)
where k is curvature. Notice that when k = 0, the FRWL metric reduces to Minkowski, as we
would expect it to.
This method can be applied without loss of generality to negative curvature as well, and
the only difference is that the fiducial variable will satisfy this equation
r2 − w2 = b2 (2.15)
so that we will end up with this metric
ds2 = dt2 − a (t)2[
dr2
1 + kr2+ r2dΩ2
]
(2.16)
We can generalize and write
ds2 = dt2 − a (t)2[
dr2
1− kr2 + r2dΩ2
]
(2.17)
where it is understood that k can take positive and negative values. We can write the metric
another way by substituting√kr = sin
√kχ and working out that
dr = cos√kχdχ (2.18)
⇒ dr2 = cos2√kχdχ2 (2.19)
⇒ dr2 =(1− kr2
)dχ2 (2.20)
⇒ dχ2 =dr2
1− kr2 (2.21)
Substituting for this and for r, we get that the metric is
ds2 = dt2 − a (t)2[
dχ2 +sin2√kχ
kdΩ2
]
(2.22)
9
This excercise is useful because we can immediately extract the Angular Diameter Distance
from the new form of the metric - it is the square root of the factor that multiplies dΩ2:
DA =sin√kχ√k
(2.23)
In the case of flat space-time, k → 0 such that sin√kχ√k→ χ = r which is what we expect.
Having studied the geometrical aspects of the metric, let us now turn our attention
to the dynamics of the Universe. The equations that are derived below are again very useful,
especially in their most general form, and their beauty lies in the fact that though the derivation
has nothing to do with GR, these are the exact same result we would get if we worked with
the Einstein equations instead. The GR approach will be outlined briefly after the following
derivation. Let us start from the first law of thermodynamics:
dU + pdV = 0 (2.24)
where, naturally, U = ρa3, where ρ is the density (total energy density, but this can be simplified
for those epochs when the total energy density is dominated by just one component) and a the
scale factor, which is a function of time. Substituting for U , we get
a3dρ+ 3a2ρda+ 3a2pda = 0 (2.25)
⇒ 3a2da (p+ ρ) = −a3dρ (2.26)
⇒ 3 (p+ ρ)da
a= dρ (2.27)
⇒ 3
(p
ρ+ 1
)da
a=
dρ
ρ(2.28)
Now pρ is what is referred to as the equation of state. It is usually denoted by w in the literature,
so we will follow the convention:
3 (w + 1)da
a=dρ
ρ(2.29)
⇒ d ln ρ
d ln a= 3 (1 + w) (2.30)
This, then, is the most general expression relating ρ and a. Notice that we have not yet made
any assumption about w - it may very well be a function of a, and this equation will still hold.
If we assume a constant equation of state w (as is true for baryonic matter and radiation), we
get a simpler relation:
ρ ∼ a−3(1+w) (2.31)
There is another dynamical equation we can derive with our simplistic approach, but this
one requires a leap of faith on one count. Start out with the classical statement for conservation
of energy1
2mv2 − GMm
r= constant (2.32)
10
where m is the mass of a “particle” and M is the mass of the Universe in the shape of a sphere
of uniform density ρ and M = (4/3) πr3ρ. Changing the above equation to represent quantities
per unit mass, we get1
2v2 − 4πG
3
r3ρ
r= constant (2.33)
Use Hubble’s law: v = Hr to get
H2 =8πG
3ρ+
constant
r2(2.34)
This is called the Friedman Equation. It is one of the most fantastic coincidences of Cosmology
that a line of argument as weak as the preceding one can yield the same result as GR. We
can derive this from Einstein’s Equations, with the only difference that the second term on the
right will be − kr2
where k is space-time curvature, as before, so that the final equation is
H2 =8πG
3ρ− k
r2(2.35)
2.2 Cosmodynamic calculations
Having introduced the basic concepts in cosmology, let us work through a few small
calculation that will be relevant in §2.3. It is conventional to write eq.(2.35) as
H2 =8πG
3ρcrit (2.36)
where we have incorporated curvature and the net energy density of the universe in the quantity
ρcrit. When studying cosmology, we are not always interested in the value of ρ for different
components - just what fraction of the energy density they make up. To this end, we define a
set of parameters denoted by Ω such that for a component X,
ΩX =ρXρcrit
(2.37)
is the fraction of energy density in component X at a given time.
2.2.1 Preliminaries
Here is my notation: m,γ,Λ, κ denote matter, radiation, vacuum and curvature respec-
tively.
Ωm = Ωm,NOW = Ωm0 (2.38)
etc., and
11
Ωm (t) is the same parameter at time ’t’.
Let us just write down the expressions for H (t) and H0:
H2 =8πG
3[ρm + ργ + ρΛ + ρκ] =
8πG
3
[ρm0a
−3 + ργ0a−4 + ρΛ + ρκ0a
−2]
= ρcr (t) (2.39)
and
H20 =
8πG
3[ρm0 + ργ0 + ρΛ + ρκ0] = ρcr0 (2.40)
Now divide the two:
H2
H20
=ρcr (t)
ρcr0=
Ωma−3 + ΩΛ + Ωγa
−4 + Ωκa−2
Ωm + ΩΛ + Ωγ + Ωκ (= 1)(2.41)
And so:
ρcr (t) =(Ωma
−3 + ΩΛ + Ωγa−4 + Ωκa
−2)ρcr0 (2.42)
And also, for any general component l, Ωl (t) is:
Ωl (t) =ρl (t)
ρcr (t)=
ρl0a−l
ρcr0 (Ωma−3 + ΩΛ + Ωγa−4 + Ωκa−2)(2.43)
and so finally:
Ωl (t) =Ωla
−l
(Ωma−3 + ΩΛ + Ωγa−4 + Ωκa−2)(2.44)
2.2.2 Horizon size at recombination
Since light travels at a finite speed c, in a time t, only those spots that are within a
distance ct of each other are in causal contact. Therefore, if the age of the universe is t, then
parts as big as ct are causally connected. This is called the horizon size.
The universe is radiation-dominated from the Big-Bang almost all the way upto recombi-
nation. Matter-radiation equality occurs just before recombination, so in principle, both matter
and radiation terms must be kept while calculating the horizon size.
Let us write down the expression for the Hubble parameter:
H2 =8πG
3[Ωγ (t) + Ωm (t)] ρcr (t) =
8πG
3
[Ωγa
−4 + Ωma−3]ρcr0 =
H20
ρcr0ρcr0 [Ωγ + Ωma] a
−4
(2.45)
12
Replacing H20 by 100 km
sMpc , we get:
H =√
[Ωγ + Ωma]a−2h
(
100km
sMpc
)
(2.46)
Now we get to what we started out to calculate, the horizon size at recombination:
ηR = c
∫ a=10−3
a=0
dt
a= c
∫ 10−3
0
da
a2H(2.47)
Replacing the value of H from above, we get:
ηR =c
100
∫ 10−3
0
da√
(Ωγ + Ωma)hMpc =
3000Mpc
(Ωmh2)12
∫ 10−3
0
da√(
Ωγ
Ωm+ a) (2.48)
The final result is:
ηR =6000
(Ωmh2)12
(√
Ωγ
Ωm+ 10−3 −
√
Ωγ
Ωm
)
(2.49)
Putting in Ωm=0.3, Ωγ = 4.8× 10−5 and h=0.72, we get ηR= 326 Mpc. This is the comoving
horizon size at recombination. Considering the age of the universe to be ∼14G light years, we
get that the angle that the horizon subtends on the sky should be
θrecombination horizon =326
14000× 3.26 ≈ 4.3 (2.50)
This means that only 4.3 patches should have similar temperatures on the sky! However, this
is not true - the CMB sky is very nearly uniform. This problem is discussed further in §2.3.1.
In the foregoing calculation, we have assumed that information is able to travel at the
speed of light. However, in reality, information travels at the speed of sound in the plasma,
which happens to be ∼ c√3, so that the above estimate revises to ∼2.
2.2.3 Age of the Universe
From eq. 4, we have
H2 = H20
[ρcr (t)
ρcr0
]
= H20
[Ωma
−3 + ΩΛ + Ωγa−4 + Ωκa
−2
Ωm + ΩΛ + Ωγ + Ωκ (= 1)
]
(2.51)
and so
H = H0
[Ωma
−3 + ΩΛ + Ωγa−4 + Ωκa
−2] 1
2 (2.52)
Remember the definition of the Hubble parameter:
H =1
a
da
dt(2.53)
13
from where
dt =da
aH=
da
aH0 [Ωma−3 + ΩΛ + Ωγa−4 + Ωκa−2]12
(2.54)
so that
t =
∫da
aH=
∫da
aH0 [Ωma−3 + ΩΛ + Ωγa−4 + Ωκa−2]12
(2.55)
is a general expression for the age of the universe, without quintessence. Now, a = 11+z so that
da = − dz(1+z)2
so that
t = −∫
dz
(1 + z)H0
[
Ωm (1 + z)3 + ΩΛ + Ωγ (1 + z)4 + Ωκ (1 + z)2] 1
2
(2.56)
2.3 The CMB
The CMB is another “pillar” of cosmology, and by far the most informative one. Before
we delve into what cosmological parameters can be constrained with the CMB, let us look
briefly at the CMB itself.
Hubble’s law imples that as we go back in time, the size of the universe decreases mono-
tonically. This means that the wavelength of photons decreases and the temperature of the
universe increases. This implies that there must have been an epoch earlier than which the
universe would have been ionized. This epoch is called “recombination” or “last scattering
surface” and we shall use these terms interchangably. Before recombination, the universe can
be thought of as a “primordial soup” of protons, electrons, neutrons (i.e. baryonic matter) and
photons. Baryonic matter experiences two opposing forces: the attractive force of gravity and
repulsive force of radiation pressure. These two opposing forces set up acoustic oscillations in
the “primordial soup”. But these end at recombination, and the photons that travel freely after
recombination constitute the CMB. We need to remember, though, that the universe is expand-
ing even as these acoustic oscillations permeate the universe. Keeping this in mind, and looking
at comoving distances instead of physical ones, let us examine the acoustic oscillations in a little
more detail. Ignoring the origin of the oscillations for the moment, we immediately see from
figs.(2.1) and (2.2) that every length scale ends up with a different amplitude. If the wavelength
of a “mode” (i.e. a length scale) is sufficiently large, small changes in the wavelength do not
produce an appreciable effect (this is the reason that the power spectrum is nearly constant
for low ℓs - see fig.(2.4)). As the wavelength decreases, however, the amplitude of the mode at
recombination increases until it reaches a maximum, and then decreases with decreasing wave-
length. The amplitude cannot possibly be measured today, but the power level can, and so this
is the quantity that CMB cosmology aims to measure. The reason that we can measure this
quantity (i.e. the power in fluctuations in matter) is that the photons that we detect today as
14
Figure 2.1: Evolution of perturbations. Shown here are three oscillation sizes which are impor-
tant for extracting informatin from the CMB.
the CMB were coupled to matter before recombination. This is why fluctuations in the CMB
temperature directly indicate fluctuations in the matter before recombination. What makes the
study of the CMB fundamental to our understanding of the universe is that it is these small
fluctuations in matter that grow to become all the structure we see in the universe today. The
study of fluctuations in the CMB is the study of the origins of all structure in the
universe.
2.3.1 Problems with the simple early-universe model
We have explained the origin of the CMB, but there are problems with this model:
1. What is the origin of these oscillations? In particular, if there is no fixed phase relation
between the oscillations at different scales, the resulting spectrum turns out to be flat!
But this is not what we observe; what causes the initial phases of these oscillations to be
related to each other?
2. We know that these oscillations must have been small - but why?
3. The universe is very nearly spatially flat - what causes this particular value of curvature
to be chosen? But the WORST problem is:
15
Figure 2.2: Acoustic oscillations in the CMB. What we are able to measure today is proportional
to the square of the amplitude at recombination, via the CMB power spectrum.
4. Why is the entire CMB sky nearly at one temperature when parts of it could not have
been in causal contact (as calculated in §2.2.2)?
It is possible to explain part 4 above if the universe started out small, but was expanded out
by a large amount in a short period of time. This would cause parts that were in causal contact
before this expansion to be more than a comoving horizon away from each other.
This simple idea was put forth by Alath Guth in 1981[3] as an elegant solution to all
the four problems mentioned above, and is called “Inflation”. Before we discuss how inflation
solves the problems mentioned above, let us look at its dynamics.
One of the simplest possible rapid expansions is exponential expansion, which can happen
in the following way. Look at the definition of the Hubble parameter:
H =1
a
da
dt=⇒ Hdt = d ln a (2.57)
16
Exponential expansion =⇒ a ∼ econstant×t, which can be easily achieved if H is a constant.
Thus, exponential expansion =⇒ constant H. But what component of the universe can
satisfy this condition? Let us look at the Friedmann equation:
H2 =8πG
3ρ− k
r2=⇒ H ∼ √ρ (2.58)
This means that the energy density of the component dominating the total energy density would
have to be constant. However, from eq.(2.31), we get that ρ can be constant with time if and
only if w = −1, which implies a negative pressure. While the standard model of particle physics
does not provide us with a particle with this property, [4] shows a possible way to get w = −1:
a scalar field that is “slowly rolling” down a potential, such that the potential energy dominates
the kinetic energy at first, but this slowly reverses. Certain criteria need to be satisfied in order
for this to happen, and these are discussed in Appendix E.
Let us now return to the three problems mentioned above and see how inflation can solve
them:
1. Quantum field theory tells us that there must be fluctuations at the level of ∼ 10−30 in
classical vacuum. If these fluctuations in energy density can be expanded out by factors
of ∼ 1025, we get classical fluctuations ∼ 10−5, which can act as seeds for the acoustic
oscillations which lead to the formation of the CMB and large scale structure in the
universe. Furthermore, the spectrum of these fluctuations is flat.
2. Inflation expands EVERY scale by the same factor. Combined with the flatness of the
initial quantum fluctuations, this leads to all the acoustic oscillations starting out in the
same phase.
3. The universe can easily have a non-zero curvature pre-inflation. However, it is always
possible to find a small enough region of space which is spatially flat. Inflation can
expand out this small section to the entire observable universe.
4. As stated before, inflation can get rid of the horizon problem with the correct amount of
expansion.
Inflation doesn’t just solve the problems in early universe cosmology. It produces gravi-
tational waves as well - these are the tensor perturbations in Einstein’s equations of GR in the
early universe[4]. Scattering produces polarization before the LSS because photons have a small
quadrupole moment1. The gravitational wave passing through space-time while polarization
is being produced causes a certain “curl” pattern to be produced [6]. Thus, polarization over
the CMB sky can be split into two parts - one with a “gradient” pattern and the other with
1The reason for this is discussed in detail in chapter 2
18
Figure 2.4: WMAP 3 year power spectrum.
a “curl” pattern. These are called “E-modes” and “B-modes” respectively. In the absence
of any interactions between LSS and now, the presence of B-modes indicates the presence of
gravitational waves in the early universe. Thus, the detection of B-modes in CMB po-
larization anisotropy is the most direct indication of inflation and the B-mode signal
is proportional to the inflaton potential [4]. Slow-roll inflation (a model developed by Alath
Guth, Andrei Linde and Andreas Albrecht) and parameters associated with it are discussed in
Appendix D.
2.3.2 Multipole expansion
Anisotropies in the CMB can be expanded over the full sky in terms of spherical harmonic
functions:∆T
T=∑∑
aℓmYℓm (θ, φ) (2.59)
This is fine, but how do we extract useful information about cosmology from here? And how
do we relate this to measurements?
If early universe physics described in this section is correct, then the CMB is gaussian2,
so that a two-point correlation function contains all the information in the CMB anisotropy
field. Thus,
C (θ) = 〈∆T1 (θ, φ)∆T2 (θ, φ)〉 (2.60)
2In reality, there is some non-gaussianity, but little of it originates in the early universe
19
contains all the information in the CMB. It turns out that the fourier transform of C (θ) is
Cℓδℓℓ′δmm′ = 〈aℓma∗ℓ′m′〉 (2.61)
where Cℓ is known as the power spectrum of the CMB. It tells us the amount of power in
anisotropies at a given lengthscale specified by ℓ, where for large enough ℓ (¿20), ℓ = πθ . For
a detailed discussion of this relationship, see Appendix C. In fig.(2.1), the amplitude of the
oscillation at the LSS is determined by the wavelength of the particular oscillation. Each Cℓ is
the square of the amplitude for a particular value of the wavelength, which is a function of ℓ
and therefore an angle on the sky. This is the reason that a power spectrum is a more
useful tool for studying the early universe than an image - it probes individual angular
scales on the sky and therefore individual length scales in the early universe. We shall discuss
later in §5.6 how the power spectrum is related to the output of an interferometer. The power
spectrum from 3-year WMAP data is shown in fig.(2.4) [5].
20
Bibliography
[1] R. H. Cyburt, B. D. Fields, and K. A. Olive, “Primordial nucleosynthesis with CMB inputs:
probing the early universe and light element astrophysics,” Astroparticle Physics, vol. 17,
pp. 87–100, Apr. 2002.
[2] E. Hubble, “A Relation between Distance and Radial Velocity among Extra-Galactic Neb-
ulae,” Proceedings of the National Academy of Science, vol. 15, pp. 168–173, Mar. 1929.
[3] A. H. Guth, “Inflationary universe: A possible solution to the horizon and flatness prob-
lems,” Phys. Rev. D, vol. 23, pp. 347–356, Jan. 1981.
[4] S. Dodelson, Modern cosmology, Modern cosmology / Scott Dodelson. Amsterdam (Nether-
lands): Academic Press. ISBN 0-12-219141-2, 2003, XIII + 440 p., 2003.
[5] L. Page, G. Hinshaw, E. Komatsu, M. R. Nolta, D. N. Spergel, C. L. Bennett, C. Barnes,
R. Bean, O. Dore, J. Dunkley, M. Halpern, R. S. Hill, N. Jarosik, A. Kogut, M. Limon, S. S.
Meyer, N. Odegard, H. V. Peiris, G. S. Tucker, L. Verde, J. L. Weiland, E. Wollack, and
E. L. Wright, “Three-Year Wilkinson Microwave Anisotropy Probe (WMAP) Observations:
Polarization Analysis,” ApJ Suppl., vol. 170, pp. 335–376, June 2007.
[6] W. Hu and M. White, “A CMB polarization primer,” New Astronomy, vol. 2, pp. 323–344,
Oct. 1997.
21
Chapter 3
Theory of CMB Polarization
Most of the discussion in this chapter can be found in [1] and [2].
A monochromatic plane electromagnetic wave is characterized in the following way. The
x and y components of both E and H fields obey the wave equation. If the direction of
propagation is z, then the electric fields are given by
Ex = Ex0ei(kz−ωt+δx)
Ey = Ey0ei(kz−ωt+δy) (3.1)
where δx and δy are phases associated with the two components.
Despite the appearance of 4 variables, there really are only 3 independent ones in the
above equations: Ex0, Ex0 and δ = δy − δx. We therefore need 3 quantities to completely
characterize a monochromatic wave. The extension to quasi-monochromatic waves is discussed
in §3.1 - it will emerge there that we need 4, and not 3 parameters to completely characterize
a general wave.
Even though Ex0, Ey0 and δ = δy − δx completely characterize a monochromatic wave,
this parametrization/characterization is not satisfactory, because none of these quantities can
be directly measured by an instrument. Instruments can measure |Ex0|2, |Ey0|2 or their linear
combinations. For instance, it is possible to use waveguides and detectors to separate out and
measure |Ex0|2 and |Ey0|2 for this wave. We therefore need 3 parameters in terms of |Ex0|2 and
|Ey0|2 which contain all information about Ex0, Ey0 and δ = δy − δx.
Simultaneously, we need to describe the state of polarization of the wave. These two prob-
lems are tightly coupled, and can be solved simultaneously as follows. One obvious parameter
is the total intensity of the wave, I = |Ex0|2 + |Ey0|2, or equivalently, I = Ix+Iy, which is easily
measured by total-power detectors. Next, thinking only in terms of linear polarization, we can
define polarization as a “difference in intensity along two independent axes”. The preceding
22
sentence is strictly speaking, wrong, since I is a scalar. But it does make sense to compare
|Ex0|2 and |Ey0|2 to check if there is more power on one axis than the other. But this “extra
power along one axis” is precisely the definition of polarization! We can therefore define one
polarization parameter in the following way
Q = |Ex0|2 − |Ey0|2 (3.2)
We need to check what happens to Q under a rotation, since it is not guaranteed to be a
rotation-invariant quantity. We do this as follows.
Under a rotation by an angle, say θ, co-ordinates transform as
x′ = x cos θ + y sin θ
y′ = −x sin θ + y cos θ (3.3)
Electric fields will therefore transform the same way:
E′x = Ex cos θ + Ey sin θ
E′y = −Ex sin θ + Ey cos θ (3.4)
In the rotated co-ordinate system, Stokes’ Q is
Q′ = |E′x|2 − |E′
y|2 (3.5)
and
|E′x|2 = (Ex cos θ + Ey sin θ)
(E∗x cos θ + E∗
y sin θ)
|E′y|2 = (−Ex sin θ + Ey cos θ)
(−E∗
x sin θ +E∗y cos θ
)(3.6)
so that
|E′x|2 = |Ex|2 cos2 θ + |Ey|2 sin2 θ + cos θ sin θ
(ExE
∗y
)
|E′y|2 = |Ex|2 sin2 θ + |Ey|2 cos2 θ − cos θ sin θ
(ExE
∗y
)(3.7)
But the quantity in the last bracket is just 2ℜ (E∗xEy). Subtract the two expressions to get
Q′ =(|Ex|2 − |Ey|2
) (cos2 θ − sin2 θ
)+ 2 sin θ cos θ (2ℜ (E∗
xEy)) (3.8)
Using the trigonometric identities, and the definition of Q: Q = |Ex|2 − |Ey|2, we get
Q′ = Q cos 2θ + 2ℜ (E∗xEy) sin 2θ (3.9)
When we compare this to the transformation of co-ordinates above, we find that this equation
suggests that we define a quantity 2R (E∗xEy) - we call this Stokes’ U. We can check that U
transforms as
U ′ = −Q sin 2θ + U cos 2θ (3.10)
23
so that
Q′ = Q cos 2θ + U sin 2θ (3.11)
It is also possible to define a 4th parameter V :
V = 2I (E∗xEy) (3.12)
We state the definitions of the 4 quantities:
I = |Ex|2 + |Ey|2
Q = |Ex|2 − |Ey|2
U = 2ℜ (E∗xEy)
V = 2I (E∗xEy) (3.13)
Before we proceed, we note that these definitions in the xy co-ordinate system work well only
in the “flat-sky approximation”. For a general treatment of observations of radiation from the
sky, we would need to switch to the θ − φ co-ordinate system, where the definitions are as
follows
I = |Eθ|2 + |Eφ|2
Q = |Eθ|2 − |Eφ|2
U = 2ℜ (E∗θEφ)
V = 2I (E∗θEφ) (3.14)
In what follows, we will work with Ex and Ey - the generalization to the θ − φ co-ordinate
system is straightforward.
We state without proof that V is a measure of circular polarization, and is hence = zero
for the CMB. Also,
I2 = Q2 + U2 + V 2 (3.15)
An equivalent but more rigorous and interesting method of defining Stokes’ Parameters - the
Poincare Sphere - is described in §3.2 of[1].
3.1 Quasi-monochromatic EM waves
Regardless of the degree of polarization, the observable intensity of a wave is given by its
time-averaged Poynting Flux (PF henceforth). For the monochromatic case, the expression for
PF is straightforward:
I (P ) = ExE∗x + EyE
∗y (3.16)
24
For a non-monochromatic EM wave, the electric and magnetic fields can be expressed
most generally as an integral over frequency:
E (t) =
∫ ∞
0a (ν) ei[φ(ν)−2πνt]dν (3.17)
(Mathematically, this can be thought of as an infinite sum over a finite frequency range.)
Then, the PF is given by
I (P ) = 〈E (P, t)E∗ (P, t)〉 ≡⟨|Ex|2 + |Ey|2
⟩(3.18)
The rest of the Stokes’ Parameters can be defined exactly the same way. The 4 parameters are
I =⟨|Ex|2
⟩+⟨|Ey|2
⟩
Q =⟨|Ex|2
⟩−⟨|Ey|2
⟩
U = 2ℜ 〈E∗xEy〉
V = 2I 〈E∗xEy〉 (3.19)
We find from equations 3.19 that
I2 ≥ Q2 + U2 + V 2 (3.20)
We can thus define the degree of polarization as
p =
√
Q2 + U2 + V 2
I(3.21)
Notice that there are 4 parameters needed to describe a quasi-monochromatic wave and we have
defined exactly 4 Stokes’ parameters. These Stokes’ parameters can be measured by a variety
of instruments, and the 4 parameters needed to characterize the wave can then be derived from
them, if needed.
3.2 Spin Harmonics
Equations (3.10) and (3.11) can now be written using a compact notation. Using
cos θ =eiθ + e−iθ
2
sin θ =eiθ − e−iθ
2i(3.22)
we get(Q′ ± iU ′) = e∓i2θ (Q± iU) (3.23)
which is shorthand for: under a rotation by an angle θ, this is how the quantity (Q± iU)
transforms.
25
However, this is the definition of a spin-2 system! This implies, among other things,
that Q and U cannot be described by spherical harmonics, because they are not invariant under
rotation.
It turns out that there exists a class of functions that describe quantities with non-
zero spin - these are called spin-weighted harmonics or spin-harmonics and they are related to
spherical harmonics. We shall discuss them in brief here. For a more detailed and complete
treatment, see[3].
The basic idea is this - there exist “spin-s” harmonic functions, sYlm (θ, φ), which form a
complete, orthonormal basis on the sphere ∀|s| ≤ l:∫
dΩsY∗lm (θ, φ) sYlm (θ, φ) = δll′δmm′
∑
l
∑
m
(
sY∗lm (θ, φ) sYlm
(θ′, φ′
))= δ
(φ− φ′
)δ(cos θ − cos θ′
)(3.24)
For these spin-harmonic functions sYlm (θ, φ), there exist “spin-raising” and “spin-lowering”
operators, denoted here by ♯ and respectively, which, as the names suggest, “raise” or “lower”
the spin of a system. For instance, let a function fs = fs (θ, φ) have spin s and therefore
transform under a rotation ψ as
f ′s = e−isψfs (3.25)
Then,
(♯fs)′ = e−i(s+1)ψ (♯fs) (3.26)
and
(fs)′ = e−i(s−1)ψ (fs) (3.27)
Explicitly, the spin raising and lowering functions are[2, 4]:
♯ = − sins θ
[
∂θ +i
sin θ∂φ
]
sin−s θ (3.28)
= − sin−s θ
[
∂θ −i
sin θ∂φ
]
sins θ (3.29)
These two operators can be used to raise (lower) the spin of the functions −sYlm (θ, φ) (sYlm (θ, φ))
to exactly zero. In other words, these spin-weighted functions (spin-harmonics) can then be
expressed as
sYlm =
[(l − s)!(l + s)!
] 12
♯sYlm (3.30)
sYlm =
[(l + s)!
(l − s)!
] 12
(−1)s −sYlm (3.31)
26
These are spin-s harmonics. Spin-−s harmonics can be expressed in a similar way:
−sYlm =
[(l − s)!(l + s)!
] 12
(−1)s sYlm (3.32)
−sYlm =
[(l + s)!
(l − s)!
] 12
♯−sYlm (3.33)
We end by stating some useful properties of spin-harmonics that will come in handy later:
♯sYlm = [(l − s) (l + s+ 1)]12
s+1 Ylm (3.34)
sYlm = − [(l + s) (l − s+ 1)]12
s−1 Ylm (3.35)
We are now ready to apply this formalism to polarization parameters over the sky.
3.3 Application of Spin-harmonics to Polarization
Let a position on the sky be defined by the co-ordinates (θ, φ). Let the unit vector along
the line-of-sight be n. The unit vectors on the tangent plane at any point (θ, φ) are given by
(eθ, eφ). From equations (3.23)
(Q′ ± iU ′) = e∓i2θ (Q± iU) (3.36)
We can now expand Q± iU in spin-2 spherical harmonics:
(Q+ iU) (n) =∑
lm
a2,lm 2Ylm (n) (3.37)
(Q− iU) (n) =∑
lm
a−2,lm −2Ylm (n) (3.38)
Temperature is characterized by spherical harmonics, which are spin-0, i.e. invariant under
rotation:
T (n) =∑
lm
almYlm (n) (3.39)
Since we wish to work with spin-0 quantities, we first lower the spin of Q+ iU thus:
2 (Q+ iU) =∑
lm
a2,lm 2Ylm
=∑
lm
2
([(l + s)!
(l − s)!
] 12
(−1)2 −2Ylm
)
fromeq(3.31)
=∑
lm
[(l + s)!
(l − s)!
] 12
a2lmYlm (3.40)
27
Similarly,
♯2 (Q− iU) =∑
lm
a−2,lm−2Ylm
=∑
lm
♯2
([(l + s)!
(l − s)!
] 12
♯−2Ylm
)
from eq(3.30)
=∑
lm
[(l + s)!
(l − s)!
] 12
a−2lmYlm (3.41)
Now, since our aim is to work with spin-0 quantities constructed from (Q± iU), we can
in principle work with 2 (Q+ iU) and ♯2 (Q− iU). However, this is not a convenient choice for
the following reason. Q has parity even and U has parity odd, i.e. under a rotation n → −n,
we get Q→ Q, U → −U .
We would, therefore, like to work with two spin-0 quantities with well-defined parities,
i.e. one with parity even and the other with parity odd. However, the two quantities Q± iU do
not have this property, and so we cannot expect the parities of 2 (Q+ iU) and ♯2 (Q− iU) to
work out to be even/odd. We need to construct two other quantities, say E and B from these
two thus:
E = a2 (Q+ iU) + b♯2 (Q− iU) (even) (3.42)
B = c2 (Q+ iU) + d♯2 (Q− iU) (odd) (3.43)
where we need to determine the 4 quantities a, b, c, d.
We can write
E =(a2 + b♯2
)Q+ i
(a2 − b♯2
)U (3.44)
B =(c2 + d♯2
)Q+ i
(c2 − d♯2
)U (3.45)
Thus, we need even parities for a2 + b♯2 and c2 − d♯2 as well as odd parities for a2 − b♯2and c2 + d♯2. But under a parity transformation 2 → (−1)l ♯2 and ♯2 → (−1)l 2. Thus, we
will have all the required parities as required iff a = b and c = −d. In particular, we choose
a = b = −12 and c = −d = 1
2i for reasons of normalization [2]. The expressions for E and B are
E = −1
2
[2 (Q+ iU) + ♯2 (Q− iU)
](even) (3.46)
B =1
2i
[2 (Q+ iU)− ♯2 (Q− iU)
](odd) (3.47)
These are the so-called “E and B-modes” in CMB polarization. The reason for the choice of the
letters E and B is primarily their respective parities: E-modes have parity even, like electric
28
fields, and B-modes have parity odd, like magnetic fields. This relationship between E and
B-modes and Stokes’ parameters is derived in a different way in Appendix E.
E and B-modes can now be expanded in terms of spherical harmonics:
E =∑
lm
aElmYlm (3.48)
B =∑
lm
aBlmYlm (3.49)
where
aElm = −(l + 2)!
(l − 2)!
a2lm + a−2lm
2(3.50)
aBlm = −(l + 2)!
(l − 2)!
a2lm − a−2lm
2(3.51)
We can now define the power spectra that provide a statistical description of CMB tem-
perature and polarization anisotropies:
CXℓ =1
2ℓ+ 1
∑
m
〈a∗XℓmaXℓm〉 (3.52)
where X = T,E,B and 〈· · · 〉 ≡ ensemble average.
Here is another reason for working with E and B-modes instead of 2 (Q+ iU) and
♯2 (Q− iU): since E-modes are parity even and B-modes parity odd, the cross-correlations
BE, BT vanish. This means that we have to deal with fewer power spectra. Had we chosen
2 (Q+ iU) and ♯2 (Q− iU), we would have had to analyze atleast two more power spectra,
without gaining any additional physical insight.
3.4 Thomson Scattering
Scattering of a photon from an electron, when there is no change in photon energy, is
called Thomson scattering. Since electrons (and protons) are free before last scattering, this is
the dominant process that causes “communication” between photons and matter.
Thomson scattering cannot “produce” polarization if the incident radiation is completely
uniform. However, if there are anisotropies in the incident radiation (in particular, quadrupole
anisotropy, as we will later see) then the scattered radiation can have polarization. This is
the case with the CMB. In particular, a (temporally) thin slice of the last scattering surface
(LSS henceforth) causes polarization anisotropies to appear because of Thomson scattering
of radiation that has a quadrupole moment. Both temperature and polarization anisotropies
depend on evolution before the LSS, albeit differently - Thomson scattering causes polarization
29
right before recombination/LSS, but it also destroys polarization information before the LSS
([5] chapter 4).
To delve into the details of how Thomson scattering leads to E and B-modes, consider an
electron at the origin close to the LSS (or just before recombination). An incoming plane wave,
which consists of oscillating electric and magnetic fields will accelerate the electron which then
radiates EM waves. This can be viewed as scattering of radiation by an electron, and we will
refer to it as such.
Let us define co-ordinate systems first. Let x′ − y′ refer to the co-ordinate system of
the incoming (incident) radiation, which has wavevector ki. Let x− y refer to the co-ordinate
system of the scattered radiation, which has wavevector ks. Scattering is represented in the
figure below, which is a copy of the figure in [2].
If the electric field vector of the incoming linearly polarized wave is in the ki − ks plane
(we call this the “scattering plane”) , the differential cross-section of Thomson scattering is [6]
dσ
dΩ
∣∣∣∣POL
=3σT8π
∣∣∣ki · ks
∣∣∣
2(3.53)
where σT is the Thomson cross-section. If the elctric field is perpendicular to the scattering
plane,dσ
dΩ
∣∣∣∣POL
=3σT8π
(3.54)
where the solid angle dΩ = d(cos θ)dφ is defined in the usual spherical coordinates.
Now, consider unpolarized radiation, which is ≡ many linearly polarized waves at all
angles to each other. We can thus regard an incoming E-field as consisting of one E-field
polarized parallel to the scattering (i.e. ki − ks) plane and the other polarized perpendicular
to it. The net differential cross-section is just the sum of the two cross-sections:
dσ
dΩ
∣∣∣∣UNPOL
=3σT16π
(
1 +∣∣∣ki · ks
∣∣∣
2)
(3.55)
Thus, for right-angle scattering (i.e. θ = π2 ), scattered radiation is completely linearly polarized
perpendicular to the scattering plane. Eqs. (3.53) and (3.54) tell us what happens to I⊥ and
I‖ respectively. Expressions for these two, i.e. I⊥ and I‖ will immediately give us two Stokes’
parameters - I = I⊥ + I‖ and Q = I‖ − I⊥ (where the definition of Q is arbitrary up to an
overall -ve sign). The other two scatter as follows
U = U ′(
ks · ki)
(3.56)
V = V ′(
ks · ki)
(3.57)
30
But the CMB has V ≡ 0, and our choice of co-ordinate systems and geometry for this one
particular angle ensure that U = 0. Thus, the four Stokes’ parameters are
I (z) =3σT16π
(1 + cos2 θ′
)I ′θ′,φ′ (3.58)
Q (z) =3σT16π
sin2 θ′I ′θ′,φ′ (3.59)
U (z) = 0 (3.60)
V (z) ≡ 0 (3.61)
However, this geometry is defined only for φ′ = 0. For any general angle φ′ 6= 0, we will have
Q(z, φ′
)= Q (z) cos 2φ′ + U (z) sin 2φ′ = Q (z) cos 2φ′ (3.62)
U(z, φ′
)= Q (z) sin 2φ′ + U (z) cos 2φ′ = Q (z) sin 2φ′ (3.63)
We can now integrate over the solid angle to get
I (z) =3σT16π
∫
dΩ′ (1 + cos2 θ′)I ′θ′,φ′ (3.64)
Q (z) =3σT16π
∫
dΩ′ sin2 θ′ cos 2φ′I ′θ′,φ′ (3.65)
U (z) =3σT16π
∫
dΩ′ sin2 θ′ sin 2φ′I ′θ′,φ′ (3.66)
We can now expand the incoming intensity by spherical harmonics
I ′θ′,φ′ =∑
lm
a′lmYlm(θ′, φ′
)(3.67)
and remembering that sin2 θ = 1−cos 2θ2 and cos2 θ = 1+cos 2θ
2 , and that∫dΩ cosnθ sin qφYlm
picks out anq, we get that
Q± iU ∝ a′2±2 (3.68)
This is the result we had quoted earlier: polarization anisotropies in the CMB are caused only
because of the quadrupole moment in the radiation just before the LSS.
3.5 CMB Polarization and Cosmology
We have shown in the preceding sections that
1. Scattering produces polarization - both Q and U modes
2. Both E and B modes are thus produced in polarization due to scattering
31
Figure 3.1: B-mode level compared with the levels of E-modes, foregrounds and the lensing
contribution to B-modes[7]
While (2) is true in general, Hu and White [8] have shown that there are only only two
ways to produce B-modes: by having either tensor or vector perturbations before recombina-
tion (also, see fig.(3.2)). Both vector and tensor modes decay after recombination, but vector
modes decay faster such that none survive to the present time. Thus, tensor modes are the
only reason for B-modes to show up and a measurement of B-modes indicates the
presence of tensor modes in the early universe. These tensor modes are equivalent to (or
lead to) gravitational waves, which could only have been produced during inflation, according
to our present understanding. Schematically, the relation between spin-2 spherical harmonic
coefficients for B-modes and the energy scale of inflation quantified by the inflaton potential
(since it is the potential that drives inflation - see chapter 6 in [5]) is given by
±2aℓm ∝∫
jℓ (r) r2drk2dkVφTφ,aG (a) (3.69)
where
jℓ = Bessel function of order ℓ
r = Comoving distance
k = Wavenumber of a mode
Tφ,a = Transfer function : quantifies the change at a mode transition
G (a) = Growth function : describes behaviour of mode at late times
Vφ = The Inflaton potential (3.70)
32
Figure 3.2: Scalar and Tensor modes with corresponding E and B components.
Figure 3.3: The contribution of tensor modes to the temperature power spectrum (in green).
33
Figure 3.4: WMAP 1st year power spectrum, showing cosmic variance at low ℓs. Notice that
the cosmic variance shown here is significantly larger than the tensor mode contribution in
fig.(3.3)
The actual relation is more involved and is given by, e.g. eq.(70) in [2]. The parameter r in
fig.(3.1) is the ratio of the average power in tensor modes and the average power in the scalar
modes of perturbation in the early universe before recombination. Current estimates of the
highest value of r are ∼0.3. This corresponds to Vφ ∼1015GeV (i.e. the GUT scale), well out
of reach of the capabilities of current particle accelerators by more than a decade in order-of-
magnitude! This is the reason we need more sensitive cosmological probes of the
early universe.
Tensor modes in the early universe contribute to temperature and polarization power
spectra. However, they decay away exponentially with time, and the smaller the scale (i.e. the
higher the value of ℓ), the faster they decay away[5]. Thus, they have a small effect on the low-ℓ
part of the temperature power spectrum as shown in fig.(3.3). However, this is the part of the
power spectrum dominated by cosmic variance (the fact that we have only one sky to look at
implies that the sampling error is high at low ℓs) as shown in fig.(3.4), which is large enough
that the effect of the tensor modes cannot possibly be distinguished from that of scalar modes.
Thus, B-modes are the most direct indicators of cosmological inflation. The
expected level of B-mode signal is shown in fig.(3.1). However, all that is stated about the
connection between B-modes and cosmological inflation above holds true when there are no
34
foregrounds. There are two ways foregrounds can produce a spurious B-mode signal:
1. Emission: All processes that produce polarization, e.g. synchrotron can produce polarized
foregrounds in the presence of inhomogeneous magnetic fields.
2. Conversion: Gravitational lensing of the CMB by galaxies and galaxy clusters produces
distortions because lensing depends on the 2-D surface density, which is necessarily non-
uniform for clusters. This produces a “torsion” effect ([5] chapter 11) which converts a
portion of E-modes to B-modes. Since B-modes are an order of magnitude smaller, even
a small percentage of conversion leads to a large spurious B-mode effect.
These systematics will challenge the next generation of CMB polarization experiments.
In the next two chapters, we discuss results from recent experiments and the reason we
prefer interferometry over imaging.
35
Bibliography
[1] K. Rohlfs and T. L. Wilson, Tools of Radio Astronomy, Tools of Radio Astronomy, XVI,
423 pp. 127 figs., 20 tabs.. Springer-Verlag Berlin Heidelberg New York. Also Astronomy
and Astrophysics Library, 1996.
[2] Y.-T. Lin and B. D. Wandelt, “A beginner′s guide to the theory of CMB temperature and
polarization power spectra in the line-of-sight formalism,” Astroparticle Physics, vol. 25,
pp. 151–166, Mar. 2006.
[3] M. Zaldarriaga, Fluctuations in the cosmic microwave background, Ph.D. thesis, MAS-
SACHUSETTS INSTITUTE OF TECHNOLOGY, 1998.
[4] N. Goldberg, ,” J. Math. Phys., vol. 8, pp. 2155+, 1966.
[5] S. Dodelson, Modern cosmology, Modern cosmology / Scott Dodelson. Amsterdam (Nether-
lands): Academic Press. ISBN 0-12-219141-2, 2003, XIII + 440 p., 2003.
[6] G. B. Rybicki and A. P. Lightman, Radiative processes in astrophysics, New York, Wiley-
Interscience, 1979. 393 p., 1979.
[7] J. Bock, S. Church, M. Devlin, G. Hinshaw, A. Lange, A. Lee, L. Page, B. Partridge,
J. Ruhl, M. Tegmark, P. Timbie, R. Weiss, B. Winstein, and M. Zaldarriaga, “Task Force
on Cosmic Microwave Background Research,” ArXiv Astrophysics e-prints, Apr. 2006.
[8] W. Hu and M. White, “A CMB polarization primer,” New Astronomy, vol. 2, pp. 323–344,
Oct. 1997.
36
Chapter 4
Current status of CMB observations
Detection of CMB anisotropy has always been a challenge because of its low amplitude ∼ 10µK
out of a background of 2.7K. In fact, it took over two decades to discover anisotropies in the
CMB [1] from the time the CMB temperature was first measured in 1965 by Penzias and Wilson.
The reason is that CMB anisotropies are smaller than the CMB by a factor of ∼ 105, i.e. at the
level of ∼10µK. COBE (the COsmic Background Explorer) was the first experiment to measure
anisotropy in the CMB[1]. It was also the first experiment that proved conclusively that the
spectrum of the CMB is Planckian. Since COBE, a lot of CMB experiments (e.g. WMAP)
have constrained the CMB temperature power spectrum to exquisite precision. We discuss the
two most successful of these post-COBE experiments - WMAP and DASI.
4.1 Detectors
Detectors used in CMB cosmology can be divided into two broad categories:
Figure 4.1: A schematic of a bolometer, showing how it works.
37
Figure 4.2: A schematic of how a bolometer is used.
1. Coherent Receivers - These detect the amplitude and phase of the incoming signal.
This is why they are used in interferometric CMB probes. Amplifiers that use High
Electron Mobility Transistors (HEMTs) have been the coherent receivers of choice for
CMB experiments. However, their sensitivity is low above 100 GHz.
2. Incoherent detectors - These are total power detectors and are unable to detect phase.
Bolometers are an example of incoherent detectors. These consist of an absorber, a ther-
mometer, a cold reservoir and a thermal link from the absorber to the reservoir. The
radiation incident on the absorber warms it up and changes its temperature, which is
measured by the thermometer. This heat is then drained into the cold reservoir and the
cycle is repeated. Bolometers can work at any temperature; however, they are most sensi-
tive when cryo-cooled. Since bolometers cannot detect any phase or spectral information,
the instrument that they are part of has to incorporate some method that enables phase
detection. A novel technique that discusses one such arrangement is discussed in chap-
ter 6. Fig.(4.1) is a cartoon of a bolometer and fig.(4.2) shows how a typical bolometer
operates.
4.2 The Wilkinson Microwave Anisotropy Probe
Named in honor of its pioneer, Prof. David T. Wilkinson, the Wilkinson Microwave
Anisotropy Probe (WMAP) is a satellite that orbits the sun at the second Lagrange point.
WMAP uses differential radiometers, meaning that it differences the input from two horns
that point 140 away from each other. It takes six months to image the entire sky. WMAP’s
radiometers use a series of Orthomode Transducers, Hybrid T’s, HEMT amplifiers and phase
shifters. A pair of horns each has its polarization components separated, processed, amplified
and the signals recorded by a pair of detectors that are a combination of a single polarization
from both of the horns. Differencing the two detector signals then produces a result that is
proportional to the difference in polarization between the two horns. From these measurements
the WMAP team then reconstructs the amplitude and orientation of the CMB’s polarized signal
38
at each point on the sky [2]. WMAP was optimized for CMB temperature measurements, which
it has done with unprecedented precision. A table of cosmological parameters constrained by
WMAP is given below.
Figure 4.3: WMAP parameters.
While WMAP is a phenomenal success in terms of the temperature power spectrum
it has recovered, its sensitivity is not enough to enable it to detect B-modes. The upper
limit it has placed on B-modes is shown in fig.(2.3). Clearly, we need more sensitivity by a
factor of about 10 to get down to the expected B-mode levels shown. For this reason, a
combination of interferometry and bolometry is preferred over the techniques used
by the instruments mentioned in this chapter. We believe that this combination
will provide us with the sensitivity required to detect the weak B-mode signal.
4.3 The Degree Angular Scale Interferometer
The Degree Angular Scale Interferometer (DASI) is a 13 element co-planar interferometer
array. It operates with HEMTs in the 26-36 GHz range with the frequencies broken into ten,
one GHz wide bands[3]. DASI uses right and left circular polarizers to separate polarizations as
opposed to linear polarizers in WMAP. This turns out to be desirable for control of systematic
effects. DASI focused on 140 < l < 900. DASI found a 6.3 σ significant detection of EE power
spectrum and a 2.9 σ significant detection of the TE cross correlation power spectrum[4]. Data
from DASI enabled the detection of the second peak in the temperature power spectrum, but
shows no evidence of B-modes.
Most CMB experiments have used imaging techniques to estimate the power spectrum of
the CMB. We discuss power spectrum estimation from CMB imaging in some detail in chapter
9, but the essential steps are as follows:
1. Image the CMB using a certain scan-strategy and beam
39
2. Use a computationally optimal technique to extract the signal from the equation
d = A · s+ n (4.1)
where d is imaging data, A is a matrix that describes the beam and the scan strategy and
is called the “pointing matrix”, s the signal and n is the noise. This is where the image
is “pixelized”. Care has to be taken not to pixelize the data beyond the beam resolution.
3. Define likelihood. Using an optimal computational technique, estimate the values of Cℓs
that maximize the likelihood.
The trouble with imaging lies in points 2 and 3 above. In 2, we could decide to pixelize too
coarsely. This would certainly increase the signal-to-noise ratio, leading to lower errors in the
power spectrum, but the Cℓ estimates cannot be made for high ℓs. What is needed for future
CMB experiments is a system that can sample the power spectrum more directly and with
better control of systematics. The connection between an image and the power spectrum is
indorect; the power spectrum is the fourier transform of the two-point correlation function in
image space.
However, we show in chapter 5 that the power spectrum can also be expressed as a two-
point correlation function of the visibility (the output from one baseline of an interferometer).
This means that the interferometer samples ℓ-space directly - in fact, it turns out that every
unique baseline length corresponds to a unique ℓ-band where the width of the band depends
on the bandwidth of the instrument. Thus, there is never any confusion about the location of
the values of ℓ where the power spectrum is sampled - these are fixed in interferometry.
These and other characteristics of interferometry make its use preferable for CMB cos-
mology. The advantages of interferometry are discussed in detail in chapter 5. Additionally, in
chapter 6, we explore a new technique to utilize the information in a particular kind of inter-
ferometric system to enhance resolution in ℓ-space, leading to better estimates of Cℓs, as well
as better images.
40
Bibliography
[1] C. L. Bennett, A. J. Banday, K. M. Gorski, G. Hinshaw, P. Jackson, P. Keegstra, A. Kogut,
G. F. Smoot, D. T. Wilkinson, and E. L. Wright, “Four-Year COBE DMR Cosmic Mi-
crowave Background Observations: Maps and Basic Results,” ApJ Lett., vol. 464, pp. L1+,
June 1996.
[2] G. Hinshaw, M. R. Nolta, C. L. Bennett, R. Bean, O. Dore, M. R. Greason, M. Halpern, R. S.
Hill, N. Jarosik, A. Kogut, E. Komatsu, M. Limon, N. Odegard, S. S. Meyer, L. Page, H. V.
Peiris, D. N. Spergel, G. S. Tucker, L. Verde, J. L. Weiland, E. Wollack, and E. L. Wright,
“Three-Year Wilkinson Microwave Anisotropy Probe (WMAP) Observations: Temperature
Analysis,” ApJ Suppl., vol. 170, pp. 288–334, June 2007.
[3] N. W. Halverson, J. E. Carlstrom, M. Dragovan, W. L. Holzapfel, and J. Kovac, “DASI:
Degree Angular Scale Interferometer for imaging anisotropy in the cosmic microwave back-
ground,” in Proc. SPIE Vol. 3357, p. 416-423, Advanced Technology MMW, Radio, and
Terahertz Telescopes, Thomas G. Phillips; Ed., T. G. Phillips, Ed., July 1998, vol. 3357 of
Presented at the Society of Photo-Optical Instrumentation Engineers (SPIE) Conference,
pp. 416–423.
[4] E. M. Leitch, J. M. Kovac, N. W. Halverson, J. E. Carlstrom, C. Pryke, and M. W. E. Smith,
“Degree Angular Scale Interferometer 3 Year Cosmic Microwave Background Polarization
Results,” ApJ, vol. 624, pp. 10–20, May 2005.
41
Chapter 5
Interferometry
5.1 Overview
The observing wavelength of the Millimeter-wave Bolometric Interferometer (MBI)(∼3mm,
W-band) places it in the category of a radio telescope. However, MBI is not an imaging tele-
scope, but an interferometer. Even though it can be used as an imaging instrument (as shown
in the following chapter), we will discuss it here only as an interferometer.
Classically, interferometers were preferred over dish antennae for the following reason.
The angular resolution of a dish is given by θ ∼ λD . However, radio-waves are long-wavelength
and so for radio astronomy, we require huge single dishes for any reasonable angular resolution.
Interferometers are fundamentally different in that they produce diffraction patterns of the field-
of-view (‘FOV’ henceforth), and not images. Interferometers can achieve high angular resolution
by combining signals from widely separated small dishes. To a good first approximation, we can
treat them the way we treat diffraction slits. It will be shown later that there are fundamental
differences between a simple 1-slit diffraction of a point source and the diffraction of an extended
source through an interferometer.
5.2 The Mutual Coherence Function
While interferometry has many advantages (as described in §5.8), we need to be able to
relate the output of an interferometer to the image on the sky. It turns out that this connection
can be made via the study of coherence properties of the source. As will become clear later in
this chapter and the next, an interferometer makes use of both intensity and phase information,
so that this is not surprising. The following discussion on can be found in greater detail in several
texts, e.g. [1].
The simplest wave-field that can be imagined is the plane monochromatic wave. For
42
this wave, if we know the field at a point A, we can find the field at any other point B - all
we need is the phase-difference between A and B. This is a completely coherent wave-field.
The other extreme is that of a random polychromatic wave - for this wave, the field at any two
points is completely uncorrelated. In general, though, all real wave-fields lie between these two
extremes, i.e. they are partially coherent.
For a general wave, therefore, we require some measure of coherence. This measure must
be a time average, and we will want to compare the field at two different points, say P1 and P2.
Let the (Electric) fields at the two points be E (P1, t1) and E (P2, t2). The Mutual Coherence
Function is defined as
Γ (P1, P2, τ) = LimT→∞1
T
∫ T
−TE (P1, t)E
∗ (P2, t+ τ) ≡ 〈E (P1, t)E∗ (P2, t+ τ)〉 (5.1)
where we have used 〈〉 to indicate a time-average and recognized the fact that the difference in
the fields at the two different points due to a single point source is just a time-delay. Note that
the intensity is a special case of this definition:
I (P ) = Γ (P,P, 0) = 〈E (P, t)E∗ (P, t)〉 (5.2)
5.3 The Coherence Function of Extended Sources
Single point-sources have limited use in astrophysics - we need to extend the definition of
the Mutual Coherence Function to extended sources, especially if we want to study the CMB,
a diffuse source over the whole sky. We can do this - we just need to remember that any two
points on an extended source, which is necessarily very distant, are completely independent. In
short, the source is spatially incoherent.
Consider two waves originating at two different points on an extended source and therefore
with two different wavevectors ka and kb, incident on the observing plane. The resulting field
at any point in the observing plane is given by E = Ea + Eb. The Mutual Coherence function
for two points on the observing plane then is
Γ (P1, P2, τ) = 〈E (P1, t1)E∗ (P2, t2)〉
= 〈[Ea (P1, t1) + Eb (P1, t1)] [E∗a (P2, t2) + E∗
b (P2, t2)]〉= 〈Ea (P1, t1)E
∗a (P2, t2)〉+ 〈Eb (P1, t1)E
∗b (P2, t2)〉
+ 〈Ea (P1, t1)E∗b (P2, t2)〉+ 〈Eb (P1, t1)E
∗a (P2, t2)〉
︸ ︷︷ ︸
=0
(5.3)
The last two terms are zero because of the assured spatial incoherence of the source. Thus, the
Mutual Coherence Function for two points a and b on the source becomes
Γ (P1, P2, τ) = 〈Ea (P1, t1)E∗a (P2, t2)〉+ 〈Eb (P1, t1)E
∗b (P2, t2)〉 (5.4)
43
We want to define the M.C.F. for the whole source; but now we can imagine the extended
source being made of a huge number of point sources, and sum over all of them thus:
Γ (P1, P2, τ) =
N∑
i=1
Ei (P1, t)E∗i (P2, t+ τ) (5.5)
It is more practical to define it as an integral over the FOV:
Γ (P1, P2, τ) =1
∆Ω
∫
E (P1, t)E∗ (P2, t+ τ) dΩ (5.6)
In other words, for an interferometer, we need to sample the field from a distant source at two
different points on the observation plane (which we can do with antennae) and then multiply
these together. This can be achieved by the following simple setup:
C
Detector
E1E2
Figure 5.1: A general interferometric setup
In the combiner marked C, the two electric fields E1 and E2 are just added. The detector then
squares this sum to get
(E1 + E2) (E∗1 + E∗
2) = |E1|2 + |E2|2︸ ︷︷ ︸
Total Power
+E1E∗2 +E∗
1E2︸ ︷︷ ︸
Visibility
(5.7)
44
where only the last two terms indicate interference. This is the basic idea in interferometry,
and we will keep using this schematic to treat interferometry in the following chapters.
Also, we have not yet assigned a name to the last two terms on the RHS in eq(5.7); it is
called the “Visibility”. What follows is a mathematical derivation of the Visibility expressed
as a functional transform of the intensity pattern on the sky.
5.4 Visibility as a function on Intensity pattern on the sky
Consider two horn antennas / radio telescopes separated by a distance B. These define
one baseline. Let them be oriented as shown to receive a signal from an extended object in the
sky.
Then, as shown in figure 2, consider a single point on the extended object or source, P .
Let the distance from P to each of the telescopes be d1 and d2.
The reason we consider a single point on the source is that rays originating in different
parts of the source are not mutually coherent, i.e. their relative phases are random. So it doesnt
make sense, for instance, to calculate the net electric field at one telescope due to rays from the
object as a whole. We do need to make an image of the whole object, however, and for this
reason we scan across it with our baseline. Mathematically, this is equivalent to calculating the
net field and then integrating over the source.
To distant s
ource
d2−d1
B
Figure 5.2: One baseline
45
Let (x, y) be the co-ordinate system on the source. Let I (x, y) be the intensity as a
function of position on the source, and let E (x, y) be the electric field due to the source on
both the antennas. D is the distance between the telescopes and the source.
There is a time delay of (d2−d1)c between the signals received by the two telescopes (see
fig.(5.2)). The electric fields at the two telescopes can be written as:
E1 = E (x, y) eiω
“
t− d1c
”
(5.8)
E2 = E (x, y) eiω
“
t− d2c
”
(5.9)
Consider now the product of these two electric fields. In brief, it is only by multiplying
the two signals that we can get interference, as discussed in §5.3.
E1E∗2 ∼ E2 (x, y) e
iω“
t− d1c−td2
c+
”
= I (x, y) eiωc(d2−d1) (5.10)
Looking at figure 3, the two distances d1 and d2 can be written as
d21 =
(
x− 1
2B
)2
+ y2 +D2 = D2
1 +y2
D2+
(
x− 12B
D
)2
(5.11)
d22 =
(
x+1
2B
)2
+ y2 +D2 = D2
1 +y2
D2+
(
x+ 12B
D
)2
(5.12)
Clearly, BD ≪ 1; in other words, the distance between us and the source is much greater
than the length of the baseline. We assume that xD ,
yD ≪ 1, or that the size of the source is
much smaller than the distance between us and the source.
Then,
d1 ≃ D[
1 +1
2
y2
D2+
1
2
(
x− 12B
D
)]
(5.13)
d2 ≃ D[
1 +1
2
y2
D2+
1
2
(
x+ 12B
D
)]
(5.14)
so that
d2 − d1 = D.1
2.
1
D2. 2. 2.
1
2. Bx =
Bx
D(5.15)
46
⇒ E1E∗2 ∼ I (x, y) ei
ωc
BxD = I (x, y) ei2π
BxDλ (5.16)
All this is fine, but we would like to work in terms of angles, so we change variables to α = xD ,
β = yD so that
E1E∗2 ∼ I (α, β) ei2π
Bλα (5.17)
However, the basis (α, β) is relative to the source, and not the observation plane or the sky. We
therefore slip the formalism into something more comfortable and convenient - the equitorial
co-ordinates, thus:
α = cos θx′ + sin θy′ (5.18)
β = sin θx′ + cos θy′ (5.19)
Then,
E1E∗2 ∼ I
(x′, y′
)ei2π
Bλ
(cos θx′+sin θy′) (5.20)
Write
u =B
λcos θ (5.21)
v =B
λsin θ (5.22)
These are what are called the u, v co-ordinates. Now we integrate over x′ and y′ to get:∫ ∫
E1E∗2dx
′dy′ = K
∫ ∫
I(x′, y′
)ei2π(ux′+vy′)dx′dy′ (5.23)
But the right side of the equation is just the fourier transform of I (x′, y′). The left hand side
is what we call visibility.
In this discussion, we ignored the effect of the diffraction patterns of the telescopes /
antennas themselves. We can introduce it in equations 1 and 2 above:
E1 =√
A (x, y)E (x, y) eiω
“
t− d1c
”
(5.24)
E2 =√
A (x, y)E (x, y) eiω
“
t− d2c
”
(5.25)
and then follow it through, to get:∫ ∫
E1E∗2dx
′dy′ = K
∫ ∫
A(x′, y′
)I(x′, y′
)ei2π(ux′+vy′)dx′dy′ (5.26)
We do not need to make the small-sky patch approximation to get a useful result, though.
It just so happens that in this approximation, the output of the interferometer, the “Visibility”,
47
or as we defined it earlier, the “Mutual Coherence Function” happens to be the fourier transform
of the intensity pattern on the sky. If the approximation is relaxed, the visibility becomes a
general mathematical transform of the intensity pattern, and not necessarily a fourier transform.
The main idea is that one baseline, i.e. one pair of antennas gives us a single point in
the fourier transform of the intensity pattern on the sky, convolved, of course, with the beam.
To get more distinct points, we need more baselines, each with a different length or orientation.
The most general form of eq(5.26) is then
∫ ∫
E1E∗2dx
′dy′ = K
∫ ∫
A(x′, y′
)I(x′, y′
)ei2πu·xdx′dy′ (5.27)
(x′ and y′ are really θ and φ on the sky) where u is the vector uu + vv and the unit vectors
u and v span what is called the “u-v” plane, which is the fourier-transform equivalent of the
θ − φ plane on the sky. What this means is that visibility, which is a function of u and v, i.e.
V = f (u, v) is the fourier transform of the image (or intensity pattern) on the sky (for small
patches) i.e.
V (u, v) = F (I (θ, φ)) (5.28)
which is exactly what eq(5.27) says above.
Then, to find an expression for |u|, consider eqs(5.21) and (5.22):
|u| = B
λ(5.29)
, that is, |u| ∝ the baseline length. For the same baseline, though, a different orientation will
give us the same |u| but different values of u and v. What this means is that if we were to track
a single patch on the sky and rotate the instrument w.r.t. the patch, we will be observing at
all those points in the uv plane that lie on a circle with the radius |u| = Bλ .
Now, temperature on the sky can be expanded out as
T (θ, φ) =∑
ℓ
∑
m
aℓmYℓm (θ, φ) (5.30)
The power spectrum is defined as
Cℓ = 〈aℓma∗ℓ′m′〉 δℓℓ′δmm′ (5.31)
Eqs.(5.30) and (5.31) are meant for the full-sky case. However, the quantity that the interfer-
ometer measures, i.e. the visibility, is the flat-sky equivalent of aℓm. Therefore, in the flat-sky
case, the power spectrum is just the two-point correlation function of the visibility.
Recall that the power spectrum is the fourier transform of the two-point correlation
function [Chapter 1]. It can also be written as the two-point correlation of the fourier transform
48
of the intensity pattern on the sky (later in this chapter). But the visibility is the FT of the sky
image! Therefore, with an interferometer, all we need to do is find the two-point correlations
between observations from different baselines!
Furthermore, we are looking for the two-point correlation function of the deviation from
the mean temperature. Now every baseline, which has already defined an angular scale on the
sky, gives us this correlation between several pairs of points separated at a certain angle. If we
find the variance of these values, that will be the power spectrum we are after.
The power spectrum is just the variance of the visibility and visibility is the
output of the interferometer. Naturally, in real instruments one has to extract the visibility
from the detectors.
The foregoing discussion is summarized in the statement Interferometers directly
measure fourier modes on the sky.
We describe a few characteristics of the u-v plane in the next section and discuss polarized
interferometry in the following section.
5.5 Interlude: A small discussion on interferometry
As was shown in §5.4, visibility (i.e. the output of an interferometer) is the fourier
transform of the image on the sky. It is useful then to compare what an imager and an
interferometer “see” - both in the image plane and the fourier plane (which we shall refer to as
the “u-v plane” henceforth). Fig.(5.4) shows this comparison. From §5.4, recall that a single
visibility from one baseline of an interferometer is one point in the u-v plane. The length of
the baseline determines the spatial frequency of the fringe, which is the same as a length in
the u-v plane1. The exact form of this relationship is as follows. For a baseline B, the angular
resolution is θ = λB . Then, the value of ℓ that this corresponds to is ℓ = 2π
θ = 2πBλ , i.e. ℓ ∼ B,
or, longer baselines correspond to higher ℓ-modes or higher angular frequencies. By comparison,
an imager is an interferometer with B ≡ 0. This is illustrated in fig.(5.4).
It is also pertinent to mention that each baseline produces its own fringe pattern. The
fourier transform of a real quantity is necessarily complex (property of FTs) and therefore the
u-v plane image is always complex. This means that the fringes have a real and an imaginary
part. The precise combination depends on the phase of the fringe, which is determined by the
relative orientation of the baseline and the sky. Thus, rotating the instrument through 360
1Just as frequencies appear as lengths in the fourier plane
49
Figure 5.3: Schematic of interferomentric observation - one baseline. The two antennas are at
G and D.
50
Figure 5.4: The u-v plane coverage of an imager and an interferometer. Figure courtesy Dr.
Carolina Calderon[2].
w.r.t. the sky allows each baseline to cover a circular ring in the u-v plane and shifts the phase
of the corresponding fringe continuously through 360. But what effect do these fringes have
on the image? Effectively, every baseline chooses a Fourier mode from the image on the sky. In
other words, each baseline modulates the image with a fringe pattern whose spatial frequency
depends on the length of the baseline and whose phase depends on instrument orientation.
Let us look at several different pixels in the u-v plane. In fig.(5.5), vectors marked “3” and
“4” clearly have different lengths. These different lengths imply different lengths of baselines
and hence different spatial frequencies of the fringe pattern on the focal plane. However, “1”
and “2” are of equal length, and differ only in their angular position. This angle in the u-v
plane corresponds to a phase in the image plane. In other words, this angle represents the
phase of the fringe pattern produced by a particular baseline. The only way that a baseline can
produce fringe patterns that differ in phase is by rotating it w.r.t. the image on the sky. Thus,
angle in the u-v plane is the same as the phase of the corresponding fringe or the orientation
of the instrument w.r.t. the sky.
Figs.(5.6) and (5.7) illustrate that the image plane and the u-v plane are “inverses” of
each other in a sense - the smaller dimension in the image plane (a pixel) becomes the larger
dimension in the u-v plane and vice-versa.
51
Figure 5.5: The u-v plane with several pixels. Pixels marked “1” and “2” have the same distance
from the origin, but differ only in their angular position (this corresponds to the phase of the
fringe). Pixels marked “3” and “4” differ in their distance from the origin and angular position.
5.6 Visibility, the power spectrum and the beam
In this section, we prove the claim in §5.4 that the power spectrum is the variance of
visibility. The effect of the beam has to be taken into account, and it turns out that a new
quantity, called the Window Function needs to be defined.
For an imager, the output signal is given by
si =
∫
dnΘ (n)Bi (n) (5.32)
This is equivalent to the expression:∫ ∫
E1E∗2dx
′dy′ = K
∫ ∫
A(x′, y′
)I(x′, y′
)ei2π(ux′+vy′)dx′dy′ (5.33)
in from §5.1 earlier in this chapter; which
⇒ Vi ≡ si ≡∫
dnE1E∗2 = K
∫
dnAi (n) I (n) ei2πu·n (5.34)
Remember, however, that this is ONE pair of antennas, and so ONE baseline. u
is therefore fixed; it should be labelled ui So,
si ≡ Vi = K
∫
dnAi (n) I (n) ei2πui·n (5.35)
52
Figure 5.6: FOV and pixel in the image
plane. In this figure and the one alongside,
red represents a pixel in image space and
green the FOV in image space.
Figure 5.7: The same FOV and pixel as in
the previous figure. The size of the inter-
ferometer’s FOV determines its resolution
in u-v space. Notice that the two objects
have swapped their dimensions. If N pix-
els fit in the FOV in the image plane, then
the u-v plane is also divided into N pixels
whose size is inversely proportional to the
FOV.
Now,
ICMB (n, ν) ≃ B (ν, T0) +∂B
∂T|T0T0
∆T
T0(5.36)
from Jaiseung Kim’s thesis [3]. We will consider only the perturbed part, and therefore
K =∂B
∂T|T0T0 (5.37)
but let us NOT write this down and use K instead.
Let us rewrite (rework) Cs,ij =⟨
ViV∗j
⟩
now. Remember the full-sky decomposition of
temperature anisotropies:
∆T
T(n) =
∞∑
l=1
+l∑
m=−lalmYlm (n) ≡
∑
lm
almYlm (n) (5.38)
so then⟨
ViV∗j
⟩
K2T 2=
∫
dn
∫
dn′Ai (n)A∗j
(n′)∑
lm
∑
l′m′
Ylm (n)Y ∗l′m′
(n′) 〈alma∗l′m′〉 ei2π(~ui·n−~uj ·n′) (5.39)
But remember, alm’s are like fourier coefficients, and that Power Spectrum ≡ square of fourier
coefficients. More precisely,
〈alma∗l′m′〉 = Clδll′δmm′ (5.40)
53
⇒
⟨
ViV∗j
⟩
K2T 2=
∫
dn
∫
dn′Ai (n)A∗j
(n′)∑
l
∑
m
ClYlm (n)Y ∗lm
(n′) ei2π(~ui·n−~uj ·n′) (5.41)
But from properties of spherical harmonics,
Ylm (n)Y ∗lm
(n′) =
2l + 1
4πPl(n · n′) (5.42)
⇒
⟨
ViV∗j
⟩
K2T 2=
∫
dn
∫
dn′Ai (n)A∗j
(n′)∑
l
2l + 1
4πClPl
(n · n′) ei2π(~ui·n−~uj ·n′) (5.43)
Again, we remind ourselves that Viα Visibility, so that |Vi|2 should give us Cl× another quantity.
⇒
⟨
ViV∗j
⟩
K2T 2=∑
l
(2l + 1
4π
)
Cl
∫
dn
∫
dn′Ai (n)A∗j
(n′)Pl
(n · n′) ei2π(~ui·n−~uj ·n′) (5.44)
For various reasons, we always prefer to plot l(l+1)2π Cl instead of Cl. Let us therefore manipulate
this equation to get those factors; i.e. multiply and divide by l (l + 1)
⇒
⟨
ViV∗j
⟩
K2T 2=∑
l
(l (l + 1)
2π
)
Cl2l + 1
2l (l + 1)
∫
dn
∫
dn′Ai (n)A∗j
(n′)Pl
(n · n′) ei2π(~ui·n−~uj ·n′)
(5.45)
≡∑
l
l (ℓ+ l)
2πCl (2l + 1)Wij,l
1
2l (l + 1)(5.46)
where we have chosen l(l+1)2π Cl ≡ Cl instead of Cl - the reason is buried in the math in White et
al [4], and this is the reason we end up with 12l(l+1) - and we have defined
Wij,l =
∫
dn
∫
dn′Ai (n)A∗j
(n′)Pl
(n · n′) ei2π(~ui·n−~uj ·n′) (5.47)
as the WINDOW FUNCTION. It really is just the fraction of Cl that the antenna ”lets in”
at every l. This expression is completely general, i.e. without any approximations:
⟨
ViV∗j
⟩
K2T 2=∑
l
(2l + 1) ClWij,l1
2l (l + 1)(5.48)
We could also have defined the Window function another way:
Wij,l =1
2l (l + 1)Wij,l (5.49)
54
where Wij,l is now the ‘net’ Window function, and we have⟨
ViV∗j
⟩
K2T 2=∑
l
(2l + 1) ClWij,l (5.50)
Aside: this implies that the height of the ‘net’ window function decreases with increasing
l. Physically, what this means is that by increasing the baseline, the amount of light we let
into the telescope system decreases compared to the amount that would have been let in had
the telescope been a filled aperture with the length of baseline as the diameter. Succinctly, the
‘filling-factor’ (the ratio of the total area of antennas in one baseline to the area of a dish with
the baseline as a diameter) decreses with increasing l (equivalent to increasing the baseline).
The foregoing argument implies thatWij,l ∼ l−2, which is indeed the case, as can be seen
in the above expression.
In the following, we will derive an expression for Wij,l, and will finally write down the
expression for Wij,l at the end of this section.
We can now apply the FLAT-SKY APPROXIMATION (small θ, large l):
1. x and x′ are vectors in directions n n′ respectively. Approximating them to 2-D vectors
on the sky ⇒Pl(n · n′) ≃ Pl
(cos|x− x′|
)(5.51)
and ∫
dn
∫
dn′ ≃∫
d2x
∫
d2x′ (5.52)
⇒Wij,l =
∫
d2x
∫
d2x′Ai (x)A∗j
(x′)Pl
(cos|x− x′|
)ei2π(~ui·x−~uj ·x′) (5.53)
This is the modified version of eq. 11.40 in Dodelson.
2.
Pl(cos|x− x′|
)→ J0
(l|x− x′|
)≡ 1
2π
∫ 2π
0dφe−il|x−x′| cosφ (5.54)
where the last equality is the definition of the Bessel function of order l.
Now, l|x − x′| cosφ can be written as l · (x− x′), because φ is really just a parameter
we are integrating over. We are therefore free to provide our own physical interpretation of it.
The one convenient for us is: φ is an angle in l-space, and is defined as φ = tan−1 lylx
and then
l =√
l2x + l2y.
So then
⇒ Wij,l =1
2π
∫
d2x
∫
d2x′Ai (x)A∗j
(x′) 1
2π
∫ 2π
0dφe−il·(x−x′)ei2π(~ui·x−~uj ·x′) (5.55)
55
Again, remember, 2πu = l, so that 2πui = li and 2πuj = lj
Wij,l =1
2π
∫
d2x
∫
d2x′Ai (x)A∗j
(x′) 1
2π
∫ 2π
0dφe−i((l−li)·x−(lj−l)·x′) (5.56)
⇒Wij,l =1
2π
∫ 2π
0dφ
[∫
d2x′A∗j
(x′) e+il·x
′
e−ilj·x′
] [∫
d2xAi (x) e−il·xe+ili·x]
(5.57)
The quantity in the left square-bracket is the fourier transform of A∗j (x)× a phase factor and
the one inside the right square bracket its complex conjugate. Recall that F((f(x)e−ik0x
)=
f (k − k0). If we denote the F (Bi (x))) as Aj (l), we end up with
⇒Wij,l =1
2π
∫ 2π
0dφA∗
j (l− lj) Ai (li − l) (5.58)
5.6.1 Window function for one baseline in an interferometer
Suppose we just want to calculate Wii,l for gaussian beams. In that case,
Ai (x) ≡ A∗j (x) =
1
2πσ2e
−x2−y2
2σ2 (5.59)
and
A (l) = e−l2σ2
2 (5.60)
⇒ |A (l − li) |2 = e−(l−li)2σ2 ≡Wii,l (5.61)
where the last equality holds because there is no angular dependance in a gaussian distribution.
5.6.2 Effect of finite frequency bandwidth on width of window function
Our beam-combiner-detector system works in the following way. For two antennas which
output electric fields E1 and E2, it first adds them and then squares the sum, so that what we
record in the detector is (E1 + E2) (E∗1 + E∗
2). This is all very well, but when there is a finite
bandwidth, the detector sums this up over all frequencies, and we get∫dν (E1 + E2) (E∗
1 + E∗2)
instead.
Now recall from §5.1 that E1E∗2 is proportional to the visibility. Therefore, we need to
integrate over all frequencies to get the visibility and the expression for a single visibility now
becomes
Vi = K
∫
dν
∫
dnA (n) I (n) ei2πui·n (5.62)
56
So, we can follow through the entire last section with two integrals over the previous expression
thus:⟨
ViV∗j
⟩
K2T 2=
∫
dν
∫
dν ′∫
dn
∫
dn′Ai (n)A∗j
(n′)∑
lm
∑
l′m′
Ylm (n)Y ∗l′m′
(n′) 〈alma∗l′m′〉 ei2π(~ui·n−~uj ·n′)
(5.63)
Now, from the 1-D relation for angular resolution
∆θ =λ
B(5.64)
- where in this case, B is the baseline - we can deduce the following:
l ∼ 2π
∆θ= 2π
B
λ= 2π
B
cν ⇒ dν =
c
2π
1
Bdl (5.65)
Substituting this in the above expression for the window function, we get:
Wij,l =1
BiBj
(c
(2π)2
)2 ∫ 2π
0dφ
∫
dl
∫
dl′A∗j (l− lj) Ai (li − l) (5.66)
This is the general expression for two different baselines. But what are the limits of integration
over l and l′? Define the center frequency to be ν0. Let the band be defined by the lower and
upper frequencies ν0−∆ν and ν0 + ∆ν respectively. The lower and upper limits of integration
for the baseline labelled i are then
li1 = 2πBic
(ν0 −∆ν) (5.67)
and
li2 = 2πBic
(ν0 + ∆ν) (5.68)
respectively, and similarly also for the baseline labelled j.
5.7 Visibility in the polarized case
We discuss now how the output of a polarized interferometer is related to the Stokes’
parameters.
Consider two horn antennas / radio telescopes separated by a distance B. These define
one baseline. Let them be oriented to receive a signal from an extended object in the sky.
Then, the electric fields at the two antennas are the same except for a phase factor that
depends on their separation: B sin θ 2πλ . Both E1 and E2 can be written in terms of x- and y-
polarized states thus:
E1 = Exx +Eyy (5.69)
57
E2 = (Exx + Eyy) e−i2πλBα (5.70)
The reason we do this is that we wish to express all measurable quantities in terms of the Q,U
parameters, which are very easily expressed in terms of E1 and E2.
In general, waveguides can be coupled to some combination of linear polarizations, so:
E1 = a1Exx + a2Eyy (5.71)
E2 = (b1Exx + b2Eyy) e−i2πλBα (5.72)
If a2 = b2 = 0, then one linear polarization is chosen; if a2a1
= ±i then a circular polarization is
chosen.
The Stokes’ parameters are defined as follows:
T =⟨|Ex|2 + |Ey|2
⟩(5.73)
Q =⟨|Ex|2 − |Ey|2
⟩(5.74)
U = 〈2ℜ (E∗xEy)〉 (5.75)
V = 〈2I (E∗xEy)〉 (5.76)
Then, Ex and Ey can be expressed in terms of the Stokes’ parameters:
|Ex|2 =1
2(T +Q) (5.77)
|Ey|2 =1
2(T −Q) (5.78)
E∗xEy =
1
2(U + iV ) (5.79)
ExE∗y =
1
2(U − iV ) (5.80)
Then, the output of the multiplying interferometer is:
〈E∗1E2〉 =
1
2e−i
2πλBα [a∗1b1 (T +Q) + a2b
∗2 (T −Q) + a∗1b2 (U − iV ) + a∗2b1 (U + iV )] (5.81)
Simplifying,
〈E∗1E2〉 =
1
2e−i
2πλBα [(a∗1b1 + a∗2b2)T + (a∗1b1 − a∗2b2)Q+ (a∗1b2 + a∗2b1)U + i (a∗2b1 − a∗1b2)V ]
(5.82)
We can now do an integration over x′ and y′, as shown in §5.1, to end up with:∫ ∫
〈E∗1E2〉 dx′dy′ = K
∫ ∫
A(x′, y′
)[(a∗1b1 + a∗2b2)T + (a∗1b1 − a∗2b2)Q
+ (a∗1b2 + a∗2b1)U + i (a∗2b1 − a∗1b2)V ]
58
or
V = KA ∗[
(a∗1b1 + a∗2b2) T + (a∗1b1 − a∗2b2) Q+ (a∗1b2 + a∗2b1) U + i (a∗2b1 − a∗1b2) V]
(5.83)
where A(x′, y′) is the antenna pattern, tildes denote a fourier transform and asterisk denotes
convolution.
We need to assign one kind of polarization, i.e. either linear or circular, in order to figure
out the four different visibilities. Let us consider two horns; one that outputs left circular
polarization and the other that outputs right. Define left and right polarization states thus:
R ≡ a2
a1= −i (5.84)
L ≡ a2
a1= i (5.85)
for E1 and
R ≡ b2b1
= −i (5.86)
L ≡ b2b1
= i (5.87)
for E2.
Substituting these values in the above equations leads to:
VRL = KA ∗ ˜(Q+ iU) (5.88)
Similarly, we get the other visibilities:
VLR = KA ∗ ˜(Q− iU) (5.89)
VRR = KA ∗ ˜(T + V ) (5.90)
VLL = KA ∗ ˜(T − V ) (5.91)
Eqs 19-22 are visibilities for the circular polarization case. For linear polarization,
X ≡ a2
a1= 0 (5.92)
Y ≡ a1
a2= 0 (5.93)
for E1 and
X ≡ b2b1
= 0 (5.94)
Y ≡ b1b2
= 0 (5.95)
59
for E2 and so
VXY = KA ∗ ˜(U + iV ) (5.96)
VY X = KA ∗ ˜(U − iV ) (5.97)
VXX = KA ∗ ˜(T +Q) (5.98)
VY Y = KA ∗ ˜(T −Q) (5.99)
5.8 Why Use an Interferometer?
The preceding section describes the output of an interferometer and how it relates to
the power spectrum. But why build an interferometer instead of a more traditional imaging
system for studying CMB polarization? There are a number of reasons that have motivated the
construction of the interferometers listed in Table 5.2. The main reason is to control systematic
effects, which in some cases are more manageable than in imaging systems. There are additional
factors, especially aperture size, that favor interferometric approaches over imaging for space-
based systems. For equivalent angular resolution, an interferometer can be substantially simpler
and less costly than a single aperture.
5.8.1 Angular Resolution
For a monolithic dish of diameter, D, equal to the length of a two-element interferometer
baseline, B, the interferometer has angular resolution (fringe spacing) roughly twice as good
as that of the monolithic dish. The reason for this difference in angular resolution is that
the filled dish is dominated by spacings that are much smaller than the aperture diameter.
The full width to the first zero for a uniformly illuminated aperture of diameter D is 2.4λ/D.
The full width to the first zero for a two-element interferometer, when the baseline B is much
larger than the individual aperture diameter, is λ/B. It is helpful to consider the difference
between the systems in l-space as well. For an interferometer the window function peaks at
l = 2πB/λ. For an imaging system with a Gaussian beam the window function is Wl = e−l2σ2
.
The beamwidth σ = 0.42 FWHM and FWHM = (1.02 + 0.0135Te)λ/D where Te is the edge
taper of the antenna in dB [5]. For an edge taper of 40 dB (typical for CMB instruments),
FWHM = 1.51λ/D, σ = 0.66λ/D and the window function falls to 10% of its peak value at
l = 2.29D/λ, which is less than half of the peak l-value for an interferometer baseline of the
same size.
This angular resolution factor is important because the size of the aperture is a cost-driver
for the EIP mission. Angular resolution is important for CMB polarization measurements in
two ways. First, imperfections in the shape and pointing of beams couple the CMB temperature
anisotropy into false polarization signals. These problems can be reduced significantly if the
60
CMB is smooth on the scale of the beam size, which happens for beams smaller than ∼10′ [6].
Second, removing contamination of the tensor B-mode signal by B-modes from weak lensing
requires maps of the lensing at higher angular resolution than the scale at which the tensor
B-modes peak [7].
5.8.2 No Rapid Chopping and Scanning
Imaging systems with either coherent or incoherent detectors typically use some form of
“chopping,” either by nutating a secondary mirror or by steering the entire primary at a rate
faster than the 1/f noise in the atmosphere and detectors. Similar approaches are used with
arrays of detectors. Since an interferometer does not require this rapid chopping, the time
constants of the bolometers used can be relatively long. When using an imaging system to
form a two-dimensional (2D) map with minimal striping or other artifacts, the scan method
must move the beam (or beams) on the sky at a rapid rate. Interferometers provide direct
2D imaging and do not require such scanning strategies. In the interferometer, only correlated
signals are detected, so it has reduced sensitivity to changes in the total power signal absorbed
by the detectors [4].
5.8.3 Clean Optics
The simplicity of an interferometric optical system eliminates numerous systematic prob-
lems that plague any imaging optical system. Instead of a single reflector antenna, the in-
terferometers we have studied use arrays of corrugated horn antennas. These antennas have
extremely low sidelobes and have easily calculable, symmetric beam patterns. Furthermore,
there are no reflections from optical surfaces to induce spurious instrumental polarization, an
unavoidable problem for any system with imaging optics [8, 9]. In principle, one could con-
struct an imaging instrument without reflective optics; an array of horn antennas, each coupled
directly to a polarimeter, could view the sky directly. Each horn aperture would be sized to
provide the required angular resolution. However, such a system uses the aperture plane ineffi-
ciently. A single horn antenna in such an imaging system will have angular resolution ∼ 2λ/D,
where D is the horn diameter. An N - element interferometric horn array that achieves the
same angular resolution will have a maximum baseline length of B = D (and require the same
aperture size), but will collect N modes of radiation from the sky and hence be more sensitive.
Another advantage over an imaging system is the absence of aberrations from off-axis
pixels: all feed elements are equivalent for the interferometer. In contrast to an imaging system,
the field-of-view (FOV) of an interferometer is determined by the primary beamwidth of the
array elements, not by beam distortion and cross-polarization at the edge of the focal plane.
One can choose to increase the sensitivity of the instrument by collecting more modes (optical
61
Table 5.1: Comparison of various optical designs for the EIP. To achieve the same angular
resolution each instrument allows different amounts of throughput (number of modes) and
requires different aperture diameters, D. For the Gregorian the edge taper on the primary
mirror illumination is assumed to be −40dB, the diameter of the FOV is given in degrees and
the number of modes is approximately [FOV/(angular resolution)]2, assuming all the modes
reaching the focal plane are coupled to detectors. For the imaging horn array, the horn diameter
= D. For the interferometric horn array, D = B, the diameter of a close-packed array of horns,
each of diameter d, and the number of modes is given by the number of horns ∼ (D/d)2. In
the last three columns, for all cases, the angular resolution = 1 and λ = 3 mm.
Instrument Angular resolution FOV Aperture D Modes(FWHM) () (cm)
Gregorian telescope 1.51λ/D ∼ 7 26 49
Imaging horn array 2λ/D 2λ/D 34 1
Interfer. horn array λ/2D 2λ/d 8.6 16
throughput) of radiation from the sky. In the interferometer this can be done by adding
additional antennas; the only limitation is the size of the aperture plane rather than optical
aberrations in the focal plane. The largest usable FOV for an off-axis Gregorian reflector is
approximately 7 [10]. See Table 5.1 for a comparison of imaging and interferometric optical
systems.
5.8.4 Direct Measurement of Stokes Parameters
Interferometry solves many of the problems related to mismatched beams and pointing
errors raised by Hu et al. (2003) [6]. This advantage arises because interferometers measure
the Stokes parameters directly, without differencing the signal from separate detectors.
Imaging instruments for CMB polarization measure the power in each linear polarization
on separate bolometers and then form the difference of the two signals to determine the linear
polarization. This approach requires careful matching of the bolometers. Moreover, if the
signals being differenced come from two different antennas, then the beam patterns and pointing
of the two antennas must coincide precisely. Any mismatch converts power from the total
intensity into a spurious polarization signal [6]. In an interferometer, differences in antenna
patterns for the different horns do not couple intensity to polarization in this way (See §5.9).
An interferometer measures the Stokes parameters by correlating the components of the
electric field captured by each antenna with the components from all of the other antennas. If
the output of each antenna is split into Ex and Ey by an orthomode transducer (OMT), on the
baseline formed by two antennas, 1 and 2, the interferometer’s correlators measure 〈E1xE2x〉,
62
〈E1yE2y〉, 〈E1xE2y〉, and 〈E1yE2x〉. The first two are used to determine I and the latter two
measure U . Rotating the instrument allows a measurement of Q. Stokes V can be recovered in
a similar manner but is expected to be zero for the CMB. Alternatively, the antenna outputs can
be separated into left- and right-circular polarization components by a combination of an OMT
and a polarizer. Correlating these signals also allows recovery of all four Stokes parameters.
DASI uses a switchable polarizer to accomplish this [11].
Separation of E and B Modes. A significant challenge in CMB polarization measurements
is separation of the very weak B modes from the much stronger E modes. Unless a full-sky
map (an impossibility because of Galactic cuts) is made with infinite angular resolution the
two modes “leak” into each other [12, 13]. It has been shown [14, 15], however, that an
interferometer can separate the E and B modes more cleanly than can an imaging experiment,
although detailed calculations of this advantage in realistic simulations remain to be done.
5.9 Systematic Effects
Hu et al. (2003) [6] have reviewed systematic effects relevant to CMB polarization mea-
surements, mainly in the context of imaging instruments. Bunn (2006) [16] performs similar
calculations for interferometers. Table 5.2 outlines a variety of systematic errors and how they
can be managed in imaging and infererometric instruments. The relative importance of these
effects is quite different in interferometric systems: some sources of systematic error in imaging
systems are dramatically reduced in interferometers. As an example we consider the effects of
pointing errors and mismatched antenna patterns.
In a traditional imaging system, the Stokes parameters Q and U are determined by
subtracting the intensities of two different polarizations. For example, Q might be measured by
splitting the incoming radiation into x and y polarizations, determining the intensities Ix and
Iy of the two polarizations, and subtracting. In such an experiment, any mismatch in the beam
patterns used to determine Ix and Iy (including differential pointing errors as well as different
beam shapes) will cause leakage from total power (T ) into polarization (Q,U).
In an interferometer, the signals are combined before squaring to get intensities. In such a
system, mismatched beams do not lead to leakage from temperature into polarization. Suppose
that the signal entering each horn of an interferometer is split into horizontal and vertical
polarizations. Working in the flat-sky approximation, let Eix(r ) and Eiy(r) stand for the x and
y components of the electric field of the radiation entering the ith horn from position r on the
sky. The signals coming out of each horn are averages of the incoming electric fields weighted
by some antenna patterns Gi(x,y)(r).
In an interferometer, these signals are multiplied together to obtain a visibility. To
63
Table 5.2: A Comparison of Systematic Effects
Systematic Effect Imaging System Solution Interferometer Solution
Cross-polar beam response Instrument rotation Instrument rotation& correction in analysis & non-reflective optics
Beam ellipticity Instrument rotation No T to E and B leakage& small beamwidth from beams; inst. rot’n
Polarized sidelobes Correction in analysis Correction in analysis
Instrumental polarization Rotation of instrument Clean, non-reflective optics& correction in analysis
Polarization angle Construction No T to E and B leakage& characterization from beams; construction
& characterizationRelative pointing Rotation of instrument No T to E and B leakage
& dual polarization pixels from beams; inst. rot’n
Relative calibration Measure calibration using Detector comparisontemperature anisotropies not req’d for mapping or
measuring Q and URelative calibration drift Control scan-synchronous All signals on all detectors
drift to 10−9 level
Optics temperature drifts Cool optics to ∼3 K No reflective optics& stabilize to < µK
1/f noise in detectors Scanning strategy Instant. measurement of& phase modulation/ power spectrum
lock-in without scanningAstrophysical foregrounds Multiple frequency bands Multiple frequency bands
64
measure the Stokes parameter U , for example, we would multiply the x signal from horn i with
the y signal from horn j to obtain the visibility
V Uij =
∫
d2r1 d2r2Gix(r1)Gjy(r2)〈Eix(r1)E
∗jy(r2)〉.
The angle brackets denote a time average. The electric fields due to radiation coming from two
different points on the sky are uncorrelated, and the product of x and y components of the
electric field gives the Stokes U parameter:
〈Eix(r1)E∗jy(r2)〉 = U(r1)e
2πiu·r1δ(r1 − r2),
so the visibility is
V Uij =
∫
d2r Gix(r)Gjy(r)U(r)e2πiu·r .
Note that the visibility V Uij does not contain any contribution from the total intensity
(Stokes I), even if the two antenna patterns are different. This means that differential pointing
errors and different beam shapes for different antennas do not cause leakage from T into E and
B. Antenna pattern differences do cause distortion of the observed polarization field, so errors
in modeling beam shapes and pointing may cause mixing between E and B.
Coupling between intensity and polarization will arise if the beams have cross-polar con-
tributions. In that case, the visibility V Uij , which is supposed to be sensitive to just polariza-
tion, will contain contributions proportional to 〈ExE∗x〉 and 〈EyE∗
y〉, to which Stokes I does
contribute.
The same considerations apply if the incoming radiation is split into circular rather
than linear polarization states. The visibility V RLij , obtained by interfering the right-circularly-
polarized signal entering horn i with the left-circularly-polarized signal entering horn j, contains
only contributions from Q and U if the beams are co-polar, even if the two horns have different
beams. Again, cross-polarity induces leakage from intensity into polarization.
In short, in an interferometer, beam mismatches are less of a worry than cross-polar
contributions. The reverse is true for an imaging system.
5.10 The Adding Interferometer
In a simple 2-element radio interferometer, signals from two telescopes aimed at the same
point in the sky are correlated so that the sky temperature is sampled with an interference
pattern with a single spatial frequency. The output of the multiplying interferometer is the vis-
ibility (defined in the last section). With more antennas these same correlations are performed
along each baseline. To recover the full phase information, complex correlators are used to
measure simultaneously both the in-phase and quadrature-phase components of the visibility.
65
In interferometers that use incoherent detectors, such as an optical interferometer, EPIC
and MBI, the electric field wavefronts from two antennas are added and then squared in a
detector — an “adding” interferometer as opposed to a “multiplying” interferometer [17]. (See
Figure 5.8.) The result is a constant term proportional to the intensity plus an interference term.
The constant term is an offset that is removed by phase-modulating one of the signals. Phase-
sensitive detection at the modulation frequency recovers both the in-phase and quadrature-
phase interference terms and reduces susceptibility to low-frequency drifts (1/f noise) in the
bolometer and readout electronics. The adding interferometer recovers the same visibility as a
multiplying interferometer.
In an interferometer with an array of N > 2 antennas, the signals are combined in such
a way that interference fringes are measured for all possible baselines (N(N − 1)/2 antenna
pairs). This combination can occur in two different ways: pairwise combination or Fizeau (or
Butler) combination [18]. Pairwise combination involves splitting the power from each of the N
antennas in the array N − 1 ways, adding the signals in a pairwise fashion, and then squaring
the signals and separating out the interference term as described above. In optical systems
this approach is analogous to Michelson interferometry. In Butler combination the signals from
each of the antennas are split and then combined in such a way that linear combinations of all
the antenna signals are formed at each of the outputs of the combiner (Figure 5.9). To allow
all the Stokes parameters to be determined simultaneously, orthomode transducers (OMTs)
are inserted after corrugated horn antennas. In this case, the Butler combiner delivers the
signals from 2N antenna outputs to 2N detectors. Each detector squares these amplitudes,
creating interference signals from all baselines simultaneously on each detector. Effectively, the
signals from all baselines are multiplexed onto each of the N detectors. Only 2N detectors are
required, rather than the 2N(2N − 1)/2 detectors required for pairwise combination. Butler
combiners are commonly used for phased array antennas with coherent systems using either
waveguide or coaxial techniques. The optical analog is Fizeau combination, which is typically
used for incoherent systems at optical wavelengths. We have developed both Butler and Fizeau
approaches and have decided to concentrate on the Fizeau method because of its relative sim-
plicity and low-loss. However, in a coherent system, with amplifiers, the Butler approach is still
an attractive option for forming a large-N interferometer.
66
Figure 5.8: Adding interferometer. At antenna A2 the electric field is E0, and at A1 it is E0eiφ,
where φ = kB sinα and k = 2π/λ. B is the length of the baseline, and α is the angle of the
source with respect to the symmetry axis of the baseline, as shown. (For simplicity consider
only one wavelength, λ, and ignore time dependent factors.) In a multiplying interferometer
the in-phase output of the correlator is proportional to E20 cosφ. For the adding interferometer,
the output is proportional to E20 +E2
0 cos(φ+ ∆φ(t)). Modulation of ∆φ(t) allows the recovery
of the interference term, E20 cosφ, which is proportional to the visibility of the baseline.
Figure 5.9: Block diagram of a planned CMB polarization experiment. Light enters the instru-
ment from the left. Each phase switch is modulated in a sequence that allows recovery of the
interference terms (visibilities) by phase-sensitive detection at the detectors. The signals are
mixed in the beam combiner and detected on cold bolometers at the right. The beam combiner
can be implemented either using guided waves (Butler combiner, as shown here) or quasiopti-
cally (Fizeau combiner, see below). The triangles represent corrugated conical horn antennas,
which connect through transitions to rectangular waveguide. Orthomode transducers (OMTs)
allow all the Stokes parameters to be determined simultaneously.
67
Bibliography
[1] K. Rohlfs and T. L. Wilson, Tools of Radio Astronomy, Tools of Radio Astronomy, XVI,
423 pp. 127 figs., 20 tabs.. Springer-Verlag Berlin Heidelberg New York. Also Astronomy
and Astrophysics Library, 1996.
[2] C. Calderon, “SIMULATION OF THE PERFORMANCE OF THE MILLIMETRE-
WAVE BOLOMETRIC INTERFEROMETER (MBI) FOR COSMIC MICROWAVE
BACKGROUND OBSERVATIONS. Ph.D. Thesis, Cardiff.,” Ph.D. Thesis, 2006.
[3] Jaiseung Kim, “The Millimeter-wave Bolometric Interferometer (MBI) for Observing the
Cosmic Microwave Background Polarization,” Ph.D. Thesis, 2006.
[4] M. White, J. E. Carlstrom, M. Dragovan, and W. L. Holzapfel, “Interferometric Obser-
vation of Cosmic Microwave Background Anisotropies,” ApJ, vol. 514, pp. 12–24, Mar.
1999.
[5] Paul F. Goldsmith, Quasioptical Systems, IEEE Press, 1998.
[6] W. Hu, M. M. Hedman, and M. Zaldarriaga, “Benchmark parameters for CMB polarization
experiments,” Phys. Rev. D, vol. 67, no. 4, pp. 043004–+, Feb. 2003.
[7] L. Knox and Y.-S. Song, “Limit on the Detectability of the Energy Scale of Inflation,”
Physical Review Letters, vol. 89, no. 1, pp. 011303–+, July 2002.
[8] E. Carretti, R. Tascone, S. Cortiglioni, J. Monari, and M. Orsini, “Limits due to instru-
mental polarisation in CMB experiments at microwave wavelengths,” New Astronomy, vol.
6, pp. 173–187, May 2001.
[9] E. Carretti, S. Cortiglioni, C. Sbarra, and R. Tascone, “Antenna instrumental polarization
and its effects on E- and B-modes for CMBP observations,” Astronomy & Astrophysics,
vol. 420, pp. 437–445, June 2004.
[10] S. Hanany and D. P. Marrone, “Comparison of designs of off-axis Gregorian telescopes for
millimeter-wave large focal-plane arrays,” Appl. Opt., vol. 41, pp. 4666–4670, Aug. 2002.
68
[11] E. M. Leitch, J. M. Kovac, C. Pryke, J. E. Carlstrom, N. W. Halverson, W. L. Holzapfel,
M. Dragovan, B. Reddall, and E. S. Sandberg, “Measurement of polarization with the
Degree Angular Scale Interferometer,” Nature, vol. 420, pp. 763–771, Dec. 2002.
[12] A. Lewis, A. Challinor, and N. Turok, “Analysis of CMB polarization on an incomplete
sky,” Phys. Rev. D, vol. 65, no. 2, pp. 023505–+, Jan. 2002.
[13] E. F. Bunn, “Separating E from B,” New Astronomy Review, vol. 47, pp. 987–994, Dec.
2003.
[14] C.-G. Park, K.-W. Ng, C. Park, G.-C. Liu, and K. Umetsu, “Observational Strategies
of Cosmic Microwave Background Temperature and Polarization Interferometry Experi-
ments,” ApJ, vol. 589, pp. 67–81, May 2003.
[15] C.-G. Park and K.-W. Ng, “E/B Separation in Cosmic Microwave Background Interfer-
ometry,” ApJ, vol. 609, pp. 15–21, July 2004.
[16] E. F. Bunn, “Systematic Errors in Microwave Background Interferometry,” to be submitted
to Phys. Rev. D., 2006.
[17] K. Rohlfs and T. L. Wilson, Tools of Radio Astronomy, Springer, 2004.
[18] J. Zmuidzinas, “Cramer-Rao sensitivity limits for astronomical instruments: implications
for interferometer design,” Optical Society of America Journal A, vol. 20, pp. 218–233,
Feb. 2003.
69
Chapter 6
The Fizeau Combiner: A Concept Study
6.1 Introduction
The advantages of interferometry have been stated/discussed in the preceding chapter.
However, extraction of visibilities from an interferometer is not a unique process - there are
many different techniques that can be employed to do this. A general introduction to “adding
interferometry” was provided in §5.8, and one of the adding techniques was discussed (the
Butler beam combination technique). In general, we wish to obtain the highest signal-to-noise
ratio for every baseline, and based on this and other design-related criteria, it is possible to
choose an extraction technique that suits us best.
Figure 6.1: A simple multi-slit diffraction/interference experiment. Phase differences occur
after light has passed through the slit, inside the instrument.
70
Figure 6.2: A simple traditional interferometer. Rays suffer phase differences before they enter
the slits.
In this chapter, we introduce one such technique, which we refer to as “Fizeau beam com-
bination” and the beam combiner as a “Fizeau system”. This is a wavefront-division interferom-
etry technique which is analogous to the simplest interferometer in 1-D: the Young’s double(or
multiple)-slit interference/diffraction set-up, an example of which is shown in fig.(6.1). While
this is very well-known, we stress here the fact that there is a path difference (and therefore a
phase difference) introduced inside the instrument, i.e. different rays entering the instrument
suffer a phase difference after they pass through the slits/antennae (these two terms will be
used interchangably in the remainder of this chapter). Compare this to a traditional interfer-
ometer (also shown in 1-d, though the extension to 2-d is straightforward) as shown in fig.(6.2),
where rays entering the instrument have already undergone a phase difference before they enter
the antennae.
A “Fizeau system” is one that combines both the aforementioned instruments, quite
literally. A simple Fizeau system is shown in fig.(6.3). Notice that rays entering the instrument
suffer phase differences both before and after they pass through the slit. It is this fact that
makes Fizeau combination a powerful tool. Let us explore what this combination achieves. We
start by noting that the “external” phase difference as shown in fig.(6.2) is the reason that
visibility is a fourier transform of the image on the sky. As mentioned in §5.4, this implies that
the output (visibility) is essentially the intensity modulated by a fringe, where the fringe is a
function of baseline length, and therefore selects one “mode” from the image. In the Fizeau
system, we have an additional set of phase differences. Without loss of generality, we may
assign a -ve sign to the phase introduced inside the instrument. Now, if we sum over both the
71
Figure 6.3: A simple 1-d Fizeau system. Notice that there are two sets of phase differences.
phases, we get a fourier transform followed by an inverse fourier transform - but this is the
image itself! Thus, Fizeau combination enables imaging in an interferometer. This is
discussed later in this chapter in §6.3.
Just as a traditional interferometer multiplies the image with the fringes produced by its
baselines (i.e. the fringes due to the “outer” phase differences), so the Fizeau system multiplies
visibilities with internal fringes, where each fringe is a function of baseline length, and is
produced due to “internal” phase differences. This is not an added complexity - visibility is a
complex quantity, but detectors measure only real and positive quantities. By modulating the
visibility by two different known phases, we can recover both the real and imaginary parts of
the visibility. But we can do more - if there is a large number of detectors on the focal plane, we
can modulate each visibility by many known phases and extract much more information than
is posible in a traditional interferometer. Let us explore how. Irrespective of which technique
we choose to employ, CMB interferometry will always require that we use as wide a bandwidth
as possible, since the CMB polarization signal is very low (∼ µK). Let ν be the center frequency
and ∆ν the bandwidth. Then, a baseline of length B will measure CMB polarization at
ℓ =πB
λ≡ πνB
c(6.1)
where the width in ℓ-space is
∆ℓ =π∆νB
c(6.2)
Thus, the larger the bandwidth, the larger the radial width of the pixel in the u-v plane. This
is shown in fig.(6.8) Herein lies one of the main problems with CMB interferometry: a large
72
Figure 6.4: 2-slit diffraction pattern. The
large envelope is caused by the single-slit
diffraction and the fine features by the in-
terference between 2 slits.
Figure 6.5: 8-slit diffraction pattern. The
pattern is more “focused”, leading to bet-
ter image recovery.
Figure 6.6: 16-slit diffraction pattern,
source 10 away from center.
Figure 6.7: 16-slit diffraction pattern,
source 20 away from center.
bandwidth ensures high enough signal-to-noise, but increases the size of a pixel in the u-v
plane. The additional information that is available to us as mentioned above can be utilized
to sub-divide the band in the u-v plane. Thus, the Fizeau system enables extraction of
spectral information via geometry, without additional components like filters. We
discuss this aspect in detail in §6.2
To summarize, in this chapter, we study the aforementioned Fizeau combiner approach
to interferometry and find that it is more useful than traditional interferometry in two ways:
1. Possible to get spectral information within a single frequency band
2. Possible to use the instrument as both an imager and an interferometer
Before we begin to discuss the two aspects of the Fizeau system in detail, we note that the
73
simple 1-d multi-slit system acts as an imager as well. Figs.(6.4) and (6.5) show the diffraction
pattern due to 2 and 8 slits respectively, illustrating the fact that a larger number of slits leads
to better image recovery. This can be explained in terms of interferometry as follows. Each
baseline detects a mode in the image. The greater the number of baselines, the greater the
number of modes that can be recovered and hence the better the recovered image. Figs.(6.6)
and (6.7) illustrate that even in a simple multi-slit system, the image formed on the focal plane
traces the actual image faithfully.
While the idea of using a Fizeau system is a novel one in CMB cosmology, and while the
Fizeau system employed in the MBI was developed by the MBI collaboration (and the following
ideas by the author), this is by no means the first time this technique has been employed
[1],[2],[3]. But our attempt to extract spectral information using the Fizeau system and use it
to run the instrument as an imager and an interferometer are certainly new developments, to
the best of our knowledge.
6.2 Spectral information from an interferometer using a Fizeau approach
6.2.1 Motivation
Figure 6.8: The u-v plane coverage of one baseline of an interferometer for a single pointing in
a single baseline orientation angle. The two causes of spread in a single pixel in the u-v plane
are shown. Also shown is the size of the FOV, which is the fundamental limit to u-v resolution.
74
As mentioned earlier in the chapter, cosmological signals are very weak; therefore, a
wide bandwidth helps increase the power input from the cosmological source. However, a
wide bandwidth also means poor u-v coverage as shown in fig.(5.4) and fig.(6.8). This can be
explained as follows. Consider a two-slit experiment with a monochromatic point source. This
experiment yields fringes that can be computed given the exact parameters of the experiment.
Now, if the same point source emits two different wavelengths that differ by a small percentage,
the fringes are slightly shifted. If a lot of such wavelengths are used, each only slightly different
than the one preceding it, then a “fringe-band” is produced instead of clear, sharp fringes. We
call this effect a “fringe wash-out”. The wider the bandwidth, the greater the wash-out.
Now, a single sharp (i.e. monochromatic) fringe corresponds to a single point on the u-v
plane. If the effect of the bandwidth is to add many such fringes, what it means is that we are
measuring the average visibility over a certain finite area on the u-v plane, where the radial
stretch is due to a finite bandwidth and the angular stretch represents the integration time for
the interferometer.
6.2.2 Preliminaries
The output measured at the bolometers in MBI contains the following phase information
integrated over the entire bandwidth (75− 110GHz)
1. Phase introduced because of the path difference between any two rays that arrive from
the same part of the sky on the the two outward-facing antennae that make up a baseline
2. Phase introduced because of the path difference between any two rays that arrive from
two different antennae on to the same point in the focal plane
The phase in point 1 is due to the fact that MBI is an interferometer, and so the visibility that
we measure must, by definition, include this phase. However, the phase in 2 above introduced
by the beam combiner needs to be eliminated to recover visibility from each bolometer. If
there were a way to calculate the net phase introduced by the beam combiner over the whole
bandwidth, then all we would need to do is to divide the output at each point in the focal plane
by this net phase, and we would get visibility directly.
However, calculating this net phase is not easy, since integration over the bandwidth
turns our calculation into an unrecognizable beast. So we choose instead to work with fringe
patterns1. In order to do so, we need to realize that what we observe at every detector is the
visibility on the sky times the phase factor, summed over a part of the fringe pattern.
1This is exactly the same as saying that the visibilities in each sub-band are modulated by a fringe which
depends on baseline length. To extract these visibilities, we need to separate out the fringes.
75
But the fringe pattern is different for every single frequency in the bandwidth. Visibility
is also different for every different frequency. This can be reasoned as follows. Every single
frequency defines a single value of ℓ for a single baseline as follows. The angular resolution of
a single baseline of length D is
∆θ =λ
D
⇒ ℓ =2π
∆θ
=2πD
λ
⇒ ℓ =2πDν
c(6.3)
Thus, a range of values of ν will produce a range of ℓ’s, or a band in ℓ-space. A finite-bandwidth
interferometer thus measures what is called a “bandpower” instead of a single value of the power
spectrum at one value of ℓ. But the power spectrum is just the variance of the visibilities for
a circle (ring) in the u − v plane, as proved in the previous chapter. And so we get different
visibilities for the same baseline and orientation but for different frequencies [fig. 4.8?].
We can therefore think of the effect of the instrument on the visibilities in the following
way. Let us divide the entire bandwidth of the instrument into N sub-bands and let ν1, ν2...νN
be the centre-frequencies of these N sub-bands. Then for one baseline, one orientation, and
one detector position, these will correspond to visibilities V1, V2...VN and to phase differences
φj1, φj2...φjN (where j represents the detector). If we represent the output at the jth detector
as O then we get
Oj =
N∑
α=0
Vαeiφjα (6.4)
Given just one detector, it is impossible to extract every Vα for every sub-band, even though we
know precisely what the φjα’s are. However, if we have N detectors, then we can easily write
the following system of equations
O1 = V1eiφ11 + V2e
iφ12 + · · ·+ VNeiφ1N
O2 = V1eiφ21 + V2e
iφ22 + · · ·+ VNeiφ2N
. . .
ON = V1eiφN1 + V2e
iφN2 + · · ·+ VNeiφNN (6.5)
This is a system of N equations with N unknowns - V1, V2 · · · VN , and so we can get the values
for each one of these “sub-band visibilities”. The beam combiner thus achieves far more than
just separating the real and imaginary parts of visibilities. (In fact, there are 2N equations,
since visibilities are complex quantities, but this has been overlooked to simplify the equations
for the discussion).
76
6.2.3 Effect of non-zero detector size
In the discussion above, we assumed that the collection area of the detectors is negligible,
and we completely ignored the effect of the fringe pattern. Let us account for these effects
in the following way. Let A be the effective collecting area of each detector. Let f (x, να) be
the value of the fringe pattern (i.e. just a fraction) at a point on the focal plane x and in a
frequency sub-band marked by α. Then, equations (6.5) become
O1 =
∫
V1eiφ11(x)f (x, ν1) d
2x + · · ·+∫
VNeiφ1N (x)f (x, νN ) d2x
O2 =
∫
V1eiφ21(x)f (x, ν1) d
2x + · · ·+∫
VNeiφ2N (x)f (x, νN ) d2x
. . .
ON =
∫
V1eiφN1(x)f (x, ν1) d
2x + · · ·+∫
VNeiφNN (x)f (x, νN ) d2x (6.6)
where it is understood that integration is done over the area of the detector.
This still leaves us with a problem - that of deconvolving the V ’s from the integrals.
However, if the area of the detector is small compared to the width of fringes, then we can
assume that the phase differences remain roughly constant over the collecting area of one
detector, so that we may write
O1 = A[
V1eiφ11(x)F (x, ν1) + · · · + VNe
iφ1N (x)F (x, νN )]
O2 = A[
V1eiφ21(x)F (x, ν1) + · · · + VNe
iφ2N (x)F (x, νN )]
. . .
ON = A[
V1eiφN1(x)F (x, ν1) + · · · + VNe
iφNN (x)F (x, νN )]
(6.7)
where F (x, να) represents an “average” value of the fringe pattern, perhaps the value at the
centre of the detector.
Equations (6.7) again have N variables and can be solved to get N visibilities over the
bandwidth.
6.2.4 Feasibility of using techniques in §6.2 for MBI
1. Bandwidth Issues
For MBI, we are using ∼ 20 detectors, meaning that we can extract visibilities for 20
“sub-bands”. Bandwidth for MBI is 35 GHz, so that the width of every sub-band is 1.75
GHz, such that we get, for every sub-band
∆ν
ν=
1.75
90∼ 0.002 (6.8)
which is really small. Thus, the small-bandwidth approximation holds for equations (6.7).
77
2. Detector Area Issues
We need to compare the area of every detector to the width of the fringes, in order to
estimate whether the area approximation holds in equations (6.7). We first estimate the
width of fringes on the focal plane thus (distance to focal plane = L ∼ 1m):
∆w =λ
DL ∼ 0.3
10× 1m ∼ 3cm (6.9)
The diameter of a detector area is ∼ 1” ≡ 2.54 cm, so that one detector covers about
one entire fringe. This reduces the spectral resolution that can be obtained using this
technique with MBI-4. Future versions of MBI will have much smaller collecting areas,
and will thus provide better resolution.
6.3 The Fizeau combiner as an imager
In addition to acting as an interferometer, the Fizeau system can also be used directly
as an imager and additionally be used to extract spectral information in the u-v plane not
normally possible with conventional interferometer systems. But how is this possible? After
all, an interferometer chooses certain fourier modes specified by the lengths of its baselines.
However, in the Fizeau system, every fourier mode from every baseline falls on every detector
in the focal plane. In addition, the phase differences introduced within the instrument correct
for the phase differences introduced outside the system. This way, every detector detects ALL
possible modes in the right phase - but this is exactly a part of the image!
Thus, the Fizeau system acts naturally as an imager. As a matter of fact, one has to
make a greater effort to operate it in the interferometer mode, for precisely the reason mentioned
above - that the output from every baseline occurs at every detector. So, in order to be able to
distinguish between visibilities from different baselines at every detector, we need another level
of modulation. We use a phase modulator based on the Faraday Effect, and discuss it in the
following chapter. A more detailed discussion and mathematical description of the operation
of this phase modulator combined with the Fizeau system will be provided in a subsequent
publication.
In what follows, we denote the output at the bolometers as O and a fourier transform as
F.
Let ǫ be the phase difference introduced outside the instrument and δ the phase difference
inside the instrument. If E1 and E2 are the electric fields at the two antennae that make a
baseline, then the output from one baseline at any detector is:
O =
∫ θ2
θ=θ1
∫ φ2
φ=φ1
E∗1E2e
i(δ+ǫ) + E1E∗2e
−i(δ+ǫ) sin θdφdθ (6.10)
78
The units of this “intermediate” visibility are Wm−2Hz−1, since we have integrated over the
solid angle.
Equivalently,
O =
∫ θ2
θ=θ1
∫ φ2
φ=φ1
2ℜ(
E∗1E2e
i(δ+ǫ))
sin θdφdθ (6.11)
O ≈∫ θ2
θ=θ1
∫ φ2
φ=φ1
2ℜ(
E∗1E2e
i(δ+ǫ))
dφdθ (6.12)
in the flat-sky case. We can now integrate over the focal-plane area in the following way: if the
area on the focal plane being integrated over is AF , then
O ≈ 1
AF
∫ ∫ ∫ θ2
θ=θ1
∫ φ2
φ=φ1
2ℜ(
E∗1E2e
i(δ(x,y)+ǫ(θ,φ)))
dφdθdxdy (6.13)
Let us consider just one term in the expression ℜ (...):
O ≈ 1
AF
∫ ∫ ∫ θ2
θ=θ1
∫ φ2
φ=φ1
(
E∗1E2e
iǫ(θ,φ))
dφdθe−iδ(x,y)dxdy (6.14)
where we have changed the sign on δ without loss of generality, since phase differences inside
the cryostat are independent of phases due to the skyward horn antennae.
Now, E1E∗2 ∝ IS where IS is a linear combination of Stokes’ parameters[3]. In addition,
we need to take into account the antenna beam:
E1E∗2 ∝ B (θ, φ) IS (6.15)
We can thus write
O =1
AF
∫ ∫
B (x, y)
∫ ∫
B (θ, φ) ISeiǫ(θ,φ)dθdφe−iδ(x,y)dxdy (6.16)
Now, if ǫ (θ, φ) is linear in θ and φ (this is true in the flat-sky case), then
O =1
AF
∫ ∫
B (x, y)F (BIS) e−iδ(x,y)dxdy (6.17)
Now, if the distance from the inward-facing antennae to the focal plane ≫ the collecting area
for each bolometer, δ (x, y) is linear in (x, y) and
O =1
AFF−1 (BF (BIS)) (6.18)
IF this is correct, the beam needs to be deconvolved from the above expression in order to
obtain the image from MBI.
Now, eq(6.17) can be split up over the focal plane:
O =1
AF
[∫ ∫
1BF (BIS) e−iδ(x,y) + · · ·+
∫ ∫
NBF (BIS) e−iδ(x,y)
]
(6.19)
79
where 1 · · ·N are labels for bolometers on the focal plane.
Each of the bolometer outputs then represents a pixel in image space. However, the total
number of pixels depends on the resolution of the instrument, and not the number of bolometers
on the focal plane. Therefore, if the number of bolometers on the focal plane are greater than
the number of pixels in the image, we need to “repixelize” the image obtained.
In general, this is how the beam is convolved with the image on the sky for the Fizeau
beam combiner:
O =1
AFF−1 (BF (BIS)) (6.20)
=1
AF
[(F−1B
)∗ (BIS)
](6.21)
There are two assumptions inherent in the foregoing discussion:
1. The focal plane is large enough to receive most of the power from the inward-facing
antennae
2. There are no “blank” areas on the focal plane for which the incident power is not absorbed
by a bolometer
This approach can be extended to include a finite bandwidth. Also, it is possible to do
this with what is known as a Butler beam combiner as well [4, 5]. In that case, δ is fixed for
every pair of antennae; therefore, we can do one of two things:
1. Construct a Butler combiner that produces several different values of δ for the same pair
of antennae
2. Devise a phase-switching scheme for the phase modulator which allows us to recover
visibilities for a certain time and an image for some time while observing the sky
In MBI-4, the longest baseline is ∼ 15cm which translates to an angular resolution of
∼1-1.5. Since the FOV is ∼8, this implies ∼36 pixels for an image. However, there are only
19 bolometers on the focal plane, and so we can have only a 19-pixel resolution in the image.
6.3.1 Remarks about the Fizeau system
1. The Fizeau system acts naturally as an imager.
2. By introducing phase modulators discussed in the following chapter, we can measure
visibilities for all baselines in a Fizeau system.
80
3. The Fizeau system makes it possible to recover spectral information without the need for
filters.
While it is possible to divide the bandwidth into many different sub-bandwidths, it isn’t
possible to do this indefinitely. The beam for a single antenna determines the FOV of the
instrument and limits the resolution in the u-v plane, as shown in fig.(6.9).
u
v
Fine repixelization due to Fizeau combiner
Beam of single antenna
Figure 6.9: The u-v coverage of a single baseline has been divided into many pixels; however,
the beam of a single antenna is larger than a single pixel, so that this division is not physical.
There exists a way to reduce the effective u-v beamsize below that determined by the
FOV: “super-resolution coverage”. Further discussion on this is left to a future publication.
It is also possible to operate the interferometer simultaneously as an imager. The
additional modulation mentioned above opens up a range of possibilities, including the simul-
taneous measurement of visibilities and images. This shall also be explored further in a future
publication.
In conclusion, the Fizeau system introduced here is a powerful tool for CMB cosmology:
it allows the recovery of more information than is possible with traditional interferometers or
81
imagers and does not need significantly more resources to build. A discussion of the simulation
of a simple Fizeau system is given in §8.2.
82
Bibliography
[1] D. Loreggia, D. Gardiol, M. Gai, M. G. Lattanzi, and D. Busonero, “Fizeau interferom-
etry from space: a challenging frontier in global astrometry,” in New Frontiers in Stellar
Interferometry, Proceedings of SPIE Volume 5491. Edited by Wesley A. Traub. Bellingham,
WA: The International Society for Optical Engineering, 2004., p.255, W. A. Traub, Ed.,
Oct. 2004, vol. 5491 of Presented at the Society of Photo-Optical Instrumentation Engineers
(SPIE) Conference, pp. 255–+.
[2] M. R. Swain, C. K. Walker, M. Dragovan, P. J. Dumont, P. R. Lawson, E. Serabyn, and
H. W. Yorke, “A Fizeau Spatial-Spectral Imaging Submillimeter Interferometer for the
Large Binocular Telescope,” in Bulletin of the American Astronomical Society, Dec. 2002,
vol. 34 of Bulletin of the American Astronomical Society, pp. 1302–+.
[3] M. L. Cobb, “A Comparison of Michelson and Fizeau Beam Combiners for Optical Inter-
ferometry,” in Bulletin of the American Astronomical Society, Dec. 2000, vol. 32 of Bulletin
of the American Astronomical Society, pp. 1429–+.
[4] J. Zmuidzinas, “Cramer-rao sensitivity limits for astronomical instruments:implications for
interferometer design,” J. Opt. Soc. Am. A, vol. 20, no. 2, pp. 218, 2003.
[5] C. Calderon, “SIMULATION OF THE PERFORMANCE OF THE MILLIMETRE-WAVE
BOLOMETRIC INTERFEROMETER (MBI) FOR COSMIC MICROWAVE BACK-
GROUND OBSERVATIONS. Ph.D. Thesis, Cardiff.,” Ph.D. Thesis, 2006.
83
Chapter 7
The MBI Instrument
Figure 7.1: A schematic of the main parts of the MBI instrument.
The Millimeter-wave Bolometric Interferometer (MBI) is a ground-based instrument de-
signed to measure both intensity and polarization of astronomical sources. The first version of
84
MBI has 4 antennae and is called MBI-4. MBI-4 does not have adequate sensitivity to detect
CMB polarization. Rather, the current instrument is a technology demonstor. MBI measures
visibilities using a kind of incoherent detector called “bolometer” (fig.(7.7)). These are more
sensitive than coherent receivers (e.g. amplifier systems like HEMT) at λ ≤3mm, the region
where the CMB spectrum peaks1. Ultimately, instruments with 100s of apertures at multi-
ple wavelengths are envisioned. One such proposed instrument is the Einstein Polarization
Interferometer for Cosmology (EPIC) [1].
Figure 7.2: A schematic of the main parts of the MBI instrument.
The MBI consists of 4 outward-facing horn antennae and each antenna selects a single
linear polarization. The configuration of MBI-4 optics and cryostat is shown in Fig.(7.3). A
photograph of the MBI-4 optics is also shown in Fig.(7.3). The cryostat is attached to an
altitude-azimuth mount. This mount has a third axis to rotate the instrument about its optical
1The choice of the frequency band in which the instrument operates is very important. Fortunately, there
exists a window in which the foreground emission is a minimum - the CMB spectrum happens to be a maximum
there as well, as shown in fig.(7.4).
85
Figure 7.3: A detailed schematic/view of how the Fizeau combiner system fits inside the MBI
instrument.
axis.
The feed horn configuration is chosen to provide uniform uv coverage. The instrument
is sensitive to CMB temperature and polarization fluctuations in a medium multipole range
(ℓ ∼200).
The phase of each of the four inputs is sequentially modulated between -90 and +90
using ferrite-based modulators [3] implemented in circular waveguide. The modulation rate is
86
Figure 7.4: CMB foreground spectra from the WMAP team [2]. The frequency range of MBI
is indicated by the last yellow column on the right marked “W” for the W-band, which is very
close to the minimum of the combined foreground spectrum. This is the frequency band in
which the MBI operates.
∼1-10 Hz and the loss is < 1 dB. The phase shifters dissipate negligible power, ∼ 1 mW each.
Differential loss between the two phase states will produce an offset after demodulation of the
detector signal, so the differential loss betweent the two phase states must be small. Details of
the phase modulators are discussed in §7.6. Light is interfered on an array of 16 bolometers
at the focal plane of the primary mirror. MBI-4 uses spider-web bolometers, provided by JPL,
with NTD germanium thermistors. The bolometers are coupled to the incoming radiation with
conical horns; the horns form a hexagonally packed array. The bolometers and horns are cooled
to ∼ 330 mK with a 3He refrigerator.
MBI-4 will be demonstrated at the Pine Bluff Observatory (PBO) near Madison, Wis-
consin. Key tests include measuring the interferometric beam patterns, observing bright object
such as the moon, and during the winter, when atmospheric conditions are good, carrying out
long integrations on test fields.
We follow this with a brief discussion of MBI operation and then examine parts of the
MBI in more detail in the rest of this chapter.
Fig.(7.2) shows the Fizeau combiner system discussed in the previous chapter. It is easy
to trace the path of a ray from the sky into the instrument all the way to the detector. A
ray entering an outward-facing horn produces an electric field in the antenna; this electric field
87
is the same as that on the sky weighted by the beam pattern of the outward-facing antenna.
The E-field then gets modulated as it passes through the phase modulator and weighted by the
beam pattern of the inward-facing antenna, before being reflected by the primary and secondary
mirrors onto the detector array/unit.
7.1 Antennae
Figure 7.5: The antenna arrangement (right) and how it looks from atop the cryostat, covered
by filters.
The observation of the sky directly with feed horns rather than telescope has several
advantages. The optical design is simple and clean. A large number of feed horns, not limited
in number by a telescope design, can be used to increase sensitivity. The cost of this approach
is the loss of angular resolution unless extremely large feed horns are used. MBI uses this
approach, but adds interferometry between feed horns to recover some of the angular resolution
lost by dispensing with a telescope. In MBI-4 we have used electroformed corrugated conical
feed horns with aperture 5.3 cm for the input elements. These feed horns have a symmetric
beam pattern with measured beam FWHM of ∼7. MBI-4 only collects a single polarization
for each feed selected by the rectangular WR-10 waveguide attached to the horn output. In a
future MBI instrument a waveguide ortho-mode transducer will be used. The relative placement
of the feed horn is chosen in order to provide uniform u v coverage for polarization sensitive
channels with 10 step rotation of the instrument around its optical axis (Fig. 4). This set of
baselines makes the instrument sensitive to CMB polarization fluctuations over the multipole
range ℓ = 150 − 270. Temperature channels will be used for calibration by comparison with
temperature maps of WMAP.
88
7.2 Fizeau Beam combiner
(a) (b)
Figure 7.6: (a) Simulation of fringe patterns formed in the focal plane of the Fizeau beam
combiner from a single baseline.(b) Superposition of fringes from 6 baselines (as expected in
MBI). Fringes are separated by phase modulation sequence.
The signals from each of the input units2 (IUs) are interfered using a so-called Fizeau
beam combiner. The Fizeau combiner acts as an image-plane correlator or interferometer, as
described in the previous chapter. In our instrument, the Fizeau combiner is essentially a
Cassegrain telescope. All signals from the IUs illuminate the primary mirror, and the light is
correlated or interfered on the array of 16 bolometers at the focal plane behind the primary
mirror. For MBI-4 an alternative version of the beam combiner based on a 4 × 4 waveguide
Butler matrix has also been developed and will be tested. Simulations of fringes from a Fizeau
system set-up are shown in figs.(7.6(a)) and (7.6(b)).
7.3 Detectors, electronics and data acquisition
MBI-4 uses 16 traditional spider-web bolometers, provided by JPL, with NTD germanium
thermistors. The bolometers are placed in an optical cavity (see fig.(7.7)) and coupled to the
incoming radiation via 30 flare smooth wall conical horns with 2.54 cm diameter. The horns
form a hexagonally packed array with spacing 2.8 cm in the image-plane of the beam combiner.
The whole unit is suspended from the supporting frame by Kevlar threads and connected to
the cold plate of the 3He refrigerator. The optical efficiency for this configuration is expected
to be ∼50%.
The MBI-4 bolometers are read out with a standard AC-biased differential circuit. The
readout circuit demodulates the detector signals to provide stability to low frequencies (¡30
2An input unit consists of an outward-facing antenna, a phase modulator and an inward-facing antenna
89
Figure 7.7: A spider-web JPL bolometer, with NTD germanium thermistor.
mHz). The bolometer bias and readout electronics are based on those of BLAST38. The
preamplifiers consist of Siliconix U401 differential JFETs with 57 nV/√Hz noise at ν¿100 Hz
and 120 µW power dissipation per pair. They are suspended on a lithographed silicon nitride
membrane, using fabrication techniques similar to those used to make the bolometers and self-
heat to the optimal operating temperature of 120 K. The total power of the JFETs for 16
channels is only 4 mW which allows them to be placed close to the detectors. The data are
read by two FPGA boards NI-7833R.
7.4 Cryogenics
A schematic of the MBI-4 instrument is shown in Fig.(7.1) and a photograph of the
receiver is shown in Fig.(7.3). The cryostat holds 17 liters of liquid nitrogen and 25.7 liters
of liquid helium. In its operational configuration the liquid helium lasts for 50 hours. The
detectors are cooled by a self-contained 3He refrigerator manufactured by Simon Chase. The
3He condenser is cooled by a self-contained charcoal-pumped 4He pot. The base temperature
of the 3He refrigerator in its operational configuration is 330 mK and lasts at least 90 hours.
Cycling the refrigerator takes about one hour. The refrigerator is designed so that an additional
3He stage can be attached to the first 3He stage, which would provide lower temperatures ( 200
mK).
7.5 Telescope and mount
The MBI pointing platform, shown in Fig.(7.8), consists of a fully-steerable altitude-
azimuth mount. In addition, the entire cryostat can be rotated around the optical (θ) axis.
90
Figure 7.8: The MBI mount.
Tracking of the sky occurs under computer control using feedback from 17-bit absolute optical
encoders on each of the three axes altitude, azimuth and theta. Absolute pointing is estab-
lished using a bore-sited optical telescope. This altitude azimuth mounting scheme was used
successfully on the COMPASS experiment.
7.6 Measurements 1: Analysis of data from the Faraday-Effect Phase Mod-
ulator
In order to separate the interference (visibility) signals from the total power signal (see
chapter on Interferometry) detected by each bolometer, the phase of the signal from each
antenna must be modulated. The phase is sequentially modulated between -90 and +90, and
a “lock-in” amplification is done in software to recover the signal. For MBI-4 we use ferrite-
based phase modulators; these waveguide devices have been fabricated by the Observational
Cosmology team at UCSD and are a modification of the Faraday rotators used in BiCEP [4].
The modulation rate is ∼10-100 Hz. The loss in the phase shifter is ≤1 dB. The magnetic
field in the ferrite is controlled by the small superconducting coil. The phase shifters dissipate
negligible power, ∼1 mW each. Also, the differential loss between the two phase states must
be small. Differential loss will produce an “offset” signal after demodulation of the detector
signal.
This section discusses the tests performed on the UCSD-made Ferrite Phase Modulators
(FRMs henceforth). This work was carried out in the Electrical Engg. lab. of Prof. Dan van
der Weide with A. Gault [5]. It was necessary to measure not just the input/output ratio, but
also the relative phase of the outgoing phase modulated signal. For this reason, we had to use
91
a device that can not only generate an input signal for the FRM and measure the output/input
ratio (S21 henceforth (Appendix B) but also measure the relative phase of the outgoing signal.
This device is called a Vector Network Analyzer (VNA henceforth) because it can measure the
vector (i.e. both magnitude and phase) of outgoing signal values. Fig.(7.6) shows an early test
of one of the FRMs in MBI. The FRM is inside the cryostst.
7.6.1 Estimation - no losses
Let us suppose that we have a perfect measuring device; in particular, a perfect VNA.
In this case, the only reason where loss can occur is because the phase shift is not 90. This
is illustrated in fig.(7.10). Looking at fig.(7.10), we see that the fraction of the signal (in terms
of electric fields) that gets through is sin θ where θ is the amount of phase shift/rotation angle.
However, the VNA measures the power ratio, so that the ratio of input and output powers is
OutputPower
InputPower= sin2 θ (7.1)
However, this ratio is expressed in dB’s by the device where
S21 (dB) = 10 log10
(OutputPower
InputPower
)
= 10 log10
(sin2 θ
)≡ 20 log10 (sin θ) (7.2)
where
S21 (dB) = 10 log10 (S21 (ratio)) (7.3)
Then, the rotation angle can be extracted from S21 by the formula
θ = sin−1(
10S2120
)
(7.4)
7.6.2 Estimation with losses
If, however, there are losses in other parts of the set-up, e.g. waveguides, then S21 is no
longer a measure of the angle. We need to subtract this loss (the loss being represented by a
negative number in dBs) from S21, and then that quantity will be the true measure of rotation
angle.
The losses that we expect are as follows
1. Adapter losses
2. Waveguide losses
3. Ferrite losses
92
[h]
Figure 7.9: The Vector Network Analyzer(VNA) at the van der Weide lab at UW-Madison.
The FRM is inside the gold cryostat.
93
Output w/g orientation
Faraday rotation angle (\theta)
Angle of outgoing w
ave
with output w
/g orientatio
n
90−\theta
Input w/g orientation
Figure 7.10: Rotation angle and how it is related to S21
We will discuss Ferrite losses in §7.6.4 For now, we limit ourselves to correcting for adapter and
waveguide losses, which we represent by ‘adloss’ and ‘wgloss’ respectively. We stress again that
both these quantities - ‘adloss’ and ‘wgloss’ are negative numbers in dB which we subtract
from S21. After correcting for these losses, the rotation angle is given by
θ = sin−1(
10S21−wgloss−adloss
20
)
(7.5)
or
sin θ = 10S21−wgloss−adloss
20 (7.6)
Now, the VNA does not give us numbers in dB. Instead, for each S-parameter, it gives us the
ℜ (real) and I (imaginary) parts of the ratios. Thus, what the VNA gives us is ℜ (S21ratio)
and I (S21ratio). We can then easily extract S21ratio thus
S21ratio =
√
(ℜ (S21ratio))2 + (I (S21ratio))
2 (7.7)
Obviously, we can convert this, as well as adloss and wgloss into dBs and estimate θ. However,
we do not really need to convert to dBs, because eq(7.6) can be written as
sin θ =10S21dB
10wglossdB10adlossdB(7.8)
However, each one of the factors on the right is really a ratio, so that
sin θ =S21ratio
wglossratioadlossratio(7.9)
i.e. we just need to divide S21 by the modulus of the loss ratios that we get from the VNA.
94
7.6.3 Correcting for Ferrite loss
In principle, we need to correct for ferrite loss in exactly the same way, i.e.
sin θ =S21ratio
wglossratioadlossratioflossratio(7.10)
where ‘flossratio’ is the ferrite loss as a ratio. However, it cannot be measured in any obvious
way unlike adloss and wgloss, which are measured by an adapter calibration and a separate
baseline test respectively. Instead, we need to make an estimate indirectly as follows
1. From the ferrite-uncorrected angle vs. current graph, find a current for which the rotation
angle is zero
2. Find the S11 for this current, as a ratio
3. This S11 is the result of the wave traversing the entire length of waveguide once, plus
traversing through the ferrite twice (after correcting for adapter loss).
4. Subtract wgloss obtained from the baseline test from this S11
After the aforementioned operations are done, this S11 is a good estimate of twice the ferrite
loss (in dB) ≡ square of the ferrite loss ratio (as a ratio). Now, we are ready to obtain the
corrected θ:
sin θ =S21ratio
wglossratioadlossratioflossratio (corrected)(7.11)
This is what was done, and the result is in fig.(7.11).
7.6.4 Over/under-estimation of Ferrite loss
If we pick out the current at which the phase shift angle is zero, and if S11 and S22 are
the same, then loss estimation is exact. However, this is rarely ever the case. In reality, the
current chosen always has some non-zero phase-shift associated with it. If so, we have actually
underestimated the loss in the ferrite. Then there is the question of whether the hysteresis
loop is symmetric.
To summarize, the estimation error could be a combination of the following factors:
1. Asymmetry in the reflection at the zero-phase angle point
2. The supposed “zero-phase angle” point not being at exactly zero angle
3. The asymmetry of the hysteresis loop because of other reasons
95
The following possibilities exist about point 2 above:
1. If θ is the phase angle at the current we have chosen, S11 is off by a factor of sin θ, so that
we need a corrective multiplicative factor of 11−sin θ
2. The factor in point (1) above is actually sin2 θ instead of sin θ, so the corrective factor is1
1−sin2 θ≡ 1
cos2 θ
3. The factor is 1cos θ
This will be discussed in greater detail in[5]. Concerning points 1 and 3 above, an iterative
approach is a good solution in the absence of a detailed knowledge of the modulator. In this
iterative scheme, we shift the hysteresis loop in every iteration until it is approximately centered
and then recalculate all parameters. We can continue to repeat these steps until the required
accuracy is reached.
7.7 Measurements 2: Antenna Beam Patterns
Since MBI is ground-based, it has to detect sub-µK signals in the vicinity of warm sources
(the Earth, the sky). This level of accuracy has never been studied before in any ground-based
telescope system. We need an excellent understanding of the beam pattern of MBI for the
following reasons.
1. Hu et al. [6] have shown that even if the errors in the main beam are very low, the
corresponding error in measuring polarization is huge, because the temperature signal,
which is 2-3 orders of magnitude higher than polarization for the CMB, “leaks” into the
polarization signal, and even a small leakage causes huge errors in observed polarization,
and changes one form of polarization into another (true especially of sidelobes, however
low). This makes it extremely difficult to extract useful cosmological information from
the data.
2. Antenna sidelobes can couple signals from warm objects (the sky, the Earth).
These are problems that will challenge the next generation CMB polarization probes. For this
reason, antenna beam patterns need to be measured to exquisite precision in order to eliminate
mixing the CMB temperature signal into polarization (to ∼1 part in 108).
Requirements of beam-mapping measurement:
The top of the MBI instrument is about 3 m above the ground and the antennae receive
signals in a band of a wavelengths centered on 3mm. All cosmological sources are at a distance
96
of many billions of light-years from earth, so that the source for these antennae is always in the
far-field. Therefore, all beam patterns need to be measured with the test source in the far-field.
For a 3mm antenna with a 15cm diameter, the far-field is ∼14m away. However, placing a test
source on the ground at this distance is not an option because we cannont tilt MBI any more
than 45-degrees from the zenith for the following reasons:
1. Our limitation in tilt is caused by the dewar - the refrigerator that cools the detectors
stops working well when the dewar is tipped more than 45 degrees.
2. Signals from the ground start to interfere with that from the standard source. Since the
ground signal is significantly stronger than that from the standard source, this will result
in appreciable distortion of beam measurements, even with an AC-modulated source. For
high-precision beam-mapping, we thus need to place the source about 14m above the
ground.
The signal thus needs to be conducted 14m without appreciable loss. Standard sweepers output
∼0.1mW of power, and the sensitivity of MBI is 10−14W√s. At ∼15m, we expect an attenuation
of at most 60dB (factor of 106). If the source power is 0.1mW, we expect 10−10 W at the MBI.
We can therefore tolerate a maximum conduction loss of 40dB (factor of 10000) in the apparatus
that conducts the signal 14 m. Placing a source on the tower poses problems, since power or
frequency cannot be adjusted easily. Therefore, all the frequency and loss properties of the
conducting material need to be characterized before we begin to measure the beam.
To summarize, requirements for the measurement are:
1. 14 mapparatus to conduct signal with maximum loss of 40dB
2. A tower to hold the apparatus steady
We discuss a technique to minimize conduction loss below.
7.7.1 Loss in an overmoded circular waveguide
7.7.2 Introduction
Microwave signals often need to be carried over large distances, e.g. in precision astro-
physical applications (interferometry for instance) In our case, to make precision measurements
of beam patterns, we need to transport RF power to a tower ∼20 m high. We describe a simple
technique to propagate a signal in the W-band over ∼20m or more without appreciable loss.
The technique is easy to implement and does not require elaborate fabrication. It involves
97
propagating the signal through a small section of standard WR-10 waveguide and then transi-
tioning to a wider circular waveguide (i.e. overmoding) for ∼20m. We then transition back to
WR-10 and detect the loss through the entire section.
This overmoding technique depends on low-loss transitions. In order to be low-loss, these
transitions had to be smooth and gradual. We used a 2” transition from WR-10 to a 0.3” inner
diameter circular waveguide. The reason for this choice was that copper tubes of this width are
readily available commercially.
7.7.2.1 Theory - Loss in a waveguide at room temperature
We now calculate the loss in a waveguide that occurs due to the resistive element of the
waveguide material. We do this calculation for two waveguide systems:
1. Rectangular W-band waveguide made of silver
2. Circular 0.3” Waveguide made of copper
A naive first assesment would assume a lower loss in (1) above, because silver is a better
conductor than copper. We show below that resistive losses depend on the dimensions of the
waveguide as well as material conductivity.
For a section of waveguide of length z, the ratio of the amount of power in the TE10
mode, which carries almost all the transmitted power, to the input power P0 is given by
P10
P0= e−2αz (7.12)
where α is the “attenuation constant” and is measured in Np/m ([7] pp.188). Let us calculate
the attenuation constants for the two cases, followed by an estimate of the loss through a
waveguide length of 60’ in each case. This is the maximum length for which we measured the
loss through a circular copper waveguide.
Rectangular W-band waveguide
For a rectangular waveguide, the attenuation constant is given by ([7] pp.188):
α =
(RmZ0
)1
abβ10k0
(2bk2
c10 + ak20
)(7.13)
98
where
Rm = Real part of surface impedance of waveguide
Z0 = Impedance of free space
a, b = Dimensions of waveguide
k0 = Wavenumber in free space
kc10 = Wavenumber corresponding to cutoff frequency
λc, fc = Cutoff wavelength and frequency respectively
β = Propagation factor
β10 = Propagation factor for the TE10 mode (7.14)
For the W-band, the parameters are as follows:
f ≡ Centre frequency = 93GHz
fc = 60GHz
a = 0.10′′ ≡ 0.254cm
b = 0.05′′ ≡ 0.127cm
k0 =2π
cf = 1947.79m−1
β10 =2π
c
√
f2 − f2c = 1488.20m−1
kc10 =2π
cfc = 1256.64m−1
Rm =1
σδ≡√
πfµ
σ= 0.078 for silver
Z0 = 377Ω (7.15)
With these values, the attenuation constant, α is calculated to be
α = 0.303Np/m (7.16)
The ratio of transmitted power for a 60’-long waveguide section is given by
P10
P0= e−2×0.303×18.288 = 1.54× 10−5 ≡ −48dB (7.17)
Circular Waveguide: 0.3”
The attenuation constant for a circular waveguide is given by ([7] pp.196):
α =
(RmZ0
)1
a
(
1− 1.8412
k20a
2
)− 12(
1.8412
k20a
2+ 0.4185
)
(7.18)
99
where the only changes from eq(7.13) are:
a = Diameter of waveguide = 0.3′′ = 0.00762m
Rm = 0.0795 for copper (7.19)
With these values, the attenuation constant, α is found to be
α = 0.0121Np/m (7.20)
for a 0.3” circular copper waveguide. The ratio of transmitted power for a 60’-long waveguide
section is given byP10
P0= e−2×0.0121×18.288 = 0.642 ≡ −1.92dB (7.21)
7.7.2.2 Measurements, data and conclusions
The two transitions were attached together to measure the loss through them. This
can then be subtracted from the data to get an estimate of loss through just the 0.3” tube
section. Raw data from experiments is shown in fig.(7.14). It is clear that the loss increases
monotonically with the length of the copper tube at all frequencies.
Fig.(7.16) shows the average loss per unit length calculated from the smoothed data. The
net loss is about 1 dB per 10 feet of tube length; however, this estimate holds for frequencies
below ∼105 GHz.
Fig.(7.17) shows the signal in a small frequency range (90.0-90.4 GHz). Notice that the
frequency interval between resonances decreases with increasing tube length. Calculations [8]
show that these frequency intervals correspond exactly to what is expected for the corresponding
tube lengths.
101
Figure 7.12: The WR-10 to 0.2” transition (gold) connected with an adapter which then con-
nects to the circular copper tube.
Figure 7.13: Schematics of the planned antenna beam test.
102
Figure 7.14: Raw data from the tube test for pipes of different lengths. The oscillations are
caused by standing waves in the pipes. Notice that the signal from different lengths decreases
monotonically with increasing length.
105
Figure 7.17: Resonances in the data in a small frequency range. These are consistent with
standing waves in the tube lengths used.
106
Bibliography
[1] P. T. Timbie, G. S. Tucker, P. A. R. Ade, S. Ali, E. Bierman, E. F. Bunn, C. Calderon,
A. C. Gault, P. O. Hyland, B. G. Keating, J. Kim, A. Korotkov, S. S. Malu, P. Mauskopf,
J. A. Murphy, C. O’Sullivan, L. Piccirillo, and B. D. Wandelt, “The Einstein polarization
interferometer for cosmology (EPIC) and the millimeter-wave bolometric interferometer
(MBI),” New Astronomy Review, vol. 50, pp. 999–1008, Dec. 2006.
[2] C. L. Bennett, R. S. Hill, G. Hinshaw, M. R. Nolta, N. Odegard, L. Page, D. N. Spergel,
J. L. Weiland, E. L. Wright, M. Halpern, N. Jarosik, A. Kogut, M. Limon, S. S. Meyer,
G. S. Tucker, and E. Wollack, “First-Year Wilkinson Microwave Anisotropy Probe (WMAP)
Observations: Foreground Emission,” ApJ Suppl., vol. 148, pp. 97–117, Sept. 2003.
[3] J. Bock, S. Church, M. Devlin, G. Hinshaw, A. Lange, A. Lee, L. Page, B. Partridge,
J. Ruhl, M. Tegmark, P. Timbie, R. Weiss, B. Winstein, and M. Zaldarriaga, “Task Force
on Cosmic Microwave Background Research,” ArXiv Astrophysics e-prints, Apr. 2006.
[4] K. W. Yoon, P. A. Ade, D. Barkats, J. O. Battle, E. M. Bierman, J. J. Bock, H. C. Chiang,
C. D. Dowell, L. Duband, G. S. Griffin, E. F. Hivon, W. L. Holzapfel, V. V. Hristov, B. G.
Keating, J. M. Kovac, C. Kuo, A. E. Lange, E. M. Leitch, P. V. Mason, H. T. Nguyen,
N. Ponthieu, and Y. D. Takahashi, “Report on BICEP’s First Season Observing the Cosmic
Microwave Background from South Pole,” in Bulletin of the American Astronomical Society,
Dec. 2006, vol. 38 of Bulletin of the American Astronomical Society, pp. 963–+.
[5] A. C. Gault and S. S. Malu, “A measurement of the Faraday-effect Phase modulator
performance,” In preparation, 2007.
[6] W. Hu, M. M. Hedman, and M. Zaldarriaga, “Benchmark parameters for CMB polarization
experiments,” Phys. Rev. D, vol. 67, no. 4, pp. 043004–+, Feb. 2003.
[7] R. E. Collin, Foundations for Microwave Engineering, Wiley-IEEE Press. ISBN-13 978-
0780360310, 2000, XIII + 944 p. 2nd ed., 2000.
[8] L. Levac, S. S. Malu, and P. T. Timbie, “Loss in an overmoded circular waveguide over
medium distances,” In preparation, 2007.
107
Chapter 8
Simulations of the CMB sky and the MBI
Instrument
In chapter 7, we described the MBI instrument in detail. Before the MBI can be put to use in
CMB observations, though, we need to know:
1. its response to a simulated CMB sky, in order to perform checks on its various parts
2. its response as a function of ℓ, which depends on its antenna beams patterns.
To achieve this, we need the following calculations/simulations:
1. simulation of the CMB sky over a patch as large as the MBI beam
2. a calculation of the Window functions of MBI for CXℓ where X = T,E,B
3. simulation of the MBI instrument
In addition, we need to describe the analysis of data from the FRM.
We describe these three calculations/simulations below.
8.1 Simulation of the CMB sky patch
As described in §5.6, the power spectrum is a statistical description of CMB anisotropies.
That is, it does not contain information about the amount of power in every single anisotropy
over the sky as a function of position. Instead, it tells us how much power there is in the
anisotropy at a given angular scale. This can be pictured as follows. Imagine a point on the
sky, say θ = 0, i.e. the NCP. Now consider all the points (ideally, infinite, but practically, a
large number) at some θ 6= 0. If we compare the temperature at θ = 0 with the temperatures
108
at points θ, φ = 0 → 2π (i.e. find the angular two-point correlation function or the power
spectrum), and take the standard deviation, we end up with the value of the power spectrum
at ℓ = πθ . Thus, information in a CMB map is “compressed”, so to speak, to form the power
spectrum.
So if we are given a set of Cosmological parameters, and therefore a power spectrum, which
can be calculated via software packages like CMBFAST[1], we need to “add a dimension” to
it in order to get a simulated map. But it isn’t possible to just generate random numbers to
get the temperature of the points at θ, φ = 0 → 2π, without knowing anything else about the
CMB. There is one property of the CMB that we have not recalled yet - its gaussianity. If we
include this property, we need to do the following in order to generate a simulated map of a
patch of the CMB sky:
1. Generate N (depends on the desired resolution of the simulated map) gaussian random
numbers with unit variance for every angle and therefore every ℓ.
2. Multiply the vector containing the random numbers with the value of√Cℓ (the standard
deviation)
3. Repeat the above steps for all values of θ - this forms a map in fourier space
4. Take the inverse FT of this map to get a map of CMB anisotropies in real space
Since only a small patch of the sky is observed, the curvature in this patch may be ne-
glected, and this is also the reason we can use fourier transforms instead of spherical harmonics.
Under this assumption, this is how fourier decomposition works[2]:
⟨a∗ (u) a
(u′)⟩ = S (u) δ
(u− u′) (8.1)
where
S (u)u2 =l (l + 1)
(2π)2Cl (8.2)
and u = l/ (2π).
The fourier definitions are as follows:
a (u) =
∫
a (x) e−2πiu·xd2x (8.3)
for forward and
a (x) =
∫
a (u) e−2πiu·xd2u (8.4)
for reverse FT.
109
Figure 8.1: The power spectrum used to generate the simulated maps shown below. This was
obtained by choosing a set of cosmological parameters in CMBFAST[1].
To generate a small CMB map, we first start with a power spectrum - the one used is
shown in fig.(8.1).
We then derive the fourier transform of the map aTmp in the following way:
aTmp = rT(CTTl
)(8.5)
and then take the inverse fourier transform to get the real map, shown in fig.(8.2).
Q and U maps are also shown below in figures (8.3) and (8.4) respectively.
We can also generate the map we should expect to see with an ideal (no noise) interfer-
ometer, given 6 baselines (like the MBI) - this is shown in fig.(8.5).
We can also perform a very basic check on the maps generated as follows. As described
in [Knox], the error bars expected on the power spectrum, given an ideal instrument observing
a fraction of the sky fSKY are
σCl=
√
2
(2l + 1) fSKYCl (8.6)
To perform this check, we recover the power spectrum from the simulated map, with the given
mapsize, and compare with the formula above. In fig.(8.6)
110
Figure 8.2: The temperature map obtained from the power spectrum above and the method
described in this chapter. The size of the map is in degrees, indicated on the two axes. Tem-
peratures are in K.
8.2 Simulation of the MBI Instrument
Aim: “observe” a simulated CMB sky patch with MBI-4 and recover bandpowers for
different baselines, given a nearly ideal instrument, i.e. no noise or systematic effects.
8.2.1 Interferometry
In §8.1, we discuss how a polarization interferometer works and the relation between
observable quantities (Stokes’ T, Q, U and V) and sky signal. In section 2, we calculate what
we see at one frequency by integrating over the field-of-view. In section 3, we integrate over the
bandwidth that the antenna / waveguide system and presumably the detectors (in the case of
the MBI, the bolometers) are sensitive to. For MBI, we needn’t worry - spider-web bolometers
being used are not sensitive to any particular bandwidth. Sections 1 through 3 are general and
can be applied to any interferometric observations of CMB polarization.
Section 4 describes the beam combiner system being used in MBI, and calculates the
phase difference between two rays from two different antennas, i.e. it calculates the fringe
pattern produced at the focal plane by one baseline.
Notation: θ and φ represent a direction on the sky, and ψ is an angle inside the cryostat. ǫ
111
Figure 8.3: Q map obtained from the power spectrum above and the method described in this
chapter.
Figure 8.4: The temperature map obtained from the power spectrum above and the method
described in this chapter.
112
Figure 8.5: The temperature map that a 6-baseline ideal interferometer is expected to output,
given the sky map shown in fig.(8.2).
and δ are phase differences of a “pixel” on the sky and a position in the focal plane respectively.
These will be useful later. χ denotes an orientation of the instrument.
We follow here the discussion in §5.7, and write the output electric fields of the two horns
as
E1 = Exx +Eyy (8.7)
E2 = (Exx + Eyy) eiǫ (8.8)
In general, waveguides can be coupled to some combination of linear polarizations, so:
E1 = a1Exx + a2Eyy (8.9)
E2 = (b1Exx + b2Eyy) eiǫ (8.10)
If a2 = b2 = 0, then linear polarization is chosen; if a2a1 = ±i then circular polarization is chosen.
113
Figure 8.6: This is a basic check of the map in fig.(8.2). The curves on the top and bottom
indicate the 1-σ error bars expected from eq.(8.6), and the marked points make up the recovered
power spectrum. Note that the vertical scale is different from the power spectrum in fig.(8.1).
The Stokes’ parameters are defined as follows:
T =⟨|Ex|2 + |Ey|2
⟩(8.11)
Q =⟨|Ex|2 − |Ey|2
⟩(8.12)
U = 〈2ℜ (E∗xEy)〉 (8.13)
V = 〈2I (E∗xEy)〉 (8.14)
Then, Ex and Ey can be expressed in terms of the Stokes’ parameters:
|Ex|2 =1
2(T +Q) (8.15)
|Ey|2 =1
2(T −Q) (8.16)
E∗xEy =
1
2(U + iV ) (8.17)
ExE∗y =
1
2(U − iV ) (8.18)
In general, the multiplying interferometer works in the following way. The two electric
fields are first added using a beam combiner (in the present MBI configuration, this is the
Fizeau scheme) and then detected on the focal plane. With no other phase differences, e.g. at
114
exactly the middle of the focal plane the output at the detector will be (E1 + E2) (E∗1 + E∗
2).
However, there will be an additional phase factor due to the difference in path length between
the two paths to the focal plane from the two antennas, as shown in figure 1. There is a relative
phase of ǫ due to the position of the two antennas looking towards the sky and δ between the
rays from antenna two and antenna one inside the cryostat, and therefore between E1 and E2.
Now recall that the part of the detected signal that has been phase modulated is ∝ E1E∗2
and its conjugate. Let us work out the detected quantity explicitly:
(
E1 + E2ei(δ+ǫ)
)(
E∗1 + E∗
2e−i(δ+ǫ)
)
= |E1|2 + |E2|2 + E1E∗2e
−i(δ+ǫ) + E∗1E2e
i(δ+ǫ) (8.19)
The first two terms are easily evaluated:
|E1|2 = E1E∗1 = |a1|2|Ex|2 + |a2|2|Ey|2 (8.20)
|E2|2 = E2E∗2 = |b1|2|Ex|2 + |b2|2|Ey|2 (8.21)
We can substitute for |Ex|2 etc. from the above equations to get
|E1|2 =1
2
[T(|a2
1|+ |a22|)
+Q(|a2
1| − |a22|)]
(8.22)
|E2|2 =1
2
[T(|b21|+ |b22|
)+Q
(|b21| − |b22|
)](8.23)
If we want to study interference, we wish to look at only the last two terms, which will
have been phase-modulated. However, they are just complex conjugates of each other. So we
need evaluate only one, and the other will follow. WLOG, we consider the last term:
⟨
E∗1E2e
i(δ+ǫ)⟩
=1
2ei(δ+ǫ) [a∗1b1 (T +Q) + a2b
∗2 (T −Q) + a∗1b2 (U − iV ) + a∗2b1 (U + iV )]
(8.24)
Simplifying,
⟨
E∗1E2e
i(δ+ǫ)⟩
=1
2ei(δ+ǫ) ×
[(a∗1b1 + a∗2b2)T + (a∗1b1 − a∗2b2)Q+ (a∗1b2 + a∗2b1)U + i (a∗2b1 − a∗1b2)V ] (8.25)
Similarly,
⟨
E1E∗2e
−i(δ+ǫ)⟩
=1
2e−i(δ+ǫ) ×
[(a1b∗1 + a2b
∗2)T + (a1b
∗1 − a2b
∗2)Q+ (a1b
∗2 + a2b
∗1)U − i (a2b
∗1 − a1b
∗2)V ] (8.26)
We need to remind ourselves that the four quantities T,Q,U and V already have the effect of
the primary antenna beam included. Just so we are clear, let us replace T etc. by T where
115
T = A (φ, θ)T etc. thus:
⟨
E∗1E2e
iδ⟩
=1
2ei(δ−ǫ) [(a∗1b1 + a∗2b2)T + (a∗1b1 − a∗2b2)Q+ (a∗1b2 + a∗2b1)U + i (a∗2b1 − a∗1b2)V]
(8.27)
and
⟨
E1E∗2e
−iδ⟩
=1
2e−i(δ−ǫ) ×
[(a1b∗1 + a2b
∗2)T + (a1b
∗1 − a2b
∗2)Q+ (a1b
∗2 + a2b
∗1)U − i (a2b
∗1 − a1b
∗2)V] (8.28)
We need to assign one kind of polarization, i.e. either linear or circular, in order to figure
out the sum of these two quantities. Let us consider the case of linear polarization first, where
X ≡(a2=0a1=1
)and Y ≡
(a1=0a2=1
)for E1 and X ≡
(b2=0b1=1
)and Y ≡
(b1=0b2=1
)for E2, so that
⟨
E1E∗2e
−iδ⟩
XX=
1
2e−i(δ−ǫ) (T +Q) (8.29)
⟨
E1E∗2e
−iδ⟩
Y Y=
1
2e−i(δ−ǫ) (T − Q) (8.30)
⟨
E1E∗2e
−iδ⟩
XY=
1
2e−i(δ−ǫ) (U + iV) (8.31)
⟨
E1E∗2e
−iδ⟩
Y X=
1
2e−i(δ−ǫ) (U − iV) (8.32)
And similarly, the complex conjugate term gives us
⟨
E∗1E2e
iδ⟩
XX=
1
2ei(δ−ǫ) (T +Q) (8.33)
⟨
E∗1E2e
iδ⟩
Y Y=
1
2ei(δ−ǫ) (T − Q) (8.34)
⟨
E∗1E2e
iδ⟩
XY=
1
2ei(δ−ǫ) (U + iV) (8.35)
⟨
E∗1E2e
iδ⟩
Y X=
1
2ei(δ−ǫ) (U − iV) (8.36)
8.2.1.1 Application to simulations of time-ordered data (TOD)
In an interferometer, the only diference between the electric fields at the different antennas
is a phase factor that depends on the path difference between the photons that arrive at those
antennas. Therefore, we need to find the electric field at only one antenna; the field at the
others will follow easily.
Following equations 9 and 10, we get for the two components of the electric field:
|Ex|2 =1
2(T +Q) (8.37)
|Ey|2 =1
2(T − Q) (8.38)
116
It seems a little strange at first that the electric fields be maps instead of just a number, but
we have to remember that they do not get added/averaged over until they reach the detectors.
At the detector, they are summed over with all the appropriate phase factors.
8.2.2 Integration over the field-of-view (FOV) / sky patch
This is not a single step, since there are several things involved:
1. Q and U as functions of orientation χ: As the instrument is rotated, the response
from every baseline changes - Q and U are functions of orientation angle χ - this relation
is described in the appendix.
2. Relative phase of each point on the sky: Every antenna’s position is a point, and
the distance of every point in the sky patch / field-of-view to the antenna is different;
consequently, each point on the FOV has a unique phase associated with it, which needs
to be calculated. In other words, we need to calculate the exact functional form of ǫ,
which is a function of φ, θ and χ
Let us perform each operation, one by one. Before performing integration, though, we need to
get the units of each quantity right. The units of the incoming power are Wm−2Hz−1Sr−1.
8.2.2.1 Integration
We can simply integrate over an “area” on the sky. An “area” on the sky is given by
A =
∫ θ2
θ=θ1
∫ φ2
φ=φ1
sin θdφdθ (8.39)
where θ1 and φ1 can have any value depending on which part of the sky we are looking at, and
θ2−θ1 and φ2−φ1 are determined by the FOV - these will be calculated in sub-section 4 (to be
added later). We want to integrate the last two terms in (8.19); let us call the “intermediate”
visibility V - this is clearly a function of the orientation of the instrument, χ. So
V (χ) =
∫ θ2
θ=θ1
∫ φ2
φ=φ1
E∗1E2e
i(δ+ǫ) + E1E∗2e
−i(δ+ǫ) sin θdφdθ (8.40)
The units of this “intermediate” visibility are Wm−2Hz−1, since we have integarted over the
solid angle.
8.2.2.2 The relative phase difference ǫ
Look at fig.(8.8). This figure shows the orientation of a baseline w.r.t. the co-ordinate
system. As the instrument (and therefore the baseline) rotates, the two points labelled ‘A’ and
117
‘B’, i.e. the two antennas that form the baseline, also rotate. Their position at an orientation
angle χ is given by
x′1 = x1 cosχ+ y1 sinχ (8.41)
y′1 = −x1 sinχ+ y1 cosχ (8.42)
x′2 = x2 cosχ+ y2 sinχ (8.43)
y′2 = −x2 sinχ+ y2 cosχ (8.44)
The distance of each antenna from the origin remains constant, though, so that the two quan-
tities
b1 =√
x21 + y2
1 (8.45)
b2 =√
x22 + y2
2 (8.46)
remain constant. For what we are about to do, it is useful to define the unit vectors along the
direction of the antennas thus:
b1 =1
b1
(x′1, y
′1, 0)
(8.47)
b2 =1
b2
(x′2, y
′2, 0)
(8.48)
Write out all the quantities explicitly:
b1 =1
b1(x1 cosχ+ y1 sinχ,−x1 sinχ+ y1 cosχ, 0) (8.49)
b2 =1
b2(x2 cosχ+ y2 sinχ,−x2 sinχ+ y2 cosχ, 0) (8.50)
Let us also write out the unit vector at a point on the sky:
r = (cosφ sin θ, sinφ sin θ, cos θ) (8.51)
Now, look at figure(to be drawn) - the two antennas clearly form a triangle with the point
on the sky under consideration. Since we know the unit vectors to all the three points on the
triangle, we can find the three angles a, a1 and a2:
cos a1 = b1 · r (8.52)
cos a2 = b2 · r (8.53)
cos a =
(
r− b1
)
·(
r− b2
)
|r− b1||r− b2|(8.54)
The path difference can then be calculated with the help of the sine identity for triangles
(suggested by Peter H.):B
sin a=
s2sin a1
=s1
sin a2(8.55)
118
so that the path difference is
s2 − s1 =B
sin a(sin a1 − sin a2) (8.56)
and the phase difference is
ǫ =2πB
λ
(sin a1 − sin a2)
sin a(8.57)
Since each one of the angles a, a1 and a2 is a function of φ, θ and χ, ǫ = ǫ (φ, θ, χ).
We are now ready to write the integral over the FOV. For simplicity, let us not write the
normalization or the integration limits for now:
V (χ) =
∫ ∫
E∗1E2e
i(δ+ǫ(φ,θ,χ)) + E1E∗2e
−i(δ+ǫ(φ,θ,χ)) sin θdφdθ (8.58)
8.2.2.3 Limits of integration
The most convenient thing to do is to assume that the centre of the FOV has θ = 0, and
then, θ2 − θ1 = FOV2 ; φ2 − φ1 = 2π
8.2.3 Interference pattern in focal plane
First, let us recall that we have an antenna radiating out to the focal plane. We must
account for the primary beam of the antenna again, i.e. we must multiply our result with the
antenna beam.
Now, notice that it really does not matter which configuration we wish to work out; they
will have the same factor of
ei(δ−ǫ) + e−i(δ−ǫ) = 2cos (δ − ǫ) (8.59)
Antenna beam can be written generically as
A (φ) = e−φ2
2σ2 (8.60)
where
σ ≡ FWHM√2 ln 2
(8.61)
However, when we calculate the pattern due to a baseline at a single point in the focal plane,
we must remember that the signal at that point is coming from two different antennas, and the
beam pattern of each antenna at that point is, in general, different. We label the antennas by i
and j and each one of them has a beam that is a gaussian. We will evaluate each angle φi and
119
φj separately in section 4. For now, we bring together all the factors for finding an expression
for the interference pattern as below.
Aij = e−φ2
i2σ2 e−
φ2j
2σ2 = e−φ2
i +φ2j
2σ2 (8.62)
The net interference pattern then is
I (x, y) ∝ 2A (φi, φj) cos (δ − ǫ) ≡ e−φ2
i +φ2j
2σ2 cos (δ − ǫ) (8.63)
If we wish to calculate the pattern for a single pointing, we could include the receiving antenna
beam as well:
I (x, y) ∝ 2A (φi, φj)A (θ) cos (δ − ǫ) ≡ e−φ2
i +φ2j
2σ2 e−θ2
2σ2 cos (δ − ǫ) (8.64)
8.2.4 Effect of finite bandwidth
In the frequency range ν → ν+dν, the proportion of intensity that an instrument receives
is P (ν) dν where P (ν) is the Planck Brightness Function. The total amount of energy that a
detector receives in a bandwidth is then
I (ν1, ν2) ∝∫ ν2
ν1
f (ν)P (ν) dν (8.65)
where f (ν) is the interference pattern above. The final expression is then
Looking at the document ‘distribution.pdf’, all we need to do is weigh the interference-
term in eq(8.58) function with the distribution function
P (z) =z3
ez − 1(8.66)
and divide by the integral∫ z2
z1
z3
ez − 1(8.67)
where z1 = hν1kT0
and z2 = hν2kT0
. Therefore, changing variables to z = hνkT in the above expression
for I (ν1, ν2), we get
I (ν1, ν2;χ) =
∫ z2z1V (χ)
(z3
ez−1
)
dz∫ z2z1
z3
ez−1dz(8.68)
or, equivalently, in a short form,
I (ν1, ν2;χ) =
∫ z2z1V (χ)P (z) dz∫ z2z1P (z) dz
(8.69)
120
In its full glory, the expression is
I (ν1, ν2;χ) =
∫ z2z1
∫ ∫E∗
1E2ei(δ+ǫ(φ,θ,χ)) + E1E
∗2e
−i(δ+ǫ(φ,θ,χ)) sin θdφdθ(
z3
ez−1
)
dz∫ z2z1
z3
ez−1dz(8.70)
We will evaluate an expression for δ as a function of z in section 4.
8.2.4.1 Dependence of FWHM on frequency
The above expression for the fringe pattern is not final. We have to take into account the
FWHM of the beam and the fact that FWHM ∼ λD where D is some aperture width associated
with the antennae. In other words, FWHM (ν) ∼ 1ν . All our beam pattern measurements have
been at 90 GHz (more generally, the central frequency, call it νc); therefore,
FWHM (ν)
FWHM (ν = νc)=νcν
(8.71)
⇒ FWHM (ν) =νcν· FWHM (ν = νc) =
h× νckT0
· 1zFWHM (ν = νc) (8.72)
The same expression will then hold for σ, since it is related to FWHM by a constant factor:
σ (ν) =h× νckT0
· 1zσ (ν = νc) (8.73)
Now, h×νc
kT0= D (say), and call σ (ν = νc) = σ0
⇒ σ (ν) =D
zσ0 ⇒ σ2 =
D2
z2σ2
0 (8.74)
The net interference pattern then is then
I (x, y) ∝ 2A (φi, φj) cos (δ − ǫ) ≡ e−(φ2
i +φ2j)z2
2D2σ20 cos (δ − ǫ) (8.75)
and the final expression becomes
I (ν1, ν2) =
∫ z2z1e−(φ2
i +φ2j)z2
2D2σ20 cos (δ − ǫ)
(z3
ez−1
)
∫ z2z1
z3
ez−1dz(8.76)
121
8.2.5 Implementation of formalism to the instrument
The foregoing formula is difficult to implement in the case of an actual instrument, because
it contains two phase angles instead of positions of the antennas or the detectors in the focal
plane. Let us consider the internal antennas first, and let us define a convenient co-ordinate
system in the following way. Choose a point on the focal plane and call that the origin. Let
the optical distance from the antennas to the focal plane be d. Let the antennas be labeled by
the numbers i and j (we need two labels since this is an interferometer and the basic unit is
one baseline, i.e. two antennas). Then the position of one of the antennas is specified by the
vector
rib = (xi, yi, d) (8.77)
(the notation rib may seem a little intriguing; after all, what it means is “position of the ith
bolometer”; however, we will need to define another vector ri0 which represents a point on the
focal plane with the same (x, y) co-ordinates as the bolometer - we will need this to calculate
angles). Positions on the focal plane are specified by
r = (x, y, 0) (8.78)
The path length from one antenna to any point in the focal plane is then given by
|rib − r| =(
(x− xi)2 + (y − yi)2 + d2)(1/2)
(8.79)
For the secong antenna in a baseline, similar equation can be written down:
|rjb − r| =(
(x− xj)2 + (y − yj)2 + d2)(1/2)
(8.80)
Then, the path difference between the two antennas in one baseline is:
|rib− r| − |rjb− r| =(
(x− xi)2 + (y − yi)2 + d2)(1/2)
−(
(x− xj)2 + (y − yj)2 + d2)(1/2)
= rij
(8.81)
The “phase angle” associated with the path difference rij is
δ =2π
λrij (8.82)
This is the definition of φ that needs to be substituted in the last equation in the previous
sub-section.
8.2.5.1 Calculation of Angles
Using the above geometrical setup, we can also evaluate the angles introduced in section
3, φi and φj. Looking at figure 1, we can define two vectors that represent two points with the
122
d
90
B
0
Path difference
FOCAL PLANE
φj
r = (x, y)
φi
B2 +√
x2 + y2
rib = (xi, yi)rjb = (xj, yj)
ANTENNAE
Figure 8.7: Schematic of the Quasioptical beam combination set-up inside the cryostat
same (x, y) co-ordinates as the two bolometers respectively, but both of them on the focal plane.
Let us call these vectors ri0 and rj0; their co-ordinates are (xi, yi, 0) and (xj , yj, 0) respectively.
Again, looking at figure 1, we can figure out the angles φi and φj with the help of the two
vectors we just defined. Notice that the two vectors rib − ri0 and rib − r enclose the angle φi
between them. We can therefore easily figure out the cosine of the angle:
cosφi =(rib − ri0) · (rib − r)
|rib − ri0| · |rib − r| (8.83)
But rib − ri0 = d so that
φi = cos−1 (rib − ri0) · (rib − r)
d · |rib − r| (8.84)
Similarly,
φj = cos−1 (rjb − rj0) · (rjb − r)
d · |rjb − r| (8.85)
We can now substitute the values of δ, φi, φj and ǫ (which depends on the baseline and pointing).
Given a set of n antennas and their positions, we can cycle between all the baselines to get the
123
B
(0,0,0)
(x2,y2,
0)
A(x1,
y1,0)
Figure 8.8: Schematic of the Quasioptical beam combination set-up inside the cryostat
net interference pattern as a function of x and y, the co-ordinates on the focal plane.
8.2.5.2 Size and placement of bolometers
Let the placement of bolometers on the focal plane be characterized by xk, and let their
lengths along the x and y axes be a and b respectively. To get the signal from one bolometer,
we need to integrate the expression for the signal as a function of x and y over this range.
8.2.6 Recovery of Cℓ from instrument simulation
8.2.6.1 Simulation
Let us start with the expression for the output at the Fizeau combiner’s focal plane:
I (ν1, ν2;χ) =
∫ z2z1
∫ ∫E∗
1E2ei(δ+ǫ(φ,θ,χ)) + E1E
∗2e
−i(δ+ǫ(φ,θ,χ)) sin θdφdθ(
z3
ez−1
)
dz∫ z2z1
z3
ez−1dz(8.86)
124
Excluding normalization, and denoting Planck distribution effects as P (z),
I (ν1, ν2;χ) =
∫ z2
z1
∫ ∫
E∗1E2e
i(δ+ǫ(φ,θ,χ)) + E1E∗2e
−i(δ+ǫ(φ,θ,χ)) sin θdφdθP (z)dz (8.87)
In the actual simulation, the∫
’s are replaced by∑
’s:
I (ν1, ν2;χ) =∑
z
∑
Ω
E∗1E2e
i(δ+ǫ(φ,θ,χ)) + E1E∗2e
−i(δ+ǫ(φ,θ,χ))∆ΩP (z)∆z (8.88)
Let HA = collecting area of the horn antenna; FPA = area of the focal plane. Let (x, y) be
co-ordinates on the focal plane. Then,
O (ν1, ν2; χ; x, y(ND); NB) =HA
FPASUM (8.89)
where
SUM =∑
x,y
∑
z
∑
Ω
E∗1E2e
i(δ+ǫ(φ,θ,χ,NB)) + E1E∗2e
−i(δ+ǫ(φ,θ,χ,NB))∆ΩP (z)∆z∆x∆y (8.90)
where (x, y) specify the position of one detector, χ is the orientation of the instrument and
NB and ND specify a baseline and a detector respectively; O is the output at a detector for a
particular baseline and orientation.
To recover visibilities, we need to “undo” the effect of all the factors above. Let
ΩO = Solid angle on the sky observed by the instrument
DA = Area of one detector
P (ν)∆ν = Net Planck factor
Φ(x, y(ND)) = Net phase introduced inside Fizeau combiner
fSKY = Fraction of sky covered by instrument (8.91)
Then, the visibility V for a given baseline, detector and orientation is given by
V(χ; ND; NB) =O (ν1, ν2; χ; x, y(ND); NB)
ΩO DA eiΦ(ND)
(FPA
HA
)
(8.92)
To obtain an estimate of the power spectrum, Cℓ, recall from §5.6 the relation between
visibility and power spectrum:⟨
ViV∗j
⟩
K2T 2=∑
ℓ
(2ℓ+ 1
4π
)
Cℓ
∫
dn
∫
dn′Ai (n)A∗j
(n′)Pℓ
(n · n′) ei2π(~ui·n−~uj ·n′) (8.93)
or, equivalently, ⟨
ViV∗j
⟩
K2T 2=∑
ℓ
(2ℓ+ 1
4π
)
CℓWij,ℓ (8.94)
125
An estimate of the power spectrum is then obtained by
Cℓ =
(4π
2ℓ+ 1
)1
fSKY
1
P (ν)∆ν〈V(χi; ND; NB)V∗(χi; ND; NB)〉 (8.95)
where we have completely disregarded
1. The antenna beam and therefore the window function, which is assumed to be a delta-
function above.
2. Finite sky coverage. This leads to a convolution discussed in §...
An estimate of errors on these estimates for the power spectrum is also needed, to find
out whether the recovered values of Cℓ are consistent with the values that the simulation used.
A very small fraction of the sky is used, so we expect the cosmic variance to be high. Let us
list the various errors on this estimate and find out how they contribute to the net error:
Finite sky coverage :
√
1
fSKY(2ℓ+ 1)
Cℓ Sampling variance :
√
1
Nχ
Simulation sampling variance :
√1
NPIX(8.96)
Errors due to finite sky coverage and finite number of instrument orientations are couple
to each other, whereas the number of pixels on the simulated sky is independent:
σ2NET =
1
fSKY(2ℓ+ 1)
1
Nχ+
1
NPIX(8.97)
Since I am not sure that eq.(8.97) is correct, the plotted error bars are given by
σ2NET =
1
fSKY(2ℓ+ 1)(8.98)
A sample recovered power spectrum is shown in fig.(8.11). Notice that the normalization
is different from the power spectrum that the simulation started out with. The recovered
spectrum, does, however, present the same features as the input power spectrum. The beam
of each antenna was assumed to be a “top-hat”, which leads to a window function with large
sidelobes with which bandpowers are convolved. This convolution has not been reversed in
the recovered spectrum. Also, it was not possible to make baselines that would correspond to
ℓ &200 because of the size of the antennae.
This simulation shall be extended to demonstrate the u-v plane spectral resolution ability
of the Fizeau combiner.
126
8.2.6.2 Simulation parameters
Frequency Range : 93− 94GHz
Antenna 1 position(in cm) : (−5,−5)
Antenna 2 position(in cm) : (10,−5)
Antenna 3 position(in cm) : (4, 6)
Antenna 4 position(in cm) : (−8, 3)
Baseline Lengths(in cm) : 15.0, 14.2, 12.5, 8.5, 19.7, 12.4
Cooresponding values of ℓ : 147, 139, 123, 84, 193, 121 (8.99)
Figure 8.9: The power spectrum used for the simulation.
127
Figure 8.10: Temperature map from the power spectrum shown in fig.(8.9) above. Used as
input for the instrument simulation. Temperature anisotropies are in µK.
129
Bibliography
[1] M. Zaldarriaga and U. Seljak, “CMBFAST for Spatially Closed Universes,” ApJ Suppl.,
vol. 129, pp. 431–434, Aug. 2000.
[2] M. White, J. E. Carlstrom, M. Dragovan, and W. L. Holzapfel, “Interferometric Observation
of Cosmic Microwave Background Anisotropies,” ApJ, vol. 514, pp. 12–24, Mar. 1999.
130
Chapter 9
CMB Data Analysis
For almost three decades after Penzias and Wilson’s discovery, the task of finding anisotropies
in the CMB remained a challenge. COBE brought about a revolution when it reported a
detection on a 7 angular scale. A number of smaller instruments focused on smaller angular
scales followed, and in a short period of time, the size of the datasets exceeded capabilities of
the techniques used to analyze the data. Developement of analysis techniques has been the
prime focus of theorists in CMB ever since. In addition, experiments like DASI have upped the
stakes because they use interferometry. In a way, interferometry has some advantages, since it
enables sampling directly from fourier space, i.e. directly from the l-modes themselves, which is
precisely what we aim for in power spectrum estimation. However, interferometers come with
their own challenges.
In this chapter, we start with a concise discussion of linear mapmaking techniques in §9.1.We then move to Bayesian Maximum-likelihood Analysis of interferometry data to recover Cℓ’s
and show that a full Bayesian approach is computationally unfeasible. In section 4, we explore
a novel approach to likelihood analysis that enables computationally efficient calculation within
a fully Bayesian framework, without having to make approximations. This is the first time this
technique (called “Gibbs sampling”) has been applied to interferometry. We then present results
and show that Gibbs sampling as used here is indeed robust by applying the Gelman-Rubins
test for convergence.
9.1 Mapmaking
9.1.1 The general mapmaking problem
This section is a concise summary of the detailed discussion in [1].
In general, every instrument has its own unique scan strategy, and what we receive (in our
131
case, the visibility from some baseline) depends on the signal on the sky, and the convolution
with the beam, combined with the scan strategy. Instrumental noise has to be added to that
later. All this is summarized in the nice equation
d = P∆ + n (9.1)
where d is TOD (“time-ordered data”), ∆ is the signal we are trying to recover, n is instrumental
noise, and P has all the information about our scan strategy. Bear in mind in the discussion
that follows that d, ∆ and n are vectors, i.e. column matrices (one could take them to be row
matrices just as well, since the final results will be exactly the same), and P is a matrix with
the correct dimensions.
In order to make a map, then (or, for that matter, do any analysis), we need to recover
∆, the signal on the sy; in our case, the visibility from different baselines. We therefore need
to invert the above equation to recover ∆. While it isn’t clear to me whether there is a finite
number of non-linear methods, our first task should be to find linear solutions. All linear
solutions can be expressed as
∆′ = Bd (9.2)
W.L.O.G., where ∆′ is an estimate of ∆, and therefore, there is an estimation error involved,
no matter how good our technique. Before we jump into calculating B, let us define a few basic
quantities. Let S =⟨∆∆T
⟩denote the theory covariance matrix, and N =
⟨nnT
⟩the noise
covariance matrix. It seems reasonable to me (do let me know if you object, and why, since
I may have missed something) to suppose that noise and signal are uncorrelated, since their
sources have nothing to do with each other, i.e.⟨∆nT
⟩=⟨n∆T
⟩≡ 0
We are now ready to address the mapmaking problem. What follows is a discussion of
my own analysis, and unfortunately, I haven’t been able to compare it to any reference to see
whether it has any element of sensibility. What I found is that broadly, there are three different
conditions on can impose, and each will lead to a different (and unique) definition of B. The
three conditions are:
1. Ease of calculation
2. Minimizing the estimation error
3. Minimizing χ2
Let us look at these in detail.
132
9.1.1.1 The Brute-force simplistic method
In the equation
d = P∆ + n (9.3)
the simplest thing to do to recover ∆ is to multiply throughout by P−1 to get
P−1d = ∆ + P−1n (9.4)
so that
∆′ = P−1d (9.5)
with an estimation error
δ = P−1n (9.6)
However, we CANNOT possibly do the inverse operation on P, since it is not a square matrix.
But we can do other operations, like taking the transpose. So we recall that PTP is the modulus
of P. This is a square matrix, and we can take its inverse, which is, schematically P−1(PT)−1
(I say schematically, because the operation inverse is not permitted on the individual matrices).
Clearly, if we multiply this by PT , we can recover P. The net matrix is then(PTP
)−1PT .
If we look at the last equation more carefully, we may perhaps be persuaded to feel a
little less silly for adopting such a simplistic approach, since it tells us that the estimation error
is independent of signal, and depends only on instrumental noise. While this is nice, we may
not entirely be happy with it and want to minimize |δ|2.
9.1.1.2 Minimizing the estimation error
In general, we want to estimate ∆ by a “correction matrix”; call it B:
∆′ = Bd = BP∆ + Bn (9.7)
so that the estimation error is
δ = ∆′ −∆ = BP∆ + Bn−∆ = (BP− I)∆ + Bn (9.8)
Therefore,
|δ|2 =⟨δδT⟩
=⟨[(BP− I)∆ + Bn]
[∆T
(PTBT − I
)+ nTBT
]⟩(9.9)
Multiplying out explicitly,
|δ|2 =⟨(BP− I)∆∆T
(PTBT − I
)+ BnnTBT +
[Bn∆T
(PTBT − I
)+ (BP− I)∆nTBT
]⟩
(9.10)
133
The two terms in the square brackets are ∝ either⟨∆nT
⟩or its transpose, both of which are
zero. Therefore,
|δ|2 =⟨(BP− I)∆∆T
(PTBT − I
)+ BnnTBT
⟩(9.11)
Substituting S =⟨∆∆T
⟩and N =
⟨nnT
⟩, we get
|δ|2 = (BP− I)S(PTBT − I
)+ BNBT (9.12)
Now, we need a B such that |δ|2 is minimized. Therefore, we need a solution to the equation
∂|δ|2∂B
= 0 (9.13)
Differentiating w.r.t. B, we get
PS(PTBT − I
)+ NBT = 0 (9.14)
Collecting terms with BT
(PSPT + N
)BT −PS = 0 (9.15)
or(PSPT + N
)BT = PS (9.16)
On a first glance, it seems like solving this equation is impossible for any value of BT , because
if we define BT such that it cancels out either term inside the brackets, it is impossible to make
the LHS equal to PS.
However, we can always cancel out the bracket in the LHS by defining a BT that is ∝ its
inverse. If we also require that BT is simultaneously ∝ PS, then we have essentially solved the
equation. Formally, the solution is
BT =(PSPT + N
)−1PS (9.17)
All we need to do now is to take the transpose of the RHS, and we will have our solution, i.e.
B = (PS)T((
PSPT + N)T)−1
(9.18)
where we have used the fact that (AB)T = BTAT . We can now expand out to get
B =(STPT
) ((PSPT
)T+ NT
)−1(9.19)
But for S, ST =(∆∆T
)T=(∆T)T
∆T = ∆∆T = S, and similarly, NT = N. Now(PSPT
)T= (SP)T PT =
(PT)T
STPT = PSPT . So the final expression is
B = SPT[PSPT + N
]−1(9.20)
We could have differentiated |δ|2 w.r.t. BT instead, and we would get precisely the same result.
134
9.1.1.3 Minimizing χ2
Following the discussion in [2] §11.5 (equations 11.129 to 11.131),
χ2 = (d−P∆)N−1 (d−P∆) (9.21)
To minimize χ2, we set∂χ2
∂∆= 0 (9.22)
which gives us
∆′ =(PTN−1P
)−1PTN−1d (9.23)
⇒ B =(PTN−1P
)−1PTN−1 (9.24)
This map-making method can be employed to extract a “fourier map”; in other words, visi-
bilities, from an interferometer. However, the method involves a number of matrix inversions,
each of which costs ∼ N3P operations where NP is the number of pixels in the map (in fourier or
real space). Wandelt et al ([3, 4]) have introduced fully Bayesian methods that allow a global
inference of covariance and allow map recovery at the same time, and cost only N32
P operations.
This method (called “Gibbs sampling”) will be discussed in §.. The true advantages of Gibbs
sampling come to light when power spectra need to be evaluated.
9.2 Power Spectrum Estimation: Bayesian Approach
Let d = pixelized data from an interferometer. We wish to explore the posterior density
P (Cℓ|d) ∝ P (d|Cℓ)P (Cℓ)← Prior (9.25)
where
P (d|Cℓ) ∝ exp(
d†C−1D d
)
(9.26)
where
C−1D = S (Cℓ) +N (9.27)
is the covariance matrix.
Traditionally, least-squares [5] and maximum-likelihood [6] estimators have been em-
ployed to explore the posterior. However, evaluating either is computationally very costly,
requiring O(N3P
)operations.
135
9.2.1 Detailed Bayesian Formalism
In the Bayesian approach, we wish to compute the posterior density
P (Cℓ, s|d) =P (d|s)P (s|Cℓ)P (Cℓ)
P (d)(9.28)
If S and N are signal and noise covariance matrices respectively,
P (d|s) =1
√
2π |N |exp
(
−1
2(d− s)†N−1 (d− s)
)
(9.29)
Also, since we know that the CMB is very nearly gaussian
P (s|Cℓ) =1
√
2π |S|exp
(
−1
2s†S−1s
)
(9.30)
so that
− lnP (Cℓ, s|d) =1
2(d− s)†N−1 (d− s) +
1
2s†S−1s+
1
2ln |S| (9.31)
The best estimate of the signal s can then be found by
∂ (− lnP (Cℓ, s|d))∂s†
= −N−1 (d− s) + S−1s = 0
=⇒(S−1 +N−1
)sBE = N−1d (9.32)
Matrix inversions are computationally costly, and in the final expression for sBE, there are
three inversions, each costing O(N3P
)operations. If we were to transform the above equation
into the form Ax = B, we can employ efficient techniques to solve for sBE.
However, an efficient method of extracting sBE leaves the problem of Cℓ extraction being
extremely time-consuming computationally.
9.2.2 The problem with the Bayesian approach
Interferometry has been used to detect CMB temperature and polarization anisotropy
(VSA, DASI, CBI references). Here are some of the advantages of interferometry:
1. Direct sampling of Fourier space
2. No leakage T → Q,U , so better control of systematic effects
However, an exact Bayesian analysis analysis method is just as unfeasible as for an imaging
experiment. However, Wandelt et al[3] have introduced a fully Bayesian approach - Gibbs
sampling - that allows a global inference of covariance. Among the many advantages of using
the Gibbs sampler is that it is easily extended for foreground removal.
Let us first illustrate Gibbs’ sampling by applying it to a simple problem in the next
section.
136
9.3 Interlude: The Gibbs Sampler
9.3.1 The problem
Here is the statement of the problem: The problem is to infer the variance of a fluctuating
signal when you only have noisy measurements of this signal: Given 10 data values di, say, and
given that these values are independent samples of the sum of two Gaussian variates each si
(the signal) and ni (the noise), and that s and n have zero mean, and the variance of n, but we
don’t know the variance of s, σ2s . Write down Bayes’ theorem for this case, compute
• the conditional density for the vector s (with components si) given σ2s , and
• the conditional density for σ2s (up to normalization).
• then sample from the joint density of s and σ2s using Gibbs sampling.
We spell out Bayes’ Theorem in §2 and the sampling technique in §3.
9.3.2 Bayes’ Theorem
Since we are working with an interferometer, let us assume that all the formalism we
write down is for an interferometer. For instance, si stands for visibility from a baseline, free
from noise, and di is the measured visibility etc.
Our aim is to find the posterior density, i.e. the probability of the theory given the data.
We can write, from Bayes’ theorem
P(σ2s |di)
=P(di|σ2
s
)P(σ2s
)
P (di)(9.33)
which can be written schematically as
JointPosterior =Likelihood × Prior
Normalization(9.34)
Assume a flat prior and then, up to a normalization, we get
P(σ2s |di
)= P
(di|σ2
s
)(9.35)
However, the whole problem is that σ2s cannot directly be related to data, di. In other words,
P(σ2s |di)
= P (di|si)P(si|σ2
s
)(9.36)
Now, we know that
137
1. si and di are related through noise, so that
P (di|si) =1
√
2πσ2N
exp
(
−(si − di)22σ2
N
)
(9.37)
2. P(si|σ2
s
)is Gaussian in si, so that
P(si|σ2
s
)=
1√
2πσ2s
exp
(
− s2i2σ2
s
)
(9.38)
Thus,
P(σ2s |di
)=
1√
2πσ2s
exp
(
−(si − di)22σ2
N
)
exp
(
− s2i2σ2
s
)
(9.39)
Since there are N observations, the right hand side is really a product of N factors
P(σ2s |d)
=
(
1√
2πσ2s
)N
1
√
2πσ2N
NN∏
i=1
P (di|si)P(si|σ2
s
)(9.40)
But we know the form of this function, up to a normalization assuming that the mean is zero:
P (di|si)P(si|σ2
s
)=
(
1√
2πσ2s
)
1
√
2πσ2N
exp
(
−(si − di)22σ2
N
)
exp
(
− s2i2σ2
s
)
(9.41)
And so
P(σ2s |d)
= Norm
∫
. . .
∫ ∞
−∞exp
(
−N∑
i=1
(si − di)22σ2
N
)
exp
(
−N∑
i=1
s2i2σ2
s
)
ds1 . . . dsN (9.42)
where Norm = normalization:
Norm =
(
1√
2πσ2s
)N
1
√
2πσ2N
N
(9.43)
where we have marginalized over s to get the posterior. This N-dimensional integral is hard
to evaluate for large values of N , i.e. Evaluating this probability requires huge amounts of
computational time. So we sample from the joint probability instead.
It seems like an even more difficult task, but it is made easier if we are willing to undergo
a paradigm shift in the way we visualize probabilities. Normally, we look at probability as a
function of N variables. If we stick to this interpretation, we will have to evaluate the function
at some point. We can, however, choose to see this probability as a “density of points”, kind of
like the way we see the electron probability density inside an atom. If we can make this step,
then there exist sampling techniques in Statistics that make our job easier. One of them is the
Gibbs’ sampling technique, described in §9.3.3 below.
138
9.3.3 Sampling Technique
Let us state our problem in a general way first. We have two variables, x and y, and we
know the functional form of P (x|y) and P (y|x), and we wish to find either P (x) or P (y) or
both. Normally, we would marginalize over x or y thus:
P (x) =
∫
P (x|y) dy (9.44)
However, we Gibbs sample instead in the following way:
• Start with an initial guess of x, say x0
• Sample y1 from P (y|x0)← this can be done since we know the functional form of P (y|x)
• Sample x1 from P (x|y1)
• Repeat the last two steps, keeping track of all the values of x and y sampled
After a certain number of iterations, the density of values of x and y represents P (x ∩ y).
9.3.4 Application to experiment
In §3, let x = σ2s , y = s and follow through. There is just one small issue: there are two
Gaussians multiplied to each other, and not just one:
P(σ2s |di
)=
(
1√
2πσ2s
)
1
√
2πσ2N
P (di|si)P(si|σ2
s
)= exp
(
−(si − di)22σ2
N
)
exp
(
− s2i2σ2
s
)
(9.45)
We really need to reduce this to one Gaussian with a mean and a variance. The solution to
this problem is purely algebraic but simple. In general, when there are two Gaussians g1 and g2
with different means m1 and m2 and different variances σ1 and sigma2 respectively, we want
to find one set of (mean,variance) for
g1g2 ∝ exp
(
−(x−m1)2
2σ21
)
exp
(
−(x−m2)2
2σ22
)
(9.46)
We need to remember that the logarithm of the Gaussian function is quadratic. Therefore, we
can differentiate the log of the Gaussian and equate it to zero to get the mean. For example,
with the above distribution g1,
g1 = exp
(
−(x−m1)2
2σ21
)
(9.47)
139
⇒ ln g1 = −(x−m1)2
2σ21
(9.48)
⇒ ∂ ln g1∂x
= −(x−m1)
σ21
(9.49)
so that ∂ ln g1∂x = 0 ⇒ x = m1. This is the general way to get the mean of a complicated
Gaussian.
Also, we can easily differentiate twice to get rid of all dependence on the variable x, and
only a combination of mean and variance will remain. So, in our present example, differentiate
eq( 9.49) again to get∂2 ln g1∂x2
= − 1
σ21
(9.50)
⇒ σ21 = −
(∂2 ln g1∂x2
)−1
(9.51)
We apply this to the product g1g2 above in eq( 9.46) and get
ln (g1g2) = −[
(x−m1)2
2σ21
+(x−m2)
2
2σ22
]
(9.52)
∂ ln (g1g2)
∂x= −
[(x−m1)
σ21
+(x−m2)
σ22
]
(9.53)
Now, equate this to zero, which means that we have to solve the following equation for x (the
net average):(x−m1)
σ21
+(x−m2)
σ22
= 0 (9.54)
Collect all the terms containing x on one side:
x
(1
σ21
+1
σ22
)
=
(m1
σ21
+m2
σ22
)
(9.55)
⇒ x =
(m1
σ21
+ m2
σ22
)
(1σ21
+ 1σ22
) (9.56)
This is the mean for g1g2. Now, differentiate eq( 9.53) to get
∂2 ln (g1g2)
∂x2= −
(1
σ21
+1
σ22
)
≡ − 1
σ2(9.57)
where σ2 is the net variance. Solve to get
σ2 =1
(1σ21
+ 1σ22
) =σ2
1σ22
σ21 + σ2
2
(9.58)
140
Now we have everything we need for implementing Gibbs’ sampling.
9.3.5 Results
9.4 Cℓ extraction using Gibbs’ Sampling
This section is a concise version of [7].
9.4.1 Method
Let d = pixelized data from an interferometer. We wish to explore the posterior density
P (Cℓ|d) ∝ P (d|Cℓ)P (Cℓ)← Prior (9.59)
where
P (d|Cℓ) ∝ exp(
d†C−1D d
)
(9.60)
where
C−1D = S (Cℓ) +N (9.61)
is the covariance matrix.
Traditionally, least-squares [5] and maximum-likelihood [6] estimators have been em-
ployed to explore the posterior. However, evaluating either is computationally very costly,
requiring O(N3P
)operations.
Instead, we use the Gibbs’ sampler introduced to CMB data analysis by Wandelt et al
[3, 4], and sample from the joint distribution
P (Cℓ, s, d) = P (d|s)P (s|Cℓ)P (Cℓ) (9.62)
since there is no known way to sample directly from P (Cℓ|d) in eq.(9.59) above. The main
point of using Gibbs’ sampling is that it can be proved [8] that if it is possible to sample from
P (s|Cℓ, d) and P (Cℓ|s, d) ∝ P (Cℓ|s) then we can sample iteratively from the joint distribution
[3].
9.4.2 Formalism
In the Bayesian approach, we wish to compute the posterior density
P (Cℓ, s|d) =P (d|s)P (s|Cℓ)P (Cℓ)
P (d)(9.63)
142
If S and N are signal and noise covariance matrices respectively,
P (d|s) = exp
(
−1
2(d− s)†N−1 (d− s)
)
(9.64)
Also, since we know that the CMB is very nearly gaussian [ref]
P (s|Cℓ) =1
√
|S|exp
(
−1
2s†S−1s
)
(9.65)
so that
− lnP (Cℓ, s|d) =1
2(d− s)†N−1 (d− s) +
1
2s†S−1s+
1
2ln |S| (9.66)
The best estimate of the signal s can then be found by
∂ (− lnP (Cℓ, s|d))∂s†
= −N−1 (d− s) + S−1s = 0
=⇒(S−1 +N−1
)sBE = N−1d (9.67)
Matrix inversions are computationally costly, and in the final expression for sBE, there are
three inversions, each costing O(N3P
)operations. If we were to transform the above equation
into the form Ax = B, we can employ efficient techniques to solve for sBE.
We also need to remember that eq.(9.67) gives us a value for the average of the signal.
The real signal on the sky is of the form x + C12 ξ where ξ is the variation and is a gaussian
variable and C is the covariance. C is evaluated easily by noting that P is a multiple of two
gaussians, with covariances S and N ; therefore the covariance of P is
C−1 = S−1 +N−1 =⇒ C =(S−1 +N−1
)−1(9.68)
Eq.(9.67) can be recast in a more suitable format thus
(S−1 +N−1
)sBE = N−1d
=⇒ S− 12
(
1 + S12N−1S
12
)
S− 12x = N−1d
=⇒(
1 + S12N−1S
12
)
S− 12x = S
12N−1d (9.69)
where we have replaced sBE by x. We now need to find a similar equation for the fluctuating
part of the signal. Call this part b such that s = x + b. b then needs to satisfy the following
properties
〈b〉 = 0 (9.70)⟨
bb†⟩
= C, since (9.71)
b = C12 ξ (9.72)
=⇒ C−1b = C− 12 ξ (9.73)
143
We claim then that if(S−1 +N−1
)b = S− 1
2 ξ1 +N− 12 ξ2 (9.74)
then b has the properties outlined in eq.(9.73). To prove the first property, take the average of
the LHS in eq.(9.74):
(S−1 +N−1
)〈b〉 = S− 1
2 〈ξ1〉+N− 12 〈ξ2〉 ≡ 0 (9.75)
The second property is proved thus
⟨
bb†⟩
= C⟨(
S− 12 ξ1 +N− 1
2 ξ2
)(
S− 12 ξ†1 +N− 1
2 ξ†2
)⟩
C
= C(
S−1⟨
ξ1ξ†1
⟩
+N−1⟨
ξ2ξ†2
⟩)
C
= C (9.76)
where we have used the fact that ξ1 and ξ2 are independent gaussian variates with unit variance:
⟨
ξ1ξ†2
⟩
=⟨
ξ†1ξ2⟩
= 0 (9.77)⟨
ξ1ξ†1
⟩
=⟨
ξ2ξ†2
⟩
= 1 (9.78)
The equation for b is then
(S−1 +N−1
)b = S− 1
2 ξ1 +N− 12 ξ2
=⇒ S− 12
(
1 + S12N−1S
12
)
S− 12 b = S− 1
2 ξ1 +N− 12 ξ2
=⇒(
1 + S12N−1S
12
)
S− 12 b = ξ1 + S
12N− 1
2 ξ2 (9.79)
To summarize, the two equations for simulating the signal are
(
1 + S12N−1S
12
)
S− 12x = S
12N−1d (9.80)
(
1 + S12N−1S
12
)
S− 12 b = ξ1 + S
12N− 1
2 ξ2 (9.81)
These can be solved to obtain x and b for every iteration, and s = x+ b.
9.4.2.1 Beam / Window Function
Brute-force Implementation The foregoing discussion ignores the beam of the instru-
ment, which is assumed to be flat in fourier space. This assumption is unreal and the beam
needs to be included in signal and power spectrum estimation. Let us denote the signal co-
variance matrix with the “flat” beam mentioned above as SD. SD is clearly diagonal, and its
elements are the different Cℓ’s. If S is the signal covariance matrix, then, schematically,
S = BSD (9.82)
144
where B is the beam matrix of the instrument in fourier space. Recalling that S = ss† and
that this is a fourier space representation of the signal, B is really the window-function of the
instrument. Let us therefore denote the beam in fourier space as BF such that
S = B†FSDBF (9.83)
In other words, we have replced the signal s with BF s.
Let us work out relations for the best estimate of the average signal sBE or x and the
fluctuation b, retracing steps in §9.4.2. In particular, we start with the modified version of
eq.(9.66) with s→ BF s and SD ≡ s†s:
− lnP (Cℓ, s|d) =1
2(d−BF s)†N−1 (d−BF s)
+1
2s†S−1
D s+1
2ln |S| (9.84)
The best estimate of the average signal can be found as before:
∂ (− lnP (Cℓ, s|d))∂s†
= −B†FN
−1 (d−BF s) + S−1D s = 0
=⇒(
S−1D +B†
FN−1BF
)
sBE = BFN−1d (9.85)
As before, we can recast eq.(9.85) into a more suitable form:(
S−1D +B†
FN−1BF
)
sBE = BFN−1d
=⇒ S− 1
2
D
(
1 + S12
DBFN−1BFS
12
D
)
S− 1
2
D x = BFN−1d
=⇒(
1 + S12
DBFN−1BFS
12
D
)
S− 1
2
D x = S12
DBFN−1d (9.86)
where we have again replaced sBE by x.
To obtain an equation for the fluctuations, we note that the covariance of P is still C
where
C−1 = S−1D +N−1 =⇒ C =
(S−1D +N−1
)−1(9.87)
From eq.(9.79), we get
(
S−1D +B†
FN−1BF
)
b = S− 1
2
D ξ1 +N− 12 ξ2
=⇒ S− 1
2
D
(
1 + S12
DBFN−1BFS
12
D
)
S− 1
2
D b = S− 1
2
D ξ1 +N− 12 ξ2
=⇒(
1 + S12
DBFN−1BFS
12
D
)
S− 1
2
D b = ξ1 + S12
DBFN− 1
2 ξ2 (9.88)
Results are shown in figs.(9.6,9.7,9.8,9.9). Both Cℓs and maps seem to have considerable
loss in power, implying perhaps that the beam has not been accounted for properly?
145
Computationally Efficient Implementation The foregoing discussion about includ-
ing the beam is easily implemented; however, the computational cost is higher than before. In
order to reduce computation time, we employ the following trick. Let us denote the beam in
pixel (or real) space as BP . It is more desirable to work with BP since
1. It is a sparse matrix, diagonal in the ideal case when there is no “leakage” of signal from
one pixel into another
2. It is a measured quantity and so any non-ideal behaviour can be directly inferred from
measurement.
However, the quantities that an interferometer measures (visibilities) are in the fourier domain.
Therefore, a convenient representation of the beam in fourier space is
BF = FBPF−1 (9.89)
where F represents a fourier transform. The idea is as follows: when the fourier-beam BF
multiplies another quantity, take the inverse fourier transform of the quantity, multiply with
the pixel-beam BP and then fourier transform the result.
This representation changes the quantities in §9.4.2 above. The signal covariance matrix
becomes
S = B†FSDBF (9.90)
= F−1BPFSDFBPF−1 (9.91)
This can also be written as
S =
(
B†FS
12
D
)(
S12
DBF
)
(9.92)
and since the quantities in the two brackets are equal,
S12 = B†
FS12
D (9.93)
We can then replace S12 and S in all the equations in §9.4.2 above to include the beam.
The advantage of using this representation of the instrument becomes apparent when we
look at the modified posterior density
− lnP (Cℓ, s|d) =1
2(d−BF s)†N−1 (d−BF s)
+1
2s†S−1
D s+1
2ln |SD| (9.94)
146
Notice that in the second term, there is no factor that depends on the beam. This leads to the
possibility of combining two or more datasets from different instruments for a joint analysis via
Gibbs’ sampling. For two datasets, the posterior density becomes
− lnP (Cℓ, s|d) =1
2(d−BF1
s)†N−1 (d−BF1s)
+1
2(d−BF2
s)†N−1 (d−BF2s)
+1
2s†S−1
D s+1
2ln |SD| (9.95)
where BF1and BF2
are the fourier-beams of the two instruments.
9.5 Application to simulated data
We simulated a 7×7 patch of the sky with CMB signal. The simulated map is shown
in fig.(9.2). Histograms of recovered values of Cℓs are shown in fig.(9.10).
9.5.1 Gelman-Rubin Test
In order to test the convergence of the Gibbs’ sampling setup, we perform the Gelman-
Rubin test [9, 10] in the following way.
1. Let n be the number of samples in a Gibbs’ sampling algorithm. Let there bem parameters
we wish to estimate - in our case, these are the number of bins.
2. Run the Gibbs’ sampling algorithm with many different initial values. These should be
sampled from a wide range. Effectively, run N “chains” of the Gibbs’ sampling algorithm
where N = number of different initial values.
3. Compare the “in-chain” and “between-chain” variances. These should be approximately
equal in order for the Gibbs’ sampling algorithm to converge.
The last step is completed by calculating the following quantities
1. “Within-chain” variance
W =1
m (n− 1)
m∑
j=1
n∑
i=1
(θij − θj
)2(9.96)
2. “Between-chain” variance
B =n
m− 1
m∑
j=1
(θj − θ
)2(9.97)
147
3. Estimated variance
V (θ) =
(
1− 1
n
)
W +1
nB (9.98)
4. The Gelman-Rubin Statistic
√R ≡
√
V (θ)
W=
√(
1− 1
n
)
+1
n| (BW−1) | (9.99)
where || indicates trace.
The Gelman-Rubin statistic was evaluated to be ∼0.999951 - sufficiently close to indicate con-
vergence.
149
Figure 9.3: The power spectrum used for map simulation and the spectrum recovered from the
simulated map.
157
Bibliography
[1] M. Tegmark, “How to Make Maps from Cosmic Microwave Background Data without
Losing Information,” ApJ Lett., vol. 480, pp. L87+, May 1997.
[2] S. Dodelson, Modern cosmology, Modern cosmology / Scott Dodelson. Amsterdam (Nether-
lands): Academic Press. ISBN 0-12-219141-2, 2003, XIII + 440 p., 2003.
[3] B. D. Wandelt, D. L. Larson, and A. Lakshminarayanan, “Global, exact cosmic microwave
background data analysis using Gibbs sampling,” Phys. Rev. D, vol. 70, no. 8, pp. 083511–
+, Oct. 2004.
[4] B. D. Wandelt, “MAGIC: Exact Bayesian Covariance Estimation and Signal Reconstruc-
tion for Gaussian Random Fields,” ArXiv Astrophysics e-prints, Jan. 2004.
[5] J. R. Bond, A. H. Jaffe, and L. Knox, “Estimating the power spectrum of the cosmic
microwave background,” Phys. Rev. D, vol. 57, pp. 2117–2137, Feb. 1998.
[6] M. P. Hobson and K. Maisinger, “Maximum-likelihood estimation of the cosmic microwave
background power spectrum from interferometer observations,” Monthly Notices of the
RAS, vol. 334, pp. 569–588, Aug. 2002.
[7] B. Wandelt and S.S. Malu, “Gibbs’ sampling for Interferometry,” Work in progress, 2007.
[8] Tanner, Tools for Statistical Inference: Methods for the Exploration of Posterior Distribu-
tions and Likelihood Functions, Springer Verlag, Heidelberg, Germany., 1996.
[9] G. Huey, R. H. Cyburt, and B. D. Wandelt, “Precision primordial 4He measurement from
the CMB,” Phys. Rev. D, vol. 69, no. 10, pp. 103503–+, May 2004.
[10] Gelman, Andrew and Rubin, Donald B., “Inference from iterative simulation using multiple
sequences,” Statistical Science, vol. 7, no. 4, pp. 457–472, nov 1992.
158
Chapter 10
Conclusions
In this thesis, we have introduced a novel interferometer, the Millimeter-wave Bolometric Inter-
ferometer, which combines the advantages of interferometry with the sensitivity of bolometers.
Furthermore, MBI has a novel quasi-optical beam-combination arrangement that will allow it
to simultaneously function as an interferometer and an imager. In addition, an efficient sta-
tistical technique already employed by Wandelt et al [1, 2] for imagers was adapted for an
interferometer. Several instrumental aspects were also studied in chapter 5 and two different
measurement/instrument optimization techniques explored (§7.7.1,§7.6,§7.7). These techniques
will provide MBI with the sensitivity required to provide upper limits to B-mode levels. But
MBI-4 has just six baselines and a 7 beam - these parameters imply large pixels in u-v space.
Future versions of MBI and another planned space-based version [3] will have larger beams
(∼15), leading to smaller pixels in the u-v plane. A space-based version (called EPIC) is also
intended to have many more detectors in the focal plane in its Fizeau system. EPIC is also
planned to have several “units”, each in a different bandwidth, each with several dozens of
antennae, providing both the ℓ-space coverage and a means to characterize foregrounds.
This thesis has thus introduced a new kind of instrument with exquisite control over
systematic effects, brought about through instrument optimization. In particular, it can simul-
taneously observe as an imager and an interferometer. Power spectra can thus be computed
with both approaches and compared.
But comparing power spectra is not all. MBI can make images of those parts of the
sky where foregrounds are known to dominate the CMB signal. This same information can
simultaneously be obtained in the u-v plane, split into several sub-bands. This provides us with
the ability to perform a unique comparison of foregrounds and systematic effects in image plane
and the u-v plane. If we add the advantage of several bands to this instrument, we gain the
ability to characterize foregrounds as well as detect the faint B-mode signal. This
is a unique ability, not yet achieved by an experiment in CMB cosmology. This eliminates the
159
need to cross-correlate data from several experiments to eliminate foregrounds, thereby allowing
greater control over instrument systematics. In this sense, the MBI is a complete instrument,
supported by the analysis and simulation techniques developed in this thesis.
160
Bibliography
[1] B. D. Wandelt, D. L. Larson, and A. Lakshminarayanan, “Global, exact cosmic microwave
background data analysis using Gibbs sampling,” Phys. Rev. D, vol. 70, no. 8, pp. 083511–+,
Oct. 2004.
[2] B. D. Wandelt, “MAGIC: Exact Bayesian Covariance Estimation and Signal Reconstruction
for Gaussian Random Fields,” ArXiv Astrophysics e-prints, Jan. 2004.
[3] P. T. Timbie, G. S. Tucker, P. A. R. Ade, S. Ali, E. Bierman, E. F. Bunn, C. Calderon,
A. C. Gault, P. O. Hyland, B. G. Keating, J. Kim, A. Korotkov, S. S. Malu, P. Mauskopf,
J. A. Murphy, C. O’Sullivan, L. Piccirillo, and B. D. Wandelt, “The Einstein polarization
interferometer for cosmology (EPIC) and the millimeter-wave bolometric interferometer
(MBI),” New Astronomy Review, vol. 50, pp. 999–1008, Dec. 2006.
161
Appendix A
Dr. Planck, or: How I Learned to Stop Worrying
and Love Stat Mech.
A.1 The general problem
We wish to figure out how particles are distributed among N states, ni, each with energy
En - this is the general problem that Statistical Mechanics addresses. In these notes we will
restrict ourselves to photons.
In general, the distribution function U (E) is given by U (E) dE = number of available
states × average energy per unit state. The number of available states is usually referred to in
literature as the ‘phase factor’. We must then, deal with two separate calculations.
A.2 Average Energy
For a photon, Eγ = hν. In general, the nth state will have energy En = nhν. To find
the average energy per state, we sum over the entire distribution, weighing each state with a
factor given by the distribution function in this case the Boltzmann distribution function given
by e−βEn where β = 1kT . The average energy then is
E =
∑
nEne
−βEn
∑
ne−βEn
(A.1)
In Statistical Mechanics, the sum∑
ne−βEn is called the Partition function, and all physical
quantities like (average) Energy, Entropy etc. can easily be related to it. Below is a simple
example of such a relation that is useful to us. Call Z the Partition function, so that Z =
162
∑
ne−βEn Then differentiate Z w.r.t. β to get:
∂Z
∂β= −
∑
n
Ene−βEn (A.2)
We immediately see that this is very close to the expression for E above; all we need to do is
to divide this by∑
n e−βEn , which is exactly Z. So we get, adding a minus sign,
E = − 1
Z
∂Z
∂β≡ −∂ lnZ
∂β(A.3)
Now, for photons,
Z =∑
n
e−βEn =
∞∑
n=0
e−βnhν (A.4)
Recall now that ∞∑
n=0
r−nα =1
1− r (A.5)
Applying that here, we get, with r ≡ e−βhν
Z =1
1− e−βhν (A.6)
⇒ lnZ = − ln |1− e−βhν | (A.7)
⇒ E = −∂ lnZ
∂β= +
1
1− e−βhν · −e−βhν · −hν =
hν
(1− e−βhν) eβhν =hν
eβhν − 1(A.8)
A.3 Number of phase states available, or phase factor
The Uncertainty Principle decides the ‘minimum size’ of a ‘phase cell’ because
dxdp ≥ h (A.9)
⇒ d3xd3p ≥ h3 (A.10)
Therefore, given a certain phase-space volume d3xd3p, the number of ‘cells’ available in that
phase space is
ζ =d3xd3p
h3(A.11)
Writing the volume in real space as V , we get
ζ =V d3p
h3=
4πp2dp · Vh3
(A.12)
163
For photons, E = pc from Special Relativity ⇒ p = Ec = hν
c
⇒ ζ = 4π6 h2ν2
c2· 6 hdν
c· 1
6 h3· V =
4π
c3ν2dν · V (A.13)
Now, we account for the fact that there are two unique polarization states possible for any given
direction of propagation. The above expression becomes
ζ =8π
c3ν2dν · V (A.14)
A.4 Planck Distribution
Planck distribution can then be written as a multiple of the two factors in the two sections
above
U (ν) dν =8π
c3ν2 · hν
eβhν − 1dνV (A.15)
We can express the LHS in terms of the energy density instead of energy, so that
u (ν) dν =8π
c3ν2 · hν
eβhν − 1dν (A.16)
This is the expression we wanted.
What we really need in calculations for the MBI, though, are integrals of this functions
× other functions, like the interference pattern on the bolometers. While doing an integral, it
is always very convenient to separate out unitless quantities that we integrate over, and factors
that depend on the physics of the situation. In this case we define a unitless quantity x = hνkT
and strive to present this distribution in terms of x thus:
u (ν) dν =8π
c3
(hν
kT
)2
·(kT
h
)2
·(hν
kT
)
· kT(
1
ehνkT − 1
)
d
(hν
kT
)(kT
h
)
(A.17)
Replacing hνkT by x, and collecting all other factors together, we get
u (ν) dν =8πh
c3
(kT
h
)4 x3
ex − 1dx (A.18)
where u has units JHz−1m−3, i.e. energy per unit volume per unit frequency interval. But the
energy radiated out in all directions is the same. Moreover, we are interested in the spectral
intensity, i.e. power per unit area per unit solid angle per unit frequency integral. We denote
the spectral intensity by I, and this is how I is related to u:
I (ν, T ) =u (ν, T ) c
4π(A.19)
such that
I (ν, T ) =2h
c2
(kT
h
)4 x3
ex − 1dx (A.20)
164
We can also write this as
I (ν, T ) = Bν (T ) dν (A.21)
for a black body, where
Bν =2hν3
c2(ex − 1)−1 [Wm−2Sr−1Hx−1
](A.22)
so that
I (ν, T ) =2hν3
c2(ex − 1)−1 dν
[Wm−2Sr−1
](A.23)
However, if the signal varies across the sky, then Bν is a function of (θ, φ). Thus, if we write
T = T0 + δT (θ, φ) (A.24)
for the CMB, then
Bν (T + δT ) = Bν (T0) +∂B
∂TδT ≡ Bν (T0) + ∆B (A.25)
If we are studying the anisotropies in the CMB, we are interested only in the second term,
which we can write as∂B
∂x
∂x
∂(
1T
)∂(
1T
)
∂T=
2kν2
c2x2ex
(ex − 1)2(A.26)
so that
∆B (θ, φ) =2k
λ2
(x2ex
(ex − 1)2
)
δT (θ, φ) (A.27)
Multiply and divide by hkT0
(in order to end up with dx instead of dν in the expression for
intensity, I):
∆B (θ, φ) =2k2T0
hλ2
(x2ex
(ex − 1)2
)
δT (θ, φ)
(h
kT0
)[Wm−2Sr−1Hz−1
](A.28)
Now, because of the extra factor of hkT0
at the end, we can write the intensity as
∆I (θ, φ) =2k2T0
hλ2
(x2ex
(ex − 1)2
)
δT (θ, φ) dx[Wm−2Sr−1
](A.29)
This is the quantity that we wish to calculate in the instrument simulation, given a simulated
map of the sky.
165
A.5 Distribution for particle number
We can follow through §A.2 for number of particles too. Similar to eq(A.1), we can write
n =
∑
nne−βEn
∑
ne−βEn
(A.30)
Recall that En = nhν, so that1
hν
∂Z
∂β= −
∑
n
ne−βEn (A.31)
and we can write
n =E
hν= − 1
hν
1
Z
∂Z
∂β≡ − 1
hν
∂ lnZ
∂β=
1
eβhν − 1(A.32)
Using the phase space factor from §A.3, we get that the number of particles between ν and
ν + dν, given by n (ν) dν is
n (ν) dν =8π
c3ν2
eβhν − 1dν (A.33)
The author is forever indebted to late Dr. Swaminathan for his expositions on Statistical
Mechanics.
166
Appendix B
S- and T-matrix formulation
B.1 Two port devices and the S-matrix
Most of the microwave devices we use in our lab are 2-port devices, and are usually used
in series, e.g. a w/g twist with a w/g straight piece. Any 2-port device has two possible inputs
and two outputs. We label the inputs with a and outputs with b.
For all practical purposes, we are interested not in the values of the outputs b, but what
they are compared to the inputs. In other words, we wish to look at the generic ratios
output
input(B.1)
for all four quantities.
The most simple-minded approach would be to define the four ratios b1a1
etc., but we can
write these out systematically as:
b1 = S11a1 + S12a2 (B.2)
b2 = S21a1 + S22a2 (B.3)
The above equations can also be in vector form written as:
~b = ~S · ~a (B.4)
or, better still, in matrix form, which is more useful for our purpose:[
b1
b2
]
=
[
S11 S12
S21 S22
][
a1
a2
]
(B.5)
167
b1 a2
b2a1
2−port device
Figure B.1: Scematic of the 2-port device
Note for the mathematically inclined (a.k.a nerdy): each of the four quantities b1, b2, a1,
and a2 is independent of all others, and so these are four linearly-independent quantities. This
purely mathematical fact deduced from common-sense will help us later.
This definition of the so-called S-matrix is good-enough for anyone involved in making
measurements, and the four S-parameters have the (by now obvious) meanings:
B.2 The need for a T-matrix
All this is fine for a single device, but what if there is a series of 2-port devices? Taking
the familiar example from our lab of many waveguide devices in series, we see immediately that
while we care about characterizing every single device, our eventual aim is not to slog away
tediously trying to figure out how the input from one device becomes the output of another,
but to figure out the effect of all the series devices at the same time.
Note, however, that this is not really possible with the S-matrix, since the inputs and
also the outputs are on both sides of the device. Therefore, we need to change into a system
where the inputs are both on the left (right) and the outputs on the right (left). The simplest
thing to do then would be to have this formalism worked out such that the net effect of all
devices would be:
Neteffect = device1 × device2 × device3 × ...× devicen
This is why we need the so-called T-matrix. Here is how the formalism is defined:
instead of going from input to output (this is what the S-matrix does), we want to go
from left to right. Recall that when we wanted to go from input to output, we changed from
the matrix
[
a1
a2
]
to
[
b1
b2
]
Now look at fig 1. The two quantities on the left are a1 and b1, and the two quantities
168
on the right are a2 and b2. So, very naively, we wish to go from
[
a1
b1
]
to
[
a2
b2
]
And so we need a new matrix to go from the left-vector to the right-vector. Schematically,
we can write this as:~Right = ~T · ~Left (B.6)
or, a little more clearly, as:
[
a2
b2
]
=
[
T11 T12
T21 T22
][
a1
b1
]
(B.7)
B.3 Conversion between S- and T-matrix
When we make measurements of a device, it makes sense to think in terms of S-parameters,
especially since those are what all network analyzers output. So, we need to figure out a way
to change from S-parameters to T-parameters and back. Lets try to figure out the former first:
Essentially, there are four equations we need to work with for the four T-parameters:
b1 = S11a1 + S12a2 (B.8)
b2 = S21a1 + S22a2 (B.9)
a2 = T11a1 + T12b1 (B.10)
b2 = T21a1 + T22b1 (B.11)
Equations 9 and 11 imply
S21a1 + S22a2 = T21a1 + T22 (S11a1 + S12a2) (B.12)
Similarly, equations 8 and 10 imply
a2 = T11a1 + T12 (S11a1 + S12a2) (B.13)
Equation 12 is
S21a1 + S22a2 = T21a1 + T22S11a1 + T22S12a2 (B.14)
169
or, grouping terms with a1 and a2 separately:
a1 [S21 − T21 − T22S11] + a2 [S22 − T22S12] = 0 (B.15)
Each of the two brackets must equal separately, since a1 and a2 are independent, so that the
second bracket yields
S22 − T22S12 = 0⇒ T22 =S22
S12(B.16)
Now substitute this value of T22 into the equation we get from the first bracket
S21 − T21 − T22S11 = 0⇒ T21 = S21 − T22S11 ⇒ T21 =S12S21 − S22S11
S12(B.17)
Now look at equation 13, which reads
a2 = T11a1 + T12S11a1 + T12S12a2 (B.18)
As above, we collect terms with a1 and a2
a2 [1− T12S12]− a1
[
T11 +S11
S12
]
= 0 (B.19)
Again, using the linear independence of a1 and a2 we get from the first bracket
T12 =1
S12(B.20)
and from the second bracket
T11 +S11
S12= 0⇒ T11 = −S11
S12(B.21)
We can now write out our T-matrix in terms of elements of the S-matrix thus[
T11 T12
T21 T22
]
=
[
−S11
S12
1S12
S12S21−S22S11
S12
S22
S12
]
. (B.22)
It turns out that we can manipulate the same equations to express the S-matrix in terms
of the T-matrix thus [
S11 S12
S21 S22
]
=
[
−T11
T12
1T12
T12T21−T22T11
T12
T22
T12
]
. (B.23)
Another note for the vector-space inclined: what we have done essentially is changed
from the Input-Output basis to the Right-Left basis, and found the corresponding change in
the transformation matrix.
170
Appendix C
Relationship between ℓ and θ
CMB anisotropy is usually “broken-down” in spherical harmonics
∆T
T0(θ, φ) =
∑
ℓ
∑
m
aℓmYℓm (θ, φ) (C.1)
with the power spectrum
Cℓ =∑
m
|aℓm|2 (C.2)
However, how are ‘θ’ (angular scale on the sky) and ℓ related? The physical intuition is that the
sky is divided into ℓ parts, so there must be an inverse relationship between the two. Looking
at how ‘θ’ is defined in spherical-polar co-ordinates, we notice that θ: 0 → π. We are then
essentially dividing this angle into ‘ℓ’ parts, so that
θ =π
ℓ(C.3)
In reality, this relation is approximate and holds only for small-enough (∼ 5 − 10) angles.
What is the most general relationship between θ and ℓ?
To answer this question, we first have to make sense of scales on a non-flat geometry; since
for us, relative scales make sense when represented in a flat geometry. So let us represent our
sphere on a flat sheet of paper - the only way to do this is via the Stereographic projection
(figure C.1).
We place a sheet of paper so that it forms a tangent and draw a line from point ‘A’
through the point ‘P’ (at an angle θ to the origin ‘O’) onto the sheet. ‘BC’ is then the length
we wish to calculate. We can write
tanθ
2=
2R
r(C.4)
⇒ r = 2R cotθ
2(C.5)
⇒ r = 2cotθ
2(C.6)
171
with a unit circle.
r is ∝ ℓ, and this is the relationship between them.
R
r
P
B
C
O
A
Figure C.1: Stereographic projection
172
Appendix D
Inflaton field equation of motion and slow-roll
conditions
This is my attempt to heuristically ‘derive’ the inflaton-field equation of motion (which is fairly
straightforward) and the slow-roll conditions, which every set of notes or book/s define/quote in
their own way. Fed up of the lack of consensus in literature, I attempt to follow the convention
that makes most sense to me. Usual health warnings apply: this is my attempt at understanding
these topics, so I make no claim about these notes being right.
D.1 The equation of motion
Let us start from the first law of thermodynamics:
dU + pdV = 0 (D.1)
where, naturally, U = ρa3, where ρ is the density (total energy density, but since we are in
the inflation-era, ρ is dominated by the energy density of the inflaton field φ) and a the scale
factor, which is a function of time.
Substituting for U , we get
a3dρ+ 3a2daρ+ 3a2dap = 0 (D.2)
⇒ 3a2 (p+ ρ) = −a3dρ (D.3)
Now, for a field, we have the following expressions for pressure and energy density:
ρ = KE + PE =1
2
(dφ
dt
)2
+ V (φ) (D.4)
173
p = KE − PE =1
2
(dφ
dt
)2
− V (φ) (D.5)
where the second equation can be derived from the general expression for the energy-momentum
tensor. See, for instance, [1]. From these two expressions, we get
p+ ρ =
(dφ
dt
)2
(D.6)
which we can now substitute in eq. 3 to get:
3a2
(dφ
dt
)2
da = −a3dρ (D.7)
⇒ dρ
da= −3
a
(dφ
dt
)2
(D.8)
Now, if we differentiate the expression for ρ w.r.t. time, we get
dρ
dt=dφ
dt
d2ρ
dt2+dV
dφ
dφ
dt(D.9)
But dρdt = dρ
dadadt , so the above equation becomes, after substituting for dρ
da
−3
a
da
dt
(dφ
dt
)2
=dφ
dt
d2φ
dt2+dV
dφ
dφ
dt(D.10)
After cancelling dφdt from every term,
−31
a
da
dt
dφ
dt=d2φ
dt2+dV
dφ(D.11)
But 1adadt is H, the Hubble paramter, so that
d2φ
dt2+ 3H
dφ
dt+dV
dφ= 0 (D.12)
This, then, is the equation of motion.
Let us now slip into a more comfortable notation: dφdt ≡ φ and dV
dφ ≡ V ′. The equation of
motion is
φ+ 3Hφ+ V ′ = 0 (D.13)
174
D.2 Slow-roll conditions
In order to sustain inflation for long enough to solve the horizon problem etc., we need
the inflaton field to move slowly. The two conditions can be written as follows:
1. Define slow:1
2φ2 << V (φ) (D.14)
, or, KE << PE
2. Keep it slow:
|φ| << |3Hφ| (D.15)
The equation of state then changes to
3Hφ+ V ′ ≃ 0 (D.16)
Aside: while sliding from one sordid equation to the other, remember that
H2 =8πG
3ρ ≃ 8πG
3V (φ) (D.17)
≃ constant during inflation. The approximate equation of motion means that
φ ≃ −V′
3H(D.18)
⇒ 1
2φ2 ≃ V ′2
18H2(D.19)
Substituting for H2 from above,
1
2φ2 ≃ V ′2
188πG3 V
=V ′2
48πGV(D.20)
Applying the first slow roll condition, we get
V ′2
48πGV<< V (D.21)
or,1√
48πG|V
′
V| << 1 (D.22)
This leads us to define our first slow-roll parameter
ǫ ≡ 1√48πG
|V′
V| << 1 (D.23)
175
To work out the second slow-roll condition, differentiate equation 18 w.r.t time again to
get
φ ≃ −V′′φ
3H(D.24)
so that the second slow-roll condition is
−V ′′φ3H
<< 3Hφ⇒ V ′′
9H2<< 1 (D.25)
Substituting for H2 again from eq 17
V ′′
24πGV<< 1 (D.26)
This leads us to define our second slow-roll parameter
η ≡ 1
24πG
V ′′
V<< 1 (D.27)
The use of η is unfortunate, since this is the same greek letter used in literature to denote
conformal time. However, this is the convention followed in literature, unfortunately.
176
Appendix E
E-B decomposition
E.1 Stokes’ parameters
When dealing with temperature anisotropies, it is conventional to ignore polarization.
However, CMB photons are polarized, and we need to think about how to characterize tem-
perature and polarization at the same time. The reason we need to consider all of them at the
same time is because our instruments are capable of measuring only intensities, and not the
amplitudes of radiation falling on them.
The most “common-sense” way to characterize polarization is to figure out the difference
between the intensities along the two rectangular-coordinate axes. This is referred to as Stoke’s
Q and its definition is easily extended to the case of circular polarization. All this is very well,
but how many independent quantities do we need to characterize the radiation field?
Consider this: detectors are sensitive to intensities, which ∼ amplitude-squared. We are
therefore dealing with two electric fields and their phases, i.e. four quantities. We therefore
need four quantities to completely characterize radiation. (Logic suspect).
But how do we represent these four quantities? In quantum mechanics, we represent
observables by hermitian matrices. The four quantities, then, should be written as a 2×2
hermitian matrix of observable quantities, two of which are intensity and Stoke’s Q.
In general, a 2×2 hermitian matrix can be written as[
a b+ ic
b− ic d
]
In our case, this matrix happens to be[
I + V Q+ iU
Q− iU I + V
]
177
where the four quantities are defined as
I =⟨E2x
⟩+⟨E2y
⟩(E.1)
Q =⟨E2x
⟩−⟨E2y
⟩(E.2)
U = 〈2ExEy cos δ〉 (E.3)
V = 〈2ExEy sin δ〉 (E.4)
where δ is the phase difference between Ex and Ey and the unit vectors are (ex, ey). The
definitions are very similar in the (eθ, eφ) basis:
I =⟨E2θ
⟩+⟨E2φ
⟩(E.5)
Q =⟨E2θ
⟩−⟨E2φ
⟩(E.6)
U = 〈2EθEφ cos δ〉 (E.7)
V = 〈2EθEφ sin δ〉 (E.8)
When V = 0, as is the case with CMB (i.e. the CMB is not circularly polarized),
polarization of the CMB is completely characterized by Q and U . Here are the transformation
properties of Q and U :
(
Q′
U ′
)
=
(
cos 2ψ sin 2ψ
− sin 2ψ cos 2ψ
)(
Q
U
)
(E.9)
E.2 Relationship between E-B and Q-U
Looking at the expression Q+ iU gives us the idea that they could be represented by
Qx + U y (E.10)
However, we must remember the transformation properties of Q and U stated above. These
imply that Q and U transform into each other after a rotation of 45, and therefore in this
basis we cannot write the abve expression. However, if we define φ = 2ψ, and work in a basis
/ co-ordinate system where angles go from 0− 180.
178
We can now proceed with the math of E-B decomposition. In very simple terms, what
we want is to split both Q and U into a gradient component and a curl component. But that
is easily done, for any vector field can be written as a sum of the two. For an arbitrary vector
field A, we write
A = ∇ · f +∇×B ≡ G + C (E.11)
where f is a scalar field and B is a vector field, and G and C are the gradient and curl components
of A respectively. Using vector calculus identities, we get
∇ ·A = ∇ · G (E.12)
∇×A = ∇× C (E.13)
Substitute Qx + U y for A and E and B for G and C respectively, we get
∇ ·E = ∇ · (Qx + U y) (E.14)
∇×B = ∇× (Qx + U y) (E.15)
We cannot really do very much else in real space, so lets take the fourier transform of the first
equation, changing all derivatives to factors of l, and take the ∇ out of the integral:
∇ ·∫
d2xEeil·r = ∇ ·∫
d2x (Qx + U y) eil·r (E.16)
Remember the definition of ∇:
∇ =∂
∂xx +
∂
∂yy (E.17)
and with r = xx + yy we get
∫
d2x
(∂
∂xx +
∂
∂yy
)
· eil(cos(xφ)+sin(yφ)) =
∫
d2x
(∂
∂xx +
∂
∂yy
)
· eil(cos(xφ)+sin(yφ))Qx + U y
(E.18)
But(∂
∂xx +
∂
∂yy
)
eil(cos(xφ)+sin(yφ)) = il (cosφx + sinφy) eil(cos(xφ)+sin(yφ)) (E.19)
⇒6 i 6 l∫
d2x (cosφx + sinφy) ·Eeil·r = 6 i 6 l∫
d2x (cosφx + sinφy) · (Qx + U y) eil·r (E.20)
⇒∫
d2xEeil·r =
∫
d2x (Q cosφ+ U sinφ) eil·r (E.21)
179
Denoting fourier transforms like this: E, we get:
E = Q cosφ+ U sinφ (E.22)
Now restore φ = 2ψ:
E = Q cos 2ψ + U sin 2ψ (E.23)
Similarly, for B, we get:
B = −Q sin 2ψ + U cos 2ψ (E.24)
Index
baseline, 41
bolometer, 82
FOV, 37
Gaussianity, 99
last scattering, 26–28
MBI
Instrument, 79
instrument, 82, 98
interferometry, 37
Mutual Coherence Function, 38
North Celestial Pole(NCP), 98
Pine Bluff Observing Site, 81
Power Spectrum, 26
polarized, 26
recombination, 27
Thomson Scattering, 26
Visibility, 42
180