Bayesian Inversionof Stokes Profiles
A. Asensio Ramos (IAC)M. J. Martínez González (LERMA)
J. A. Rubiño Martín (IAC)
Beaulieu Workshop (Beaulieu sur Mer, 8-10 October 2007)
Outline
• Introduction
• Bayesian Inversion
• Markov Chain Monte Carlo
• Applications
• Conclusions
Introduction
ObservationsModel
What is an inversion process?
We have a set of observations and we propose a physical model
FORWARD PROBLEM (Typically univoque)
INVERSION PROBLEM (is it univoque?)
Introduction
If we are living in a perfect, ideal, Teletubbie world with
Nonoise
Noambiguities
Nodegeneracies
There will be ONE model that better explains the observations and onecan safely say that this is THE correct model
Introduction
However, we are fortunately living in an imperfect, non ideal world with
Noise Ambiguities Degeneracies
There will be more than ONE model that better explains the observations and one cannot say that only one model is THE correct model
Introduction
As a consequence, any inversion procedure carried out in our noisy andambiguous world cannot give only one model as solution but has to
give a set of models that are compatible with our observables
Any inversion problem has to be understood as aprobabilistic problem that has to be tackled using a statistical approach
We give the probability that any given model explains the observables
Bayesian Inference
D represents our observables (Stokes profiles)
M represents our model (Milne-Eddington, LTE, ...)and it is parameterized by a vector of parameters
We have some a-priori knowledge of their values
Bayesian Inference
The inductive inference problem is to update from our a-priori knowledge ofthe parameters to a a-posteriori knowledge after taking into account the
information encoded in the observed dataset
Bayes theorem
Posterior probability
Prior probabilityLikelihood
Priors
Any Bayesian reasoning scheme introduces the prior probability (a-prior information)
Typical priorsTop-hat function(flat prior)
imaxmin
Gaussian prior(we know a highprobable value)
i
Assuming statistical independence for all parameters the total prior can be calculated as
Likelihood
Assuming normal (gaussian) noise, the likelihood can be calculated as
where the 2 function is defined as usual
In this case, the 2 function is specific for the the case of Stokes profiles
Bayesian Inference
In order to completely solve the inversion procedure, we NEED to knowthe complete p(|D) posterior probability distribution
Sometimes, we are interested only in a subset of parameters
Marginalization
In any case, we still need the complete posterior distribution
Bayesian Inference. The naïve approach
Our model is parameterized with N parameters
We use M values for each parameter (to have a good coverage)
We end up with MN evaluations of the forward problemto obtain the full posterior distribution
Example: if N=10 (typical for ME models) and M=10 (relatively coarse grid), we end up with
1010 evaluations ~31 years if each model is evaluated in 0.1 s
Only one experiment is possible during a typical scientific life!!!You better choose the correct experiment!!!
Bayesian Inference. The practical approach
Markov Chain Monte Carlo
“HAPPY IDEA”!!Build a Markov Chain with an equilibrium probability distribution function
equal to the posterior distribution
GOOD NEWS!!The Markov Chain rapidly converges towards the desired distribution
using a reduced amount of evaluations (typically increases linearly withthe number of parameters)
MCMC. Technical details
Propose an initialset of parameters 0
Calculate the posteriorp(0|D)
Obtain new set of parameters
sampling from q(i|i-1)
Calculate the posteriorp(i|D) and the ratio
Accept set of parameters i with probability
Bayesian Inference. Simple example
Andrieu et al. 2003
Bayesian Inference. Proposal density
The proposal density is the key point in MCMC methods
It should ideally be as similar as possible to the posterior distributionbut easy to calculate
Typical proposal densities
Uniform distribution
In the limit that the proposal is equal to the distribution you want to sample,all proposed models are accepted
Multi-dimensional gaussian
MCMC. Possible post-facto analysis
If chains start far from the region of large posterior probability, it takes someiterations to locate the region the first Nburn-in iterations are thrown away
BURN-IN
Starting point
Burn-in
Reduces the size of the chain hopefully maintaining their propertiesLess used now due to the increase of the computer capabilities
THINNING
Academic example =10-5 Ic
Original valuesB=100 GB=45º
Academic example =10-5 Ic
Markov chains without burn-in and thinning
Marginalization (multi-dimensionalintegration) is obtained by “making
histograms”
Academic example =10-3 Ic
Original valuesB=100 GB=45º
B cos B=cte
Academic example =10-3 Ic
Academic example =10-3 Ic
B cos B=100 cos 45º = 70.7 G
Realistic example =10-4 IcStokes profiles with a low flux (10 Mx/cm2)
B=1000 Gf = 1%
Fields between 500 and 1800 G are compatible withthe observed Stokes profiles at 1 confidence level
Some parameters are not constrained by the observables
Low flux region
The “thermodynamical” parameters of thenon-magnetic component can be nicely
constrained by the data
Low flux region
High flux region =10-4 IcStokes profiles with a high flux (200 Mx/cm2)
B=1000 Gf = 20%
Magnetic field strength and other parameters of the magnetic
component are constrained by the data
High flux region =10-4 Ic
Broader confidence levels are seenin the non-magnetic component due to
its reduced filling factor
Lack of information
Marginalized magnetic field strength posterior distribution
Note the similarity with the prior distribution
Observed sunspot
Umbral profile observed with THÉMIS
6302 Å
Observed sunspot
Umbral profile observed with THÉMIS
6301 Å
Combination of information
Inclusion of new information is trivial underthe Bayesian approach by directly
multiplying their posteriors
The 6302 Å line constrains better themagnetic field vector than the 6301 Å
line in this case
6301 Å
6302 Å
Conclusions
• The inversion process is a statistical problem give set of parameters of a model that are compatible with the observables for a given noise
• Bayesian methods allow us to move from the a-priori information to a-posteriori situation using the information encoded in the data
• The posterior and their marginalized distributions are easily obtained using a Markov Chain Monte Carlo method
• Applications to synthetic and real data show the potential of the technique and points out severe degeneration problems