prev

next

out of 28

View

218Download

0

Embed Size (px)

7/25/2019 MGP SeparableA

1/28

Multi-output separable Gaussian process: Towards an efficient,

fully Bayesian paradigm for uncertainty quantification

Ilias Bilionis a,b, Nicholas Zabaras a,b,, Bledar A. Konomi c, Guang Lin c

a Center for Applied Mathematics, 657 Frank H.T. Rhodes Hall, Cornell University, Ithaca, NY 14853, USAb Materials Process Design and Control Laboratory, Sibley School of Mechanical and Aerospace Engineering, 101 Frank H.T. Rhodes Hall, Cornell University,

Ithaca, NY 14853-3801, USAc Computational Sciences and Mathematics Division, Pacific Northwest National Laboratory, 902 Battelle Boulevard, P.O. Box 999, MSIN K7-90, Richland, WA 99352,

USA

a r t i c l e i n f o

Article history:

Received 29 July 2012

Accepted 10 January 2013

Available online 7 February 2013

Keywords:

Bayesian

Gaussian process

Uncertainty quantification

Separable covariance function

Surrogate models

Stochastic partial differential equations

Kronecker product

a b s t r a c t

Computer codes simulating physical systems usually have responses that consist of a set of

distinct outputs (e.g., velocity and pressure) that evolve also in space and time and depend

on many unknown input parameters (e.g., physical constants, initial/boundary conditions,

etc.). Furthermore, essential engineering procedures such as uncertainty quantification,

inverse problems or design are notoriously difficult to carry out mostly due to the limited

simulations available. The aim of this work is to introduce a fully Bayesian approach for

treating these problems which accounts for the uncertainty induced by the finite number

of observations. Our model is built on a multi-dimensional Gaussian process that explicitly

treats correlations between distinct output variables as well as space and/or time. The

proper use of a separable covariance function enables us to describe the huge covariance

matrix as a Kronecker product of smaller matrices leading to efficient algorithms for carry-

ing out inference and predictions. The novelty of this work, is the recognition that the

Gaussian process model defines a posterior probability measure on the function space of

possible surrogates for the computer code and the derivation of an algorithmic procedure

that allows us to sample it efficiently. We demonstrate how the scheme can be used in

uncertainty quantification tasks in order to obtain error bars for the statistics of interest

that account for the finite number of observations.

2013 Elsevier Inc. All rights reserved.

1. Introduction

It is very common for a research group or a company to spend years of development of sophisticated software in order to

simulate realistically important physical phenomena. However, carrying out tasks like uncertainty quantification, model cal-

ibration or design using the full-fledged model is in all but the simplest cases a daunting task, since a single simulation

might take days or even weeks to complete, even with state-of-the-art modern computing systems. One, then, has to resort

to computationally inexpensive surrogates of the computer code. The idea is to run the solver on a small, well-selected set of

inputs and then use these data to learn the response surface. The surrogate surface may be subsequently used to carry out

any of the computationally intensive engineering tasks.

0021-9991/$ - see front matter 2013 Elsevier Inc. All rights reserved.http://dx.doi.org/10.1016/j.jcp.2013.01.011

Corresponding author at: Materials Process Design and Control Laboratory, Sibley School of Mechanical and Aerospace Engineering, 101 Frank H.T.

Rhodes Hall, Cornell University, Ithaca, NY 14853-3801, USA. Tel.: +1 607 255 9104.

E-mail address: zabaras@cornell.edu(N. Zabaras).

URL: http://mpdc.mae.cornell.edu/(N. Zabaras).

Journal of Computational Physics 241 (2013) 212239

Contents lists available atSciVerse ScienceDirect

Journal of Computational Physics

j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a te / j c p

http://dx.doi.org/10.1016/j.jcp.2013.01.011mailto:zabaras@cornell.eduhttp://mpdc.mae.cornell.edu/http://dx.doi.org/10.1016/j.jcp.2013.01.011http://www.sciencedirect.com/science/journal/00219991http://www.elsevier.com/locate/jcphttp://www.elsevier.com/locate/jcphttp://www.sciencedirect.com/science/journal/00219991http://dx.doi.org/10.1016/j.jcp.2013.01.011http://mpdc.mae.cornell.edu/mailto:zabaras@cornell.eduhttp://dx.doi.org/10.1016/j.jcp.2013.01.011http://crossmark.dyndns.org/dialog/?doi=10.1016/j.jcp.2013.01.011&domain=pdf7/25/2019 MGP SeparableA

2/28

The engineering community and, in particular, the researchers in uncertainty quantification, have been making extensive

use of surrogates, even though most times it is not explicitly stated. One example is the so-called stochastic collocation(SC)

method (see [1]for a classic illustration) in which the response is modeled using a generalized Polynomial Chaos (gPC) basis

[2]whose coefficients are approximated via a collocation scheme based on a tensor product rule of one-dimensional Gauss

quadrature points. Of course, such approaches scale extremely badly with the number of input dimensions since the number

of required collocation points explodes quite rapidly. A partial remedy of the situation can be found by using sparse grids

(SG) based on the Smolyak algorithm [3], which have a weaker dependence on the dimensionality of the problem (see

[46]and the adaptive version developed by our group [7]). Despite the rigorous convergence results of all these methods,

their applicability to the situation of very limited observations is questionable. In that case, a statistical approach seems

more suitable.

To the best of our knowledge, the first attempt of the statistics community to build a computer surrogate starts with the

seminal papers of Currin et al. [8] and independently Sacks et al. [9], both making use of Gaussian processes. On the same

spirit is also the subsequent paper by Currin et al. [10]and the work of Welch et al. [11]. One of the first applications to

uncertainty quantification can be found in OHagan et al.[12]and Oakley and OHagan[13]. The problem of model calibra-

tion is considered in [14,15]. Refs. [16,17] model non-stationary responses, while [18,15] (in quite different ways) attempt to

capture correlations between multiple outputs. Following these trends, we will consider a Bayesian scheme based on Gauss-

ian processes.

Despite the simplistic nature of the surrogate idea, there are still many hidden obstacles. Firstly, the question about the

choice of the design of the inputs on which the full model is to be evaluated arises. It is generally admitted that a good start-

ing point is a Latin hyper-cube design[19], because of its great coverage properties. However, it is more than obvious, that

this choice should be influenced by the task in which the surrogate will be used. For example, in uncertainty quantification, it

makes sense to bias the design using the probability density of the inputs [17] so that highly probable regions are adequately

explored. Furthermore, it also pays off to consider a sequential design that depends on what is already known about the sur-

face. Such a procedure, known as active learning, is particularly suitable for Bayesian surrogates since one may use the pre-

dictive variance as an indicative measure of the informational content of particular points in the input space [20]. Secondly,

computer codes solving partial differential equations usually have responses that are multi-output functions of space and/or

time. One can hope, that explicitly modeling this fact may squeeze more information out of the observations. Finally, it is

essential to be able to say something about the epistemic uncertainty induced by the limited number of observations

and, in particular, about its effect on the task for which the surrogate is constructed for.

In this work, we are mainly concerned with the second and the third points of the last paragraph. Even though we are

making use of active learning ideas in a completely different context (see Section2.3), we will be assuming that the obser-

vations are simply given to us. Our first goal is the construction of a multi-output Gaussian process model that explicitly

treats space and time (Section 2.1). This model, in its full generality is extremely computationally demanding. In Section 2.2,

we carefully develop the so-called separable model, which allows us to express the inference and prediction tasks using Kro-

necker products of small matrices. This, in turn, results in highly efficient computations. Finally, in Section 2.3, we apply our

scheme to uncertainty quantification tasks. Contrary to other approaches, we recognize the fact that the predictive distribu-

tion of the Gaussian process conditional on the observations, actually defines a probability measure on the function space of

possible surrogates. The weight of this measures corresponds to the epistemic uncertainty induced by the limited data.

Extending on ideas of[13], we develop a procedure that allows us to approximately sample this probability space. Each sam-

ple, is a kernel approximation of a candidate surrogate for the code. In the same section, we show how we can semi-analyt-

ically compute all the statistics of the candidate surrogate up to second order. Higher order statistics or even proba