MGP SeparableA

  • View
    218

  • Download
    0

Embed Size (px)

Text of MGP SeparableA

  • 7/25/2019 MGP SeparableA

    1/28

    Multi-output separable Gaussian process: Towards an efficient,

    fully Bayesian paradigm for uncertainty quantification

    Ilias Bilionis a,b, Nicholas Zabaras a,b,, Bledar A. Konomi c, Guang Lin c

    a Center for Applied Mathematics, 657 Frank H.T. Rhodes Hall, Cornell University, Ithaca, NY 14853, USAb Materials Process Design and Control Laboratory, Sibley School of Mechanical and Aerospace Engineering, 101 Frank H.T. Rhodes Hall, Cornell University,

    Ithaca, NY 14853-3801, USAc Computational Sciences and Mathematics Division, Pacific Northwest National Laboratory, 902 Battelle Boulevard, P.O. Box 999, MSIN K7-90, Richland, WA 99352,

    USA

    a r t i c l e i n f o

    Article history:

    Received 29 July 2012

    Accepted 10 January 2013

    Available online 7 February 2013

    Keywords:

    Bayesian

    Gaussian process

    Uncertainty quantification

    Separable covariance function

    Surrogate models

    Stochastic partial differential equations

    Kronecker product

    a b s t r a c t

    Computer codes simulating physical systems usually have responses that consist of a set of

    distinct outputs (e.g., velocity and pressure) that evolve also in space and time and depend

    on many unknown input parameters (e.g., physical constants, initial/boundary conditions,

    etc.). Furthermore, essential engineering procedures such as uncertainty quantification,

    inverse problems or design are notoriously difficult to carry out mostly due to the limited

    simulations available. The aim of this work is to introduce a fully Bayesian approach for

    treating these problems which accounts for the uncertainty induced by the finite number

    of observations. Our model is built on a multi-dimensional Gaussian process that explicitly

    treats correlations between distinct output variables as well as space and/or time. The

    proper use of a separable covariance function enables us to describe the huge covariance

    matrix as a Kronecker product of smaller matrices leading to efficient algorithms for carry-

    ing out inference and predictions. The novelty of this work, is the recognition that the

    Gaussian process model defines a posterior probability measure on the function space of

    possible surrogates for the computer code and the derivation of an algorithmic procedure

    that allows us to sample it efficiently. We demonstrate how the scheme can be used in

    uncertainty quantification tasks in order to obtain error bars for the statistics of interest

    that account for the finite number of observations.

    2013 Elsevier Inc. All rights reserved.

    1. Introduction

    It is very common for a research group or a company to spend years of development of sophisticated software in order to

    simulate realistically important physical phenomena. However, carrying out tasks like uncertainty quantification, model cal-

    ibration or design using the full-fledged model is in all but the simplest cases a daunting task, since a single simulation

    might take days or even weeks to complete, even with state-of-the-art modern computing systems. One, then, has to resort

    to computationally inexpensive surrogates of the computer code. The idea is to run the solver on a small, well-selected set of

    inputs and then use these data to learn the response surface. The surrogate surface may be subsequently used to carry out

    any of the computationally intensive engineering tasks.

    0021-9991/$ - see front matter 2013 Elsevier Inc. All rights reserved.http://dx.doi.org/10.1016/j.jcp.2013.01.011

    Corresponding author at: Materials Process Design and Control Laboratory, Sibley School of Mechanical and Aerospace Engineering, 101 Frank H.T.

    Rhodes Hall, Cornell University, Ithaca, NY 14853-3801, USA. Tel.: +1 607 255 9104.

    E-mail address: zabaras@cornell.edu(N. Zabaras).

    URL: http://mpdc.mae.cornell.edu/(N. Zabaras).

    Journal of Computational Physics 241 (2013) 212239

    Contents lists available atSciVerse ScienceDirect

    Journal of Computational Physics

    j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a te / j c p

    http://dx.doi.org/10.1016/j.jcp.2013.01.011mailto:zabaras@cornell.eduhttp://mpdc.mae.cornell.edu/http://dx.doi.org/10.1016/j.jcp.2013.01.011http://www.sciencedirect.com/science/journal/00219991http://www.elsevier.com/locate/jcphttp://www.elsevier.com/locate/jcphttp://www.sciencedirect.com/science/journal/00219991http://dx.doi.org/10.1016/j.jcp.2013.01.011http://mpdc.mae.cornell.edu/mailto:zabaras@cornell.eduhttp://dx.doi.org/10.1016/j.jcp.2013.01.011http://crossmark.dyndns.org/dialog/?doi=10.1016/j.jcp.2013.01.011&domain=pdf
  • 7/25/2019 MGP SeparableA

    2/28

    The engineering community and, in particular, the researchers in uncertainty quantification, have been making extensive

    use of surrogates, even though most times it is not explicitly stated. One example is the so-called stochastic collocation(SC)

    method (see [1]for a classic illustration) in which the response is modeled using a generalized Polynomial Chaos (gPC) basis

    [2]whose coefficients are approximated via a collocation scheme based on a tensor product rule of one-dimensional Gauss

    quadrature points. Of course, such approaches scale extremely badly with the number of input dimensions since the number

    of required collocation points explodes quite rapidly. A partial remedy of the situation can be found by using sparse grids

    (SG) based on the Smolyak algorithm [3], which have a weaker dependence on the dimensionality of the problem (see

    [46]and the adaptive version developed by our group [7]). Despite the rigorous convergence results of all these methods,

    their applicability to the situation of very limited observations is questionable. In that case, a statistical approach seems

    more suitable.

    To the best of our knowledge, the first attempt of the statistics community to build a computer surrogate starts with the

    seminal papers of Currin et al. [8] and independently Sacks et al. [9], both making use of Gaussian processes. On the same

    spirit is also the subsequent paper by Currin et al. [10]and the work of Welch et al. [11]. One of the first applications to

    uncertainty quantification can be found in OHagan et al.[12]and Oakley and OHagan[13]. The problem of model calibra-

    tion is considered in [14,15]. Refs. [16,17] model non-stationary responses, while [18,15] (in quite different ways) attempt to

    capture correlations between multiple outputs. Following these trends, we will consider a Bayesian scheme based on Gauss-

    ian processes.

    Despite the simplistic nature of the surrogate idea, there are still many hidden obstacles. Firstly, the question about the

    choice of the design of the inputs on which the full model is to be evaluated arises. It is generally admitted that a good start-

    ing point is a Latin hyper-cube design[19], because of its great coverage properties. However, it is more than obvious, that

    this choice should be influenced by the task in which the surrogate will be used. For example, in uncertainty quantification, it

    makes sense to bias the design using the probability density of the inputs [17] so that highly probable regions are adequately

    explored. Furthermore, it also pays off to consider a sequential design that depends on what is already known about the sur-

    face. Such a procedure, known as active learning, is particularly suitable for Bayesian surrogates since one may use the pre-

    dictive variance as an indicative measure of the informational content of particular points in the input space [20]. Secondly,

    computer codes solving partial differential equations usually have responses that are multi-output functions of space and/or

    time. One can hope, that explicitly modeling this fact may squeeze more information out of the observations. Finally, it is

    essential to be able to say something about the epistemic uncertainty induced by the limited number of observations

    and, in particular, about its effect on the task for which the surrogate is constructed for.

    In this work, we are mainly concerned with the second and the third points of the last paragraph. Even though we are

    making use of active learning ideas in a completely different context (see Section2.3), we will be assuming that the obser-

    vations are simply given to us. Our first goal is the construction of a multi-output Gaussian process model that explicitly

    treats space and time (Section 2.1). This model, in its full generality is extremely computationally demanding. In Section 2.2,

    we carefully develop the so-called separable model, which allows us to express the inference and prediction tasks using Kro-

    necker products of small matrices. This, in turn, results in highly efficient computations. Finally, in Section 2.3, we apply our

    scheme to uncertainty quantification tasks. Contrary to other approaches, we recognize the fact that the predictive distribu-

    tion of the Gaussian process conditional on the observations, actually defines a probability measure on the function space of

    possible surrogates. The weight of this measures corresponds to the epistemic uncertainty induced by the limited data.

    Extending on ideas of[13], we develop a procedure that allows us to approximately sample this probability space. Each sam-

    ple, is a kernel approximation of a candidate surrogate for the code. In the same section, we show how we can semi-analyt-

    ically compute all the statistics of the candidate surrogate up to second order. Higher order statistics or even proba