60
SGEMS-UQ: An Uncertainty Quantification Toolkit for SGEMS Lewis Li 1 , Alexandre Boucher 2 and Jef Caers 3 1 Electrical Engineering Department, Stanford University, USA 2 Advanced Resources and Risk Technology, USA 3 Energy Resources Engineering department, Stanford University, USA Abstract While algorithms and methodologies to study uncertainty in the Earth Sciences are constantly evolving, there are currently few open-source integrated softwares that allows the general practitioners access to these developments. This paper presents SGEMS-UQ, a plugin for the SGEMS platform, that is used to perform distance-based uncertainty analysis on geostatistical simulations, and the resulting transfer functions responses used in subsurface modeling and engineering. A versatile XML-derived dialect is defined for communicating with external programs that eliminates the need for ad-hoc linking of codes, and a relational database system is implemented to automate many of the steps in data mining the spatial and forward model parameters. Through a graphical user interface, one can map a set of realizations and transfer function responses into a multidimensional scaling (MDS) space where advanced visualization utilities, and clustering techniques are available. Once mapped in the MDS space, the user can explore linkage between parameters and transfer function responses by querying a SQL database. Consideration is given to the use of programming designs to produce a code-base

SGEMS-UQ: An Uncertainty Quantification Toolkit for SGEMSpangea.stanford.edu/departments/ere/dropbox/scrf/...SGEMS-UQ: An Uncertainty Quantification Toolkit for SGEMS Lewis Li1, Alexandre

  • Upload
    others

  • View
    11

  • Download
    3

Embed Size (px)

Citation preview

SGEMS-UQ: An Uncertainty Quantification Toolkit for

SGEMS

Lewis Li1, Alexandre Boucher2 and Jef Caers3 1Electrical Engineering Department, Stanford University, USA 2Advanced Resources and Risk Technology, USA 3Energy Resources Engineering department, Stanford University, USA

Abstract

While algorithms and methodologies to study uncertainty in the Earth Sciences are constantly

evolving, there are currently few open-source integrated softwares that allows the general

practitioners access to these developments. This paper presents SGEMS-UQ, a plugin for the

SGEMS platform, that is used to perform distance-based uncertainty analysis on geostatistical

simulations, and the resulting transfer functions responses used in subsurface modeling and

engineering. A versatile XML-derived dialect is defined for communicating with external

programs that eliminates the need for ad-hoc linking of codes, and a relational database system

is implemented to automate many of the steps in data mining the spatial and forward model

parameters. Through a graphical user interface, one can map a set of realizations and transfer

function responses into a multidimensional scaling (MDS) space where advanced visualization

utilities, and clustering techniques are available. Once mapped in the MDS space, the user can

explore linkage between parameters and transfer function responses by querying a SQL

database. Consideration is given to the use of programming designs to produce a code-base

that is manageable, efficient, and extensible for future applications, while being scalable to

work with large datasets. Finally, we illustrate the versatility of the code-base on a real-field

application of modeling uncertainty in reservoir forecasts for an Oil reservoir of the West Coast

of Africa.

Keywords: uncertainty quantification, software design, XML, SQL database

Introduction

With an increase in modeling and computational methodologies, codes and hardware

resources, the aspect of uncertainty has received increased attention in the Earth Sciences (e.g.

Wu, et al., 2006; Brown and Heuvelink, 2007; Caers, 2011). In this paper we focus on

uncertainty problems that involve geostatistical modeling (Chilles and Delfiner 1999) and

computationally demanding transfer functions for which these geostatistical models are

created. The example applications targeted in this paper are mostly related to subsurface

modeling and engineering (e.g. modeling of groundwater systems, conventional and

unconventional energy resources, mining and CO2 sequestration), but the resulting software,

SGEMS-UQ, could be applied to any other similar applications. In terms of modeling and

decision-making, a possible workflow is outlined in Figure 1, adapted from Caers (2011). The

geological scenarios are built with a number of geostatistical algorithms, either covariance-

based (Chilles and Delfiner 1999), Boolean (Holden, et al. 1998) or multiple-point statistics

(Caers and Zhang, 2004; Mariethoz, et al., 2009, Strebelle 2002). Each algorithm has parameters

such as mean, variance, covariance, training image, object shapes and dimensions that are

uncertain and need to be varied. In terms of software, several packages, commercial and open-

source are available for geostatistical studies. The development of this uncertainty module is a

plug-in to SGEMS (Stanford Geostatistical Earth Modeling software, sgsems.sourceforge.net).

Geostatistical modeling is generally fast in terms of CPU with algorithms from covariance or

training image being parallelized (Peredo and Ortiz 2011), that run million-cell grid in seconds

or minutes. However, within the described applications, geostatistical modeling is rarely an

end-goal, and the resulting models are then typically used for evaluating flow, geo-mechanical

or geophysical (seismic, EM, GPR) equations, including the coupling of these equations.

Software such as TOUGH (Lawrence Berkeley National Laboratory, 2012) and MODFLOW

(Harbaugh 2005) require hours to days of computational effort. Other evaluations consider

optimization codes such as the optimization of open-pit design in mining (e.g. Dimitrakopoulos

and Ramazan, 2004).

Modern software engineering relies on concepts of re-usability, extensibility and

maintainability of the design, which our software targets as well. Commercial platforms such as

Ocean/Petrel (Schlumberger Limited 2010) aim at providing such capabilities, but no open-

source codes combine the full capabilities of geostatistical modeling with forward modeling or

optimization codes. While most researchers write codes that link the various components

described in Figure 1, these codes are linked in an ad-hoc fashion, often with simple scripting or

MATLAB, and are neither re-useable nor extensible. These codes need then to be rewritten

(and compiled) for every different application or set of input parameters. Then there is the

problem of data management in real field applications. Many components of the workflow in

Figure 1 can be randomized; hence possibly hundreds of geostatistical models may be created

to capture uncertainty within such set of geostatistical models. After evaluating forward

equations on the geostatistical models, there is often a need to post-process the results. For

example, a sensitivity analysis is performed attempting to discover which parameters or

combinations of parameters have the largest impact on the responses. A visualization of the

uncertainty spanned by the responses is performed using dimensionality reduction and data

mining techniques such as multi-dimensional scaling (MDS) (Honarkhah, 2009), support vector

machines (SVM) (Cortes, 1995) or various forms of principle component analysis (PCA) (Abdi,

2010).

In this paper we address the software challenges by defining a versatile XML-derived dialect for

communicating with external programs that eliminates the need for ad-hoc linking of codes,

and the implementation of a relational database system that automates many of the steps in

data mining the spatial and forward model parameters. Consideration is given to the use of

programming designs to produce a code-base that is manageable, efficient, and extensible for

future applications, while being scalable to work with large datasets. Our methodology for

modeling uncertainty within this framework relies on the concept of distance-based modeling

of uncertainty (Caers, 2011; Caers et al., 2011; Scheidt and Caers, 2009a, 2009b), but the

presented design could be extended to other types of approaches. After reviewing the main

concepts of the distance-based uncertainty modeling workflow, we present our software design

to handle the dual challenge of coupling codes and managing the information generated by the

Monte Carlo approach of the method. Finally, we present a real-field application of modeling

uncertainty in reservoir forecasts for an Oil reservoir of the West Coast of Africa.

Review of Distance-Based Modeling of Uncertainty

The contribution of this paper lies in the software design for modeling uncertainty using

geostatistical simulation methods and computationally demanding forward modeling codes.

This section reviews the distance-based approach for modeling uncertainty that was

implemented in the software. Other approaches use experimental design; response surface

modeling could follow a similar software design since they call on similar basic elements of such

analysis: geostatistical codes, forward modeling codes and post-processing of the runs. The

challenge lies in linking codes and managing the information generated by such modeling

frameworks.

The main contribution of distance-based uncertainty quantification is to recognize that often a

more efficient and effective quantification of variation is that by means of distances, as

opposed to for example, variances and covariance. Distances and kernels are commonly used in

computer science (Liu, 2010). In our applications, we recognize that differences (distances) in

response variables can be used to quantify uncertainty. The distance-based approach is

summarized in Figure 2 (adapted from Caers (2011)) and consists of a number of steps:

1. A prior uncertainty is stated (from actual data) on spatial and forward model

parameters. Either experimental design or Monte Carlo simulation is used to generate

sets of input parameters. Some parameters may be continuous (e.g. a mean) other

discrete (e.g. a rock-type) or scenario-based (e.g. a geological interpretation).

2. A geostatistical algorithm is used to generate realizations. A few models are shown in

Figure 2.

3. A distance is defined between the realizations. This distance can be based directly on

the realizations (a static distance) or could involve an approximate physical model

(surrogate or proxy model). These static or dynamic proxies are needed when the full

forward model is too computationally expensive.

4. Multi-dimensional-scaling (MDS) is used to map the realization into a low dimensional

space for visualization purposes. MDS is basically a dimensionality reduction technique

based on the dot-product (Honarkhah and Caers, 2010 for a geostatistical application of

MDS).

5. A clustering algorithm is applied to group models into a set of distinct clusters. Figure 3

shows that such clustering can be a k-medoid or a kernel k-medoid clustering. The k-

medoid selects one of the models as a cluster representative.

6. The full physics forward model is now applied to selected models to represent the

response uncertainty

7. Analysis of the information generated from step 1 to 6: distances, parameters per

cluster, selected models, quantiles of response uncertainty and more.

Any software design for uncertainty quantification will require some flexibility in

What parameters are specified and how their prior uncertainty is modeled

What geostatistical algorithms (or combinations) are used

What proxy models are used: static, dynamic or combinations

How distances are calculated

What the full physics forward models are

How users want to analysis the results and/or do sensitivity analysis

In the next section we present a program design that addresses these challenges.

Software Design

Coupling with existing SGEMS codebase

The main SGEMS program provides the input parameters, geostatistical algorithms and their

results, while dynamic simulators are considered external codes. The decision to integrate the

uncertainty toolkit with SGEMS has the advantage of having access to the source code,

standards and conventions used by SGEMS, which are open source. The main SGEMS software

was written using the Qt4 framework with external dependencies on VTK (Schroeder, Martin

and Loresen 2004), Eigen3 (Guennebaud, Jacob and et al. 2010), and Boost (Karlsson 2005). It

has been successfully compiled and used on Windows, OSX and Linux. By developing SGEMS-

UQ as a plugin to SGEMS, as opposed to a standalone program, much of SGEMS’ codebase can

be reused; most notably loading/saving project files, and visualizing any generated

geostatistical models. SGEMS has the capability to use multiple algorithms to generate models

with the resulting models following a standard format and convention. This means that by

catering to this single specified convention, SGEMS-UQ is guaranteed to be extensible to any

future geostatistical algorithms that may be added to SGEMS.

SGEMS follows a factory method design pattern for its plugins (Fowler, et al. 1999), which

defines an interface for creating plugin classes, but the class must instantiate itself. This

paradigm allows for the SGEMS main program to take over the lifetime management of the

SGEMS-UQ plugin, reducing code duplication. Plugins use the Qt .ui file specification to define

the user interface for the plugin. This means Qt Designer program can be used as an editor for

developing the plugin GUI, allowing for the preservation of visual styles with the main SGEMS

program. However, SGEMS-UQ contains substantially more components than existing SGEMS

plugins. As shown in Figure 4, SGEMS-UQ is comprised of three separate modules. The first is

SGEMS-Metrics, which allows contains the filters required to load responses from third party

simulators. GSTL_item_model contains the framework used to compute and store the MDS

space, in addition to providing an interface to store simulation parameters to an external

database. The final MDS-GUI provides capabilities such as multiple visualizers, clustering and

sensitivity analysis. This increase in software complexity makes the use of an effective

programing design crucial in preventing code bloat, and encouraging extensibility, re-usability,

and maintenance of the codebase.

A motivating factor behind the software design of SGEMS-UQ is the efficient visualization of

multiple forms of data. SGEMS-UQ uses the Model-View-Controller (MVC) programming design

(Weisfield 2008) to display the subsets of data across different views without requiring each

visualizer to have its own copy of the same data. The basic premise of MVC revolves around

abstracting the various aspects of the program into the three constituent components of the

namesake. The model contains the data that is used by the entire program. In SGEMS-UQ, the

models consist of the response variables, MDS spaces, and input parameters. The view is the

visualizer of the models, which in SGEMS-UQ manifests as VTK plots (Schroeder, Martin and

Loresen 2004) of the various variables. The controller defines the method in which the user

interacts with the model. Plotting subsets of response variables or common parameters over

different clusters are all examples where the same model data needs to be displayed over

multiple views. MVC allows SGEMS-UQ to keep only a single copy of the response variables

despite numerous visualization windows.

While SGEMS is used to generate models, external programs are used to generate the response

variables required for performing distance-based uncertainty modeling. This necessitates a

method of importing data from external simulators into SGEMS-UQ. Presently, most dynamic

simulators output the simulation results into space delimited or comma separated values (for

example Eclipse (Schlumberger Limited 2010), Modflow (Harbaugh 2005), TOUGH (Lawrence

Berkeley National Laboratory, 2012)). To address the import of data from external programs,

SGEMS-UQ uses a specified XML dialect to transfer data between processes. While the choice

of XML over a flat file is not superior in terms of reading speed, the extensibility of XML makes

it a more appropriate selection in this application since XML allows for the storage of both data

and structure information within the same file.

At this moment, SGEMS-UQ has the built-in capability to handle four response data format:

scalar, vector, time series and distribution. Each data response format has a distance function

associated with it. For instance the distance between two distributions is calculated differently

than between two scalars. This allows SGEMS-UQ to process arbitrary number of response data

(i.e. well pressure, cumulative water production, etc.), as long as they fall into an implemented

data category type. This drastically reduces overhead when compared to recompiling code for

each set of input parameters. Note that SGEMS-UQ is not restricted to these response data

format and a developer could add more formats that would automatically integrate with the

software infrastructure. Furthermore, the Qt framework provides standard functions to

process XML syntax.

Information management using an SQL database

Another requirement of SGEMS-UQ is to data mine the parameters in order to perform

sensitivity analysis and to better understand linkage between parameters and responses.

SGEMS-UQ follows the XML SGEMS convention for storing the simulation parameters for each

realization. The XML format accounting for the hierarchical nature of geostatistical parameters

(e.g. maximum range of a spherical variogram) and those parameters are not homogenous

across algorithms (Boolean versus Gaussian versus MPS). While XML is natively readable by

users, it is not an ideal format for displaying on a GUI, nor is it an appropriate database on

which searches can be performed. SGEMS-UQ converts the simulation parameters into two

different formats, each geared towards a specific purpose. The first one is to visualize the

parameters in a tree display format (Figure 10). The tree model viewer allows users the

capability to expand and collapse groups, providing a cleaner interface for reading an otherwise

convoluted list of parameters. Secondly, SGEMS-UQ parses the parameters from the XML

format into a SQL database (Date and Darwen 1996) for data mining. The specific database that

is used is SQLite, a database system designed to be small (275 kb), such that it can be shipped

as part of the client application. SQLite is ACID-compliant and is compatible with the majority of

the functions in the SQL (Owens 2006). In SGEMS-UQ, the database is embedded in local

memory, but the web-centric nature of the SQL standard can allow for storage on a central

server while clients connect in remotely. SQL is also highly scalable, which means that SGEMS-

UQ can be used to data mine thousands of realizations. Since SQL is a relational database,

SGEMS-UQ can directly insert and retrieve any information without worrying about the

underlying data structures and retrieval procedures to answer queries. Extensions could take

advantage of full suite SQL packages such as SQL Server that offers built in statistical packages

such as SPSS (IBM Corporation, 2012), SAS (SAS Institute, 2011), and S-Plus (TIBCO Software

Incorporated, 2010) for performing more complicated data mining operations.

The different nature of parameters (e.g. float versus string) adds an additional challenge to the

data-mining process. SGEMS-UQ automatically identifies the data type of each parameter and

provides associated comparison operators. For instance, string-based parameters can only be

matched exactly using string operators, while floating point parameters can be compared using

a variety of operators, such as less than, equal to, greater than, etc. By taking advantage of the

transform iterator and binary function predicates in the C++ Standard Template Library

(Musser, Derge and Saini 2001) released as part of the latest version C++11 (Becker 2008), this

query can be extended to searching across multiple parameters. For instance, finding all

realizations with both mean porosity less than 30 and variogram radius greater than 15.

Another example is to take a series of realizations and finds their common input parameters.

While a third operation considers each cluster generated from the k-medoid step, and

evaluates which parameters types are common to all realizations (realizations generated by

different algorithms can also be compared using this technique as long as they have

comparable parameters with the same name). The entire parameter explorer module is built on

top of two abstract classes; a database interface class (param_base_class) used to query the

system, and a visualizer class (param_plot_window) that allows for data to be plotted (Figure

5). This reduces code redundancy for derived classes, and facilitates the implementation of

future data mining operations.

SGEMS-UQ provides a cohesive and efficient workflow for performing distance-based

uncertainty quantification, as will be shown with a walkthrough example in the following

section.

Application and Example Usage

Case Study

In this section, SGEMS-UQ is demonstrated for a problem of uncertainty quantification in

production forecasts for a reservoir in West Coast Africa (termed WCA). WCA is a deep-water

turbidite offshore reservoir that was successfully analyzed using distance-based uncertainty

modeling in Scheidt and Caers (2009b) and Caers (2011). To account for the depositional

uncertainty, the SGEMS Training-image generator (Boucher 2010) algorithm was used to

generate training images, each representing a geological interpretation of the reservoir

depositional system. Facies realizations are then filled with porosity and permeability values

using traditional variogram-based techniques.

This MDS analysis case study requires multiple geostatistical algorithms and external proxy

simulators as well as full-physics simulators. Scheidt and Caers (2009a) used multiple MATLAB

and Python scripts to perform this analysis. SGEMS-UQ drastically simplifies the uncertainty

quantification workflow, in addition to providing previously unavailable capabilities in analyzing

the results. All code and data are available from http://github.com/SCRFpublic/SGEMS-UQ. The

complete analysis and runs are also provided on this page.

Uncertainty Model for WCA

The response variables used for the WDA case study are cumulative oil and water production,

both of which are time series data types. The computation of oil/water production requires a

set of porosity / permeability realizations. These realizations are generated in two steps. First,

the facies models (Figure 6) is generated using a Single Normal Equation Sequential Simulation

(Strebelle 2002) algorithm (SGEMS program snesim), constrained to well data and

corresponding 3D training image with properties delineated in Table 1. Then the porosity is

generated with the Sequential Gaussian Simulation (SGEMS program sgsim). A histogram of the

distribution of each facies type was extracted from the well data, and used as the target

histogram. Uncertainty in this histogram is modeled through a triangular distribution for mean

and variance (see Table 2 and Table 3).

The permeability values were generated with a Markov Model 1 collocated coregionalization

(Journel, 1999) (SGEMS program cosgsim) with the previously generated porosity values as the

soft data. The spatial distribution of porosity and permeability is simulated independently for

each facies. The porosity, permeability and facies are combined using a cookie cutter approach

using a Python script included in the aforementioned repository. For this particular example, 10

facies models were generated per training image, and 10 porosity/permeability maps for each

facies were created, resulting in 100 realizations per training image. The final resulting

realizations contain parameters from a series of four permeability and porosity maps; one for

each facies. A simple Python script is used gather the sets of parameter files from different

simulations into a single XML file that can be read by SGEMS-UQ. This script

(ParseParameters.py) can be used to group arbitrary sets of simulation parameters, allowing

SGEMS-UQ to be extensible to a wide variety of cases.

Interfacing With Third Party Simulators

The cumulative oil and water production was simulated using ECLIPSE, and exported to a space

delimited text file, where each column corresponds to a response variable, and each row is a

realization. This format can be converted into an appropriate XML file with a simple Python

script (parseResponse.py). This script was developed such that it can be modified or extended

to extract arbitrary data from different simulators into the standard SGEMS-UQ XML dialect.

SGEMS-UQ has the capability to plot varying types of response data, allowing the user to

visualize the amount of uncertainty represented by a set of realizations. Figure 8 shows an

example of a time series response.

Distance-Based Workflow

The following section demonstrates a step-by-step approach for the distance-based uncertainty

analysis study of WCA using SGEMS-UQ. Such steps would be typical to most data sets.

1. Generation of a MDS Space:

This step consists of selecting the realizations and proxy responses to be analyzed. Figure 8

shows the GUI used to specify the generation of the MDS space. Any subset of realizations and

proxy responses can be selected, and several MDS spaces can be independently created.

SGEMS-UQ only displays the proxy responses that are common to selected realizations, which

in turn corresponds to which distances can be used for the MDS calculation. The distances

themselves can be computed using a variety of distance kernels, selectable within the same

window. These kernels can be applied to any of the implemented data types, but additional

kernels can also be added for future extensibility.

2. Visualization of the MDS space:

Once the MDS space is created and loaded into the visualizer, 3D cloud of points can be viewed,

span, zoom and rotated. This gives the user a visual representation of the distances between

realizations with regards to their proxy responses. An example of which is shown in Figure 10.

Individual realizations can be selected with appropriate highlighting.

3. K-Medoid Clustering:

The realizations can then be clustered using the kernel k-medoid (KKM) algorithm, as shown in

Figure 9. The number of clusters is left as an input parameter and all clustering results are

stored within the program for further visualization or analysis; such as clusters highlighting in

the MDS space. The clustering can be repeated with several numbers of medoids, and all the

clusters are kept and saved along with the MDS space. Then, the selected medoid realizations,

assumed to be representative of their respective cluster, are exported to a text file. They can

then be put through full physics flow simulators. For this particular example, twelve medoids

were selected, and the P10, P50 and P90 responses for cumulative oil and water production are

shown along with the quantiles from the exhaustive set in Figure 11.

4. Pareto Plot:

One-way sensitivity analysis can be performed within SGEMS-UQ by selecting the realizations,

the responses and the parameters with the included GUI as shown in Figure 12. This selection

generates a Pareto plot (Figure 12) for the sensitivity analysis of the mean and variance used as

parameters for the permeability/porosity realizations on the cumulative oil production after

2000 days. In this particular example, we can see that the two most statistically significant

parameters are the variance and mean of permeability in facies 2 and 1.

5. Parameter Analysis Within Clusters:

SGEMS-UQ allows a visual representation of the variation of parameters within a cluster or a

group of realizations, such as those generated by KKM in Step 3. Figure 13 shows the mean

permeability parameters (used in cosgim) for a few realizations within the 3rd cluster of the 5 k-

medoid grouping.

6. Finding Realization with Common Parameters:

The querying capabilities of SQL are used to identify realizations with similar parameter values

or within a specified range of values. An example of the interface is shown in Figure 15, where

we have queried the system for all realizations with a permeability variance in facies 3 less that

of realization NTG_real28. Plotting capabilities are also provided to depict the parameter values

among the search results. Additional capabilities include finding all realizations with the same

parameter value as shown in Figure 15, where the system is queried for all realizations with the

same mean porosity value for facies 1 as realization NTG_real15.

7. Extracting Common Parameters From Realizations:

Conversely SGEMS-UQ can query which parameters have the same value within a set of

realizations using the interface shown in Figure 16. By comparing these results to the distance

between the realizations in the MDS space, the user can gain further qualitative insight into the

sensitivity of the response to the parameter. This can be used to determine which simulation

parameters to focus on if there is the possibility of obtaining more information on, and thus

decrease the uncertainty in the response variable.

Conclusions

We have presented SGEMS-UQ, an open-source plugin for SGEMS that consolidates metric

space modeling into a single module that can be used with models from SGEMS. Two add-on

scripts were implemented to allow SGEMS-UQ to interface with a variety of external flow

simulators, and to merge SGEMS properties. A standardized XML grammar was designed such

that SGEMS-UQ is extensible to future geostatistical algorithms. Visualization tools were

included for response variables, clustering and the metric space. In addition, a SQL database

was embedded into the program for storing and querying simulation parameters. This has

facilitated a more powerful method of performing sensitivity analysis. Moreover, SGEMS-UQ

was developed with proven software design patterns and open standards ensuring it is easily

extensible, re-usable and maintainable.

Works Cited

Abdi, H., Williams, L., 2010. Principal component analysis. Computational Statistics, 2 (4), pp.

433-459.

Becker, P., 2008. Working Draft, Standard for Programming Language C++, Technical Report

N2798=08-0308, ISO/IEC JTC 1, Information technology, Subcommittee SC 22, Programming

Language C++. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2798.pdf

[Retrieved January 27th 2013]

Boucher, A., Gupta, R., Caers, J., Satija, A., 2010. Tetris: a training image generator for SGeMS.

Stanford Center for Reservoir Forecasting.

Brown, J. D., & Heuvelink, G. B., 2007. The Data Uncertainty Engine (DUE): A software tool for

assessing and simulating uncertain environmental variables. Computers & Geosciences , 33 (2),

172-190.

Caers, J., 2011. Modeling Uncertainty in the Earth Sciences. Wiley-Blackwell, Chichester, UK,

229pp.

Caers, J., & Zhang, T., 2004. Multiple-Point Geostatistics: A Quantitative Vehicle For Integrating

Geologic Analogs Into Multiple Reservoir Models. In: G. M. Grammer, P. M. Harris, & G. P.

Eberli, AAPG Memoir 80: Integration of Outcrop and Modern Analogs in Reservoir Modeling

AAPG, Tulsa, OK, USA, pp. 383–394.

Caers, J., Park, K., & Scheidt, C., 2009. Modelling Uncertainty in Metric Space. I.A.M.G. 2009, pp.

5-6.

Castelli, F., 2012. QJSON: The Easiest Way To Map JSON To Qt. http://qjson.sourceforge.net

[Retrieved October 24, 2012]

Cortes, C., Vapnik, V., 1995. Support-Vector Networks. Machine Learning, 20 (3), pp. 273-297.

Chilles, J.-P., & Delfiner, P., 1999. Geostatistics: Modeling Spatial Uncertainty, Wiley Hoboken,

NJ, USA, 720pp.

Date, C. J., & Darwen, H., 1996. A Guide to SQL Standard, Addison-Wesley, Boston, MA, USA,

544pp.

Dimitrakopoulos, R., & Ramazan, S., 2004. Uncertainty based production scheduling in open pit

mining. SME Transactions, 316.

Filar, J. A., & Haurie, A., 2009. Uncertainty and Environmental Decision Making: A Handbook of

Research and Best Practice (Vol. 138), Springer-Verlag. Berlin, Germany, 338pp.

Fowler, M., Beck, K., Brant, J., Opdyke, W., & Roberts, D., 1999. Refactoring: Improving the

Design of Existing Code, Addison-Wesley Professional, Boston, MA, USA, 464pp.

Guennebaud, G., Jacob, B., & et al., 2010. Eigen V3, http://eigen.tuxfamily.org, [Retrieved

October 14, 2012]

Goderis, S., 2008. On The Separation Of User Interface Concerns: A Programmer's Perspective

On The Modularisation Of User Interface Code. Ph.D. Dissertation, Vrije Universiteit Brussel,

Brussels, Belgium, 208pp.

Harbaugh, A. W., 2005. MODFLOW-2005, The U.S. Geological Survey Modular Ground-Water

Model—the Ground-Water Flow Process. Techniques and Method, 6 (A16), 253pp.

Holden, L., Hauge, R., Skare, O., & Skorstad, A., 1998. Modeling of Fluvial Reservoirs with Object

Models. Mathematical Geology , 30 (5), 473-496.

Honarkhah, M and Caers, J, 2010, Stochastic Simulation of Patterns Using Distance-Based

Pattern Modeling, Mathematical Geosciences, 42: 487–517

IBM Corporation, 2012. SPSS Statisctics, http://www-01.ibm.com/software/analytics/spss/,

[Retrived October 24, 2012].

Journel, A., 1999. Markov Models for Cross Covariances. Mathematical Geology, 31 (8), 955-964

Karlsson, B., 2005. Beyond the C++ Standard Library: An Introduction to Boost, Addison-Wesley

Professional, Boston, MA, USA, 432pp.

Lawrence Berkeley National Laboratory, 2012. TOUGH2 User's Guide. Berkeley, CA, USA, 197pp.

Liu, W., Principe, J., & Haykin, S., 2010. Kernel Adaptive Filtering: A Comprehensive

Introduction. Wiley, Hoboken, NJ, USA, 209pp.

Musser, D. R., Derge, G. J., & Saini, A., 2001. STL Tutorial and Reference Guide: C++

Programming With The Standard Template Library, Addison-Wesley Professional, New York, NY,

USA, 560pp.

Mariethoz, G., Renard, P., Cornation, F., & Jaquet, O., 2009. High Resolution Plurigaussian

Simulations of Lithofacies and Monte-Carlo Analysis. Ground Water, 47 (1), 13-24.

Owens, M., 2006. The Definitive Guide to SQLite, Apress, New York, NY, USA, 464pp.

Otsu, N., 1979. A Threshold Selection Method from Gray-Level Histograms. IEEE Transactions on

Systems, Man, and Cybernetics, 9 (1) , 62-66.

Peredo, O., & Ortiz, J., 2011. Parallel Implementation of Simulated Annealing To Reproduce

Multiple-Point Statistics. Computers & Geosciences , 37 (8), 1110-1121.

SAS Institute, 2011. Statistical Analysis System, http://www.sas.com, [Retrieved October 24

2012]

Scheidt, C., & Caers, J., 2009a. Uncertainty Quantification in Reservoir Performance Using

Distances and Kernel Methods—Application to a West Africa Deepwater Turbidite Reservoir.

SPE Journal , 14 (4), 680-692.

Scheidt, C., & Caers, J. 2009b. Representing Spatial Uncertainty Using Distances and Kernels.

Mathematical Geosciences , 41 (4), 397-419.

Schlumberger Limited, 2012. ECLIPSE 300, http://www.slb.com/services/software/reseng.aspx,

[Retrieved October 24, 2012].

Schlumberger Limited. 2010. Petrel E&P Software Platform. http://www.slb.com/petrel.aspx,

[Retrieved October 24, 2012].

Schroeder, W., Martin, K., & Loresen, B., 2004. The Visualization Toolkit (3rd Edition ed.),

Kitware Inc, New York, NY, USA, 504pp.

Strebelle, S., 2002. Sequential Simulation Drawing Structures From Training Images. PhD.

Dissertation, Stanford University, Stanford, CA, USA, 374pp.

Suzuki, S., & Caers, J., 2008. A Distance-based Prior Model Parameterization for Constraining

Solutions of Spatial Inverse Problems. Mathematical Geosciences , 40 (4), 445-469.

TIBCO Software Inc. 2010. S-Plus. http://spotfire.tibco.com/discover-spotfire, [Retrieved

October 24, 2012].

Weisfield, M., 2008. The Object-Oriented Thought Process, 3rd Edition ed., Addison-Wesley

Professional, Boston, MA, USA, 360pp.

Wu, J., Jones, K. J., Li, H., & Loucks, O. L., 2006. Scaling and Uncertainty Analysis in Ecology:

Methods and Applications. Springer, New York, NY, USA, 338pp.

Figure 1: Components of geostatistical modeling with forward modeling that are currently

combined in ad-hoc fashion.

Forwardmodel

GeostatisticsSpatial

uncertainty

SpatialModel

parameters

Analysis ofResponses

Forward model

parameters

FieldData

Figure 2: Workflow of distance based uncertainty quantification

definition of a distance

(1)

(3)

(4)

(5)

Responses

Time

Spatial & ForwardModel

parameters

GeostatisticsSpatial

uncertainty

(2)

Forwardmodel

(6)

Analysis ofResponses

(7)

Figure 3: Application of k-medoid and kernel k-medoid clustering to generate cluster models

Figure 4: Overview of software structure of SGEMS-UQ and data flow of SGEMS-UQ program

Figure 5: Data mining base classes structure. Future modules can be easily added by inheriting

from the two base classes.

Figure 6: Selection of two 3D training images with each having two sample facies models

generated using SGEMS program snesim.

Training image Facies models from SGEMS

Figure 7: Sample plot of a time series proxy response (cumulative oil production) that can be

visualized within SGEMS-UQ.

Figure 8: GUI interface used to generate MDS space within SGEMS-UQ.

Figure 9: Visualizer for MDS space and parameters in tree format for realization 92. Property

highlighting and other GUI enhancements have been included.

Figure 10: Sample 3D visualization of MDS performed on 100 realizations using Cumulative Oil

Production (over the entire field), as the distance metric. SGEMS-UQ allows for multiple

instances of k-means to be performed.

Figure 11: (a) Cumulative oil production P10, P50 and P90 values and (b) Cumulative water

production response of the 12 realizations selected by KKM compared to P10, P50 and P90

values from exhaustive set.

A B

P90P50

P10

P90

P50

P10

Figure 12: Pareto plot produced by examining sensitivity of facies porosity/permeability means

and variances. The GUI interface for producing Pareto plots is also shown.

Figure 13: Data mining of parameter value within a cluster. This operation plots the mean value

of the histogram used to generate the permeability map for facies 2 of all properties within a

particular cluster.

Figure 14: Data mining interface over all the realizations. This particular instance returns and

plots all realizations with facies3 permeability standard deviation less than that of realization

NTG_real28. This can also be used to return realizations with a parameter within a certain

range of a baseline realization.

Figure 15: Data mining interface for locating all realizations with a common parameter value.

Figure 16: Data mining interface for locating all common parameter values given a set of

realizations.

Table 1: Properties of training images

Facies

Architecture

Channel

Thickness

Width/Thickness

Ratio

Channel

Sinuosity

1 0.019 0.025 0.030 25

2 0.113 0.162 0.213 1000

3 0.151 0.191 0.231 1430

Table 2: Mean values of target histograms used to generate porosity and permeability

Facies

Architecture

Channel

Thickness

Width/Thickness

Ratio

Channel

Sinuosity

1 0.019 0.025 0.030 25

2 0.113 0.162 0.213 1000

3 0.151 0.191 0.231 1430

Table 3: Standard deviation of target histograms used to generate porosity and permeability

STD Porosity Permeability

Min Mid Max Min Mid Max

Facies1 0.022 0.027 0.032 25 37 50

Facies2 0.045 0.055 0.065 550 650 750

Facies3 0.048 0.058 0.068 485 585 685

Facies4 0.035 0.045 0.055 295 345 395

Forward model

Geostatistics Spatial

uncertainty

Spatial Model

parameters

Analysis of Responses

Forward model

parameters

Field Data

Figure 1

definition of a distance

(1)

(3)

(4)

(5)

Responses

Time

Spatial & Forward

Model parameters

Geostatistics Spatial

uncertainty

(2)

Forward model

(6)

Analysis of Responses

(7)

Figure 2

Figure 3

Figure 4

Figure 5

Training image Facies models from SGEMS

Figure 6

Figure 7

Figure 8

Figure 9

Figure 10

Figure 11

A B

P90 P50

P10

P90

P50

P10

Figure 12

Figure 13

Figure 14

Figure 15

Figure 16

Facies

Architecture

Channel

Thickness

Width/Thickness

Ratio

Channel

Sinuosity

1 0.019 0.025 0.030 25

2 0.113 0.162 0.213 1000

3 0.151 0.191 0.231 1430

Mean

Porosity Permeability

Min Mid Max Min Mid Max

Facies1 0.019 0.025 0.030 25 50 75

Facies2 0.113 0.162 0.213 1000 1200 1400

Facies3 0.151 0.191 0.231 1430 1630 1830

Facies4 0.190 0.230 0.270 1875 2175 2475

STD Porosity Permeability

Min Mid Max Min Mid Max

Facies1 0.022 0.027 0.032 25 37 50

Facies2 0.045 0.055 0.065 550 650 750

Facies3 0.048 0.058 0.068 485 585 685

Facies4 0.035 0.045 0.055 295 345 395