9
Principal component spectral analysis Hao Guo 1 , Kurt J. Marfurt 2 , and Jianlei Liu 3 ABSTRACT Spectral decomposition methods help illuminate lateral changes in porosity and thin-bed thickness. For broadband data, an interpreter might generate 80 or more somewhat re- dundant amplitude and phase spectral components spanning the usable seismic bandwidth at 1-Hz intervals. Large num- bers of components can overload not only the interpreter but also the display hardware. We have used principal compo- nent analysis to reduce the multiplicity of spectral data and enhance the most energetic trends inside the data. Each prin- cipal component spectrum is mathematically orthogonal to other spectra, with the importance of each spectrum being proportional to the size of its corresponding eigenvalue. Prin- cipal components are ideally suited to identify geologic fea- tures that give rise to anomalous moderate- to high-amplitude spectra. Unlike the input spectral magnitude and phase com- ponents, the principal component spectra are not direct indi- cators of bed thickness. By combining the variability of mul- tiple components, principal component spectra highlight stratigraphic features that can be interpreted using a seismic geomorphology workflow. By mapping the three largest principal components using the three primary colors of red, green, and blue, we could represent more than 80% of the spectral variance with a single image. We have applied and validated this workflow using a broadband data volume con- taining channels draining an unconformity, which was ac- quired over the Central Basin Platform, Texas, U.S.A. Princi- pal component analysis reveals a channel system with only a few output data volumes. The same process provides the in- terpreter with flexibility to remove any unwanted high-am- plitude geologic trends or random noise from the original spectral components by eliminating those principal compo- nents that do not aid in delineation of prospective features with their interpretation during the reconstruction process. INTRODUCTION Spectral decomposition of seismic data is a recently introduced interpretation tool that aids in the identification of hydrocarbons, classification of facies, and calibration of thin-bed thickness. Wheth- er based on the discrete Fourier transform Partyka et al., 1999, wavelet transform Castagna et al., 2003, or S-transform Matos et al., 2005, spectral decomposition typically generates significantly more output data than input data, presenting challenges in conveying the meaning of these data in a concise and interpreter-friendly form. Typically, an interpreter might generate 80 or more spectral ampli- tude and phase components spanning the usable seismic bandwidth at 1-Hz intervals. With so much data available, the key issue for in- terpretation is to develop an effective way for data representation and reduction. The most common means of displaying these components is sim- ply by scrolling through them to determine manually which single frequency best delineates an anomaly of interest. We illustrate this process for the phantom horizon slice through the seismic amplitude volume zero-phase reflectivity 66 ms above the Atoka unconfor- mity from a survey acquired over the Central Basin Platform, Texas, U.S.A. Figure 1. We compute 86 spectral components ranging from 5 Hz through 90 Hz using a matched pursuit technique de- scribed by Liu and Marfurt 2007a. Figure 2 shows representative corresponding phantom horizon slices at 20-Hz intervals from 10 Hz Figure 2a through 90 Hz Figure 2e. By observing how bright and dim areas of the response move lat- erally with increasing frequency, a skilled interpreter can determine whether a channel or other stratigraphic feature of interest is thicken- ing or thinning. For example, the thinner upstream portions of the channel indicated by the magenta arrows in Figure 1 are better delin- eated on the 40–60-Hz components displayed in Figure 2c, whereas the thicker downstream portions of the channel indicated by the yel- low arrows in Figure 1 are better delineated by the 20–40-Hz com- ponents displayed in Figure 2b. After initial analysis, we might be able to limit the display to only those frequency components most important to the task at hand Fahmy et al., 2005. For example, we might choose the 35-Hz component to map the channel if we observe Manuscript received by the Editor 25 June 2008; revised manuscript received 17 September 2008; published online 29 May 2009. 1 University of Houston,Allied Geophysical Laboratories, Houston, Texas, U.S.A. E-mail: [email protected]. 2 The University of Oklahoma, ConocoPhillips School of Geology and Geophysics, Norman, Oklahoma, U.S.A. E-mail: [email protected]. 3 Chevron Energy Technology Company, Houston, Texas, U.S.A. E-mail: [email protected]. © 2009 Society of Exploration Geophysicists. All rights reserved. GEOPHYSICS, VOL. 74, NO. 4 JULY-AUGUST 2009; P. P35–P43, 11 FIGS. 10.1190/1.3119264 P35 Downloaded 08 Nov 2009 to 118.137.43.11. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/

Principal Component Spectral Analysis

  • Upload
    roy

  • View
    347

  • Download
    11

Embed Size (px)

Citation preview

Page 1: Principal Component Spectral Analysis

P

H

©

GEOPHYSICS, VOL. 74, NO. 4 �JULY-AUGUST 2009�; P. P35–P43, 11 FIGS.10.1190/1.3119264

rincipal component spectral analysis

ao Guo1, Kurt J. Marfurt2, and Jianlei Liu3

icewamtTtata

pfpvmUfsc1

ewicetlpaim

ed 17 S, U.S.A.physicsianlei.li

ABSTRACT

Spectral decomposition methods help illuminate lateralchanges in porosity and thin-bed thickness. For broadbanddata, an interpreter might generate 80 or more somewhat re-dundant amplitude and phase spectral components spanningthe usable seismic bandwidth at 1-Hz intervals. Large num-bers of components can overload not only the interpreter butalso the display hardware. We have used principal compo-nent analysis to reduce the multiplicity of spectral data andenhance the most energetic trends inside the data. Each prin-cipal component spectrum is mathematically orthogonal toother spectra, with the importance of each spectrum beingproportional to the size of its corresponding eigenvalue. Prin-cipal components are ideally suited to identify geologic fea-tures that give rise to anomalous moderate- to high-amplitudespectra. Unlike the input spectral magnitude and phase com-ponents, the principal component spectra are not direct indi-cators of bed thickness. By combining the variability of mul-tiple components, principal component spectra highlightstratigraphic features that can be interpreted using a seismicgeomorphology workflow. By mapping the three largestprincipal components using the three primary colors of red,green, and blue, we could represent more than 80% of thespectral variance with a single image. We have applied andvalidated this workflow using a broadband data volume con-taining channels draining an unconformity, which was ac-quired over the Central Basin Platform, Texas, U.S.A. Princi-pal component analysis reveals a channel system with only afew output data volumes. The same process provides the in-terpreter with flexibility to remove any unwanted high-am-plitude geologic trends or random noise from the originalspectral components by eliminating those principal compo-nents that do not aid in delineation of prospective featureswith their interpretation during the reconstruction process.

Manuscript received by the Editor 25 June 2008; revised manuscript receiv1University of Houston,Allied Geophysical Laboratories, Houston, Texas2The University of Oklahoma, ConocoPhillips School of Geology and Geo3Chevron Energy Technology Company, Houston, Texas, U.S.A. E-mail: j2009 Society of Exploration Geophysicists.All rights reserved.

P35

Downloaded 08 Nov 2009 to 118.137.43.11. Redistribution subject to

INTRODUCTION

Spectral decomposition of seismic data is a recently introducednterpretation tool that aids in the identification of hydrocarbons,lassification of facies, and calibration of thin-bed thickness. Wheth-r based on the discrete Fourier transform �Partyka et al., 1999�,avelet transform �Castagna et al., 2003�, or S-transform �Matos et

l., 2005�, spectral decomposition typically generates significantlyore output data than input data, presenting challenges in conveying

he meaning of these data in a concise and interpreter-friendly form.ypically, an interpreter might generate 80 or more spectral ampli-

ude and phase components spanning the usable seismic bandwidtht 1-Hz intervals. With so much data available, the key issue for in-erpretation is to develop an effective way for data representationnd reduction.

The most common means of displaying these components is sim-ly by scrolling through them to determine manually which singlerequency best delineates an anomaly of interest. We illustrate thisrocess for the phantom horizon slice through the seismic amplitudeolume �zero-phase reflectivity� 66 ms above the Atoka unconfor-ity from a survey acquired over the Central Basin Platform, Texas,.S.A. �Figure 1�. We compute 86 spectral components ranging

rom 5 Hz through 90 Hz using a matched pursuit technique de-cribed by Liu and Marfurt �2007a�. Figure 2 shows representativeorresponding phantom horizon slices at 20-Hz intervals from0 Hz �Figure 2a� through 90 Hz �Figure 2e�.

By observing how bright and dim areas of the response move lat-rally with increasing frequency, a skilled interpreter can determinehether a channel or other stratigraphic feature of interest is thicken-

ng or thinning. For example, the thinner upstream portions of thehannel indicated by the magenta arrows in Figure 1 are better delin-ated on the 40–60-Hz components displayed in Figure 2c, whereashe thicker downstream portions of the channel indicated by the yel-ow arrows in Figure 1 are better delineated by the 20–40-Hz com-onents displayed in Figure 2b. After initial analysis, we might beble to limit the display to only those frequency components mostmportant to the task at hand �Fahmy et al., 2005�. For example, we

ight choose the 35-Hz component to map the channel if we observe

eptember 2008; published online 29 May 2009.E-mail: [email protected]., Norman, Oklahoma, U.S.A. E-mail: [email protected]@chevron.com.

SEG license or copyright; see Terms of Use at http://segdl.org/

Page 2: Principal Component Spectral Analysis

tassp

fivpsspid

cnasrscvst

4cmicW

Fatsc

Facn

P36 Guo et al.

hat most channel beds have the maximum constructive interferencet this frequency. However, in terms of conveying information of thepectral variation from the data space, only a small portion of thepectral variation is displayed �one out of a possible 86 spectral com-onents�.

High

Low

2 km

Amp

igure 1. Phantom horizon slice through the seismic data 66 msbove theAtoka unconformity from a survey acquired over the Cen-ral Basin Platform, Texas, U.S.A. Yellow arrows indicate down-tream, and magenta arrows upstream, components of complexhannel systems draining the unconformity high.

a)

AmpHigh

Low

2 km

High

Low

2 km

Amp

AmpHigh

Low

High

Low

Amp

c)

b) d)

igure 2. Phantom horizon slices corresponding to those shown in Figmplitude spectral component volumes computed using a matchedhannel system and are better delineated by the 20–40-Hz frequencyel systems and are better delineated by the 40–60-Hz frequency ima

Downloaded 08 Nov 2009 to 118.137.43.11. Redistribution subject to

Furthermore, although scanning works well for choosing the bestrequency for a given horizon, it can become impractical when try-ng to determine the best frequency for multiple spectral componentolumes. The spectral decomposition algorithm also outputs thehase spectrum corresponding to each amplitude spectrum. Figure 3hows the corresponding phase spectra to the amplitude spectrahown in Figure 2. The phase images in Figure 3 are not easy to inter-ret when compared to the amplitude slices in Figure 2, and thus thenterpreter might choose to ignore the phase spectrum informationuring the interpretation.

We can reduce the amount of images to be scanned by the use ofolor stacking �e.g., Liu and Marfurt, 2007b; Stark, 2005; Theopha-is and Queen, 2000�. In this workflow, we plot one componentgainst red, a second against green, and a third against blue, andcale so that the color value range �0–255� of each primary color rep-esents 95% of the corresponding component data values. Figure 4hows red, green, and blue �RGB� images of different frequencyombinations. Figure 4a-c is designed to accentuate the spectralariation in the low-, intermediate-, and high-frequency ranges, re-pectively. Figure 4d-f shows three possible frequency combina-ions on the global frequency ranges.

Compared with the images in Figure 2, the RGB images in Figureshow greater detail by combining multiple images. Simply stated,

olor stacking can triple the information that is conveyed by a singleono-frequency display. Considering the complexity of the spectra

n the target horizon and the limit of color channels for display, eachombination will highlight only certain parts of the whole spectrum.ithout further data reduction, representation of the original spec-

2 km

2 km

High

Low

2 km

Amp

e)

hrough the �a� 10-Hz, �b� 30-Hz, �c� 50-Hz, �d� 70-Hz, and �e� 90-Hzalgorithm. Yellow arrows indicate downstream components of the

s. Magenta arrows indicate upstream components of complex chan-

ure 1 tpursuitimageges.

SEG license or copyright; see Terms of Use at http://segdl.org/

Page 3: Principal Component Spectral Analysis

ttn

fitcwesdtaaa

Srmetmepo

utb

onismccap

tdcdcaiPtobPld

fio

F9pp

Principal component analysis P37

ral content reaches its limits �for example, 86�85�84 combina-ions�. This situation motivates more efficient data-reduction tech-iques.

One method of data reduction is to generate synthetic attributesrst by correlating the data against predefined spectra �or basis func-

ions� and plotting the correlation coefficients, instead of the spectralomponents, against color. In the Central Basin Platform example,e note that the combination of 30, 60, 90 Hz is not strikingly differ-

nt from the combination of 20, 50, 80 Hz, implying that there are noignificant changes over a 10-Hz range. Taking advantage of the re-undancy or correlation in the original spectra, Stark �2005� defineshree spectral basis functions that can be interpreted as simple aver-ges and plotted the low-frequency average against red, intermedi-te-frequency average against green, and highest-frequency averagegainst blue.

Liu and Marfurt �2007b� provide a moderate improvement totark’s �2005� gate or “boxcar” spectral basis function by applyingaised cosines over user-defined spectral ranges to generate low-,id-, and high-frequency color stack images. Whereas in some cas-

s the raised cosine function better approximates the cosine-likehin-bed tuning response, such approximations are suboptimal; we

ight lose a considerable amount of frequency variability represent-d by using either Stark’s �2005� or Liu and Marfurt’s �2007b� threeredefined basis functions. Furthermore, there is no simple measuref how much of the spectrum we do not represent.

Instead of using predefined basis functions or spectra, we proposesing principal component analysis to determine mathematicallyhose frequency spectra that best represent the data variability, there-y segregating noise components and reducing the dimensionality

a)2 km

180�

–180�

0

Phase

2 km

180�

–180�

0

Phase

180�

–180�

0

Phase

180�

–180�

0

Phase

c)

b) d)

igure 3. Phantom horizon slices corresponding to those shown in Fig0-Hz phase spectral component volumes computed using a matcheonents of the channel system. Note the phase rotation of the channeosition of a northwest linear feature.

Downloaded 08 Nov 2009 to 118.137.43.11. Redistribution subject to

f spectral components. We begin with a review of principal compo-ent analysis applied to real spectral amplitudes, and then show howt is readily generalized to represent complex spectra consisting ofpectral magnitudes and phases. We apply the method to a land seis-ic data volume from the Central Basin Platform and show how we

an represent 80% of the data variation using a color stack. We con-lude by showing how we can filter out selected spectral variationsnd thereby enhance individual spectral images by eliminating inter-reter-chosen components during the reconstruction.

THEORY AND METHOD

Principal component analysis finds a new set of orthogonal axeshat have their origin at the data mean and that are rotated so that theata variance is maximized �Figure 5�. The orthogonal axes arealled eigenvectors and represent spectra in the original frequencyomain. The projections of the original spectra onto these axes arealled principal component �PC� bands. The amount of the total vari-nce that each PC band can represent is quantified by its correspond-ng eigenvalue. Compared with the original frequency components,C bands are orthogonal �and thus uncorrelated� linear combina-

ions of the original spectra. In theory, we calculate the same numberf output PC bands as the input spectral components. The first PCand represents the largest percentage of data variance, the secondC band represents the second-largest data variance, and so on. The

ast PC bands represent the uncorrelated part of the original spectralata, which includes random noise.

In the seismic spectral volumes we have analyzed, we find that therst 15 PC bands represent more than 95% of the total variance of theriginal data. The first three PC bands represent as much as 80% of

2 km

2 km

2 km

180�

–180�

0

Phase

e)

and 2 through the �a� 10-Hz, �b� 30-Hz, �c� 50-Hz, �d� 70-Hz, and �e�it algorithm. Yellow arrows mark the location of downstream com-

m in different frequency phase components. White arrows mark the

ures 1d pursul syste

SEG license or copyright; see Terms of Use at http://segdl.org/

Page 4: Principal Component Spectral Analysis

tttns

naawmtldcmbnrpr

atgmivs

nt

tcis

wnct

lt

�oTgm

F3ts

P38 Guo et al.

he total variance of the data. Furthermore, we can reconstruct “fil-ered” spectra by using a subset of the interpreter-chosen PC spectra,hereby providing a means of rejecting random noise and compo-ents that interpreters consider as uninteresting background. Wehow a quantitative example of this in the next section.

Principal component analysis of more than 100 spectral compo-ents is well established in remote-sensing interpretation softwarend workflows �Rodarmel and Shan, 2002�. Principal componentnalysis of seismic attributes also is well established, particularlyith respect to seismic shape analysis �Coléou et al., 2003�. Mathe-atically, principal component analysis is an effective statistical da-

a-reduction method for data spaces in which the dimension is veryarge and the data axes are not orthogonal �that is, somewhat redun-ant� to each other. From an interpreter’s point of view, principalomponent analysis applied to spectral components begins by deter-ining which frequency spectrum �the first principal component�

est represents the entire data volume. The second principal compo-ent is that spectrum that best represents the part of the data not rep-esented by the first principal component. The third principal com-onent is the spectrum that best represents that part of the data notepresented by the first two principal components, and so on.

If normalized by the total sum of all eigenvalues, the eigenvaluessociated with each eigenvector represents the percentage of datahat can be represented by its corresponding principal component. Ineneral, the first principal component is a spectrum that representsore than 50% of the data variance, the second principal component

s a spectrum that represents another 15% through 25% of the dataariance, and the third principal component is a spectrum that repre-ents about 5% of the data variance. Together, these three compo-

a)2 km

Lo Lo Lo .

10 20 30Hi Hi Hi

Amp at (Hz)

2 km

Lo Lo Lo

40 50 60Hi Hi Hi

Amp at (Hz)

Lo Lo Lo

70 80 90Hi Hi Hi

Amp at (Hz)

Lo Lo Lo

10 40 70Hi Hi Hi

Amp at (Hz)

c)

b) d)

igure 4. Composite RGB images of �a� 10–20–30-Hz, �b� 40–500–60–90-Hz phantom horizon slices through amplitude spectral cohe second against green, and the third against blue. Each spectral courvey.

Downloaded 08 Nov 2009 to 118.137.43.11. Redistribution subject to

ents usually represent about 80% of the frequency variance seen inhe data along this horizon slice.

Principal component analysis on spectral components consists ofhree steps. The first step is to assemble the covariance matrix byrosscorrelating every frequency component, j� �1,2, . . . ,J�, withtself and all other frequency components �Figure 6�, resulting in aquare, J by J symmetrical, covariance matrix C:

Cjk� �n�1

N

�m�1

M

dmn�j� dmn

�k� , �1�

here Cjk is the jkth element of the covariance matrix C; N is theumber of seismic lines in the survey; M is the number of seismicrosslines in the survey; and dmn

�j� and dmn�k� are spectral magnitudes of

he jth and kth frequencies at line n and crossline m.The second step is to decompose the covariance matrix into J sca-

ar eigenvalues �p and J unit length J�1 eigenvectors vp by solvinghe equation

Cvp��pvp. �2�

Almost all numerical solutions of equation 2 �we use LAPACK2007� program ssyevx� sort the eigenvectors vp in either ascendingr descending order according to their corresponding eigenvalues.he third step is to project the spectrum at each trace onto each ei-envector vp to obtain a map of coefficients amn

�p� that measure howuch of each spectrum is represented by a given eigenvector:

2 km

2 km

2 km

Lo Lo Lo

20 50 80Hi Hi Hi

Amp at (Hz)

2 km

Lo Lo Lo

30 60 90Hi Hi Hi

Amp at (Hz)

e)

f)

z, �c� 70–80–90-Hz, �d� 10–40–70-Hz, �e� 20–50–80-Hz, and �f�nt volumes where the first spectral amplitude is plotted against red,nt has been balanced over a 500-ms analysis window over the entire

–60-Hmponempone

SEG license or copyright; see Terms of Use at http://segdl.org/

Page 5: Principal Component Spectral Analysis

wP—l

pufiLamttp

FotcrtagvmtTs�

.

FCiinterest.

Fsf2apcrdfit

Principal component analysis P39

amn�p� � �

j�1

J

vp�j�dmn

�j� , �3�

here the index j indicates the jth frequency. The output is a series ofC bands sorted in descending order of their statistical significance

the percentage of original data variance observable in the particu-ar PC band.

EXAMPLE

To illustrate the effectiveness of this technique, we examine thehantom horizon slice 66 ms above the Pennsylvanian age Atokanconformity from a survey acquired over the Central Basin Plat-orm, shown in Figure 1. We compute 86 spectral components rang-ng from 5 to 90 Hz using a matched pursuit technique described byiu and Marfurt �2007a�. Next we perform principal componentnalysis on the 86 spectral components, form an 86 by 86 covarianceatrix using equation 1, decompose it into 86 eigenvalue-eigenvec-

or pairs using equation 2, and project the original spectra at eachrace onto each eigenvector using equation 3. Figure 7a displays theercentage of data defined by each of the principal component spec-

10 Hz component (amplitude)

20Hzcom

ponent(amplitu

de)

30Hzcomponent(amplitude)

1st principalcomponent

2nd principal component

-

-

-

igure 5. Principal component analysis �PCA� of data consisting ofnly three frequency components, with black spheres representinghe three spectral components for each trace. The data cloud indi-ates that the three components are highly correlated and somehowedundant. The data variance is the projection of the data cloud ontohe component axis. The PCA analysis rotates the original 10, 20,nd 30-Hz axes so that the first eigenvector �PC band� represents thereatest variability in the data. The variance along the second eigen-ector is relatively small and mathematically uncorrelated with theajor trend. The third eigenvector �not shown� is perpendicular to

he first two and represents the least amount of variance in the data.hus the first PC band can effectively capture the major featureseen in the data, reducing the amount of data by a factor of three.Figure modified from a similar one courtesy of ScottPickford Ltd.�

Downloaded 08 Nov 2009 to 118.137.43.11. Redistribution subject to

... ...

... ...

10 Hz 50 Hz 90 Hz

C(10,10)

...

...

C(10, 50)

...

C(50, 10)

...

C(10, 90)

C(50, 50)

...

......

C(50, 90)

C(90, 10)

.. .. ..

...

C(90, 50) C(90, 90)

C=

..

igure 6. Cartoon showing the computation of the covariance matrixmn. Each complex spectral time slice is simply crosscorrelated with

tself and all other complex spectral time slices along the horizon of

PCA 1 PCA 2 PCA 3 PCA 4 PCA 5

0

10

20

30

40

50

60

70

80

90

100

1 3 5 7 9 11 13 15 17 19

PC band number (#)

Percentage

(%)

a)

0.3

0.2

0.1

0

–0.1

–0.2

–0.30 20 40 60 80 100

Frequency component (Hz)

CoefficientsofPCAband

b)

igure 7. �a� The first 20 eigenvalues and �b� the first five corre-ponding eigenvectors �or principal component spectra� computedrom the 86 spectral components, eight of which are shown in Figure. The first three eigenvectors represent more than 80% of the vari-nce in the data. The remaining eigenvectors represent only a smallart of the spectral variance. The black line in �a� represents the totalumulative percentage that the first N principal components can rep-esent. Because the data were spectrally balanced using the spectralecomposition algorithm described by Liu and Marfurt �2007a�, therst principal component appears as a flat spectrum. By construc-

ion, the length of each eigenvector is exactly 1.0.

SEG license or copyright; see Terms of Use at http://segdl.org/

Page 6: Principal Component Spectral Analysis

tcht

sotpntipct

sarsmje

diiaat

tr

gwrlaecaeacpb�lj

tpFsaF

Fa

P40 Guo et al.

ra �or eigenvectors� displayed in Figure 7b. The first three principalomponents account for most of the spectral variance seen along thisorizon. The remaining components account for only about 17% ofhe data variance.

Although the input amplitude volume has nonwhite amplitudepectra, the spectral components were statistically balanced as partf Liu and Marfurt’s �2007a� matched-pursuit spectral decomposi-ion algorithm. For this reason, the first eigenvector PC band 1 is ap-roximately flat and represents 62% of the total variance in the origi-al data. The second eigenvector is monotonically decreasing, withhe high frequencies contributing less than the low frequencies. Wenterpret this trend as representing the fact that the low frequenciesrobably are in-phase with each other, whereas the higher frequen-ies might have greater variance and need to be represented by morehan one eigenvector.

Although it is tempting to assign physical significance to thesepectra �with eigenvector 3 perhaps representing thin-bed tuning atbout 55 Hz�, we need to remember that they reside in mathematicalather than geologic space. Whereas the first eigenvector best repre-ents the data, all subsequent eigenvectors are constructed mathe-atically to be orthogonal to the previous ones. The PC bands are

ust weighted sums of the original spectral components, as seen inquation 3.

In Figure 8, we plot the four largest principal components of theata, as well as the 71st component. It is important to remember thatf a given event has very high reflectivity, it will have high amplituden its components as well. For this reason, we see channels tuning innd out in PC 1. We note that components 2, 3, and 4 display morenomalous behavior in the sense of spectral shape changes relativeo the total of all components. The PC 71 represents one example of

a)

High

0

2 km

Amp

Insert component 2!Insert component 2!High

Low

2 km

Amp

Insert cInsert cAmpHigh

Low

Amp

High

Low

c)

b) d)

igure 8. The spectra projected onto the �a� first, �b� second, �c� third,re best represented by the second, third, and fourth principal compon

Downloaded 08 Nov 2009 to 118.137.43.11. Redistribution subject to

he “noisy” PC bands starting from number 20; we interpret PC 71 asandom noise.

By mapping the three largest principal components against red,reen, and blue, we can represent 83% of the spectral informationith a single colored image �Figure 9� in which each component is

escaled to span the range 0–255. The conspicuous red features de-ineate the wider channels seen on the erosional high in the centralnd upper region in the horizon time map. The green features delin-ate narrower channels that are upstream from the previous redhannels. Note that the narrow channels indicated by green arrowsre quite difficult to see in Figures 2–4.Although they are best delin-ated by the higher-frequency components, these components alsore contaminated by noise. Based on our earlier discussion, note thatoherent spectra will be represented by the first few principal com-onents, and incoherent spectra corresponding to random noise wille represented by a linear combination of later principal componentssuch as PC 71 shown in Figure 8e�. Selection of the PC bands witharge eigenvalues implicitly increases the signal-to-noise ratio by re-ecting noisy PC bands such as 71.

PRINCIPAL COMPONENTS OF COMPLEXSPECTRA

The previous analysis was performed only on the magnitude ofhe spectra, represented by the images displayed in Figure 2. Thehase spectra displayed in Figure 3 provide additional information.or example, in Figure 3c the channels marked by the yellow arrowshow a 90° phase shift relative to the background, and these channelsre more chaotic in the higher-frequency phase image as shown inigure 3e. Note the spectral variation of the northwest features

onent 3!onent 3!

2 km

2 km

High

Low

2 km

Amp

e)

rth, and �e� 71st principal component. Note that the anomalous areashe 71st principal component represents random noise.

ompomp

�d� fouent. T

SEG license or copyright; see Terms of Use at http://segdl.org/

Page 7: Principal Component Spectral Analysis

mtsf1Trs

itmcocWttmgds

ptc�dacmn�fhltngpr

s

Wttrv

sclPt

sn

Fnbdg

Principal component analysis P41

arked by the white arrows in Figure 3a-e. Several geologic deposi-ional environments might be represented by the same amplitudepectrum. In an extremely simple example, consider the followingour 2-term time series: �1, �1/2�, �1/2, �1�, ��1/2, 1�, and ��1,/2�. Each of these four series will have the same amplitude spectra.his ambiguity could create a false sense of continuity, whereas in

eality we change from an upward-coarsening to an upward-finingequence.

Aformal generalization of equations 1–3 would begin by comput-ng a J by J complex Hermitian-symmetrical covariance matrix fromhe amplitude and phase of the complex spectra. This complex Her-

itian covariance matrix then is decomposed into J real eigenvalue-omplex eigenvector pairs. Crosscorrelating the complex conjugatef the complex eigenvectors with the complex spectra providesomplex crosscorrelation coefficients or principal component maps.e have implemented this approach and find it unsatisfactory for

wo reasons. First, it is unclear how to generalize our RGB multiat-ribute display to represent multiple complex maps. Second, and

ore important, the phase of the eigenvectors provides an extra de-ree of freedom to fit the complex spectra. This extra degree of free-om results in the channels being blurred instead of appearing asharp discontinuities.

Marfurt �2006� recognized this same limitation in principal-com-onent coherence computations using the analytic �complex� ratherhan the original �real� trace. He devised an alternative means ofomputing the statistics of the original �real� and Hilbert transformimaginary� data, and simply treated the Hilbert transform of theata as if they were additional real samples. For our complex spectralnalysis problem, we therefore simply add the covariance matricesomputed from the real components, dmn

�j� cos �mn�j� , to the covariance

atrix computed from the imaginary compo-ents, dmn

�j� sin �mn�j� , where dmn

�j� is the magnitude and

mn�j� is the phase of the complex spectra at the jth

requency at line n and crossline m. If our surveyas N�M seismic traces, this process is equiva-ent to considering the survey as having 2N�Mraces. Because the real and imaginary compo-ents of the complex spectra are independent, ineneral, we will need to use more principal com-onents to reconstruct the data than if we used theeal data �or alternatively magnitude data� alone.

Generalizing equation 1, we obtain a J by J realymmetrical covariance matrix:

Cjk� �n�1

N

�m�1

M

�dmn�j� cos �mn

�j� dmn�k� cos �mn

�k�

�dmn�j� sin �mn

�j� dmn�k� sin �mn

�k�� . �4�

e then reapply equations 2 and 3 and note thathe phase between the real and imaginary parts ofhe complex spectrum is “locked” and cannot beotated when crosscorrelated with the real eigen-ectors.

In Figure 10, we display maps of the complexpectra projected onto the first four principalomponents. The first PC band is strikingly simi-ar to the counterpart in Figure 8. For the secondC band, the channels stand out in excellent con-

rast in Figure 10 compared with Figure 8.

a)

High

Low

Amp

High

Low

Amp

b)

Figure 10. Thfourth princip3.

Downloaded 08 Nov 2009 to 118.137.43.11. Redistribution subject to

PC ANALYSIS AS A FILTER

Because principal component analysis assigns the most coherentpectra to the first eigenvalues and the incoherent or random compo-ents of the spectra to the later eigenvalues, we can use PC analysis

Lo Lo Lo

# 1 # 2 # 3Hi Hi Hi

Amp of PC

2 km

igure 9. Composite RGB image of the first three principal compo-ent bands �red�PC band 1, green�PC band 2, and blue�PCand 3�. Each principal component is scaled to display 95% of theata to each color channel. Note very narrow channels indicated byreen arrows.

2 km

2 km

High

Low

2 km

Amp

High

Low

2 km

Amp

c)

d)

lex spectra projected onto the �a� first, �b� second, �c� third, and �d�ponent computed from the complex spectra using equations 4, 2, and

e compal com

SEG license or copyright; see Terms of Use at http://segdl.org/

Page 8: Principal Component Spectral Analysis

amcevvttn�

ncsnwwNt

ttfeasstm

sce

wfdscucplecn

fntbwnri

a

b

Fcra

P42 Guo et al.

s a filter. For the data, the first five PC bands will take account ofore than 85% of the variance of the original data. The eigenvalues

orresponding to PC bands greater than five drop to a negligible lev-l, so that spectra crosscorrelated against eigenvectors with eigen-ectors greater than five appear to be quite random, representingery small variance in the data spectra. Plotting the first three spec-ral components, we generate the RGB color stack image in Figure 9,hereby implicitly suppressing noise and increasing the signal-to-oise ratio. A more dramatic example can be found in Guo et al.2006�, in which the horizon was contaminated by bad picks.

We can use the first five PC bands to reconstruct most of the origi-al data. In Figure 11a, we redisplay the 90-Hz spectral magnitudeomponent shown originally in Figure 2e. In Figure 11b, we recon-truct the 90-Hz spectral component from the first five PC bands. Weotice that the reconstructed data are very similar to the original data,hich proves the effectiveness of data reconstruction. In Figure 11c,e use PC bands 1, 3, 4, and 5 to reconstruct the spectral magnitude.ote how the narrow channels are delineated more easily after rejec-

ion of PC band 2 in Figure 8b.Principal component filtering provides interpreters with flexibili-

y to remove any unwanted trends from the original data by interac-ively rejecting those principal components that are not correlated toeatures of interpretation interest during data reconstruction. Suchxploratory data analysis is a well-accepted workflow in attributenalysis. In some cases, the acquisition footprint might appeartrongly in a given principal component spectrum and thus can beuppressed in subsequent reconstruction. In other cases, a stronghin-bed tuning imprint corresponding to the background matrix

ight show up a given component and can be rejected during recon-

High

Low

2 km

High

Low

2 km

)

)

High

Low

Amp

c)

igure 11. �a� The original 90-Hz component, �b� the same componeonstruction using only the first five PC bands, and �c� the same compeconstruction using only the PC bands numbered 1, 3, 4, and 5. Nore more clearly defined in �c�.

Downloaded 08 Nov 2009 to 118.137.43.11. Redistribution subject to

truction, thereby enhancing more subtle lateral changes in spectralomponents. We interpret PC band 2 in Figure 8b to be such a tuningffect.

Unfortunately, the principal components themselves do not al-ays have a fixed relation to reflector thickness but instead will vary

rom horizon to horizon and survey to survey. Thus, we object to pre-icting reservoir thickness using principal components through geo-tatistics or neural nets, although we suspect this might work in someases. Still, prediction of thickness from physical principals requiresse of the original spectral components or reconstructed originalomponents from major PC components. On the other hand, as withrincipal components applied to seismic trace-shape analysis �Co-éou et al., 2003�, we suspect that principal components will be anxcellent tool for anomaly mapping using self-organized maps be-ause the first few principal components preserve most of the origi-al data variance.

LIMITATIONS

Display of major principal components might overlook subtleeatures with little reflectivity. By construction, principal compo-ents are ordered, with the first principal component representinghe greatest variance in the data. For this reason, high-amplitudeackground reflectors �the rock matrix through which the channelas cut in our example� will show up strongly in the first few compo-ents. However, not all features of exploration interest have strongeflectivity. A significant challenge in spectral decomposition is thellumination of “invisible channels” �Suarez et al., 2008�, channels

whose reflectivity is nearly indistinguishablefrom that of the surrounding matrix. Wallet andMarfurt �2008� discuss an alternative, more ex-haustive search, based on the “grand tour” meth-od. Because the grand tour is a projection method,they find that using the first few principal compo-nents �computed using the method describedhere� form optimal, compact basis functions thatcan be projected either as a movie or interactively.

As mathematical combinations of differentspectral components, principal component spec-tra have little direct relation to porosity thicknessprovided by the original spectral componentsthemselves. Instead, the images need to be inter-preted in a more qualitative manner, using princi-pals of seismic geomorphology within a givendepositional, erosional, or diagenetic framework.Calibration of these geomorphological featuresneeds to be done the hard way — through com-parison to the original seismic data, and to wellsand production data. Direct correlation of spectrato reservoir properties also needs to be calibrated.Fortunately, such workflows are well establishedin seismic waveform analysis �e.g., Coléou et al.,2003�.

CONCLUSIONS

We have shown that principal component anal-ysis can reduce the redundant spectral compo-nents into significantly fewer, more manageable

2 km

ined after re-btained afterthe channels

nt obtaonent ote how

SEG license or copyright; see Terms of Use at http://segdl.org/

Page 9: Principal Component Spectral Analysis

bsnte

supbPhMtiaietv

tLstmp

C

C

F

G

L

M

M

P

R

S

S

T

W

Principal component analysis P43

ands that capture most of the statistical variance of the originalpectral response. By mapping the three largest principal compo-ents using an RGB color stack, we can represent most of the spec-ral variance with a single image, which in our example provided anxcellent delineation of channels.

Whereas the first PC band having the largest eigenvalue repre-ents most of the signal, the later PC bands having smaller eigenval-es represent random noise. This noise might be associated with in-ut data contaminated by ground roll, acquisition footprint, and/orad picks. Reconstructing the spectral components from a subset ofC bands provides a filtering tool that allows us to reject noise or en-ance spectral behavior that best delineates the features of interest.athematically, the first three PC bands represent most variance of

he original data. However, it is important to note that there is no def-nite physical significance to the exact shape of the PC bands. Thus,lthough such images are excellent for highlighting lateral changesn data that can fit into a seismic geomorphology model, they are notasily usable for quantitative inversion, such as predicting porosityhickness. However, they could be useful input attributes to a super-ised multiattribute prediction.

ACKNOWLEDGMENTS

We thank Burlington Resources for permission to use their data inhis research. We also thank the sponsors of the Allied Geophysicalaboratories �AGL� industrial consortium on mapping of subtletructure and stratigraphic features using modern geometric at-ributes. We thank associate editor Dengliang Gao and three anony-

ous reviewers for their help in generating a significantly improved

aper.

Downloaded 08 Nov 2009 to 118.137.43.11. Redistribution subject to

REFERENCES

astagna, J., S. Sun, and R. Siegfried, 2003, Instantaneous spectral analysis:Detection of low-frequency shadows associated with hydrocarbons: TheLeading Edge, 22, 120–127.

oléou, T., M. Poupon, and K. Azbel, 2003, Interpreter’s corner — Unsuper-vised seismic facies classification:Areview and comparison of techniquesand implementation: The Leading Edge, 22, 942–953.

ahmy, W. A., G. Matteucci, D. Butters, J. Zhang, and J. Castagna, 2005,Successful application of spectral decomposition technology toward drill-ing of a key offshore development well: 75th Annual International Meet-ing, SEG, ExpandedAbstracts, 262–264.

uo, H., K. J. Marfurt, J. Liu, and Q. Dou, 2006, Principal components analy-sis of spectral components: 76th Annual International Meeting, SEG, Ex-pandedAbstracts, 988–992.

iu, J., and K. J. Marfurt, 2007a, Instantaneous spectral attributes to detectchannels: Geophysics, 72, no. 1, 23–31.—–, 2007b, Multicolor display of spectral attributes: The Leading Edge,26, 268–271.arfurt, K. J., 2006, Robust estimates of reflector dip and azimuth: Geophys-ics, 71, no. 4, 29–40.atos, M. C., P. Osorio, E. C. Mundim, and M. Moraces, 2005, Characteriza-tion of thin beds through joint time-frequency analysis applied to a turbid-ite reservoir in Compos Basin, Brazil: 75th Annual International Meeting,SEG, ExpandedAbstracts, 1429–1432.

artyka, G. A., J. Gridley, and J. Lopez, 1999, Interpretational applications ofspectral decomposition in reservoir characterization: The Leading Edge,18, 353–360.

odarmel, C., and J. Shan, 2002, Principal component analysis for hyper-spectral image classification: Surveying and Land Information Science,62, 115–122.

tark, T. J., 2005, Anomaly detection and visualization using color-stack,cross-plot, and anomalousness volumes: 75th Annual International Meet-ing, SEG, ExpandedAbstracts, 763–766.

uarez, Y., K. J. Marfurt, and M. Falk, 2008, Seismic attribute-assisted inter-pretation of channel geometries and infill lithology: A case study of Ana-darko Basin Red Fork channels: 78th Annual International Meeting, SEG,ExpandedAbstracts, 963–967.

heophanis, S., and J. Queen, 2000, Color display of the localized spectrum:Geophysics, 65, 1330–1340.allet, B. C., and K. J. Marfurt, 2008, A grand tour of multispectral compo-

nents: The Leading Edge, 27, 334–341.

SEG license or copyright; see Terms of Use at http://segdl.org/