FAST SEISMIC IMAGING FOR MARINE DATA …FAST SEISMIC IMAGING FOR MARINE DATA Aleksandr Aravkin, Xiang Li, and Felix J. Herrmann Seismic Laboratory for Imaging and Modeling, the University

FAST SEISMIC IMAGING FOR MARINE DATA

Aleksandr Aravkin, Xiang Li, and Felix J. Herrmann

Seismic Laboratory for Imaging and Modeling, the University of British Columbia, Technical Report TR-2011-10

ABSTRACTSeismic imaging can be formulated as a linear inverse prob-lem where a medium perturbation is obtained via minimiza-tion of a least-squares misfit functional. The demand forhigher resolution images in more geophysically complexareas drives the need to develop techniques that handleproblems of tremendous size with limited computationalresources. While seismic imaging is amenable to dimension-ality reduction techniques that collapse the data volume intoa smaller set of “super-shots”, these techniques break downfor complex acquisition geometries such as marine acquisi-tion, where sources and receivers move during acquisition.To meet these challenges, we propose a novel method thatcombines sparsity-promoting (SP) solvers with random sub-set selection of sequential shots, yielding a SP algorithm thatonly ever sees a small portion of the full data, enabling itsapplication to very large-scale problems. Application of thistechnique yields excellent results for a complicated synthetic,which underscores the robustness of sparsity promotion andits suitability for seismic imaging.

Index Terms— Seismic imaging, dimensionality reduc-tion, compressive sensing, stochastic optimization

1. INTRODUCTION

Modern-day seismic wave-equation based imaging dependsincreasingly on computationally and data-intensive wavesimulators that require solutions of partial-differential equa-tions (PDEs) over increasingly large domains and frequencyranges. The computational costs involved with the solu-tions of these PDEs become prohibitively large because thelinearized inversions are based on a least-squares fitting pro-cedure carried out for a very large number of sources. Thisis challenging because each source corresponds to a differentright-hand side of the PDE and hence the computational costgrow exponentially with the size of the domain (the sourcesmove along the surface) and resolution.Motivated by early work of [1, 2], we overcome this challengeof the “curse of dimensionality” by decreasing the number of

Supported by the Natural Sciences and Engineering Research Coun-cil of Canada Collaborative Research and Development Grant DNOISE II(375142-08) matching support from the following organizations: BG Group,BGP, BP, Chevron, ConocoPhillips, Petrobras, PGS, Total SA, and West-ernGeco.

source experiments. As a result, we lower the computationalburden of the inversion significantly by using linearity of thePDEs with respect to the sources. This separable structure al-lows us to use randomized superposition, where we combinesources and their corresponding data into a smaller number ofsimultaneous source experiments. During these experiments,sequential sources are replaced by incoherent “beams”, whereeach source position fires with a strength drawn from a Gaus-sian random distribution. As long as the number of theseincoherent source experiments is smaller then the number ofsequential experiments, this random superposition leads toa reduction of computations by virtue of the reduced datavolume and number of sources.However, this dimensionality reduction hinges on two con-ditions: cancellation of source cross talk as the number ofsimultaneous sources increases, and full acquisition, whereeach source experiment sees the same receivers. The firstcondition is met as long as the batch size, i.e the numberof simultaneous sources, is sufficiently large. However, in-creasing this batch size only leads to a slow decay of thecross talk because this method corresponds to the stochastic-average approximation, which is essentially a Monte-Carlosampling technique. Unfortunately, the alternative stochasticapproximation, where randomized batches are drawn for eachgradient update, does not remedy this situation because it isunstable with respect to noise.To overcome these issues of slow convergence and noise in-stability, [3] introduced a dimensionality reduction approachwhere the source crosstalk is removed by transform domainsparsity promotion. In this approach, we turn the originally“overdetermined” problem—note that the imaging has a nullspace because related finite aperture and other effects—intoa much smaller underdetermined system of equation. To im-prove the convergence of the sparsity promoting solver, [3, 4]use a stochastic-approximation type of approach that removescorrelations between the solution vector and source encodingby randomized superposition.Unfortunately, this approach relies on full acquisition (thesecond condition), which is generally not met during marineacquisition because receivers are towed behind the sourcesand therefore move. In this paper, we address this issue byreplacing the randomized superposition by selecting randomsubsets of sequential sources instead of randomized simulta-neous sources.

2. THE SEISMIC IMAGING PROBLEM

After discretization, seismic imaging requires inversion of thelinearized time-harmonic Born-scattering matrix linking datab to the medium perturbation x. Note that b ∈ CNfNrNs

with Nf , Nr, and Ns the number of angular frequencies, re-ceivers, and sources, while x ∈ RM , with M the number ofgrid points.

Because angular frequencies and sequential sources canbe treated independently, the linearized inversion has the fol-lowing separable form:

minimizex

‖b−Ax‖22 =

K∑i=1

‖bi −Aix‖22, (1)

with K = NfNs the batch size, given by the total number ofmonochromatic sources. The vectors bi ∈ CNr represent thecorresponding vectorized monochromatic preprocessed (freeof surface-multiples and direct waves) shot records, and b is astack of vectors bi. The matrix Ai represents the monochro-matic linearized scattering matrix for the ith source, and thematrix A is a stack of matrices Ai.

Solving this problem is problematic because each itera-tion requires 4K PDE solves: two to compute the action ofAi and two for the action of its adjoint AH

i . Both actions in-volve solutions of the forward source and reverse-time resid-ual wavefields [5]. Thus, the inversion costs grow linearlywith the number of monochromatic sources.

To solve Equation (1) efficiently, we combine recentideas from stochastic optimization and compressive sensing.We cast the original imaging problem into a series of muchsmaller subproblems that work on different subsets of data.

2.1. Solution by batching

We based our original algorithm on forming compressive seis-mic experiments—or to use the language of online machinelearning mini batches—that consist of collections of smallnumbers of supershots. These supershots are made of ran-domized superpositions of sequential sources.

Mathematically, imaging experiments for mini batcheswith K ′ K monochromatic supershots, require the solu-tion of the reduced system

minimizex

∥∥b−Ax‖22=

K′∑j=1

‖bj −Ajx‖22, (2)

which has the same structure as (1), but with much smallerdimension K ′ << K. Each bj is a weighted linear combina-tion of bi, and each matrix Aj is the same linear combinationof matrices Ai: bj =

∑ki=1 wijbi, Aj =

∑ki=1 wijAi.

To specify how sub-selection and mixing is achieved inthe source-frequency space, we write the objective in (2)

as 12‖RM (b−Ax) ‖22. The restriction matrix R is de-

fined by the Kronecker product: Rdef= RΣ ⊗ I ⊗ RΩ ∈

R(K′Nr)×(KNr) with RΣ ∈ Rn′s×ns selecting n′s ns rows

uniform randomly amongst [1 · · ·ns] and RΩ ∈ Rn′f×nf

selecting n′f nf frequencies from the seismic frequencyband, and I the identity matrix. The measurement matrixM ∈ R(KNr)×(KNr) is given by the Kronecker productM

def= MΣ ⊗ I⊗ I.For the definition of the measurement matrix MΣ, we

either use phase encoding, as in [6], which turns sequentialsources into simultaneous sources, or we select random sub-sets of source locations by setting M = I . The latter allowsus to work with marine data. It is easy to show that the num-ber of PDE solves required for each iteration of the solutionof (2) is reduced by a factor of K ′/K.

2.2. Solution by sparsity promotion

Because averaging—whether via the stochastic-average ap-proximation or via averaging amongst model iterates as partof stochastic-gradient descents—is not able to remove thesource crosstalk efficiently, we rely on transform-domainsparsity promotion instead. In the noise-free case, sparsity-promoting imaging involves the solution of the followingoptimization problem:

minimizex

‖x‖1 subject to b = Ax. (3)

Efficient `1 solvers for this problem are typically based onsolutions of a series of relaxed subproblems, where compo-nents are allowed to enter into the solutions controllably. Itis widely known that these approaches lead to a reduction inthe number of iterations to reach the solution. The spectralprojected-gradient algorithm [7, SPG`1] uses this principleby solving a series of LASSO problems,

minimizex∈τB1

‖b−Ax‖22 , (4)

where τB1 = x∣∣‖x‖1 ≤ τ. The sequence of τ ’s are gen-

erated using Newton root on the value function v(τ) of (4),so the final value of τ ensures v(τ) = 0, so a solution of (4)with this τ also solves (3).

Each LASSO subproblem is solved using the SpectralProjected Gradient (SPG) method, so the overall algorithmuses a limited number of matrix-vector multiplies. Becausethe cost of the solver is determined by this number of mul-tiplies, this approach is particularly suitable for large-scalegeophysical problems [8].

Unfortunately, the degree of randomized dimensionalityreduction determines the amount of cross-talk that resultsfrom the inversion, and hence we can not reduce the prob-lem size too much. We overcome this by subsampling eachLASSO problem (4), in effect solving a series of problems:minimizex∈τB1

‖b−Ax‖22, where the subsampling can take

the form of either simultaneous “super-shots” or of randomsubsets of source locations. As in the SPG`1 algorithm, eachnew (subsampled) LASSO problem is warm-started with thesolution x to the previous LASSO problem on the way tosolving (3) . The algorithm can be seen as ‘curve’-hoppingwithin a random family of Pareto curves corresponding to thesubsampled realizations of b and A.

This approach dramatically decreases time per iteration,because computational effort for every iteration of the SPGalgorithm is reduced by a factor of K ′/K. While conver-gence to the true solution of (3) is still an open question, theapproach works well in practice and allows sparsity promo-tion techniques to be used on a previously inaccessible scale.

A natural question from a practical standpoint is whetherit is better to do random source encoding (random Gaussianmixing), or to select random subsets of shots and frequenciesat each LASSO subproblem. This is investigated in the con-text of a large-scale seismic example in the next section.

3. CASE STUDY: THE BG COMPASS MODEL

To test our imaging algorithm for the two dimensionality-reduction scenarios in a realistic setting, we consider a syn-thetic velocity model with a large degree of complexity. Tobuild the background-velocity model, we employ our modi-fied Gauss-Newton method with sparse updates in 10 overlap-ping frequency bands on the interval 2.9 − 22.5Hz and withinitial model plotted in Figure 1. (Note that this approach re-ported in [6] is based on a similar dimensionality reductiontechnique as presented in this paper.) The output of this pro-cedure is plotted in Figure 1 and is used as the background-velocity model for our imaging algorithm.

We parametrize the velocity perturbation on a 409×1401grid with a grid size of 5m. We use a Helmholtz solver togenerate data from 350 source positions sampled at an intervalof 20m and 701 receivers sampled with an interval of 10m.We use 10 random frequencies selected from 20− 50Hz andscaled by the spectrum of a 30Hz Ricker wavelet. We solve10 LASSO subproblems with and without independent re-draws of RM. The results are summarized in Figure 2 (b–c)and clearly show significant improvements from the redraws.Not only is the crosstalk removed more efficiently but the re-flectors are also better imaged in particular at the deeper partsof the model where recovery without redraws is not able toimage the events. It is also interesting to see that replacing theGaussian measurement matrix by the identity matrix—i.e., re-placing simultaneous shots by randomly selected sequentialshots—does not seriously affect the imaging result.

4. DISCUSSION AND CONCLUSIONS

We introduced an efficient algorithm to solve the linearizedimaging problem. Our method combines recent findings fromthe fields of stochastic optimization and compressive sensing

Fig. 1. Full-waveform inversion result. (a) Initial model. (b)Inverted result starting from 2.9Hz with 7 simultaneous shotsand 10 frequencies in each of the 10 frequency bands.

and turns the originally “overdetermined” seismic imagingproblem into a series of underdetermined dimensionality-reduced subproblems. By considering these subproblemsas sparse-recovery problems, we were able to create high-fidelity images at a fraction of the computational cost forsubsampling scenarios based on either randomized super-position or on random selections of subsets of sequentialsources. The latter relaxes the condition of having stationaryreceivers, which the method conducive to marine acquisition.

5. REFERENCES

[1] S. A. Morton and C. C. Ober, “Faster shot-record depth mi-grations using phase encoding,” in SEG Technical Program Ex-panded Abstracts. 1998, vol. 17, pp. 1131–1134, SEG.

[2] L. A. Romero, D. C. Ghiglia, C. C. Ober, and S. A. Morton,“Phase encoding of shot records in prestack migration,” Geo-physics, vol. 65, no. 2, pp. 426–436, 2000.

[3] Felix J. Herrmann and Xiang Li, “Efficient least-squares migra-tion with sparsity promotion,” EAGE, 2011, EAGE TechnicalProgram Expanded Abstracts.

[4] Felix J. Herrmann and Xiang Li, “Efficient least-squares imag-ing with sparsity promotion and compressive sensing.,” Tech.rep., University of British Columbia, Vancouver, August 2011.

[5] R. Plessix and W. Mulder, “Frequency-domain finite differenceamplitude-preserving migration,” Geoph. J. Int., vol. 157, pp.975–987, 2004.

[6] Xiang Li, Aleksandr Aravkin an Tristan van Leeuwen, and Fe-lix J. Herrmann, “Modified gauss-newton with sparse updates,”SBGF, 2011.

[7] E. van den Berg and M. P. Friedlander, “Probing the Paretofrontier for basis pursuit solutions,” SIAM Journal on ScientificComputing, vol. 31, no. 2, pp. 890–912, January 2008.

[8] G. Hennenfent, E. van den Berg, M. P. Friedlander, and F. J.Herrmann, “New insights into one-norm solvers from the Paretocurve,” Geophysics, vol. 73, no. 4, July-August 2008.

Fig. 2. Dimensionality-reduced sparsity-promoting imaging from random subsets of 17 simultaneous shots and 10 frequencies.We used the background velocity-model plotted in Figure 1 (a) True perturbation given by the difference between the truevelocity model and the FWI result. (b) Imaging result without redraws for the supershots. (c) The same but with redraws.(d) Image for sequential shots with redraws. Notice the significant improvement in image quality when renewing collection ofsupershots after solving each LASSO subproblem. Also, notice that the result with randomized subsets of sequential shots isalso very reasonable.

Documents

FAST SEISMIC IMAGING FOR MARINE DATA …FAST SEISMIC IMAGING FOR MARINE DATA Aleksandr Aravkin, Xiang Li, and Felix J. Herrmann Seismic Laboratory for Imaging and Modeling, the University