8
Efficient Image-Based Projective Mapping using the Master Texture Space Encoding David Guinnip , David Rice, C hristopher Jaynes Dept. of Computer Science University of Kentucky Lexington, KY 40506 [email protected] Randall Stevens ArchVision, Inc. 110 South Upper St. Lexington, KY 40507 [email protected] ABSTRACT We introduce an encoding technique that supports efficient view-dependent image-based rendering applications that combine photorealistic images with an underlying surface mesh. The representation increases rendering efficiency, reduces the space required to store large numbers of object views, and supports direct image-based editing for realistic object manipulation. The Master Texture Space encoding transforms an original set of exemplar images into a set of Master Textures that share a globally consistent set of texture coordinates based on underlying object geometry and independent of camera positions used to create the exemplar views. An important property of the master texture space is that an arbitrary but fixed pixel position in all the Master Textures correspond to the same point on the object surface. This property increases rendering efficiency for real- time dynamic image-based rendering applications because new texture coordinates do not have to be loaded as a function of viewpoint. In addition, changes in a Master Texture image can be rapidly propagated to all views to add/remove features, generate new viewpoints, and remove artifacts to a pre-existing image-based scene. Results presented here demonstrate that the technique reduces real-time rendering rates by 1.6 milliseconds per frame for reasonably complex models on a commodity graphics card. Two example scenarios demonstrate how the Master Texture encoding supports efficient update of an image-based model. Keywords Image-based rendering, texture mapping, compression, image processing. 1. INTRODUCTION Image-based rendering has emerged in recent years as an approach to rendering three-dimensional scenes [Lip80] without the representation of an overly complex geometric model. As opposed to pure geometric approaches, the image-based approach uses real-world images of an object or scene as the basis for rendered viewpoints to achieve photorealism. The image-based approach maximizes realism (by using real-world imagery) while minimizing rendering time. Image-based rendering is particularly successful for complex objects that are cumbersome to represent in a purely model-based domain (e.g. Cars, Trees, and People). Today, image-based content is used in both still renderings, off line animations, and in real-time interactive applications [Arc02]. There have been many techniques introduced for representing image-based content [Che93, Deb98, Gor96, Lav94, Lev96]. All of these techniques can be described as view dependent, in that images captured from known positions in the world are dynamically selected (or interpolated) based on the current rendering view. Each technique utilizes some geometric representation of the object to be rendered in order to incorporate the image-based model into a scene. Model complexity varies from a single plane to complex surface meshes, onto which images are statically or projectively mapped. This continuum of full geometry versus complete image-based rendering offers users the ability to represent objects based upon application needs. Points in this continuum can yield different levels of detail depending upon the application requirements. A single polygon and 200 views may be a sufficient for a model of a person that is to be inserted into an office scene and rendered from a distance. On the other hand, a vehicle, rendered from nearby may exhibit significant perspective distortion that is not modeled in the available viewpoints. In this case, a simple object mesh can be combined with the available images to improve the rendered result. This paper will focus on view-dependent projectively texture mapped (VDTM) objects with an underlying mesh of moderate complexity. We introduce an Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Journal of WSCG, Vol.11, No.1., ISSN 1213-6972 WSCG’2003, February 3-7, 2003, Plzen, Czech Republic. Copyright UNION Agency – Science Press

Efficient Image-Based Projective Mapping using the Master ...wscg.zcu.cz/wscg2003/Papers_2003/D89.pdf · Efficient Image-Based Projective Mapping using the Master Texture Space Encoding

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Efficient Image-Based Projective Mapping using the Master ...wscg.zcu.cz/wscg2003/Papers_2003/D89.pdf · Efficient Image-Based Projective Mapping using the Master Texture Space Encoding

Efficient Image-Based Projective Mapping usingthe Master Texture Space Encoding

David Guinnip, David Rice, ChristopherJaynes

Dept. of Computer ScienceUniversity of KentuckyLexington, KY 40506

[email protected]

Randall StevensArchVision, Inc.

110 South Upper St.Lexington, KY 40507

[email protected]

ABSTRACTWe introduce an encoding technique that supports efficient view-dependent image-based rendering applicationsthat combine photorealistic images with an underlying surface mesh. The representation increases renderingefficiency, reduces the space required to store large numbers of object views, and supports direct image-basedediting for realistic object manipulation. The Master Texture Space encoding transforms an original set ofexemplar images into a set of Master Textures that share a globally consistent set of texture coordinates based onunderlying object geometry and independent of camera positions used to create the exemplar views.

An important property of the master texture space is that an arbitrary but fixed pixel position in all the MasterTextures correspond to the same point on the object surface. This property increases rendering efficiency for real-time dynamic image-based rendering applications because new texture coordinates do not have to be loaded as afunction of viewpoint. In addition, changes in a Master Texture image can be rapidly propagated to all views toadd/remove features, generate new viewpoints, and remove artifacts to a pre-existing image-based scene. Resultspresented here demonstrate that the technique reduces real-time rendering rates by 1.6 milliseconds per frame forreasonably complex models on a commodity graphics card. Two example scenarios demonstrate how the MasterTexture encoding supports efficient update of an image-based model.

KeywordsImage-based rendering, texture mapping, compression, image processing.

1. INTRODUCTIONImage-based rendering has emerged in recent years asan approach to rendering three-dimensional scenes[Lip80] without the representation of an overlycomplex geometric model. As opposed to puregeometric approaches, the image-based approach usesreal-world images of an object or scene as the basisfor rendered viewpoints to achieve photorealism.

The image-based approach maximizes realism (byusing real-world imagery) while minimizingrendering time. Image-based rendering is particularlysuccessful for complex objects that are cumbersometo represent in a purely model-based domain (e.g.Cars, Trees, and People). Today, image-basedcontent is used in both still renderings, off lineanimations, and in real-time interactive applications[Arc02].

There have been many techniques introduced forrepresenting image-based content [Che93, Deb98,Gor96, Lav94, Lev96]. All of these techniques canbe described as view dependent, in that imagescaptured from known positions in the world aredynamically selected (or interpolated) based on thecurrent rendering view. Each technique utilizes somegeometric representation of the object to be renderedin order to incorporate the image-based model into ascene. Model complexity varies from a single planeto complex surface meshes, onto which images arestatically or projectively mapped.

This continuum of full geometry versus completeimage-based rendering offers users the ability torepresent objects based upon application needs.Points in this continuum can yield different levels ofdetail depending upon the application requirements.A single polygon and 200 views may be a sufficientfor a model of a person that is to be inserted into anoffice scene and rendered from a distance. On theother hand, a vehicle, rendered from nearby mayexhibit significant perspective distortion that is notmodeled in the available viewpoints. In this case, asimple object mesh can be combined with theavailable images to improve the rendered result.

This paper will focus on view-dependent projectivelytexture mapped (VDTM) objects with an underlyingmesh of moderate complexity. We introduce an

Permission to make digital or hard copies of all or partof this work for personal or classroom use is grantedwithout fee provided that copies are not made ordistributed for profit or commercial advantage and thatcopies bear this notice and the full citation on the firstpage. To copy otherwise, or republish, to post on serversor to redistribute to lists, requires prior specificpermission and/or a fee.

Journal of WSCG, Vol.11, No.1., ISSN 1213-6972WSCG’2003, February 3-7, 2003, Plzen, Czech Republic.Copyright UNION Agency – Science Press

Page 2: Efficient Image-Based Projective Mapping using the Master ...wscg.zcu.cz/wscg2003/Papers_2003/D89.pdf · Efficient Image-Based Projective Mapping using the Master Texture Space Encoding

encoding technique designed to supportphotorealistic image-based rendering, and view-dependent projective texturing of objects. Theencoding process transforms an original set of objectviews into Master Texture Space, so that the newimages share a globally consistent, single set oftexture coordinates regardless of the camera positionsused to create the initial images.

This global set of texture coordinates only needs tobe stored once and is invariant with respect to thenumber of images used for the image-based model.For standard conditions the Master Texture encodingcan achieve reasonable compression rates whileincreasing rendering efficiency.

The encoding process guarantees that Master Textureimages are pixel-wise aligned to one another. Thisalignment property supports efficient image-basedediting, straightforward view interpolation, andefficient updating of previously captured image-basedscenes.

2. IMAGE-BASED APPROACHES ANDRELATED WORKAt the heart of image-based rendering techniques isdirect processing of image data to derive new viewsof an object or scene. Typically, a set of exemplarimages is captured of a scene of object from manyviewpoints. These images are, combined withknown camera positions, are the basis for renderingnew views either by directly placing the capturedimage data into the scene or first by interpolatingbetween known views. Surface constraints, (e.g. thatthe model is piecewise planar), can be used toconstrain the interpolation process. As opposed to apurley image-based approach, explicit use of a low-resolution model has been shown to improverendering quality and extend the range of viewpointsavailable for novel views [Nis99, Deb96].

The Master Texture encoding is applicable to image-based rendering techniques that combine a geometricmodel with view dependent texture mapping(VDTM). Techniques described in [Deb96] employVDTM to achieve photorealistic novel views from asparse set of images by projecting interpolated pixeldata directly onto low-resolution models. A distinctadvantage of this approach is the ability torealistically integrate image-based models intocomputer-generated scenes. Even low-fidelitygeometric representation supports model interactionwithin its environment (e.g. illumination, shadowcasting). Combination of image-based views of anobject with an underlying surface representation canbe efficiently implemented as a projective texturemap [Hec91], and is widely supported in renderingengines, real-time displays, and graphics hardware.

Image-based modeling and rendering techniques thatdo not exploit a geometric representation [Mat02,Che93, Lav94] use morphing maps of adjacent

images to render a new images from viewpoints notcontained in the original set of basis views.Interpolation requires accurate camera calibration tothe object coordinate system, or known relativecalibration between the exemplar views forconstrained epipolar interpolation [Car01]. Althoughapproaches that use partially reconstructed surfaces orrange data to constrain the interpolation process[Deb98, Mcm97] have been introduced, interpolativemethods still rely heavily on accurate pointcorrespondences between images and are typicallyonly robust given a dense sampling of base images.

Explicit estimation of the Plenoptic function hasadvantages similar to the view dependent texturemapping in that novel viewpoints can bereconstructed from a set of basis images withoutmodeling the optic flow of pixels as they move fromone camera to another [Mcm95]. Light-Fieldrendering [Lev96] and the Lumigraph [Gor96,Bue01] reduce the dimensionality of the plenopticfunction while utilizing a sparse mesh representation.A drawback these techniques share is the dependenceon a large number of images for accurate estimation.Furthermore, these approaches are not widelysupported in hardware and do not support real-timeimage-based rendering.

Regardless of the particular tradeoff betweenexemplar views and geometry that each of theseapproaches make, rendering quality and resolution islimited by the number of basis images that can bestored and efficiently rendered. The large number ofexemplar images, calibration information, and objectmesh required for VDTM pose a significant challengeto efficient storage and rendering of photorealisticmodels. As a result, researchers have beenaddressing representation, manipulation, and efficientrendering of image-based data [Che02, Deb98].

In work perhaps most similar to our own, [Nis01]describes an encoding scheme that stores all theviews of a particular model face in a single imagetexture. This representation has the advantage ofbeing amenable to off-line compression methods andresults show that the technique is able to achievebetween 5:1 and 15:1 compression rates with little orno loss in image fidelity. However, manipulation ofthe representation is cumbersome and, moreimportantly, rendering of an object encoded using theEigentexture [Nis01] do not support real-timeapplications because texture coordinates are computedat rendering time based on viewpoint.

Our approach is motivated by the observation that asingle set of texture coordinates for any number ofbasis views is desirable for render-time compressionand efficiency. The Master Texture Space preservesimage fidelity contained in the exemplar views,achieves reasonable compression, and facilitatesefficient rendering and manipulation of the encodedimages.

Page 3: Efficient Image-Based Projective Mapping using the Master ...wscg.zcu.cz/wscg2003/Papers_2003/D89.pdf · Efficient Image-Based Projective Mapping using the Master Texture Space Encoding

3. MASTER TEXTURE GENERATIONCurrent approaches to projective texture mappingfirst select or create an appropriate image of the sceneusing the exemplar images based on the view of theobject to be rendered. This image is then projectivelymapped to the object mesh to support lightingeffects, and induce depth effects (such asforeshortening) not present in the mesh alone[Deb96]. Because it is infeasible to compute texturecoordinates in real-time using the available projectionmatrices, the mapping between object vertices andimage coordinates must be precomputed and storedas a set of texture coordinates with each image.Given a mesh containing m triangular faces, and nexemplar images, between 3m*n and m*n texturecoordinates must be stored with the model. In total,then, the images, model, and texture coordinates areused at render time to facilitate a dynamically texturemapped, photorealistic model. Figure 1 depicts thisgeneral approach.

Although VDTM successfully combinesphotorealistic images with simple geometric models,the amount of data required to achieve a highfidelity, accurate representation of the object issignificant.

The general approach to computing the MasterTexture space encoding is to precompute therelationship between all exemplar images and theobject mesh. Mesh elements are projected into allexemplar images to produce a corresponding set ofimage regions (facets) that are extracted, warped andstored in the Master Texture space.

When facets are moved from the original image spaceto the Master Texture space, a new set of texturecoordinates are created under the constraint that all

Master Textures share a single set of coordinates.Given a mesh containing m faces associated with nexemplar images, only 3m coordinates are needed tocompletely represent the relationship between meshfaces and any number of example images.

The encoding process requires a significant amountof processing. It is important to note, however, thatthis is a one time encoding cost that is notrecomputed at rendering time. As will be shown inSection 4, advantages of the Master Texturerepresentation outweigh this one-time cost.

The Master Texture encoding guarantees that a pixel,(i,j)k, in Master Texture image k corresponds to thesame point on the surface as pixel (i,j) in all otherMaster Textures. This pixelwise correspondence isan important advantage of the Master Textureencoding both because it represents a significantcompression over storing all texture assignmentsexplicitly, and image-based operations such asinterpolation can occur on the now pixel-aligned datadirectly in Master Texture space.

The Master Texture Encoding AlgorithmThe first step in creating a set Master Texture imagesfrom a set of n different exemplar images is toproject each of the m mesh triangles into each viewto produce n*m image facets. For a single meshtriangle, n facets, comprised of 3 image coordinates,(u,v)1,2,3, are generated for each of the n views usingthe known projection matrix corresponding to eachview:

For non-degenerate cases, a mesh trianglecorresponds to a triangular region in each image.Pixel data within each triangular region is rasterizedto produce a facet containing image informationNext, the n different facets corresponding to a singlemesh face m, are transformed to be of uniform sizeand shape.

All facets are warped to a right triangle of width wand height h, where w and h are the maximumindependent width and height of any the n facetscorresponding to a single mesh face. For a given wand h, an affine warp, given by the matrix C ofEquation 2, is applied to each of the n facets:

(i,j)1 . . .(i,j)3m

1

(i,j)1 . . .(i,j)3m

. . .

n

Figure 1: Traditional view-dependent projectivetexture mapping approach. Exemplar images ofa real-world object from n viewpoints (top left)are dynamically combined with a mesh using aset of texture coordinates based on the current

view.

˙˙˙˙˙

˚

˘

ÍÍÍÍÍ

Î

È

=

˙˙˙

˚

˘

ÍÍÍ

Î

È

ÂÂÂ= = =

0.1

/

/

1 1

3

1 mc

mc

mc

n

n

n

m

m cnmc

nmcnmc

nmcnmc

z

y

x

P

s

sv

su

(Equation 1)

Page 4: Efficient Image-Based Projective Mapping using the Master ...wscg.zcu.cz/wscg2003/Papers_2003/D89.pdf · Efficient Image-Based Projective Mapping using the Master Texture Space Encoding

Where A is the set of initial image coordinates of themesh triangle, B is a vector containing the targetcoordinates of the warped facet. Note that B isconstructed so that the resulting facets are all alignedwith the image axes. Figure 2 shows three facetsfrom a single face m, as seen from different views ofan object before warping (inset top row) and afterthey have been warped to Master Texture space.

Once all facets corresponding to a particular meshface have been warped, they are of uniform size andshape regardless of their initial projectively warpedshape. This step ensures that corresponding facets indifferent Master Textures can be pixelwise alignedand that the facets are of maximum fidelity.

Ultimately, each exemplar view will be discarded andreplaced by a corresponding Master Texture. EachMaster Texture image is composed of m differentfacets corresponding to the m different mesh faces asseen from that view. The m different facets areplaced into the Master Texture according to a packingalgorithm that minimizes unused pixels in theMaster Texture image.

An exemplar view is selected and the m facets,produced for that view are packed into a new image.The algorithm proceeds from the largest to thesmallest facet, placing each into the Master Texture

as efficiently as possible. Prior to packing, eachfacet is paired with a facet of similar area andreflected across the x and y-axes. A bounding box isfit to the result. Figure 3 shows two facets before andafter the facets been oriented and fit to a boundingbox. The resulting bounding region is packed intothe new Master Texture image.

Bounding boxes are packed into the image at theleftmost and topmost position that the bounding boxwill fit. This process is repeated until all m facetscorresponding to a single image have been placedinto the Master Texture for that view. Figure 4shows a Master Texture derived from a single viewof the car example.

When the process terminates, a packing map thatdetermines how all facets are placed into the MasterTexture is stored. Because each facet in a Master

Texture corresponds to n-1 other facets that areexactly the same size and shape, this same packingmap can be used to efficiently generate the remainingn-1 Master textures. These new images, the mesh,and a single set of texture coordinates are all that is

Figure 4: A master texture image computedfor view 17 of the BMW model shown in

Figures 1 and 2. Resulting image resolutionis 438x686.

Figure 3: Example of facet combination andpacking. (top row) Facets belonging to the

same view are paired according to size. Facetat top right is rotated and combined with nextlargest available to fit both into a bounding

box (bottom left). Result is inserted into aMaster Texture image using the packing

algorithm.

Figure 2: Example of facet extraction andwarping. (top row) Three images of a real-

world car object. A single mesh facecorresponding (insert on each view, top row) as

seen in each image. (bottom row) Facets areextracted and warped to uniform shape and

size.

˜˜˜˜˜˜˜˜

¯

ˆ

ÁÁÁÁÁÁÁÁ

Ë

Ê

=

˜˜˜˜˜˜˜˜

¯

ˆ

ÁÁÁÁÁÁÁÁ

Ë

Ê

=

h

wB

zyx

zyx

zyx

zyx

zyx

zyx

A

0

0

0

0

,

000

000

000

000

000

000

222

222

111

111

000

000

CBA =-1 (Equation 2)

Page 5: Efficient Image-Based Projective Mapping using the Master ...wscg.zcu.cz/wscg2003/Papers_2003/D89.pdf · Efficient Image-Based Projective Mapping using the Master Texture Space Encoding

(a) (b) (c)

Figure 5: Results of the Master Texture encoding applied to the three different test objects. Eachexample shows an exemplar image (top left), object mesh (bottom left), example Master Texture , and

the rendered view (at right). (a) Lamp example, n views, m polygons. (b) Helicopter example, 36views and 436 polygons. (c) Vehicle example, 28 views and 820 polygons. For related videos thereader is encouraged to visit www.metaverselab.org/research/digital-objects/master/index.html.

needed for efficient, view-dependent rendering of theobject using the available images. Because onlyredundant information was destroyed in the encodingprocess, renderings that use the Master Textureencoding have the same fidelity as those that makeexplicit use of the original image data.

4. RESULTS AND EXAMPLEAPPLICATIONSThe master texture space-encoding algorithm wastested on several datasets to demonstrate compressionrates and efficient use of the Master Texture space.We demonstrate the encoding on datasets of varyingmesh complexity and varying numbers of exemplarimages. The criteria for evaluating the approach are 1)efficiency of rendering a Master Textured object fromchanging viewpoints, 2) compression rate of theMaster Texture encoding as compared to traditionaltexture mapping 3) efficiency in editing the image-based data while it is stored in the Master Texturerepresentation.

Three datasets were used to understand the behaviorof the Master Texture encoding (shown in Figure 5).Exemplar images for the Lamp and Helicopterobjects were rendered using a very-high resolutionobject mesh with surface markings. Exemplarimages of the third object, a BMW 325ti, werecaptured from the real word. These images werecalibrated from a target placed in view of theexemplar images.

Encoding times are dependent on mesh resolutionand number of exemplar images. Typically encodingtimes range from 8 minutes, for 436 polygons to 26minutes for 8020 polygons on a Pentium IV 1.7Ghzmachine.

Once encoded into Master Texture space, renderingefficiency is improved by removing the need formultiple sets of texture coordinates. The transfertime required to move new texture coordinates fromsystem memory into the graphics pipeline must betaken into account when rendering complex image-based data that cannot all be stored on a graphicscard. For a polygon mesh of 8020 polygons wemeasured this transfer time to be 1.6 milliseconds foreach view on a Pentium IV 1.7Ghz machine with a4x AGP bus and a GForce III graphics card. Usingthe Master Texture encoding, this fixed, per-view,overhead is eliminated.

Compression rates each of the three different objectswere measured and, in the case of the “Heli” dataset,mesh complexity was varied. Compression rates aremeasured as the ratio of total bytes required fortraditional VDTM versus VTDM using the MasterTexture encoding. Table 1 summarizes the results ofthe experiment.

MeshSize

Views ImageSize

MasterTextureSize

CompressionRatio

Lamp1842 90 640x480 466x932 1 : 2.2Heli436 360 720x576 316x540 1 : 2.46,604 72 720x576 620x1214 1 : 1.767,259 36 720x576 820x1650 3 : 1Car651 28 640x480 438x686 1 : 1.1

Table 1: Compression ratios for three differentdatasets of varying mesh size, exemplar image size,

and number of views.

Page 6: Efficient Image-Based Projective Mapping using the Master ...wscg.zcu.cz/wscg2003/Papers_2003/D89.pdf · Efficient Image-Based Projective Mapping using the Master Texture Space Encoding

Compression rates, particularly for complex meshesand many exemplar views, are not dramatic. Thistrend is demonstrated by the “Heli” model in Table1. As mesh complexity and the number of viewsincreases, mesh faces are likely to be seen as small(foreshortened) triangles in some views. These areresized to the maximum image triangle as seen inany view during the encoding process. Thisincreases the resulting Master Texture size anddecreases compression rates. This is unsurprising, inthat the master texture encoding was developed tosupport image-based rendering scenarios. As themesh size increases, the geometry begins to dominatethe image data and the situation begins toapproximate traditional geometric rendering. In caseswere the mesh size is reasonable, the Master Textureencoding significantly decreases the amount ofstorage required for VDTM while increasingrendering efficiency.

Image-Based EditingOnce a set of images are captured under controlledconditions or rendered for an image-based renderingdataset, post-editing of the image data is acumbersome process [Sei98]. For example, using acalibrated camera to capture multiple views of anindoor scene may be used to render photorealisticarchitectural walkthroughs. If the scene is to bemodified (i.e. a billboard advertisement may beinserted), all images must be modified by backprojecting the geometric position of the new objectinto all views using the camera calibrationinformation. This requires 16*n* v multiplies for anobject of v vertices, seen in n different exemplarviews.

Alternatively, an image-based approach may be takenby painting the new billboard into each view

appropriately. Editing of an exemplar view,however, requires that pixel correspondences areknown for all exemplar views containing the newobject. Discovering correspondences can be costlyand, particularly in the case of real-world imagery,may be prone to error.

Because Master Texture images are pixel-wisealigned across all views regardless of the relativeviewing geometry between the model and each view,changes to any one of the master textures can beeasily propagated to all views. This is can beperformed efficiently without the need to rediscoverpixel correspondences.

We demonstrate this technique using two examples.In one example, changes made directly to anexemplar image are propagated to all views tosupport photorealistic rendering of the objectcontaining the new data. In a second, more complexexample, changes on the underlying geometry arefirst rendered to create a new image that is encodedusing a known packing map to produce a new MasterTexture image that can then be incorporated into themaster texture space.

4.1.1 Direct Image-Based Editing of anexisting Master Texture Encoded SpaceA dataset was first created by capturing 28 controlledviews of a car rotating in front of a calibrated digitalcamera. Exemplar views were captured at aresolution of 640x480 pixels. A correspondinggeometric model containing 651 polygons wascreated by hand digitizing points on the object. Theobject mesh and an exemplar view are shown inFigure 5c).

Using the mesh, images, and corresponding cameracalibration information, each exemplar view was

Figure 6: An image-based editing example. Original image and corresponding Master Textureshown at top left. Edited image and new Master texture are shown top column (a license plate has

been added in this case). Pixelwise subtraction of the new Master Texture from the old reveals pixelsthat have changed (top right). These pixels are written directly into all Master Texture views to

update the complete model that can then be viewed from new positions (bottom row).

Page 7: Efficient Image-Based Projective Mapping using the Master ...wscg.zcu.cz/wscg2003/Papers_2003/D89.pdf · Efficient Image-Based Projective Mapping using the Master Texture Space Encoding

encoded into a Master Texture space. This results in28 Master Texture images. An example MasterTexture image created from one exemplar is shown inFigure 6 (top left).

Exemplar view, I, directly behind the car was editiedby inserting a licence plate. The resulting image, I’was encoded to produce a new Master Texture image,M’. Figure 6 (top row) shows the initial exemplarimage I, its Master Texture encoding M, the editedexemplar I’, and the new Master Texture, M’.

Since I and I’ were both encoded using the sameprocedure, M and M’ will also be identical except forthe edited regions. Therefore subtraction of M andM’ yields a set of pixels, D , in the Master Texturespace that correspond to edited regions for all views.Figure 6 (top right) shows a difference imagecontaining only pixels from the set D.

All views in the Master Texture space are updatedthrough a pixel-wise addition of D . This operationrequires no processing to determine (or search) forpixel correspondences and does not make explicit useof the camera calibration information (that hasalready been implicitly used in the creation of theMaster Texture Space). Furthermore, pixelwiseaddition of image arrays can be easy implemented inhardware. The bottom row of Figure 6 shows the carrendered from three views using the modified MasterTexture images.

4.1.2 Updating Master Texture from Changesin an Arbitrary ViewA second example demonstrates how the MasterTexture space supports efficient updating of imageswhen a new view is introduced that does notcorrespond to an existing exemplar image (as in theprevious example). Using techniques similar to theone shown here, relighting of the object mesh,changes to material properties on the surface,shadowing effects, and other object space updates,can be rapidly propagated into an existing MasterTexture space.

Starting with the original dataset described in section4.1.1, a tree model and light source are placed intothe virtual scene containing the car mesh. A shadowis then cast on matte shaded mesh surface. In atraditional image-based rendering scenario, update ofall existing exemplar views with the new lightingeffect requires significant computation.

In order to propagate the new effect into MasterTexture space, a viewpoint that shows the desiredeffect (in this case the shadow) is selected. From thatview, the shaded mesh is rendered both with andwithout the desired effect enabled, resulting in twonew images I and I’, respectively. An image of theobject mesh with the shadow effect enabled is shownin Figure 7 (top left).

Both images are then encoded into the master texturespace using the same packing map P of the original

dataset. As before, a set of difference pixels arederived by subtracting the two new Master Textureimages M and M’. The difference image is shownFigure 5 (top right). These differences are propagatedinto all views through pixel wise addition across then Master Texture views.

Once a new set of Master Textures has beencomputed, the new image-based model can bedynamically observed without having to recomputethe effects of the shadow at render time. Figure 7(bottom row) shows two views of the new modeltexture mapped with the modified Master Textures.

In this example, the shadow will appear consistent inscenes where the car and shadow are stationary.Figure 8 depicts the new image-based model placedinto a scene containing the shadow-casting tree.Using image-based techniques for efficiency, staticshadows for all objects in the scene are precomputedand texture mapped onto a planar surface (Figure 8a).Placement of the old image-based model into thescene results in car without shadow and aperceptually inconsistent scene (Figure 8b). The newimage-based scene appears consistent without theneed for expensive shadow casting operations to becomputed at render time, and can now be viewedfrom any view in real-time (Figure 8c) (seewww.metaverselab.org/research/mastertexture forrelated videos).

CONCLUSIONWe have introduced the master texture encoding andhave demonstrated how it can be used to improvecompression and efficiency for view-dependenttexture mapping applications. Examples, presentedhere, show that for sparse meshes and large numbersof images, the master texture encoding achieves over50% compression for common image-based, view-dependent texture mapping scenarios.

Figure 7: An image-based editing example.

This example adds a shadow to the sequenceof master images by projecting a computer-generated shadow onto the geometric model

of the car.

Page 8: Efficient Image-Based Projective Mapping using the Master ...wscg.zcu.cz/wscg2003/Papers_2003/D89.pdf · Efficient Image-Based Projective Mapping using the Master Texture Space Encoding

The pixel-wise alignment of the master textureimages supports rapid modification of the mastertexture space that can be directly propagated into allviews without explicit use of image calibrationinformation. We are currently exploring how themaster texture encoding can support entire scenesrather than a single object. We expect that a singleset of globally consistent texture coordinates can beapplied to site walkthroughs, for example

5. REFERENCES[Arc02] ArchVision RPC technology. Real trees, realpeople. www.rpcnet.com

[Bue01] Buehler, C., Bosse, M., McMillan, L.,Gortler, S., Cohen, M. Unstructured LumigraphRendering. In Proceedings ACM SIGGRAPH 2001,pp. 425-432 August 2001.

[Car01] Carpenter, C.S., Seales, B., Jaynes, C., andStevens, R. Automated basis-view and match-pointselection for the ArchVision RPC image-basedmodel. In Proc. of the Inter. Conf. on Multimedia,September 2001.

[Che93] Chen, S. E. Williams, L. and Viewinterpolation for image synthesis. In Proceedings,ACM SIGGRAPH 1993, pp. 279-288 July 1993.

[Che02] Chen, W-C., Bouguet, J-Y., Chu M.,Grzeszczuk, R. Light Field Mapping: EfficientRepresentation and Hardware Rendereing of SurfaceLight Fields. In Proceedings, ACM SIGGRAPH2002, pp. 447-456 2002.

[Deb96] Debevic, P., Taylor,C., and Malik, J.Modeling and rendering architecture fromphotographs: A hybrid-geometry and images-basedapproach. In Proceedings, ACM SIGGRAPH 1996,pp. 11-20, August 1996.

[Deb98] Debevic, P., Yizhou, Y.,and Borshukov, G.Efficient View-Dependent Image-Based Renderingwith Projective Texture-Mapping. EurographicsRendering Workshop 1998, pp. 105-116, June 1998.

[Gor96] Gortler, S., Greszczuk, R., Szeliski, R., andCohen., M. The lumigraph. In Proceedings, ACMSIGGRAPH 1996, pp. 43-54, August 1996.

[Hec91] Heckbert, P. and Moreton, H. Interpolationfor polygon texture mapping and shading. State ofthe art in Computer graphics: Visualization andModeling, 1991.

[Lav94] Laveau, S. and Faugeras, O. 3-D scenerepresentation as a collection of images. InProceedings of 12th Inter. Conf. on PatternRecognition, Volume 1, pp. 689-691, 1994.

[Lev96] Levoy, M. and Hanrahan, P. Light fieldrendering. In Proceedings, ACM SIGGRAPH 1996,pp. 31- 42, 1996.

[Lip80] Lippman, A., Movie-Maps: An Applicationof the Optical Videodisc to Computer Graphics. InProceedings, ACM SIGGRAPH 1980.

[Mat02] Matusik, W., Pfister, H., Ngan, A.,Beardsley, P., Ziegler, R., and McMillan, L. Image-Based 3D Photography using Opacity Hulls. InProceedings, ACM SIGGRAPH 2002, pp.427-437,2002.

[Mcm97] McMillan, L. An Image-based Approachto Three-Dinensional Computer Graphics. PhDthesis, U. of North Carolina, Chapel Hill, 1997.

[Mcm95] McMillan, L., and Bishop, G. PlenopticModeling: An image-based redering system. InProceedings, ACM SIGGRAPH 1995.

[Nis99] Nishino, K., Sato, Y,, Ikeuchi, K.Appearance compression and synthesis based on 3dmodel for mixed reality. ICCV ’99, pp. 38-45, 1999.

[Nis01] Nishino K., Sato Y. and Ikeuchi K., Eigen-Texture Method: Appearance Compression andSynthesis based on a 3D Model, in IEEE PAMI,23:11, pp.1257-1265, Nov., 2001.

[Pul97] Pulli, K., Cohen, M., Duchamp T., Hoppe,H., Shapiro, L., and Stuetzle, W. View-basedrendering: Visualizing real objects from scannedrange and color data. In Proc. of 8th EurographicsWorkshop on Rendering, St Etienne, France, pp. 22-34, June 1997.

[Sei98] Seitz, S. and Kutulakos, K. Plenoptic ImageEditing. In Proceedings of 5th ICCV, pp. 17-24,1998.

(a) (b) (c)

Figure 8: Rapid relighting of a Master Texture dataset. (a) Scene containing precomputed shadows.(b) Car object added before relighting effect. Inconsistent shadow cues destroy realism. (c) Object

added to scene after the master textures have been modified.