854 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO ...mcl.usc.edu/wp-content/uploads/2014/01/200507-Rate-distortion... · 854 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO

854 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 15, NO. 7, JULY 2005

Rate-Distortion Optimized Compressionand View-Dependent Transmission of

3-D Normal MeshesJae-Young Sim, Student Member, IEEE, Chang-Su Kim, Member, IEEE, C.-C. Jay Kuo, Fellow, IEEE, and

Sang-Uk Lee, Senior Member, IEEE

Abstract—A unified approach to rate-distortion (R-D) optimizedcompression and view-dependent transmission of three-dimen-sional (3-D) normal meshes is investigated in this work. A normalmesh is partitioned into several segments, which are then encodedindependently. The bitstream of each segment is truncated opti-mally using a geometry distortion model based on the subdivisionhierarchy. It is shown that the proposed compression algorithmyields a higher coding gain than the conventional algorithm. More-over, to facilitate interactive transmission of 3-D data accordingto a client’s viewing position, the server can allocate an adaptivebitrate to each segment based on its visibility priority. Simulationresults demonstrate that the view-dependent transmission tech-nique can reduce the bandwidth requirement considerably, whilemaintaining a good visual quality.

Index Terms—Independent partitioning, normal meshes, rate-distortion optimization, view-dependent transmission, 3–D meshcompression.

I. INTRODUCTION

THREE-dimensional (3-D) data compression is an essentialtechnology to facilitate the transmission of 3-D models

over the wired or wireless Internet. Especially, progressivecompression techniques are useful when transmitting a hugeamount of 3-D data interactively, since they enable successivereconstruction of the data from low to high resolutions. Therehas been a large amount of work on the progressive codingof triangular meshes [1]–[4], since this mesh representation isprevalent as a modeling tool for 3-D objects. A triangular meshconsists of topology as well as geometry data. The topologydata characterize the connectivity information among vertices,while the geometry data describe vertex positions. In progres-sive mesh coding [2]–[4], a triangular mesh is converted to aset of simplified meshes with various level-of-details (LODs).Then, topology and geometry data are compressed in theincreasing order of LODs.

Manuscript received February 12, 2003; revised July 15, 2004. This paperwas recommended by Associate Editor H. Shum.

J.-Y. Sim and S.-U. Lee are with the School of Electrical Engineering andComputer Science, Seoul National University, Seoul 151–742, Korea (e-mail:[email protected]; [email protected]).

C.-S. Kim is with the Department of Information Engineering, the ChineseUniversity of Hong Kong, Shatin, Hong Kong (e-mail: [email protected]).

C.-C. J. Kuo is with the Department of Electrical Engineering, Univer-sity of Southern California, Los Angeles, CA 90089-2564 USA (e-mail:[email protected]).

Digital Object Identifier 10.1109/TCSVT.2005.848349

A triangular mesh generally has an irregular topology struc-ture, which demands a higher bitrate for coding. To overcomethis drawback, efficient progressive mesh compression algo-rithms have been proposed in [5]–[7] based on remeshingtechniques. Remeshing is a process of converting an irregularmesh into a semiregular one, in which the valence of most ver-tices is equal to six. In other words, most vertices are connectedto six adjacent vertices. The topology data of the resultingsemiregular mesh can be described very compactly. The geom-etry data can also be effectively compressed by exploiting theregular structure of vertices. Specifically, the geometry data canbe regarded as wavelet coefficients, which can be compressedwith a zerotree coding algorithm [8].

Although wavelet-based algorithms [5]–[7] achieve a highcoding gain, the local characteristics of wavelet coefficients canbe exploited to improve the coding gain further. In this paper,we propose an algorithm that first divides an input mesh intoseveral segments and then encodes each segment independentlyaccording to its local characteristics. An optimal rate is assignedto each segment to minimize the geometry distortion subjectto the constraint on a total rate with the Lagrangian multipliermethod. Simulation results demonstrate that the proposed algo-rithm yields a higher coding gain than the conventional waveletalgorithm described in [7].

Furthermore, since many Internet-based 3-D applications re-quire interactive communications between a server and multipleclients, we propose a view-dependent streaming algorithm tofacilitate the server-client interactions. The proposed algorithmencodes each segment independently so that the compressed bit-stream for each segment can be truncated at an arbitrary point.We employ a segment visibility measure to find the optimalset of truncation points, which maximizes the quality of recon-structed 3-D mesh from a client’s viewing position. It is shownthat the transmission bandwidth can be reduced significantly byallocating a major portion of bitrates to visible segments.

This paper is organized as follows. Related previous work isreviewed in Section II. The remeshing algorithm, which con-verts an irregular mesh into a normal one, is described in Sec-tion III. The rate-distortion (R-D) optimized compression algo-rithm and the view-dependent transmission algorithm are de-tailed in Sections IV and V, respectively. Simulation results arepresented in Section VI. Finally, concluding remarks are givenin Section VII.

1051-8215/$20.00 © 2005 IEEE

SIM et al.: RATE-DISTORTION OPTIMIZED COMPRESSION AND VIEW-DEPENDENT TRANSMISSION OF 3-D NORMAL MESHES 855

Fig. 1. Remeshing example, where the triangles of the semiregular mesh are illustrated with a normal flipping pattern to show the semiregular connectivity. (a)Irregular mesh. (b) Base mesh. (c) Semiregular mesh.

II. REVIEW OF PREVIOUS WORK

A. Remeshing and Wavelets

Remeshing is a technique of converting an irregular meshinto a semiregular one [9]–[11]. A remeshing example is givenin Fig. 1. An irregular mesh is simplified to a base mesh, andthen the base mesh is refined to a semiregular mesh by addingnew vertices systematically. Note that vertices in the semireg-ular mesh are positioned more regularly than those in the irreg-ular mesh, and most vertices in the semiregular mesh have avalence 6.

The remeshing process consists of two phases: parameteriza-tion and refinement. In the parameterization phase, a mappingfrom a two-dimensional (2-D) domain, which is homomorphicto a topological disk, to a 3-D surface is defined. Parameteriza-tion is usually performed in the mesh simplification step, whereeach base triangle is employed as the desired 2-D domain. Toexplain this idea clearly, let us consider an example depicted inFig. 1. In this figure, vertices within the region bounded by whitecurves in Fig. 1(a) are projected onto the corresponding basetriangle. These projected vertices are depicted by black dots inFig. 1(b). Each projected vertex onto the base triangle containsthe information of the original vertex position. By interpolatingthese original vertex positions, any point on the base triangle canbe mapped approximately to a point (not necessarily a vertex) inthe original mesh. Eck et al. [9] proposed an algorithm to con-struct a base mesh using the Voronoi partitioning. It computes apiecewise linear harmonic map that minimizes the parameteri-zation distortion using the conjugate gradient solver. Duchampet al. [12] developed a more efficient computation method forharmonic maps based on the hierarchical mesh reduction. Leeet al. [10] proposed an algorithm to find a smoother parame-terization, which employs vertex removal operations for meshsimplification.

Based on the parameterization information, the base meshcan be refined into a semiregular mesh with subdivision con-nectivity. Fig. 2 illustrates the 1–4 subdivision from the thlevel to the th level. Let us use , , and to representthree vertices at the th level, and , , and torepresent their corresponding vertices at the th level.The triangle is divided into the four sub-triangles,

, , , and

Fig. 2. The one-to-four subdivision.

by adding three new vertices , , and. There are several subdivision methods to determine the

positions of vertices at the th level from those of the thlevel [13]. They can be classified into two types: the interpo-lating and the approximating approaches. In the interpolatingapproach, a vertex at a coarser level is also a vertex at a finerlevel, i.e., , , . In contrast,vertex positions may not be preserved as the level increasesin the approximating approach. Fig. 3 represents the butterflysubdivision which is an interpolating scheme [14]. Fig. 3(a)shows the weights used to predict a newly added vertex .Thus, in Fig. 3(b), the position of a new vertex isobtained via

(1)

where , , and denote the positions of , , and, respectively.

The newly added vertex is predicted from the lower levelvertices so that it does not lie on the original mesh surface ingeneral. The actual remeshing point is found on the orig-inal surface which corresponds to the center of edgeon the parameter domain. Then the difference be-tween a remeshing point and its prediction is called a refinementvector or a wavelet coefficient. The semiregular mesh geometrycan hence be described by the base mesh geometry and the cor-responding hierarchical wavelet coefficients [15].

In addition, each refinement vector can be constrained tohave the direction of the surface normal, which is called thenormal mesh representation [11]. Then, each wavelet coeffi-cient is compactly represented by a scalar instead of a 3-D


Fig. 3. Butterfly subdivision. (a) Butterfly structure and weights. (b) Positionof newly added vertex v̂ is predicted by those of the neighboring vertices onthe butterfly structure.

vector, yielding further coding gain [7]. Fig. 4(a) illustrates thenormal remeshing procedure. The normal line is first evaluatedon , whose direction is determined by those of the neigh-boring vertices at the th level. Then, the piercing point , onwhich the normal line passes through the original mesh surface,is employed as a candidate for the remeshing point. Fig. 4(b)shows the parameter domain configuration, where denotesthe parameter coordinates of . Let .The piercing point can degrade the shape of the original meshwhen lies far from or there can be no piercing pointat all. In such a case, the remeshing point is determined from

, using the parameterization mapping, without the constraintof the normal direction. It is called a nonnormal vertex and itswavelet coefficient is described by a 3-D vector. In this work,the piercing point is accepted only if

The constant is set to 0.2 for the first and the second refinementlevels, and 0.6 for the finer refinement levels.

B. Coding

Schröder and Sweldens [16], [17] developed a scheme forthe wavelet representation of functions on a sphere. Based onthis scheme, Kolarov and Lynch [18] combined the wavelettransform and the SPIHT zerotree coding algorithm to com-press a function on 2-manifolds. They introduced the notion ofG-tree as a tree structure for general 2-manifolds. In a G-tree,the base manifold is the root node, and subdivided trianglesand higher level vertices become offsprings of a lower leveltriangle. Morán and García [5] extended this approach tocompress meshes with subdivision connectivity progressively.Moreover, Khodakovsky et al. [6], [7] proposed a progressivegeometry compression algorithm for semiregular meshes andnormal meshes based on the edge-based tree structure and thezerotree coder.

Fig. 5(a) shows the parent-offspring relation between the ver-tices at the th and the th levels based on the edge-basedtree. is a parent node and ’s are four off-springs of . Each at the th level has its own four off-springs at the th level, and so on. The vertex on each baseedge becomes the root node of an edge-based tree. In Fig. 5(b),

is a base triangle, and three vertices , , andare root nodes. The vertices at the second level are classified into

Fig. 4. Normal remeshing. (a) Refinement step and (b) its configuration onthe parameter domain, where u(v) denotes the parameter coordinates of v. Thevalidity of the piercing point is tested by checking whether it lies within theshaded circle on the parameter domain or not.

three sets ’s, ’s, and ’s, which are the offsprings of, , and , respectively. In this way, all vertices in a semireg-

ular mesh can be covered, without overlapping, by base verticesand a number of edge-based trees.

C. View-Dependent Transmission

It requires a huge amount of computations to render detailed3-D models. Many attempts have been made to develop fast andefficient rendering techniques. An approach is the view-depen-dent rendering, which assigns different LODs according to thevisibility of the models. Several algorithms have been proposedfor the view-dependent simplification and refinement of heightfield surfaces [19], irregular meshes [20]–[22], and meshes withsubdivision connectivity [23], [24].

Some effort has been also made on the view-dependent com-pression and transmission of 3-D models to facilitate interactivecommunication of 3-D data in server-client environments. In[25], [26], vertex split operations are transmitted to selectivelyrefine a 3-D model based on the vertex hierarchies. Yang etal. [27] proposed a view-dependent transmission algorithm forcompressed bitstreams of irregular meshes. Grabner and Zach[28] employed an adaptive quantization scheme to provide lo-cally adaptive resolutions to geometry data. Gioia et al. [29]proposed a rendering and transmission algorithm for semireg-ular meshes, which can add or suppress wavelet coefficients toperform view-dependent reconstruction of large meshes at thedecoder side. Also, texture data can be transmitted in a view-de-pendent way [30].

III. PREPROCESSING

In this work, an input irregular mesh is first simplified toa base mesh . Then, is refined times

to obtain a normal mesh .In the simplification process, we use half-edge collapse oper-

ations with quadric error metrics [31] and the polar map [10] toobtain the initial base mesh and its parameterization. The sim-plification scheme in [31] provides base mesh with an ac-ceptable visual quality. However, can be preprocessed for


Fig. 5. Edge-based tree. (a) Parent-offspring relation, where v is a parent node and v ’s (1 � j � 4) are four offsprings of v . (b) Covering of vertices. Thereare three root vertices v , v , and v .

better geometric fidelity to provide a robust remeshing result. In[11], a base vertex is repositioned to be closer, in the parameterdomain, to the center of the adjacent base triangles. In this work,we reposition base vertices not only to balance them in the pa-rameter domain but also to reduce the squared sum of waveletcoefficients to improve the coding gain.

Let represent the wavelet coefficient for . Wavelet co-efficients ’s for the roots of edge-based trees are includedin an energy function, and base vertices are translated to min-imize the energy function. Fig. 6(a) shows how base vertexis used in the prediction of associated vertices at the first level.Note that ’s, ’s, and ’s have as a butterfly neighbor.Based on the structure in Fig. 3, they are predicted from withbutterfly weights 1/2, 1/8, and 1/16, respectively. Thus, when

is repositioned, wavelet coefficients or prediction errors of’s, ’s, and ’s change accordingly. The energy func-

tion for is defined as

(2)

which is the squared sum of affected wavelet coefficients.Fig. 6(b) represents the parameter domain corresponding to

the base triangles adjacent to . The small dots depict the orig-inal vertices, which are mapped to this region by the polar map[10]. As shown in Fig. 6(b), is translated to one of those dots,

, to minimize the cost function in (2). This procedure is iter-atively applied to all base vertices until the decrease in the totalenergy becomes negligible. Fig. 7 shows the energy minimiza-tion results for the “Teeth” model. It is observed that the basemesh yields a more uniform triangles as the iteration continues.

In the energy minimization procedure, the initial parameteri-zation is modified following the change of the base mesh. Then,with the modified parameterization, the energy-minimized basemesh is repeatedly refined to obtain a normal mesh .The butterfly subdivision is adopted in the refinement.

Fig. 6. Energy minimization procedure. (a) Vertices at the first level which arepredicted from the base vertex v based on the butterfly structure. (b) In theparameter domain, u(v ) is moved to u(�v ) to minimize the energy function.

Consequently, can be represented by the base mesh andits geometrical refinements for higher level vertices, i.e., theirwavelet coefficients. The base mesh can be encoded by anystatic mesh compression algorithm, e.g., the algorithm in [32]or [33]. The wavelet coefficients can be compressed effectivelyby exploiting the hierarchical structure. An R-D optimizedwavelet coefficient compression algorithm is proposed in thenext section.


Fig. 7. Results of energy minimization applied to the “Teeth” model.

IV. R-D OPTIMIZED WAVELET COEFFICIENT COMPRESSION

While conventional algorithms proposed in [5]–[7] encode allwavelet coefficients as a whole, our algorithm partitions theminto disjoint segments and encodes each segment adaptively tomaximize the R-D performance.

A. Mesh Partitioning

There are several partitioning algorithms for irregularmeshes. For example, Mangan and Whitaker [34] used thewatershed scheme to partition a 3-D mesh, and Zuckerberger etal. [35] partitioned a polyhedral surface based on the floodingconvex decomposition and the watershed segmentation. Com-pared with irregular mesh partitioning, semiregular meshes canbe partitioned in a relatively simple way. As described in Sec-tion II-B, edge-based trees can provide a disjoint partitioningof vertices in a semiregular mesh [6]. Alternatively, a segmentcan be defined for each base triangle, and the vertices can bepartitioned into appropriate segments [5].

We adopt an edge-based tree in Fig. 5 as one segment for anormal mesh in this work. However, as shown in Fig. 5(b), threetree segments, which grow from , , and , intersect oneanother within the base triangle . Thus, the conven-tional edge-based tree structure cannot yield a compact localiza-tion of a segment. To address this problem, we propose two moreedge-based tree structures called the “scissor” tree and “arrow”tree in this work. Fig. 8(a) shows a scissor tree and Fig. 8(b)illustrates the covering of vertices using scissor trees. All fouroffsprings ’s are reachable from the parentvia a single edge at the th level. In contrast, in Fig. 5(a),

or is reachable from via at least two edges. Thus,the scissor tree structure alleviates the intersection problem andprovides a more compact localization of a segment than the con-ventional tree structure.

Similarly, as shown in Fig. 8(c) and (d), the arrow tree struc-ture also localizes a segment compactly. However, two arrowtrees can conflict with each other at a base edge, which is in-cident with an odd valence vertex. Fig. 9 shows an exceptionalpatching example, where dotted arrows represent the directionsof arrow trees. Since the valence of is 5, the orientationsof two base triangles and conflict witheach other at . In other words, edge cannot be as-sociated with an arrow tree. In this case, a scissor tree is used

instead to reconcile the opposite directions and provide a dis-joint covering of all vertices.

All three types of edge-based trees can be employed in the fol-lowing R-D optimized compression and view-dependent trans-mission. Especially, in view-dependent transmission, the scissorand arrow structures provide better visual quality than the con-ventional structure, since they result in the compact localizationof a segment.

B. Optimal Truncation of Bitstreams

Each segment (i.e., the edge-based tree) in a normal meshis encoded by the SPIHT algorithm [8]. As mentioned inSection II-A, there are two types of vertices in a normal mesh:normal vertices and nonnormal vertices. We use one bit pervertex to specify whether it is a normal vertex or not. TheSPIHT algorithm can be directly applied to encode scalarwavelet coefficients for normal vertices. On the other hand,to encode a vector wavelet coefficient for anonnormal vertex, isfirst encoded as a scalar, where denotes the sign of

. After the scalar is encoded as significant, the encoding ofand follows.

In the SPIHT algorithm, the compressed bitstream consists ofsignificant bits, sign bits, and magnitude refinement bits. In thiswork, truncation points are defined as last bits of significance en-coding passes and magnitude refinement passes for each vertex,and a packet is defined as a set of bits between two consecutivetruncation points. Let denote the th segment, and de-note the th packet for . Also, let denote the cumulativenumber of bits in the first packets , anddenote the distortion of the segment reconstructed using thefirst packets. Then, the packet is associated with the in-cremental rate and distortion

It is expected that the R-D curve is convex so that the slopeis a monotonic decreasing function of

. However, it is possible for some that is larger than. This abnormality is eliminated by merging and

into one packet [36].Let be the total bit budget for all segments with

, where is the number of segments. Then, the


Fig. 8. Two modified edge-based trees. (a) Scissor tree. (b) Covering of vertices using scissor trees. (c) Arrow tree. (d) Covering of vertices using arrow trees.

optimization problem is to find the set of ’s that minimizessubject to the constraint .

To find the set, we minimize the following cost function:

(3)

where is the Lagrangian multiplier. By varying , the optimalsolution satisfying the rate constraint can be found. In practice,all packets are sorted in the decreasing order of slopes .Then, the encoder continues popping the packet with the largestslope from the sorted list and transmitting it to the decoder, untilthe total bit budget is exhausted.

C. Distortion Model

The geometry distortion between the originalnormal mesh and a reconstructed mesh is defined as

(4)

where is the number of vertices in , and and denotethe original and reconstructed positions of , respectively. Theremeshing error is not incorporated in the geometry distortion,since remeshing and compression are performed separately.Note that the remeshing error is usually negligible as comparedto the quantization distortion in the compression procedure.


Fig. 9. Exceptional patching example of the arrow trees. Since a base vertexv has an odd valence 5, the directions of two triangles (v ; v ; v ) and(v ; v ; v ) conflict each other on edge (v ; v ). We substitute the arrow treewith the scissor tree to achieve a disjoint covering of vertices.

Vertex positions are related to wavelet coefficients that are pre-dictionerrors in the butterfly subdivision.Thus, the geometry dis-tortion is related to themeansquarederrorofwavelet coefficients.However, since the synthesis filter in the butterfly subdivision isnotorthogonal,thegeometrydistortionisnotlinearlyproportionalto the mean squared error of wavelet coefficients, i.e.,

where and are the original and reconstructed waveletcoefficients of , respectively. Obviously, wavelet coefficientscan be inverse-transformed to vertex positions, and the exactgeometry distortion can be computed. However, this approachdemands too high computational burden in the R-D optimiza-tion procedure. In our approach, wavelet coefficient errors areweighted depending on their remeshing levels to approximatethe geometry distortion at a modest computational complexity.A similar approach was adopted in image compression [37].

Fig. 3 shows a vertex at the fine level and its eight but-terfly neighboring vertices at the coarse level . Conversely, avertex at the coarse level is used to predict a number of verticesat the fine level. In Fig. 10, vertex at level is used for the pre-diction of ’s, ’s, and ’s at level with weights1/2, 1/8, and 1/16, respectively. In a normal mesh, all verticesexcept a few base vertices have a valence equal to 6. Thus, if

, the numbers of ’s, ’s, and ’s are equal to6, 6, and 12, respectively. Since is used for the predictionof the associated vertices, the position error of propagates to

’s, ’s, and ’s, and the propagation effect (includingthe error of itself) can be approximated by a weighting param-eter

(5)

Fig. 10. Locally associated vertices of v based on the butterfly subdivision.The vertices at level l are classified into v ’s, v ’s and v ’s, which aredepicted in different gray shades. The vertex v is a butterfly neighbor of v ’s,

v ’s and v ’s with weighs 1/2, 1/8, and �1/16, respectively.

The vertex position errors at the th and th levels alsopropagate to associated vertices at the th level. It can beeasily shown that the error propagation of to the th and

th level vertices, including its own error, can be approxi-mated by a weighting parameter , assuming that the errors ofneighboring vertices are statistically independent. In a similarway, the error propagation of to the whole normal meshcan be approximated by

(6)

where is given by (5).Thus, distortion in (3) can be obtained by multiplying

the squared error of each wavelet coefficient at the th level withweight and summing up errors from all levels

(7)

where is given by (6). Intuitively speaking, wavelet coeffi-cients at a lower level represent low frequency components ofan input mesh, and are used in the prediction of higher level co-efficients. Thus, wavelet coefficients at a lower level are moreimportant and, therefore, multiplied by a larger weighting pa-rameter in our distortion model.

V. VIEW-DEPENDENT TRANSMISSION

In 3-D model streaming applications, the client’s viewingposition can be exploited to maximize visual quality by allo-cating a major portion of the available bandwidth to visibleparts. Fig. 11 illustrates the principle of view-dependent trans-mission, where the viewing position lies on . The bold linefrom to depicts the visible region of the object. The servercan reduce the required bandwidth by transmitting only the datafor the visible region. Following discussion in the previous twosections, we propose a more sophisticated algorithm for view-


Fig. 11. View-dependent processing with respect to a viewing position.

dependent compression and transmission of normal meshes inthis section.

A. Visibility Test and Visibility Priority

The viewing parameter of a client is sent to the server througha feedback channel. With this information, the server performsa visibility test and assigns visibility priorities to vertices. Thesimplest way to find the visibility is to use the angle betweenthe surface normal vector and the viewing vector. The surfacenormal test has been used for many graphic processing proce-dures, such as back-face culling, LOD control, silhouette con-struction and collision detection [21]–[23], [26], [27]. Note thatthe surface normal test is valid only if the surface is closed.

In a normal mesh, the visibility of a vertex can be deter-

mined by two vectors, the vertex normal vector and the

viewing position vector that begins at the center of objectand ends at the viewing position . Let denote the angle

between and , given by

(8)

Vertex is more visible to the client as approaches 0. Itis invisible if is larger than . In other words, the visi-bility is proportional to and is 0 if . Sincewe employ the mean squared error of vertex positions as the ge-ometry distortion measure, the self priority of fromthe viewing position is defined as

if

if .(9)

It is called the self priority, since it considers only the view-dependent distortion of itself.

However, as mentioned in Section IV-C, vertices in a normalmesh are related to one another according to the subdivision hi-erarchy, and the position error of a lower level vertex propagatesto higher level vertices. Therefore, the overall visibility priorityof a vertex should take into account the error propagation effectas well. With the notation given in Fig. 10, we can obtain the

overall visibility priority of each vertex from the highest levelto the lowest level recursively via

(10)

Also, the visibility priority of vertex at the highest level isequal to its self priority

(11)

The visibility priority is equal to weighting parametergiven by (6), if we assume all vertices are perfectly visible,

i.e., for all ’s.It is worthwhile to point out that the view-dependent distor-

tion was defined as the projected geometric error in [19]. In otherwords, a vertex is assigned a higher priority if the angle betweenthe normal vector and the viewing vector is closer to . Thisput more emphasis on faithful reconstruction of the object sil-houette. However, in this work, the visibility priorities of ver-tices are not used for individual vertices but employed to allo-cate higher bitrates to more visible surface segments. Therefore,instead of using the sine function, we adopt the cosine functionin (9).

B. Optimal Truncation of View-Dependent Bitstreams

To support view-dependent transmission effectively, we in-corporate the above visibility criterion into the R-D optimizedcompression system. Let be the view-dependent distortionof the th reconstructed segment using the first packets. Tocompute , the visibility priorities of vertices are used in-stead of the weighting parameters ’s in (7). That is

(12)

The optimal view-dependent transmission is achieved byfinding the set of ’s that minimizes

subject to the constraint . This is basi-cally identical to the R-D optimized compression formulationpresented in Section IV-B except that the view-independentdistortion measure is replaced by the view-dependent one. Thus,the optimal set of ’s can be found using the methoddescribed in Section IV-B. By employing the view-dependentdistortion measure, the proposed algorithm can allocate thetotal bit budget more effectively to offer the maximum visualquality to a client.

Since the visibility priorities depend on the viewing position,they should be updated periodically and the R-D optimizationalgorithm should be performed based on updated visibilitypriorities. The view-dependent processing for dynamic viewing


Fig. 12. View-dependent processing with respect to dynamic viewing positions.

Fig. 13. Comparison of the original irregular mesh (left) and the corresponding normal mesh (right) for six test models with their rendered images.

positions is illustrated in Fig. 12. When an initial viewingposition lies on , the bold curve from to is visible.The visibility priorities are initialized to be ’s, and theR-D optimization is performed to transmit the firstpackets for each segment . When the viewing positionapproaches toward , the visible region expands to include

. The server performs the visibility test again, and updates

the visibility priorities to ’s. Then, based on the newinformation, the R-D optimization algorithm is performed onthe remaining packets. Thus, subject to the rate constraint,additional packets are transmitted foreach to maximize the visual quality from . Similarly,the update of visibility priorities and the R-D optimization areperformed based on the next viewing position .


C. Implementation for Real-Time Applications

The above algorithm computes the visibility priorities of allvertices whenever the viewing position changes. Thus, real-timeinteraction may be impeded by a high encoder complexity.We develop another implementation for real-time applications,which can reduce the computational complexity significantly atthe cost of modest performance degradation.

In this implementation, a visibility priority is assigned to eachsegment rather than to each vertex. For each segment , a seg-

ment normal is defined as the sum of vertex normals in thesegment and given by

(13)

Then, the visibility priority of is defined in a waysimilar to (9) via

ifif

(14)

where denotes the angle between and .Then, the view-dependent distortion in (12) is approx-

imated by

(15)

where is the view-independent distortion given by (7). Inother words, to approximate the view-dependent distortion, theview-independent distortion is weighted by the segment visi-bility priority. The server can pre-compute the view-indepen-dent distortion for each pair of and . When the viewingposition changes, the sever only computes the segment visibilitypriorities and approximates the view-dependent distortions with(15). Note that this approach requires much less computation,since it does not compute the visibility priorities of all vertices.

VI. SIMULATION RESULTS

A. Remeshing and R-D Optimized Compression ofNormal Meshes

The performance of the proposed algorithm is evaluated withsix test models as shown in Fig. 13, where the original irreg-ular mesh and the corresponding normal mesh are presentedside-by-side for visual comparison. Furthermore, five levels ofthe “Venus” normal mesh are shown in Fig. 14. Some propertiesof these test models are summarized in Table I.

The remeshing distortion in the table is calculated using theMETRO algorithm [38]. That is, in order to measure the distor-tion between two meshes and , is scan-convertedto yield a sufficient number of sample points. Then, the distancefrom to is given by the root mean of minimum squareddistances from the randomly selected sample points on to

. Similarly, the distance from to is computed. Thedistortion is given by the maximum of the two distances, and

Fig. 14. Representation of the “Venus” model at various levels.

TABLE IPROPERTIES OF TEST MESH MODELS. # V = NUMBEROF VERTICES IN AN

ORIGINAL IRREGULAR MESHM; # T = NUMBER OF TRIANGLES INM; # BT =NUMBER OF TRIANGLES IN A BASE MESHM ; # L = MAXIMUM

REMESHING LEVEL; # NNV = NUMBER OF NON-NORMAL VERTICES IN A

NORMAL MESHM

then normalized by the diagonal of the bounding box forand .

We compare the performance of the proposed R-D optimizedcompression algorithm with that of the conventional algorithm[7], which compresses a normal mesh as a whole. For a fair com-parison, a base mesh is assumed to be transmitted losslessly inboth algorithms. Fig. 15 compares the R-D curves of the pro-posed algorithm and the conventional algorithm. The unit of therate is in byte, and the distortion is measured using the METROalgorithm. The conventional algorithm encodes the forest ofedge-based trees in an implicit order. Thus, it requires no extra


Fig. 15. R-D curves for six test models. (a) BallJoint. (b) Horse. (c) Rabbit. (d) Santa. (e) Teeth. (f) Venus.

information except the SPIHT bitstreams. In contrast, the pro-posed algorithm partitions the forest into tree segments and as-signs a different bitrate to each segment. Therefore, additional

data should be transmitted to indicate the bitrate of each seg-ment. These additional data are taken into account in the R-Dcurves of the proposed algorithm.


As shown in Fig. 15, the proposed algorithm provides a bettercompression performance than the conventional algorithm formost test models, especially at low bitrates. The “Rabbit” modelis relatively smooth, and all of its segments have similar charac-teristics. Thus, the proposed algorithm does not provide a sub-stantial coding gain over the conventional algorithm. However,for other models, the proposed algorithm allocates a total rate ef-fectively according to the segment characteristics. For example,the conventional algorithm requires 4635 bytes to yield the re-constructed distortion of 5 10 on the “Horse” model, whilethe proposed algorithm consumes only 3262 bytes to yield thesame distortion.

Notice that even though the proposed algorithm uses the R-Doptimization technique, the obtained results are only subop-timal, since the resource is allocated based on the approximatedistortion model to alleviate the high computational complexityin Section IV-C.

B. View-Dependent Transmission of Normal Meshes

Next, we investigate the performance of the view-dependenttransmission method, when the viewing position is fixed. Fig. 16shows the rendered images of the reconstructed “Venus” modelsat various bitrates. The left and right images are obtained usingthe view-independent and the view-dependent methods, respec-tively. The two images with similar visual quality are shown atthe same row and labeled with the required bitrates. We see thatthe view-dependent transmission saves more than 50% of the bi-trates while providing similar visual quality by allocating a mainportion of the transmission bandwidth to visible segments.

In the above test, the conventional tree in Fig. 5 is employed.The modified trees in Fig. 8 provide almost the same R-D perfor-mance as the conventional tree. However, in the case of view-dependent transmission, the modified trees reconstruct visibleparts more reliably than the conventional one, since they localizesegments more compactly. The performance of the three edge-based trees in the view-dependent transmission case is com-pared in Fig. 17. The upper and lower rows render local parts ofthe “BallJoint” and “Santa” models, respectively. The bitratesfor the conventional, scissor, and arrow trees are 2836, 2844,and 2849 bytes for the “BallJoint” model, and 3770, 3793, and3814 bytes for the “Santa” model, respectively. It is observedthat the original shapes are reconstructed more faithfully usingthe modified trees.

Finally, we test the performance of the view-dependent trans-mission algorithm when the viewing position varies dynami-cally. In this test, the fast algorithm described in Section V-Cis used to approximate the view-dependent distortion to alle-viate the computational burden of the server, and the conven-tional edge-based tree is used. Fig. 18 shows the reconstructedmodels, when the viewing position rotates with an angular speed

(rad/sec) around the “Venus” model from the right face tothe left face. The upper (or lower) row renders a sequence ofthe front (or rear) part of the model observed. The required bi-trates are 3734, 5643, 6601, and 7100 bytes, from left to right,respectively. We see that good visual quality is offered to theclient by allocating higher bitrates to the front part, while detailsof the rear part are reconstructed gradually by including previ-ously visible regions. These simulation results indicate that the

Fig. 16. Comparison of the reconstructed “Venus” models withview-independent and view-dependent methods.

proposed algorithm is a promising technique for interactive 3-Ddata transmission.

VII. CONCLUSION AND FUTURE WORK

A unified framework to achieve the R-D optimized compres-sion and view-dependent transmission of 3-D normal mesheswas proposed in this work. The proposed algorithm partitionsa normal mesh into disjoint segments, and allocates an optimalbitrate to each segment to minimize the overall distortion. Wedeveloped a geometry distortion model to reduce the computa-tional complexity of the optimization process. It was also shownthat the visibility priorities of vertices or segments can be in-corporated in the distortion model to support view-dependenttransmission effectively. Simulation results demonstrated thatthe proposed algorithm is an efficient technique for 3-D datacompression and streaming.

As an extension of the current work, some future research canbe performed in the following areas. First, we may consider theregion-of-interest coding and transmission. In this work, view-independent and view-dependent distortion models are used inthe R-D optimization procedure. More general criteria can be


Fig. 17. Comparison of three edge-based trees in the view-dependent transmission, where the upper and the lower rows demonstrate the “BallJoint” and the“Santa” models, respectively.

Fig. 18. Reconstructed models with dynamic viewing positions, where the upper and lower rows show the front and rear parts of the “Venus” model, respectively.

used to enable various interactive applications. For example, aclient can inform the server of its region-of-interest. Then, theserver can find the set of segments which cover the region-of-in-terest, and then allocate the bit budget only to this set of seg-ments. Second, this framework can be used in the error-resilientcoding of 3-D models. To transmit 3-D data over error-pronechannels, compressed bitstreams should be resilient to transmis-sion errors. There have been some work on the error-resilientcoding of irregular meshes, e.g., [39]. Based on the edge-basedtree segmentation, we may modify the proposed algorithm toimprove the robustness of the compressed bitstreams for normalmeshes.

REFERENCES

[1] H. Hoppe, “Progressive meshs,” in Proc. SIGGRAPH, Aug. 1996, pp.99–108.

[2] J. Li and C.-C. J. Kuo, “Progressive coding of 3-D graphics models,”Proc. IEEE, vol. 86, no. 6, pp. 1052–1063, Jun. 1998.

[3] D. Cohen-Or, D. Levin, and O. Remez, “Progressive compression ofarbitrary triangular meshes,” in Proc. Visualization’99, 1999, pp. 67–72.

[4] R. Pajarola and J. Rossignac, “Compressed progressive meshs,” IEEETrans. Vis. Comput. Graphics, vol. 6, no. 1, pp. 79–93, Jan.–Mar. 2000.

[5] F. Morán and N. García, “Hierarchical coding of 3-D models with sub-division surfaces,” in Proc. ICIP, vol. 2, 2000, pp. 911–914.

[6] A. Khodakovsky, P. Schröder, and W. Sweldens, “Progressive geometrycompression,” in Proc. SIGGRAPH, 2000, pp. 271–278.

[7] A. Khodakovsky and I. Guskov, “Normal mesh compression,” inGeometric Modeling for Scientific Visualization. Berlin, Germany:Springer-Verlag, 2002.

[8] A. Said and W. Pearlman, “A new, fast, and efficient image codec basedon set partitioning in hierarchical trees,” IEEE Trans. Circuits Syst. VideoTechnol., vol. 6, no. 3, pp. 243–250, Mar. 1996.

[9] M. Eck, T. DeRose, T. Duchamp, H. Hoppe, M. Lounsbery, and W.Stuetzle, “Multiresolution analysis of arbitrary meshes,” in Proc. SIG-GRAPH, 1995, pp. 173–182.

[10] A. W. F. Lee, W. Sweldens, P. Schröder, L. Cowsar, and D. Dobkin,“MAPS: multiresolution adaptive parameterization of surfaces,” in Proc.SIGGRAPH, 1998, pp. 95–104.

[11] I. Guskov, K. Vidimce, W. Sweldens, and P. Schröder, “Normal meshes,”in Proc. SIGGRAPH, 2000, pp. 95–102.

[12] T. Duchamp, A. Certain, A. DeRose, and W. Stuetzle, “Hierarchicalcomputation of PL harmonic embeddings,” University of Washington,Tech. Rep., Seattle, Jul. 1997.

[13] D. Zorin and P. Schröder, “Subdivision for modeling and animation,”presented at the SIGGRAPH Course Notes, 1999.


[14] N. Dyn, D. Levin, and J. A. Gregory, “A butterfly subdivision schemefor surface interpolation with tension control,” ACM Trans. Graphics,vol. 9, no. 2, pp. 160–169, 1990.

[15] M. Lounsbery, T. DeRose, and J. Warren, “Multiresolution analysis forsurfaces of arbitrary topological type,” ACM Trans. Graphics, vol. 16,no. 1, pp. 34–73, Jan. 1997.

[16] P. Schröder and W. Sweldens, “Spherical wavelets: efficiently rep-resenting functions on the sphere,” in Proc. SIGGRAPH, 1995, pp.161–172.

[17] W. Sweldens and P. Schröder, “Building your own wavelets at home,”in SIGGRAPH Course Notes, 1996, pp. 15–87.

[18] K. Kolarov and W. Lynch, “Compression of functions defined onsurfaces of 3-D objects,” in Proc. Data Compression Conf., 1997, pp.281–291.

[19] P. Lindstrom et al., “Real-time continuous level of detail rendering ofheight fields,” in Proc. SIGGRAPH, 1996, pp. 109–118.

[20] J. C. Xia and A. Varshney, “Dynamic view-dependent simplification forpolygonal models,” in Proc. Visualization, 1996, pp. 335–344.

[21] H. Hoppe, “View-dependent refinement of progressive meshes,” in Proc.SIGGRAPH, 1997, pp. 189–198.

[22] D. Luebke and C. Efikson, “View-dependent simplification of arbitrarypolygonal environments,” in Proc. SIGGRAPH, 1997, pp. 199–208.

[23] D. I. Azuma, B. Curless, T. Duchamp, D. H. Salesin, W. Stuetzle, andD. N. Wood, “View-dependent refinement of multiresolution mesheswith subdivision connectivity,” Univ. Washington, Seattle, Tech. Rep.UW-CSE-2001-10-02, Oct. 2001.

[24] P. Alliez, N. Laurent, H. Sanson, and F. Schmitt, “Efficient view-depen-dent refinement of 3-D meshes using

p3-subdivision,” Vis. Comput.,

vol. 19, no. 4, pp. 205–221, 2003.[25] D. To, R. Lau, and M. Green, “A method for progressive and selec-

tive transmission of multiresolution models,” ACM Virtual Reality Softw.Technol., pp. 88–95, 1999.

[26] R. Southern et al., “A stateless client for progressive view-dependenttransmission,” ACM Web3-D, pp. 43–50, 2001.

[27] S. Yang, C.-S. Kim, and C.-C. J. Kuo, “A progressive view-dependenttechnique for interactive 3-D mesh transmission,” IEEE Trans. CircuitsSyst. Video Technol., vol. 14, no. 11, pp. 1249–1264, Nov. 2004.

[28] M. Grabner and C. Zach, “Adaptive quantization with error bounds forcompressed view-dependent multiresolution meshes,” in Proc. EURO-GRAPHICS, 2003, pp. 77–82.

[29] P. Gioia, O. Aubault, and C. Bouville, “Real-time reconstruction ofwavelet-encoded meshes for view-dependent transmission and visual-ization,” IEEE Trans. Circuits Syst. Video Technol., vol. 14, no. 7, pp.1009–1020, Jul. 2004.

[30] G. Lafruit et al., “View-dependent scalable texture streaming in 3-DQoS MPEG-4 visual texture coding,” IEEE Trans. Circuits Syst. VideoTechnol., vol. 14, no. 7, pp. 1021–1031, Jul. 2004.

[31] M. Garland and P. S. Heckbert, “Surface simplification using quadricerror metrics,” in Proc. SIGGRAPH, 1996, pp. 209–216.

[32] G. Taubin and J. Rossignac, “Geometric compression throught topolog-ical surgery,” ACM Trans. Graphics, vol. 17, no. 2, pp. 84–115, Apr.1998.

[33] C. Touma and C. Gotsman, “Triangle mesh compression,” in Proc.Graphics Interface, Jun. 1998, pp. 26–34.

[34] A. P. Mangan and R. T. Whitaker, “Partitioning 3-D surface meshesusing watershed segmentation,” IEEE Trans. Vis. Comput. Graphics,vol. 5, no. 4, pp. 308–321, Oct.–Dec. 1999.

[35] E. Zuckerberger, A. Tal, and S. Shlafman, “Polyhedral surface decompo-sition with applications,” Comput. Graph., vol. 26, no. 5, pp. 733–743,Oct. 2002.

[36] JPEG2000 Verification Model 6.0 (Technical Description), ISO/IECJTC1/SC29/WG1 WG1N1575, Jan. 2000.

[37] G. Davis and A. Nosratinia, “Wavelet-based image coding: anoverview,” Appl. Comput. Contr., Signals, Circuits, vol. 1, no. 1, pp.436–440, Spring 1998.

[38] P. Cignoni, C. Rocchini, and R. Scopigno, “METRO: measuring error onsimplified surfaces,” Comput. Graph. Forum, vol. 17, no. 2, pp. 167–174,1998.

[39] Z. Yan, S. Kumar, and C.-C. J. Kuo, “Error-resilient coding of 3-Dgraphic models via adaptive mesh segmentation,” IEEE Trans. CircuitsSyst. Video Technol., vol. 11, no. 7, pp. 860–873, Jul. 2001.

Jae-Young Sim (S’02) received the B.S. and M.S.degrees in electrical engineering from Seoul Na-tional University, Seoul, Korea, in 1999 and 2001,respectively, where he is currently working towardthe Ph.D. degree in electrical engineering andcomputer science.

His research topics include 3-D multimedia repre-sentation, compression, and transmission.

Chang-Su Kim (S’95–M’01) received the B.S.and M.S. degrees in control and instrumentationengineering and the Ph.D. degree in electricalengineering from Seoul National University (SNU),Seoul, Korea, in 1994, 1996, and 2000, respectively.

From 2000 to 2001, he was a Visiting Scholarwith the Signal and Image Processing Institute,University of Southern California, Los Angeles, anda Consultant for InterVideo Inc., Los Angeles. From2001 to 2003, he was a Postdoctoral Researcherwith the School of Electrical Engineering, SNU.

In August 2003, he joined the Department of Information Engineering, TheChinese University of Hong Kong as an Assistant Professor. His research topicsinclude video and 3-D graphics processing and multimedia communications.He has published more than 70 technical papers in international conferencesand journals.

C.-C. Jay Kuo (S’83–M’86–SM’92–F’99) receivedthe B.S. degree in electrical engineering from theNational Taiwan University, Taipei, in 1980 and theM.S. and Ph.D. degrees in electrical engineeringfrom the Massachusetts Institute of Technology,Cambridge, in 1985 and 1987, respectively.

He was Computational and Applied Mathe-matics (CAM) Research Assistant Professor inthe Department of Mathematics, University ofCalifornia, Los Angeles, from October 1987 toDecember 1988. Since January 1989, he has been

with the Department of Electrical Engineering-Systems and the Signal andImage Processing Institute, University of Southern California, Los Angeles,where he currently has a joint appointment as Professor of Electrical Engi-neering and Mathematics. He has guided about 60 students to their Ph.D.degrees and supervised 15 Postdoctoral Research Fellows. He is coauthorof seven books and more than 700 technical publications in internationalconferences and journals. His research interests are in the areas of digital signaland image processing, audio and video coding, multimedia communicationtechnologies and delivery protocols, and embedded system design.

Dr. Kuo is Editor-in-Chief for the Journal of Visual Communication andImage Representation, and Editor for the Journal of Information Science andEngineering and the EURASIP Journal of Applied Signal Processing. Heis also on the Editorial Board of the IEEE Signal Processing Magazine. Heserved as Associate Editorfor IEEE TRANSACTIONS ON IMAGE PROCESSING

in 1995–1998, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO

TECHNOLOGY in 1995–1997 and IEEE TRANSACTIONS ON SPEECH AND AUDIO

PROCESSING in 2001–2003.


Sang-Uk Lee (S’75–M’80–SM’99) received the B.S.degree from Seoul National University, Seoul, Korea,in 1973, the M.S. degree from Iowa State University,Ames, in 1976, and the Ph.D. degree from the Uni-versity of Southern California, Los Angeles, in 1980,all in electrical engineering.

From 1980 to 1981, he was with General ElectricCompany, Lynchburg, VA, working on the develop-ment of digital mobile radio. From 1981 to 1983, hewas a Member of Technical Staff, M/A-COM Re-search Center, Rockville, MD. In 1983, he joined the

Department of Control and Instrumentation Engineering, Seoul National Uni-versity, as an Assistant Professor, where he is now a Professor of the School ofElectrical Engineering. He is also affiliated with the Automation and SystemsResearch Institute and the Institute of New Media and Communications, bothat Seoul National University. His current research interests are in the areas ofimage and video signal processing, digital communication, and computer vision.He served as an Editor-in-Chief for the Transactions of the Korean Institute ofCommunication Science from 1994 to 1996.

Dr. Lee is currently an Associate Editor for the IEEE TRANSACTIONS ON

CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY and is on the Editorial Boardof the Journal of Visual Communication and Image Representation and theEURASIP Journal of Applied Signal Processing. He is a Member of Phi KappaPhi.

Documents

854 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO ...mcl.usc.edu/wp-content/uploads/2014/01/200507-Rate-distortion... · 854 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO