11
IEEE TRANSACTIONS ON BROADCASTING, VOL. 58, NO. 3, SEPTEMBER 2012 417 Joint Rate Allocation for Statistical Multiplexing in Video Broadcast Applications Wei Yao, Student Member, IEEE, Lap-Pui Chau, Senior Member, IEEE, and Susanto Rahardja, Fellow, IEEE Abstract—This paper proposes a new joint rate allocation scheme for statistical multiplexing of multiprogram video coding in broadcast systems. The scheme is based on a scalable video coding (SVC) platform that does not require computationally expensive re-encoding or transcoding to adjust the bit-rate of each video signal. A new complexity measure that incorporates the characteristics of Human Visual System (HVS) is rst introduced in the look-ahead approach. The bit-rate for base layer coding is thus more efciently distributed among videos according to their relative complexities. A piecewise linear model is used to estimate the rate-distortion relationship in SVC enhancement layers. Based on the model, we develop a joint rate allocation scheme to dynamically allocate the available bandwidth by taking into consideration both inter-program fairness and intra-program smoothness constraints. Experiments have been carried out to compare the performance of existing methods with our proposed scheme. Results demonstrate that the proposed scheme achieves a ne balance in picture quality across all statistical multiplexed videos as well as within each video. Index Terms—Complexity model, joint rate allocation, scalable video coding, statistical multiplexing. I. INTRODUCTION I N TYPICAL broadcast applications, such as digital TV broadcast and video surveillance, different video sequences are encoded in parallel and transmitted simultaneously over a bandwidth-limited channel. Coding of each video sequence independently at an equal proportion of the channel bandwidth is a straightforward method to allocate channel bandwidth but the quality may vary signicantly with time within a particular video and between videos. Statistical multiplexing is a way to distribute the channel bandwidth dynamically over time, depending on the relative complexity of each video sequence. The goal is to maximize the overall video broadcast quality under bandwidth constraints while attaining comparable qual- ities on all received videos (i.e., inter-program fairness), and stable quality in each video (i.e., intra-program smoothness) [1], [13]–[17], [19]. An accurate estimation of each video’s complexity model is a prerequisite for efcient rate allocation. Algorithms used to Manuscript received November 10, 2011 revised March 13, 2012; accepted March 14, 2012. Date of publication May 04, 2012; date of current version Au- gust 17, 2012. W. Yao and S. Rahardja are with the Department of Signal Processing, In- stitute for Infocomm Research, 138632 Singapore (e-mail: [email protected]. edu.sg; [email protected]). L.-P. Chau is with the School of Electrical and Electronic Engi- neering, Nanyang Technological University, 639798 Singapore (e-mail: [email protected]). Color versions of one or more of the gures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identier 10.1109/TBC.2012.2191701 derive video complexity can be broadly classied into feedback approaches and look-ahead approaches. In the feedback ap- proaches [1]–[3], parameters obtained after the encoding of one or several frames are used to predict the complexity model of subsequent frames. These approaches assume the neighboring frames share similar characteristics but often suffer from perfor- mance degradation at scene changes. Look-ahead approaches [4], [5], [19] typically employ a pre-processing step to estimate video complexity before encoding. These approaches are more computationally intensive but have wider choices of statistics and can thus obtain a more accurate estimation. Feedback and look-ahead approaches can also be jointly applied [6]–[8] to predict the bit allocation for each of the videos. To achieve a good statistical multiplexing performance, not only the channel bandwidth needs to be efciently allocated be- tween video programs, but also the video encoder needs to be accurately controlled. A number of statistical multiplexing al- gorithms [1]–[8] employ single layer encoders, such as the non- scalable mode of MPEG-2 [9] and H.264/AVC [10], to perform joint rate control. On these platforms, multiple videos have to be re-encoded or transcoded in order to t into the channel. Other works [17]–[19] exploit the Scalable Video Coding (SVC) ex- tension of H.264/AVC [11] to reduce computational complexity. But as a trade-off, the rate-distortion performance of SVC is usually not as competitive as non-scalable platform [29]. In gen- eral, videos encoded by SVC-based platforms are comprised of one base layer and several enhancement layers. The target bit- rate allocated to each encoded bitstream can be easily achieved through simple truncation instead of re-encoding. This makes the control of video encoder easier and more accurate in SVC based platforms. Therefore, efciently allocating the channel bit-rate according to some optimality criterion becomes more crucial in SVC based platforms. Jacobs et al.[18] proposed to distribute channel bandwidth proportionately among video se- quences according to their relative complexities. But the band- width distribution for base layer of each video uses equal al- location. Wang et al. pointed out in [19] that, base layer qual- ities affect the overall performance and the equal distribution of base layer bandwidth is not efcient. They also showed that the rate allocation scheme proposed in [18] results in a large quality difference between different videos. In [19], both the base layer bandwidth and the channel bandwidth are dynami- cally distributed among video sequences in order to minimize the quality differences across videos. Although the inter-pro- gram fairness was taken care in [19], the performance of base layer rate allocation was not evaluated. More importantly, the intra-program smoothness criterion was not considered in their optimization problem, which may cause large quality uctua- tions within each video. 0018-9316/$31.00 © 2012 IEEE

Joint Rate Allocation for Statistical Multiplexing in Video Broadcast Applications

  • Upload
    susanto

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Joint Rate Allocation for Statistical Multiplexing in Video Broadcast Applications

IEEE TRANSACTIONS ON BROADCASTING, VOL. 58, NO. 3, SEPTEMBER 2012 417

Joint Rate Allocation for Statistical Multiplexing inVideo Broadcast Applications

Wei Yao, Student Member, IEEE, Lap-Pui Chau, Senior Member, IEEE, and Susanto Rahardja, Fellow, IEEE

Abstract—This paper proposes a new joint rate allocationscheme for statistical multiplexing of multiprogram video codingin broadcast systems. The scheme is based on a scalable videocoding (SVC) platform that does not require computationallyexpensive re-encoding or transcoding to adjust the bit-rate of eachvideo signal. A new complexity measure that incorporates thecharacteristics of Human Visual System (HVS) is first introducedin the look-ahead approach. The bit-rate for base layer codingis thus more efficiently distributed among videos according totheir relative complexities. A piecewise linear model is used toestimate the rate-distortion relationship in SVC enhancementlayers. Based on the model, we develop a joint rate allocationscheme to dynamically allocate the available bandwidth by takinginto consideration both inter-program fairness and intra-programsmoothness constraints. Experiments have been carried out tocompare the performance of existing methods with our proposedscheme. Results demonstrate that the proposed scheme achievesa fine balance in picture quality across all statistical multiplexedvideos as well as within each video.

Index Terms—Complexity model, joint rate allocation, scalablevideo coding, statistical multiplexing.

I. INTRODUCTION

I N TYPICAL broadcast applications, such as digital TVbroadcast and video surveillance, different video sequences

are encoded in parallel and transmitted simultaneously overa bandwidth-limited channel. Coding of each video sequenceindependently at an equal proportion of the channel bandwidthis a straightforward method to allocate channel bandwidth butthe quality may vary significantly with time within a particularvideo and between videos. Statistical multiplexing is a wayto distribute the channel bandwidth dynamically over time,depending on the relative complexity of each video sequence.The goal is to maximize the overall video broadcast qualityunder bandwidth constraints while attaining comparable qual-ities on all received videos (i.e., inter-program fairness), andstable quality in each video (i.e., intra-program smoothness)[1], [13]–[17], [19].An accurate estimation of each video’s complexity model is

a prerequisite for efficient rate allocation. Algorithms used to

Manuscript received November 10, 2011 revised March 13, 2012; acceptedMarch 14, 2012. Date of publication May 04, 2012; date of current version Au-gust 17, 2012.W. Yao and S. Rahardja are with the Department of Signal Processing, In-

stitute for Infocomm Research, 138632 Singapore (e-mail: [email protected]; [email protected]).L.-P. Chau is with the School of Electrical and Electronic Engi-

neering, Nanyang Technological University, 639798 Singapore (e-mail:[email protected]).Color versions of one or more of the figures in this paper are available online

at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/TBC.2012.2191701

derive video complexity can be broadly classified into feedbackapproaches and look-ahead approaches. In the feedback ap-proaches [1]–[3], parameters obtained after the encoding of oneor several frames are used to predict the complexity model ofsubsequent frames. These approaches assume the neighboringframes share similar characteristics but often suffer from perfor-mance degradation at scene changes. Look-ahead approaches[4], [5], [19] typically employ a pre-processing step to estimatevideo complexity before encoding. These approaches are morecomputationally intensive but have wider choices of statisticsand can thus obtain a more accurate estimation. Feedback andlook-ahead approaches can also be jointly applied [6]–[8] topredict the bit allocation for each of the videos.To achieve a good statistical multiplexing performance, not

only the channel bandwidth needs to be efficiently allocated be-tween video programs, but also the video encoder needs to beaccurately controlled. A number of statistical multiplexing al-gorithms [1]–[8] employ single layer encoders, such as the non-scalable mode of MPEG-2 [9] and H.264/AVC [10], to performjoint rate control. On these platforms, multiple videos have to bere-encoded or transcoded in order to fit into the channel. Otherworks [17]–[19] exploit the Scalable Video Coding (SVC) ex-tension of H.264/AVC [11] to reduce computational complexity.But as a trade-off, the rate-distortion performance of SVC isusually not as competitive as non-scalable platform [29]. In gen-eral, videos encoded by SVC-based platforms are comprised ofone base layer and several enhancement layers. The target bit-rate allocated to each encoded bitstream can be easily achievedthrough simple truncation instead of re-encoding. This makesthe control of video encoder easier and more accurate in SVCbased platforms. Therefore, efficiently allocating the channelbit-rate according to some optimality criterion becomes morecrucial in SVC based platforms. Jacobs et al.[18] proposed todistribute channel bandwidth proportionately among video se-quences according to their relative complexities. But the band-width distribution for base layer of each video uses equal al-location. Wang et al. pointed out in [19] that, base layer qual-ities affect the overall performance and the equal distributionof base layer bandwidth is not efficient. They also showed thatthe rate allocation scheme proposed in [18] results in a largequality difference between different videos. In [19], both thebase layer bandwidth and the channel bandwidth are dynami-cally distributed among video sequences in order to minimizethe quality differences across videos. Although the inter-pro-gram fairness was taken care in [19], the performance of baselayer rate allocation was not evaluated. More importantly, theintra-program smoothness criterion was not considered in theiroptimization problem, which may cause large quality fluctua-tions within each video.

0018-9316/$31.00 © 2012 IEEE

Page 2: Joint Rate Allocation for Statistical Multiplexing in Video Broadcast Applications

418 IEEE TRANSACTIONS ON BROADCASTING, VOL. 58, NO. 3, SEPTEMBER 2012

Fig. 1. Structure of the proposed multiprogram video coding system.

This paper proposes a new statistical multiplexing schemebased on the platform in [19] by employing SVC as the videoencoder. In our scheme, each video is separated into groups ofpictures (GOPs) with fixed lengths and the rate allocation is up-dated at the GOP boundaries. Our proposedmultiplexing systemhas contributions in two modules: look-ahead preprocessor andjoint rate allocator. In look-ahead preprocessing, we proposea new complexity measure to determine the base layer codingrate of each video program. Our complexity measure takes intoaccount both spatial complexity and temporal complexity of avideo sequence. A new similarity index (SMI), which is a varia-tion of the Structure Similarity Index (SSIM) [21] is introducedto measure the temporal complexity. As inherited from SSIM,the proposed SMI incorporates Human Visual System (HVS)characteristics but it is modified to match the rate-distortionrelationship in video coding. After each video source is com-pressed into a base layer and several fine granularity scalability(FGS) layers, a joint bit allocator collects the coding statisticsgenerated from the encoders and a piecewise linear model isused to estimate the rate-distortion relation in FGS layers. Bit al-location of each video is determined by solving an optimizationproblem that aims to achieve a balance between inter-programfairness and intra-program smoothness while keeping the totalbit-rate of multiple programs conformant to the channel band-width. We implement the scheme to verify its performance, andit is shown that the proposed scheme obtains better intra-pro-gram smoothness than [19] while keeping a similar quality dif-ference across all videos.The rest of this paper is organized as follows. Section II gives

an overview of the statistical multiplexing system for multipro-gram video coding. Section III details the proposed complexitymeasure used in look-ahead preprocessor. The proposed jointrate allocation scheme is described in Section IV. The experi-mental results are presented and analysed in Sections V and VIconcludes the paper.

II. SYSTEM OVERVIEW

The basic structure of our statistical multiplexing system isshown in Fig. 1. The system consists of several complexity

analysers, scalable video coding (SVC) encoders, bitstreamextractors, a 2-D joint rate allocator and a multiplexer. Eachcomplexity analyser is used as a preprocessor to analyse a videosource on a GOP basis. It evaluates the similarities betweenneighboring pixels as well as similarities between adjacentframes to derive picture statistics that reflect the complexityof each video source. The 2-D joint rate allocator collects thepicture statistics from every analyser and calculates the baselayer bit-rate allocation for each encoder based on the relativecomplexities of the videos. After receiving the video and baselayer bit-rate allocation, each SVC encoder encodes the videointo a base layer and several FGS layers. The coding statisticsare thus generated as byproduct from encoding process. Thejoint rate allocator takes these coding statistics as input andcalculates the optimal bit-rate allocation to different videoprograms in a GOP. The compressed bitstream is then fedinto a bitstream extractor which extracts a substream out of theinput bitstream according to the target bit-rate given by the jointrate allocator. In the last module—multiplexer, the extractedsubstreams of the different video sources are multiplexed andtransmitted over a single channel.

III. SIMILARITY BASED COMPLEXITY ANALYSER

The similarity based complexity analyser as shown in Fig. 1is a preprocessor to evaluate the complexity of each video andthe obtained complexity is then used for determining the baselayer coding rate. By intuition, large content variation between aframe and its reference frame, and large variation of pixel valueson a frame are the causes of high complexities in video. Thusour proposed scheme uses both temporal complexity and spatialcomplexity to characterize the video complexity.

A. Existing Structure Similarity Index

The temporal complexity measures the video complexity ontemporal direction, which is related to similarity between adja-cent frames. Zhou et al. proposed the Structural Similarity Index(SSIM) in [21] as a quality assessment method for measuringsimilarity between two static images. It considers the charac-teristics of Human Visual System (HVS) in an attempt to in-corporate perceptual quality measures [21]. Essentially SSIM

Page 3: Joint Rate Allocation for Statistical Multiplexing in Video Broadcast Applications

YAO et al.: JOINT RATE ALLOCATION FOR STATISTICAL MULTIPLEXING IN VIDEO BROADCAST APPLICATION 419

measures the deviations in luminance, contrast and structure be-tween the two images and these deviations are represented as:

(1)

SSIM as shown below is a product of these three elements.

(2)

where and represent the two images under comparison;, and , are the means and variances of two images

respectively; is the covariance of image and . , aretwo constant that stabilize the division with weak denominator.The application of SSIM in a video coding context is still

under investigation [25], [26]. The conventional usage of SSIMas quality assessment metric has been studied in past years forboth images and videos, and it has been demonstrated that SSIMis an effective measurement of perceptual global degradations[22]–[24]. However, it has been shown in [27] that SSIM is verysensitive to contrast change, small translation and rotation be-tween two images. This feature is an advantage when SSIM isused as quality metric in image processing. But in video com-pression, coding of a frame with slow motion or small contrastchange with respect to its reference frame does not consume lotof bits with the help of efficient motion estimation/motion com-pensation and entropy coding. Therefore, SSIM cannot be di-rectly used in our system for temporal complexity measure. Weneed an improved similarity index that is insensitive to smallmovement, contrast change but still sensitive to fast motion.

B. Proposed Complexity Measure

In viewing of the aforementioned desired feature for tem-poral complexity measure, we introduce a new similarity index

to measure the similarity between two adjacent frames.Considering the th and th frame in a GOP. Let

and be the luminance value at pixel position in thetwo frames, respectively, where

. and represent the frame height and width inpixels. The similarity between frame and is calculatedas:

(3)where and are the average values of pixels in frameand , and are the same as in Equation (2),and are the transformations from original framesand through

(4)

in Equation (4) is a switch function and it is defined as:

(5)

(6)

where is a small constant. is the cosine of theangle between the two lexicographic order vectors of framesand . The value of is in the range of [0,1], whichserves as a preliminary detection of similarity between the twoframes. It can be realized that when there exists big differencebetween the two frames, e.g. scene change, the value ofapproaches 0 that makes approach 1. The SMI in Equation(3) becomes almost the same as SSIM accordingly, which caneffectively detects the big difference. When the two frames aresimilar, e.g. small movement, the value of would become0. The true pixel values of the two frames are used to calculatethe SMI in Equation (3) and it makes the SMI not sensitive tocontrast change and small movement.The value of SMI varies from 0 to 1 and a higher value means

the two frames are more similar with each other or the com-plexity is low. Therefore, we transform SMI to temporal com-plexity (TCX) and get the overall TCX of video in GOP witha total of frames as follows:

(7)

Spatial complexity measures the complexity in each frame.Kim et al. evaluated several methods in [30] on the complexitymeasure of still image, and found the gradient-based method ismore reliable. Thus, we sum up the spatial complexity (SCX)of each frame and define the spatial complexity of theGOP in video program as

(8)

IV. JOINT RATE ALLOCATION SCHEME

A. Rate Allocation for Base Layer

In SVC encoder, each video is encoded into a base layer andseveral FGS layers in each GOP. The base layer provides theminimum quality of the video, which is guaranteed to be trans-mitted when the bandwidth is low. Given the total transmissionbit-rate for base layers of all the videos, we need to distributeit among different videos before encoding. Wang et al. demon-strated in [19] that an efficient allocation of base layer bit-rateis critical to achieve better overall performance of the wholesystem. They also implied in their experiment discussion that amore uniform base layer quality leads to a better inter-programfairness, i.e. all received programs have similar video qualities.Due to the variation of scene complexity among videos, the

base layer bit-rate should be allocated to different videos inproportion to their complexity based on rate-distortion theory[20]. Both temporal and spatial complexity measures obtained

Page 4: Joint Rate Allocation for Statistical Multiplexing in Video Broadcast Applications

420 IEEE TRANSACTIONS ON BROADCASTING, VOL. 58, NO. 3, SEPTEMBER 2012

Fig. 2. Illustration of the problem in attaining both inter-program fairness and intra-program smoothness for broadcasting applications.

in Section III are continuously sent to the joint rate allocator thatdynamically distributes the total base layer coding bit-rate. Forthe th GOP in the th video sequence, the allocated base layerbit-rate is thus

(9)

with being the fixed total base layer coding bit-rate andbeing the total number of video programs in the system.

B. Rate Allocation for Video Adaptation

All the bitstreams are multiplexed into a single channelwith a channel bandwidth . So the aggregate transmission bit-rates of all the videos should be less than or equal to the channelcapacity.Meanwhile, the channel should be fully utilized. Sowehave the rate constraint in every GOP as

(10)

where is the transmission bit-rate allocated to the th GOPin the th video, and is a predefined small value used as athreshold to adjust channel utilization level. is set toin this paper. In order to accurately control the transmission bit-rate of every video bitstream, wemake use of FGS in the system.It is possible to truncate an FGS bitstream at any byte positionin the enhancement layers [28].Given the total channel bandwidth, it is known that an equal

distribution of the bandwidth results in a big quality differenceacross videos. Although Wang et al. [19] proposed a schemeto minimize the quality difference between videos in one GOP,the smoothness between adjacent GOPs is not considered in[19]. As illustrated in Fig. 2, the problem is a 2-D optimizationproblem where both inter-program fairness and intra-programsmoothness should be considered. However, due to the non-sta-tionary content of each video and the rate constraint in (10), atsome time instants, it may be difficult to simultaneously obtaina smooth quality transition between GOPs in each video and anear-constant quality across all videos in each GOP. Therefore,our scheme aims to dynamically allocate the bit-rate to eachvideo such that a good balance between intra-program smooth-ness and inter-program fairness can be achieved. The problem

can be formulated as

(11)

where is the smoothness factor used to adjust the im-portance of uniform quality among videos and smooth qualitywithin one video. For the first GOP in the video, since no pre-vious GOP is available, is set to 1 and only the inter-programfairness is considered. represents the total number of videobitstreams multiplexed in the channel. and denotethe distortion of the reconstructed video in GOP and ,respectively. The average mean square error (MSE) is used tomeasure the distortion. is computed as

(12)

The rate-distortion (R-D) function of SVC FGS layers is inves-tigated and suggested to be a piecewise-linear model in [31].We have conducted experiments on many sequences to verifythe piecewise-linear R-D relationship in the FGS layers. Fig. 3illustrates the results we obtained from a random GOP with 16frames of the sequence “Foreman”. The base layer of the se-quence is coded at and the Hierarchical-B structure isused. It can be observed that the R-D curve of each FGS layerappears to be linear. Thus, knowing both end values of an FGSlayer in an R-D curve, any in-between truncation points can beestimated using linear interpolation with the piecewise-linearmodel. In our system, the coding statistics, e.g. the bit-rate anddistortion of the reconstructed video of base layer and all theFGS layers can be calculated as by-product during the encodingprocess. Let and be the upper and lower bound bit-ratesof a FGS layer, and their corresponding video distortion areand . For any in-between truncation point, we have

(13)

where is the video distortion of the truncation point andrepresents the truncation bit-rate. By manipulating (13) alge-braically, we can get

(14)

Page 5: Joint Rate Allocation for Statistical Multiplexing in Video Broadcast Applications

YAO et al.: JOINT RATE ALLOCATION FOR STATISTICAL MULTIPLEXING IN VIDEO BROADCAST APPLICATION 421

Fig. 3. R-D curve of different FGS layers. (a) 1st FGS Layer. (b) 2nd FGSLayer. (c) 3rd FGS Layer.

and

(15)

Based on the linear property of the bit-rate and distortion model,we can have a one-to-one mapping between the video bit-rateand the video distortion. Therefore, the searching of the optimalrate allocation in (11) is equivalent to the searching of thecorresponding video distortion of each program.Although a brute-force search of can be used to solve

the problem, it is not practical in reality because of the intensivecomputation. Hence, we use a simplified approach to find a sub-optimal solution to this problem. The details of the method areintroduced in the following.Step 1) Collect coding statistics and initialization

As explained in Section II, the coding statistics suchas bit-rate and the corresponding video distortion

of every SVC layer can be obtained as by-productduring encoding process. These statistics are sent toJoint Rate Allocator as inputs of the bit allocationscheme. Let and denote the bit-rate anddistortion of the reconstructed video after receivingth layer and all the lower layers of .

indexes over the SVC base layer andFGS layers. First, an initial search range isdefined. It is a superset that includes all the distortionvalues from every encoder in the system. The valuesof and are thus determined as

(16)

(17)

Step 2) Carry out golden-section searchAn iterative golden-section search is carried out inthis step. The main idea is to check the golden-sec-tion point , i.e. distortion point positioned at

from the search interval endpoint inevery iteration. All videos use current golden-sec-tion point in each iteration as target distortionfor bitstream truncation. If the sum of corre-sponding truncation bit-rates could not meet therate constraint in (10), the golden-section pointreplaces one of the endpoints to form a new

search interval. In detail, if ,the new search interval in next iteration becomes

; If , the new searchinterval is . This process repeats until aconverged point is obtained. The corre-sponding truncation bit-rates fulfills the constraint

, and for video , since, the distortion variance across all the

videos in GOP is minimized.Step 3) Carry out refinement search

After the converged golden-section point is ob-tained, refinement search is carried out in this stepwith a new search interval and a set of new searchpoints. The number of search points is predefined.We firstly identify the video program whose dis-tortion in previous GOP has the biggest differencewith , i.e.

(18)

The length of search interval is set to two times ofthe difference between and .If

, the search range is with

(19)

(20)

If , the new search interval would be. The number of search points is predefined to

20 in this implementation and they are evenly dis-tributed in the search interval. Every video takes asearch point from the interval and all the combina-tions of search points are checked.

Page 6: Joint Rate Allocation for Statistical Multiplexing in Video Broadcast Applications

422 IEEE TRANSACTIONS ON BROADCASTING, VOL. 58, NO. 3, SEPTEMBER 2012

Fig. 4. Screenshot of CIF (352 288) sequences: “Foreman”, “Football”,“M&D,” and “Harbor.”

Fig. 5. Screenshot of HD (1280 720) sequences: “Crew”, “Bigship”,“Raven,” and “Shuttle_start.”

Step 4) Calculate the bit-rate allocationEvery search point represents the target distortion ofthe bitstream after truncation. We need to locate thetarget distortion in one of the FGS layers ofeach video to find the corresponding bit-rate asfollows:— If , .— If , .— If , the target distortionis located in the th FGS layer. The R-D curve ofth FGS layer is used to map the to the al-located bit-rate as in (14) and (15). The sumof allocated bit-rates is computed and only thosecombinations fulfill the rate constraint in (10) arerecorded. The combination who minimizes thecost function in (11) is the final bit-rate alloca-tion .

V. EXPERIMENTAL RESULTS

A. Experimental Setup

To evaluate the performance of our system, we carry outa number of experiments on H.264/SVC reference softwareJSVM 9.8 [32]. Two sets of video sequences with differentcharacteristics are used as video sources in the tests. All thesequences are International Standard testing sequences. Set1 contains four video sequences at CIF (352 288 pixels)resolution. The sequences are “Foreman”, “Football”, “Mother& Daughter (M&D)”, and “Harbor”, and they are coded usingHierarchical-B structure with a GOP size of 16, frame rateof 30 frame/s. Set 2 consists of four HD (1280 720 pixels)video sequences, which are “Crew”, “Bigship”, “Raven”, and“Shuttle_Start”, with a framerate of 60 frame/s. They are codedwith IPP P structure and the GOP size is set to 30. A screenshot of each sequence is shown in Figs. 4 and 5. We alsocharacterize each sequence in terms of the amount of motion,camera motion, texture and whether there is scene change. Allthe features are listed in Table I.Section V-B evaluates the performance of base layer bit al-

location, which is demonstrate the accuracy of the proposedcomplexity measure. Complexity measure with better accuracyleads to more uniform base layer qualities. Section V-C inves-tigates the performance of the proposed joint rate allocationscheme in terms of inter-program fairness and intra-program

TABLE ISUMMARY OF DIFFERENT CHARACTERISTICS ON TEST SEQUENCES

smoothness. Therefore, in all the experiments, the concern is notthe PSNR in a particular GOP of a single video. Instead, we arelooking at whether all the videos can achieve comparable qual-ities in each GOP and whether a stable quality can be achievedin each video. The performance comparisons are carried out be-tween the proposed scheme and the scheme in [19], which hasproven to be more efficient than many existing methods, suchas the equal bit-rate allocation scheme and the scheme in [18].

B. Experimental Results on Base Layer Bit Allocation

Base layer is the minimal guaranteed bandwidth of thechannel. It is set to 1 MB/s for Set 1 CIF sequences and 3 MB/sfor Set 2 HD sequences in the experiment. The total base layerbandwidth is distributed to different videos for their base layercoding according to the relative video complexity.This section evaluates the performance of base layer bit allo-

cation by comparing the quality variance among videos in eachGOP at their base layers. Variance is a measure of how far a setof numbers is spread out in statistics. Thus a smaller variancerepresents more uniform base layer qualities. The quality vari-ance for GOP is calculated as

(21)

where is the total number of videos and isthe averaged PSNR of N videos in GOP . For each GOP, thePSNR of each video sequence at the base layer and the vari-ance of PSNRs are displayed in Table II for CIF sequences,and Table III for HD sequences. The proposed scheme is com-pared with the scheme in [19], which measures the video com-plexity by combining I-frame complexity and P-frame motionactivities. It can be seen that the variance of video qualities atbase layer is reduced in every GOP using the proposed method.This is because as compared with [19], our proposed methodallocates less bandwidth to the low complexity videos, suchas “Foreman”, “M&D” at CIF resolution, and “Shuttle\_Start”at HD resolution. It results in more leftover base layer band-width that we then allocate to the high complexity videos. Con-sequently we attain a more even base layer quality across allvideos in each GOP. If the channel bit-rate drops and only theminimum base layer bandwidth is available in some extreme sit-uations, quality differences among base layers direct indicatesthe inter-program fairness, and thus a better inter-program fair-ness can be achieved by our proposed method. On the other

Page 7: Joint Rate Allocation for Statistical Multiplexing in Video Broadcast Applications

YAO et al.: JOINT RATE ALLOCATION FOR STATISTICAL MULTIPLEXING IN VIDEO BROADCAST APPLICATION 423

TABLE IIAVERAGE PSNR & VARIANCE OF PSNR FOR BASE LAYERS OF DIFFERENT SEQUENCES AT CIF RESOLUTION

TABLE IIIAVERAGE PSNR & VARIANCE OF PSNR FOR BASE LAYERS OF DIFFERENT SEQUENCES AT HD RESOLUTION

hand, since the rate is allocated according to relative complexity,the results also demonstrate that the proposed complexity mea-sure is more accurate.

C. Experimental Results on System Bit-Rate Adaptation

In this section, we simulate the proposed algorithm with CIFsequences under the channel bandwidth of 2 MB/s and with

HD sequences under the channel bandwidth of 10 MB/s. Thebit-rate for base layers is set as previous experiment. Once thebase layer coding bit-rate for each video program is determined,each encoder compresses the video into a base layer and threeFGS layers. The joint rate allocator dynamically allocates thetruncation bit-rate to each video and the bitstream extractor trun-cates accordingly. The proposed bit-rate allocation scheme aims

Page 8: Joint Rate Allocation for Statistical Multiplexing in Video Broadcast Applications

424 IEEE TRANSACTIONS ON BROADCASTING, VOL. 58, NO. 3, SEPTEMBER 2012

TABLE IVAVERAGE PSNR, VARIANCE OF PSNR IN EACH GOP OF DIFFERENT PROGRAMS, AND SUM OF PSNR FLUCTUATIONS IN EACH PROGRAM (CIF RESOLUTION,

, , , HIERARCHICAL-B STRUCTURE)

TABLE VAVERAGE PSNR, VARIANCE OF PSNR IN EACH GOP OF DIFFERENT SEQUENCES, AND SUM OF PSNR FLUCTUATIONS IN EACH SEQUENCE (HD RESOLUTION,

, , , IPP P STRUCTURE)

to minimize the cost function in (11) and balance between inter-program fairness and intra-program smoothness. is the param-

eter to adjust the importance between the two. A higher value ofwill make the algorithmmore prone to uniform quality among

Page 9: Joint Rate Allocation for Statistical Multiplexing in Video Broadcast Applications

YAO et al.: JOINT RATE ALLOCATION FOR STATISTICAL MULTIPLEXING IN VIDEO BROADCAST APPLICATION 425

Fig. 6. Quality fluctuation at GOP boundaries in each video sequence at CIF resolution with the . (a) Foreman. (b) Football.(c) M&D. (d) Harbor.

Fig. 7. Quality fluctuation at GOP boundaries in each video sequence at HD resolution with the . (a) Crew. (b) Bigship. (c)Raven. (d) Shuttle_Start.

videos. Tables IV and V list the average PSNR of each video andthe variance of all the PSNRs in each GOP. The summation ofquality fluctuations at GOP boundaries for each video is alsoshown in the tables for the two methods. The sum of fluctuationis calculated as

(22)where is the total number of GOPs tested in theexperiments.The variance of PSNRs in each GOP reflects the performance

of inter-program fairness. A smaller variance means a moreuniform quality level and a better inter-program fairness is ob-tained. The sum of fluctuation is to show the quality variations ineach video. A smaller fluctuation represents a better intra-pro-gram smoothness. The scheme in [19] only aims to minimizethe quality variance of all the videos in each GOP but it doesn’tput any control on the quality variation in a particular video.Therefore, it can be seen that the proposed scheme achievessmaller quality fluctuation in every video sequence, whereasthe variance of PSNRs in every GOP is still comparable ascompared with the scheme in [19]. For many GOPs of CIF se-quences as shown in Table IV, the proposed scheme even attainssmaller variance than [19]. This is because the scheme in [19]inefficiently encodes the base layers of low complexity videos“Foreman” and “M&D” at higher bit rates than necessary due toinappropriate base layer bit allocation. Although more channelbandwidth is then allocated to the higher complexity videos, areconstruction from their truncated bitstreams (comprising thebase layer and partial enhancement layers) still results in sub-stantially lower qualities than that of reconstructing low com-plexity videos whose bitstreams contain only a base layer. Itagain verifies that a better base layer bit allocation with a more

accurate complexity measure is beneficial for obtaining a betteroverall performance. In addition to the comparison on Sum ofFluctuations, we also plot out the detailed quality fluctuation ateach GOP boundary for all the videos in Figs. 6 and 7 to eval-uate the intra-program smoothness. It can be seen that the pro-pose scheme gets smaller quality fluctuation at GOP boundariesin general and bigger improvement can be observed in the lowcomplexity videos.

VI. CONCLUSION

A joint rate allocation scheme for multiprogram video codingis proposed in this paper. The scheme is developed on a plat-form where the scalable extension of H.264/AVC is appliedto compress the video programs in order to avoid complicatedre-encoding and transcoding. A new complexity measure thatincorporates the characteristics of Human Visual System is in-troduced for base layer bit-rate allocation. Benefit from a betterbase layer bit allocation, the system achieves a more uniformquality level for all multiplexed videos when channel bit-rate islow. A new 2-D channel bit-rate allocation proposed in the paperis designed on the basis of ensuring a comparable quality levelfor all videos while minimizing the quality variations withineach video. It also provides a way to adjust the balance be-tween smoothness and fairness performance. The overall per-formance of the proposed system has been evaluated via simula-tions and compared with existing methods. Experimental resultsshow that the proposed system obtains a notable decrease inthe intra-program quality fluctuation, and still maintains a com-parably good inter-program fairness. In applications of videobroadcasting, the proposed scheme is very useful to provide ahigh degree of comfort by bringing minimum quality fluctua-tions to clients either when they switch from one TV programto another, or just stay in the same TV program.

Page 10: Joint Rate Allocation for Statistical Multiplexing in Video Broadcast Applications

426 IEEE TRANSACTIONS ON BROADCASTING, VOL. 58, NO. 3, SEPTEMBER 2012

TABLE OF ACRONYMS

APPENDIX A

See table at the top of this page.

REFERENCES[1] L. Böröczky, A. Y. Ngai, and E. F. Westermann, “Statistical multi-

plexing using MPEG-2 video encoders,” Int. Bus. Machines J. Res.Develop., vol. 43, no. 4, pp. 511–520, Jul. 1999.

[2] L. Wang and A. Vincent, “Joint rate control for multiprogram videocoding,,” IEEE Trans. Consum. Electron., vol. 42, no. 3, p. 300C305,Aug. 1996.

[3] J. Yang, X. Fang, and H. Xiong, “A joint rate control scheme for H.264encoding of multiple video sequences,” IEEE Trans. Consum. Elec-tron., vol. 51, no. 2, p. 617C623, May 2005.

[4] M. Perkins and D. Arnstein, “Statistical multiplexing of multipleMPEG-2 video programs in a single channel,” Soc. Motion PictureTelevision Engineers J., vol. 104, no. 9, p. 596C599, 1995.

[5] Z. He and D. O. Wu, “Linear rate control and optimum statistical mul-tiplexing for H.264 video broadcast,” IEEE Trans. Multimedia, vol. 10,no. 7, p. 1237C1249, Nov. 2008.

[6] L. Böröczky, A. Y. Ngai, and E. F. Westermann, “Joint rate controlwith look-ahead for multiprogram video coding,” IEEE Trans. CircuitSyst. Video Technol., vol. 10, no. 7, pp. 1159–1163, Oct. 2000.

[7] E. N. Linzer and A. Wells, “Statistical multiplexed video encodingusing preencoding a priori statistics and a priori and a posteriori sta-tistics,” U.S. Patent 6 094 457, Jul. 2000.

[8] Z. G. Li, C. Zhu, F. Pan, G. N. Feng, X. K. Yang, S. Wu, and N. Ling,“A novel joint rate control scheme for the coding of multiple real timevideo programs,” in Proc. Distributed Computing Syst. Workshops.,Nov. 2002, pp. 241–245.

[9] “Information technology-Generic coding of moving pictures and as-sociated audio information—Part 2: Video ITU-T RecommendationH.262 and ISO/IEC 13818-2 (MPEG-2 Video), Nov. 1994.

[10] “Information technology-Coding of audio-visual objects—Part 10:Advance video coding,” ITU-T Recommendation H.264 and ISO/IEC14496-10 (MPEG-4 AVC), Version 1: May. 2003, Version 2: Jan.2004, Version 3: Sep. 2004, Version 4: Jul. 2005.

[11] Information Technology-Coding of Audio-Visual Objects—Part10: Advance Video Coding; Amendment 3 Scalable Video Coding,ISO/IEC 14496-10:2005/AMD3.

[12] T. Paridaens, D. De Schrijver, W. De Neve, and R. Van de Walle,“XMLdriven bit-rate adaptation of SVC bit-streams,” in Proc. Int.Workshop Image Anal. Multimedia Interactive Services, 2007, p.49C52.

[13] S. K. Srinivasan, J. Vahabzadeh-Hagh, and M. Reisslein, “The effectsof priority levels and buffering on the statistical multiplexing ofsingle-layer H.264/AVC and SVC encoded video streams,” IEEETrans. Broadcast., vol. 56, no. 3, pp. 281–287, Sep. 2010.

[14] C. Heyaime-Duvergé and V. K. Prabhu, “Statistical multiplexing of up-stream transmissions in DOCSIS cable networks,” IEEE Trans. Broad-cast., vol. 56, no. 3, pp. 296–310, Sep. 2010.

[15] H. Lahdili, H. Najaf-Zadeh, and L. Thibault, “Statistical multiplexingfor digital audio broadcasting applications,” IEEE Trans. Broadcast.,vol. 56, no. 1, pp. 19–27, Mar. 2010.

[16] G. Van der Auwera and M. Reisslein, “Implications of smoothing onstatistical multiplexing of H.264/AVC and SVC video streams,” IEEETrans. Broadcast., vol. 55, no. 3, pp. 541–558, Sep. 2009.

[17] X. Yang and N. Ling, “Statistical multiplexing based on mpeg-4 finegranularity scalability coding,” J. VLSI Signal Process. Syst., vol. 42,no. 1, p. 69C77, 2006.

[18] M. Jacobs, S. Tondeur, T. Paridaens, J. Barbarien, R. Van deWalle, andP. Schelkens, “Statistical multiplexing using SVC,” in Proc. IEEE Int.Symp. Broadband Multimedia Syst. Broadcast., 2008, p. 1C6.

[19] Y. Wang, L.-P. Chau, and K.-H. Yap, “Joint rate allocation for mul-tiprogram video coding using FGS,” IEEE Trans. Circuit Syst. VideoTechnol., vol. 20, no. 6, pp. 829–837, Jun. 2010.

[20] T. Berger, Rate Distortion Theory. Englewood Cliffs, NJ: Prentice-Hall, 1971.

[21] W. Zhou, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Imagequality assessment: from error measurement to structural similarity,”IEEE Trans. Image Process., vol. 13, no. 4, pp. 600–613, Apr. 2004.

[22] S. Chikkerur, V. Sundaram, M. Reisslein, and L. J. Karam, “Objec-tive video quality assessment methods: A classification, review, andperformance comparison,” IEEE Trans. Broadcast., vol. 57, no. 2, pp.165–182, Jun. 2011.

[23] S. S. Hemami and A. R. Reibman, “No-reference image and videoquality estimation: Applications and human-motivated design,” SignalProcessing: Image Communication, vol. 25, no. 7, pp. 469–481, Aug.2010.

[24] K. Seshadrinathan, T. Soundararajan, A. C. Bovik, and L. K. Cormack,“Study of subjective and objective quality assessment of video,” IEEETrans. Image Process., vol. 19, no. 6, pp. 1427–1441, Jun. 2010.

[25] A. C. Brooks, X. Zhao, and T. N. Pappas, “Structural similarity qualitymetrics in a coding context: Exploring the space of realistic distor-tions,” IEEE Trans. Image Process., vol. 17, no. 8, pp. 1261–1273,Aug. 2008.

[26] Y.-H. Huang, T.-S. Ou, P.-Y. Su, and H. H. Chen, “Perceptualrate-distortion optimization using structural similarity index as qualitymetric,” IEEE Trans. Circuits Syst. Video Technol., vol. 20, no. 11, pp.1614–1624, Nov. 2010.

[27] Z. G. Li, W. Yao, S. L. Xie, and S. Rahardja, “Robust image similarityindices via intensity mapping functions and image partition,” presentedat the Asia-Pacific Signal and Information Processing Association An-nual Summit and Conference, Dec. 2010.

[28] W. Li, “Overview of fine granularity scalability in MPEG-4 video stan-dard,” IEEE Trans. Circuits Syst. Video Technol., vol. 11, no. 3, p.301C317, Mar. 2001.

[29] H. Schwarz, D. Marpe, and T. Wiegand, “Overview of the scalablevideo coding: Extension of the H.264/AVC standard,” IEEE Trans. Cir-cuit Syst. Video Technol., vol. 17, no. 9, p. 1103C1120, Sep. 2007.

[30] W. J. Kim, J. W. Yi, and S. D. Kim, “A bit allocation method based onpicture activity for still image coding,” IEEE Trans. Image Process.,vol. 8, no. 7, p. 974C977, Jul. 1999.

[31] J. Sun, W. Gao, D. Zhao, and W. Li, “On rate-distortion modeling andextraction of H.264/SVC fine-granular scalable video,” IEEE Trans.Circuit Syst. Video Technol., vol. 19, no. 3, p. 323C336, Mar. 2009.

[32] JSVM 9.8 Software, ISO/IEC JTC 1/SC 29/WG 11 W8752, MPEGCommittee, 2007.

Wei Yao (S’07) received the B.Eng. degree in elec-trical and electronic engineering fromNanyang Tech-nological University (NTU), Singapore in 2004. Sheis currently pursuing the Ph.D. degree in electronicengineering at NTU.She joined the Institute for Infocomm Research,

Agency for Science, Technology, and Research,Singapore, as a Research Engineer in 2004. Herresearch interests includes digital video compressionand transmission, and High Dynamic Range (HDR)image compression and processing.

Lap-Pui Chau (SM’03) received the B.Eng. degreewith first class honors in Electronic Engineeringfrom Oxford Brookes University, England, and thePh.D. degree in Electronic Engineering from HongKong Polytechnic University, Hong Kong, in 1992and 1997, respectively.In June 1996, he joined Tritech Microelectronics

as a Senior Engineer. Since March 1997, he joinedCentre for Signal Processing, a national researchcenter in Nanyang Technological University as aResearch Fellow, subsequently he joined School of

Electrical & Electronic Engineering, Nanyang Technological University as anAssistant Professor and currently, he is an Associate Professor. His researchinterests include video communications, fast signal processing algorithmsand multimedia compression, and VLSI for signal processing. He involvedin organization committee of international conferences including the IEEEInternational Conference on Image Processing (ICIP 2010, ICIP 2004), andIEEE International Conference on Multimedia & Expo (ICME 2010). He is aTechnical Program Co-Chair for 2010 International Symposium on Intelligent

Page 11: Joint Rate Allocation for Statistical Multiplexing in Video Broadcast Applications

YAO et al.: JOINT RATE ALLOCATION FOR STATISTICAL MULTIPLEXING IN VIDEO BROADCAST APPLICATION 427

Signal Processing and Communications Systems (ISPACS 2010). Besides,he also served as track chairs in technical program committee for manyinternational conferences regularly.He is a senior member of the IEEE, the Chair of Technical Committee on

Circuits & Systems for Communications (TC-CASC), a member of TechnicalCommittee on Multimedia Systems and Applications (TC-MSA) and TechnicalCommittee on Visual Signal Processing and Communications (TC-VSPC) ofIEEE Circuits and Systems Society. He was the chairman of IEEE SingaporeCircuits and Systems Chapter from 2009 to 2010. He served as a member ofSingapore Digital Television Technical Committee from 1998 to 1999. Heserved as an associate editor for IEEE TRANS. MULTIMEDIA, and is currentlyserving as an associate editor for IEEE TRANS. CIRCUITS AND SYSTEMS FORVIDEO TECHNOLOGY, IEEE TRANSACTIONS ON. BROADCASTING, IEEE SIGNALPROCESSING LETTERS and IEEE TECHNOLOGY NEWS. Besides, he is IEEEDistinguished Lecturer for 2009–2013, and a steering committee member ofIEEE TRANSACTIONS FOR MOBILE COMPUTING from 2011–2012.

Susanto Rahardja (M’97–SM’03–F’11) receivedthe B.Eng. degree from the National Universityof Singapore in 1991, and the M.Eng. and Ph.D.degrees from the Nanyang Technological Universityin 1993 and 1997, respectively, all in electricalengineering.He is currently the Deputy Executive Director (Re-

search) and Head of Signal Processing Departmentat the Institute for Infocomm Research in Agency forScience, Technology and Research, Singapore. In hiscapacity as the Deputy Executive Director, he cur-

rently leads 10 research departments with more than 500 research scientists andengineers. He was involved in multimedia standardization activities in which hecontributed technologies for scalable to lossless audio compression and losslessonly coding which are adopted and published as normative international stan-dards in ISO/IEC 14496-3:2005/Amd.3:2006 and ISO/IEC14496-3:2005/Amd.2:2006 respectively. He has published more than 250 internationally refereedjournals and conference papers in the area of multimedia signal processing anddigital communications.From 2007–2011, Dr. Rahardja served as the Associate Editors for the IEEE

TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING and the IEEETRANSACTIONS ON MULTIMEDIA. He is currently serving as Editorial BoardMember of the Elsevier Journal of Visual Communication and Image Repre-sentation and APSIPA Transactions on Signal and Information Processing. Hehas received several awards, including the IEE Hartree PremiumAward in 2002,the Tan Kah Kee Young Inventors’ Open Category Gold award in 2003, the Na-tional Technology Award in 2007 andA STARMost InspiringMentor Award in2008. He is currently the President of SIGGRAPH Singapore Chapter (SSC) andSoutheast Asia Graphics (SEAGRAPH) society, Board of Governor of Asia-Pa-cific Signal and Information Processing Association (APSIPA), Member of theManagement Board of Interactive Digital Media Institute at the National Uni-versity of Singapore and Council member of National IT Standards Committeein Singapore. He was the Chair of SPIE Optics East Symposium for 2006 and2007, Government and Industry Advisor for ACM SIGGRAPH ASIA 2008,General Chair for ACM SIGGRAPH VRCAI 2008, General Chair of APSIPA2nd Summit and Conference 2010 andwas appointed the General Chair of ACMSIGGRAPH ASIA 2012. He holds an adjunct appointment of Full Professor atthe National University of Singapore and is a Fellow of the Institute of Elec-trical and Electronics Engineers (IEEE).