Bit Allocation

7/27/2019 Bit Allocation

http://slidepdf.com/reader/full/bit-allocation 1/10

Efficient Bit Allocation for Dependent Video

Coding

Yegnaswamy Sermadevi and Sheila S. Hemami

School of Electrical and Computer Engineering

Cornell University, Ithaca, NY 14853

Email: {swamy,hemami}@ece.cornell.edu

Abstract

A steepest descent based bit allocation method with polynomial iteration com-

plexity for minimizing the sum of frame distortions under a total bit rate constraint

is presented for a predictive video coder. Previous algorithms that solve this problem

have been based on Lagrangian Relaxation followed by unconstrained optimization.

Some of these methods have complexity exponential in the prediction depth in the

worst case. Moreover, the convergence properties of the faster methods are difficult to

analyze. A steepest descent method is utilized here for its comparatively low compu-

tational complexity, excellent performance and ease of analysis. Sufficient conditions

for global optimality are presented without assuming independent encoding of frames.

Results for dependent bit allocation through the assignment of quantizer step-sizes to

frames in MPEG-2 encoded sequences suggest that these optimality conditions are to

a large extent satisfied in practice. A PSNR improvement of up to 1.5 dB is obtainedover the standard TM5 rate control algorithm for MPEG-2.

1 Introduction

Optimizing the allocation of bits according to the complexity of frames in a video sequence

is crucial in order to guarantee competitive rate-distortion performance. Bit allocation for a

predictively encoded set of frames is especially difficult since the rate-distortion curves for

predicted frames depend on the reference frame allocation. The Lagrangian Optimization

technique [1] that is highly efficient in the independent case (i.e. without predictive coding)

becomes significantly more cumbersome when applied to the dependent coding framework.In this work, an alternate method based on a steepest descent algorithm is proposed to solve

the dependent coding problem, and sufficient conditions for the optimality of the algorithm

are investigated.

The paper is organized as follows. Section 2 provides some background for the depen-

dent coding problem. Section 3 introduces a steepest descent algorithm and investigates

conditions under which the algorithm is optimal. Results are presented in section 4. Sec-

tion 5 concludes the paper.



2 Background

An efficient algorithm, based on Lagrangian relaxation, for bit allocation among a set of

quantizers was presented in [1]. Assuming that signal blocks are quantized independently,

the total distortion is minimized subject to a bit rate constraint via the optimal choice of

quantization parameters from an admissible set. Denoting the quantization parameter cor-responding to the kth quantizer by Qk and the associated distortion and rate by Dk(Qk) and

Rk(Qk), this problem is formulated as

Q̄∗ = (Q∗

1, . . . , Q∗

n) = argmin(Q1,...,Qn)

n

k=1

Dk(Qk) (1)

subject ton

k=1

Rk(Qk) ≤ Rtotal (2)

where Rtotal is a constraint on the total number of bits to be used.

Lagrangian optimization relaxes the constrained problem to an easier unconstrainedversion by incorporating the constraint function into the objective by using a “Lagrange”

multiplier λ ≥ 0.

Q̄∗ = argmin(Q1,...,Qn)

n

k=1

Dk(Qk) + λ

n

k=1

Rk(Qk) (3)

If an optimal solution to the unconstrained problem for a particular λ ≥ 0 has rate equal

to Rtotal, it is also optimal for the original constrained problem ([1]). The unconstrained

problem (3) can be solved efficiently by separately minimizing each component in the sum.

Q̄k∗

= argminQk

[Dk(Qk) + λRk(Qk)] (4)

A solution to the constrained problem is found by searching for the λ that gives the largest

rate value less than or equal to Rtotal .

Sweeping λ from 0 to∞ generates the entire lower convex hull of achievable distortion-

rate pairs. This set of solutions can be efficiently generated by the generalized BFOS [2, 3]

algorithm by formulating the problem in terms of tree-pruning.

The Lagrangian optimization technique was applied to the dependent coding framework

in [4]. In the dependent coding formulation the distortion and rate functions for the k th

quantizer Dk(·) and Rk(·) depend on the quantization parameter choices for prior signal

blocks (frames in the context of video coding).

Q̄∗ = argmin(Q1,...,Qn)

D(Q̄) =n

k=1

Dk(Q1, . . . , Qk) (5)

subject to R(Q̄) =n

k=1

Rk(Q1, . . . , Qk) ≤ Rtotal (6)

It is no longer easy to solve (5-6) using the form (4) due to the dependencies in the distor-

tion and rate functions. Therefore, in the worst case, a full search over all possible choices



for the quantization parameters is necessary to solve the problem optimally. This has com-

plexity exponential in the prediction depth. This complexity is significantly reduced in [4]

by pruning the search tree based on the observation that distortion-rate functions for pre-

dicted frames are usually monotonic in the quantization step-size applied to the reference

frame. However, it is not easy in general to bound the complexity of the pruned tree search.

An important fact highlighted in [4] is that unlike the independent case, the data gath-ering phase in predictive coding is computationally highly expensive. Therefore, it is im-

portant to be able to sample as few points as possible while trying to compute the optimal

solution. A model-based method for dependent coding using reduced sampling is presented

in [5]. A Lagrange parameter λ ≥ 0 is introduced in order to relax buffer constraints, and

a gradient descent method is used to find a locally optimal solution to the resulting un-

constrained problem. A recursive algorithm similar to the coordinate-wise steepest descent

algorithm presented here is used in [6] for the selection of JPEG quantization matrices. It

is observed that if the problem can be expressed in the form (1), the algorithm is optimal.

The algorithms in [6] and in this work are closely related to the BFOS algorithm.

3 Algorithm

To derive some intuition about the dependent coding problem, consider the distortion-rate

surface in Figure 1 which plots the mean squared error (MSE) as a function of the individual

frame rates for an I (intra coded) and a P (predictively coded) frame. The set of solutions

of least distortion for different values of the total bit rate essentially form a “path” on this

surface. The basic motivation behind the steepest descent algorithm is to track the solutions

on this path through the use of a local search. Sampling is automatically reduced by this

procedure if a small enough search radius can be used.

To further motivate the algorithm and the conditions under which it is optimal, the

following definitions are made.

Definition 1 Let D(Q̄) = D(Q1, Q2, . . . , Qn) denote the distortion between the original

video sequence and the sequence with the ith frame quantized using quantization parameter

Qi. Similarly, let R(Q̄) = R(Q1, Q2, . . . , Qn) denote the rate function.

D(Q̄) and R(Q̄) are assumed to be strictly increasing and decreasing respectively in

each component variable. The Qi are assumed to take values in the finite set {1, . . . , M i}.

Definition 2 Let e j denote the n-vector with a 1 in the j th position and 0 elsewhere. The

slope in the jth direction at Q̄ is defined as s j(Q̄) = −D(Q̄−ej)−D(Q̄)

R(Q̄−ej)−R(Q̄).

3.1 Steepest Descent (SD) Algorithm

1. Set all components of the quantizer parameter vector Q̄ to their maximum values M i.

Include indices 1, . . . , n in the active list of indices that can be reduced.

2. Remove from the active list indices for which the quantization parameter cannot be

reduced or where the reduction by one would cause a rate constraint violation. Stop

if the active list is empty.



05

1015 x 10

5

05

1015

x 105

0

200

400

600

800

1000

Rate of I−frame (bits)Rate of P−frame (bits)

O v e r a l l M e a n D i s t o r t i o n ( M S E )

(a)

)(Q R

)

(

Q

D

)( R DQ

j

)(1 R DQ j

(b)

Figure 1: (a) Distortion-Rate surface for two MPEG-2 encoded frames. The first frame

is intra (I) coded, and the second frame is forward predictively coded (P). (b) Switching

between optimal D-R curves with the jth quantizer step-size fixed.

3. Let j be an index that has the largest value of s j(Q̄) among all active indices. Set

Q̄ ← Q̄ − e j .

4. Repeat from step 2.

This algorithm is a coordinate-wise steepest descent method. Step 3 of the algorithm

computes the index that provides the largest ratio of distortion decrement to rate increment

for a reduction in the quantization parameter by a single step. This gives local information

about how the change in parameter allocation within a frame affects the overall distortion

and rate of the sequence as a whole. Starting at the lowest rate solution at Step 1, solutions

at higher rates are recursively derived until all the available rate is spent. In the independent

coding case, the BFOS algorithm computes the index change that is globally steepest. In

the BFOS algorithm each index can be reduced by any (feasible) number of steps in order

to handle non-convexities in the distortion-rate functions. However, though more complex

local search methods can easily be envisioned, only a single step reduction is allowed in

the SD algorithm in order to reduce the computational complexity.

3.2 Conditions for Optimality

Simple conditions that imply global optimality of the steepest descent method can readily

be derived by considering the limitations of the algorithm: a quantization parameter is never

increased once it has been decreased, exactly one quantization parameter is decreased by a

single step at each iteration, and a quantization parameter that does not result in the steepest

slope (Def. 2) is not reduced at any given iteration. Therefore, the steepest descent method



0 0.5 1 1.5 2 2.50

100

200

300

400

500

600

700

800

900

Overall Sum Rate (Mbits)


Fixed I frame quantizer

(a)

0 0.5 1 1.5 2 2.50

100

200

300

400

500

600

700

800

900

Overall Sum Rate (Mbits)


Fixed P frame quantizer

(b)

Figure 2: MPEG-2 encoded frames with first frame coded I and second frame coded P: (a)

MSE vs. Rate (fixed I-frame quantizer) (b) MSE vs. Rate (fixed P-frame quantizer)

is shown to be optimal for a suitably restricted class of problems. The following definitions

make these ideas precise and give some desirable properties for the distortion and rate

functions.

Definition 3 A distortion-rate pair (D, R) is said to be achievable with respect to a finite

set of quantizer parameter vectors S if there exists Q̄ ∈ S such that D = D(Q̄) and

R = R(¯

Q).

The elements of S are called achievable vectors or just vectors when the meaning is clear

by context.

Definition 4 Dc(R) denotes the lower convex hull of achievable solutions with respect to

the set of all quantizer parameter vectors.

Definition 5 Du j (R) denotes the lower convex hull of achievable solutions with respect to

the set of quantizer parameter vectors with the j th component fixed to u.

The following properties are useful in describing the optimality conditions:

• (D(Q̄), R(Q̄)) is said to have the cross-over property if for each j and v < u,

Dv j (R) < Du

j (R) for sufficiently large R. Moreover, if Dv j (R) ≤ Du

j (R) for some

R0, Dv j (R) < Du

j (R) ∀R > R0.

This property states that the optimal curves with the j th quantization parameter fixed

at two different values cross at most once and that for sufficiently high rates the curve

corresponding to the smaller value stays below the other curve. Figures 2 (a) and (b) show



the set of distortion-rate values for different fixed values of I-frame and P-frame quantizer

parameters respectively, with the quantizer parameter of the other frame allowed to vary

over all possible choices (1-31). Taking the convex hull of these solutions gives Du1 (R) in

Figure 2 (a) and Du2(R) in Figure 2 (b) for different fixed values of u. Intuitively, at lower

rates, the reduction in distortion due to a smaller quantization parameter is not enough to

offset the increase in rate. However, at large enough rates, it is better to use the smallerquantization parameter.

• Let the cross-over rate Rc(u,v ,j) between two curves Du j (R) and Dv

j (R) be defined

as the smallest rate such that Dv j (R) ≤ Du

j (R) if u > v. The cross-over rate is

assumed not to be achievable. (D(Q̄), R(Q̄)) is said to be cross-over ordered if for

each j and u > v > w, Rc(v , w , j) is greater than the rate of the first achievable

solution on Dv j (R) with rate greater than or equal to Rc(u,v ,j).

This condition states that the cross-over rates of the optimal curves with the j th quan-

tizer fixed are inversely ordered with respect to the quantization parameters. Figure 2 shows

that the 2-frame sequence satisfies the cross-over ordering property.

• (D(Q̄), R(Q̄)) is said to have the reachability property if (i) For all j and u, the

vectors corresponding to two adjacent achievable solutions (i.e. solutions with no

other achievable solutions in between) on Du j (R) differ in exactly one index. (ii)

Given a vector q̄ corresponding to the highest rate achievable solution less than or

equal to Rc(u, u− 1, j) on Du j (R), the vector q̄ − e j gives the lowest rate achievable

solution on Du−1 j (R) with rate greater than or equal to Rc(u, u − 1, j).

The reachability condition is quite artificial because the SD algorithm is inherently

myopic. It guarantees that the algorithm can move to the first achievable solution on the

curve Du−1 j (R) after the cross-over rate Rc(u, u − 1, j) and that the algorithm properly

detects cross-over points. In general, a similar device is required to certify that a local

search over a sufficiently large radius can recover an optimal solution at a higher rate. The

reachability condition was satisfied for all tested 2-frame sequences. This condition is in

general not satisfied exactly for longer sequences. However, experimental results suggest

that the SD solutions are still close to optimal (See Section 4). To analyze this behavior

consider Figure 1 (b). An optimal algorithm must switch from point e to point f . If, for

instance, point c is reachable from point a in one step whereas point d is not, and if all one

step reductions in the quantization parameter have slope less than that of segment ab, the

algorithm will switch to point c instead, since the segment ac has a larger absolute slope as

compared to the segment ab. Likewise, if the algorithm is at point e, it will not be able toswitch to point f if the latter cannot be reached in one step. It may switch at a higher rate

( point h is assumed to be reachable from point g). Intuitively, it is possible to recover from

these errors if they do not occur often.

Theorem 1 The steepest descent algorithm will find all achievable (D, R) values on the

lower convex hull with rate less than or equal to Rtotal if the cross-over, cross-over ordering

and reachability conditions hold.



Proof of Theorem 1 The proof proceeds by induction. To begin, assume that q̄ is such

that (D(q̄ ), R(q̄ )) i s o n Dc(R) and that all achievable solutions (with corresponding vectors)

on Dc(R) with R < R(q̄ ) are known. Also, for each achievable (D, R) on Dc(R) with

R > R(q̄ ), let there exist a corresponding vector q̄ with every component of q̄ i ≤ q̄ i ∀i.

That is, each convex hull solution at a larger rate is accessible to the SD method from

the current solution vector q̄ . This clearly holds at the first iteration by the monotonicityassumptions on D(Q̄) and R(Q̄).

Let j be the index that gives the largest value of s j(q̄ ) (Def. 2). Let q̄ = q̄ −e j . The first

part of the reachability condition, the monotonicity assumptions on D(Q̄) and R(Q̄), the

convexity of Dq̄j j (R), and the fact that the jth direction is the steepest imply that D

q̄j j (R)

is supported by the line connecting (D(q̄ ), R(q̄ )) and (D(q̄ ), R(q̄ )). Therefore, R(q̄ ) is

certainly a rate for which Dq̄

j

j (R) ≤ Dq̄j j (R). Both parts of the reachability condition

together imply that q̄ gives the lowest rate achievable solution on Dq̄

j

j (R) with rate R >

Rc(q̄ j, q̄ j − 1, j).

By the cross-over property, Dq̄

j

j (R) < Dq̄j j (R) for all R > Rc(q̄ j, q̄ j − 1, j). Therefore,

none of solutions in D q̄j j (R) for R > Rc(q̄ j , q̄ j − 1, j) can be on the Dc(R) curve. The

cross-over ordering condition and the cross-over property together imply that none of the

achievable solutions of Dv j (R) for v < q̄ j are on Dc(R) for R < R(q̄ j). Since no convex

hull solutions are lost, convex hull solutions at rates larger than R(q̄ j) are still accessible

from the current solution. The proof can be completed by noting that the algorithm termi-

nates in a finite number of steps.

Under the conditions given in Theorem 1, the SD algorithm switches between the “op-

timal” curves with one variable fixed (Du j (R)) so that for each rate region between two

cross-over rates it picks the curve that has the least distortion of all other curves in that

region. For instance, when the SD method moves along index j in an iteration, it optimallyswitches from the curve Du

j (R) to the curve Du−1 j (R) for some value of u. Hence, every

steepest descent index choice can be interpreted in terms of a curve-switch.

The conditions in the theorem are satisfied in the independent case (1) if the achievable

distortion-rate pairs (Di(Qi), Ri(Qi)) are all on the convex hull (this can be done by suit-

ably re-indexing the quantizer parameter values), the line segments on the convex hull have

distinct slopes (across all i), and the cross-over rates are not achievable. However, not all

conditions are necessary for the algorithm to work. In the independent case, this algorithm

reduces to a variant of the BFOS algorithm as applied to bit allocation [3].

3.3 ComplexityThe worst case complexity of the steepest descent algorithm in terms of the number of

calls to the encoder is |Q|N M , where |Q| is the maximum number of possible values for

a component of the quantizer parameter vector, N is the number of frames and M is the

maximum number of frames affected by a change in allocation of a single frame. This is

because there are at most |Q|N steps in the algorithm as one component of the quantizer

parameter vector is reduced by one at each step. Also, at each step, slopes need to be

computed only for those frames (at most M ) that depend on the frame whose quantization



parameter was most recently reduced. Consequently, the encoder does not have to recode

a large part of the sequence. The algorithm has O(|Q|M N log N )) complexity in terms of

choosing the steepest direction given the slope data, where again there are |Q|N iterations

in the worst case and O(M log N ) complexity at each iteration to replace M slopes in a

sorted list of slopes. The complexity can be significantly decreased by taking longer steps,

by changing more than one component of the quantizer parameter vector at a time, and byfixing motion vectors. The original algorithm, however, gave the best results when applied

to MPEG-2 encoding.

In comparison, the pruned tree search [4] algorithm applied to a Group of Pictures in

MPEG-2 has a worst case number of calls to the encoder of O(|Q|R+1), where |Q| is as

before and R is the number of I and P frames in the GOP with an I P B B P B B . . .

structure. Though the algorithm is empirically orders of magnitude faster, it is not easy

to bound this complexity. The model-based method in [5] is about twice as complex as

the standard TM5 (Test Model 5) [7] algorithm and can be used for on-line applications.

However, the method is sensitive to starting conditions and relies on a good initial solution,

which is not an unreasonable assumption in practice.

4 Results

Results are shown for an MPEG-2 encoder with the Mean Squared Error (MSE) dis-

tortion criterion at the frame-level and the sum of frame distortions as the objective to

minimize over the whole sequence. The quantizer step-size is adjusted only at the frame

level. Though this implementation does not explicitly optimize the selection of macroblock

modes, such selection methods can be incorporated without any change in the basic proce-

dure [8].

To verify that the steepest descent algorithm produces near optimal solutions in practice,

an “empirical induction” procedure is performed. Figure 3 (a) shows curves for a two frame

sequence consisting of an I-frame and a P-frame. The curves plot distortion versus rate for

fixed I-frame quantizer scale as the P-frame quantizer scale varies. For the two frame case,

this exhausts all possible allocations. As mentioned previously, the SD solution is optimal

in this case. This procedure can be extended to three frames by applying the steepest

descent method to optimize over two frames while holding the quantization scale of the

third frame fixed. The result can then be compared with steepest descent directly applied to

three frames. Figure 3 (b) shows this for the six frame case, where the I-frame quantization

value is fixed and steepest descent is used to optimize the subsequent five P frames. It is

clear that steepest descent solution directly applied to six frames (labeled SD) is nearly

optimal.Note that the algorithm can start at the lowest rate solution and allocate bits, or equiv-

alently, start at the highest rate solution and deallocate bits optimally. For all tested se-

quences essentially the same curve was recovered in either direction. These results suggest

that this is likely the best solution obtainable for quantizer step-size selection at the frame

level assuming a fixed motion-estimation and mode selection algorithm.

The SD algorithm was used to encode a variety standard CIF/SIF test sequences using

an MPEG-2 coder. PSNR results for the SD algorithm and MPEG-2 Test Model 5 (TM5)



0 0.5 1 1.5 2 2.50

100

200

300

400

500

600

700

800

900

Rate (Mbits)

D i s t o r t i o n ( M S E

)

Mobile: 2 frames

Fixed I−frame quantization level

SD

(a)

0 1 2 3 4 5 6 70

100

200

300

400

500

600

700

800

900

Rate (Mbits)

D i s t o r t i o n ( M S E )

Mobile: 6 frames

Fixed I−frame quantization level

SD

(b)

Figure 3: (a) Mobile: MSE vs. Rate (2 frames) (b)Mobile: MSE vs. Rate (6 frames)

[7] are shown in Figure 4 (a) and (b) for 28 frames of the SIF (352x240) Football and Mo-

bile sequence respectively. Each sequence consists of two GOPs of size 13 and 15 with an

I P B B P . . . GOP structure. It was observed that at low rates reducing the quantization

step-size could actually decrease the total bit rate due to better prediction. The SD algo-

rithm was accordingly modified in order to detect and move along directions that reduced

the rate when the quantization step-size was decreased. To make a fair comparison between

the SD method and TM5 in terms of gains due to better bit allocation among frames, the

results for the TM5 allocation optimized within frame for MSE are plotted under the labelTM5+IFO in Figure 4. The steepest descent method gives up to to 1.5 dB improvement in

PSNR over TM5 and up to 0.8 dB over the MSE optimized version of TM5. As expected,

SD has better PSNR at all rates.

The complexity of the SD algorithm is a function of the total bit rate target as it starts

from the lowest rate solution and moves toward higher rate solutions at each iteration. On

a Pentium 4, 3 GHz machine, the algorithm took 10 hours to generate the full distortion

versus rate curve for 28 frames using an MPEG-2 coder implemented in C.

5 Conclusion

An efficient steepest descent based bit allocation method was introduced for dependent

video coding. Sufficient conditions for global optimality were presented along with empir-

ical results that suggest that the steepest descent method produces near-optimal solutions in

practice. Results for MPEG-2 encoded video show up to 1.5 dB PSNR improvement over

TM5. Steepest descent as implemented here is an off-line algorithm. This algorithm along

with faster variations can be useful for benchmarking and archival applications.



0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.123

24

25

26

27

28

29

Rate (Mbps @ 30Hz)

P S N R

( m e a n M S E

) d B

Football: 28 frames

SDTM5+IFOTM5

(a)

0.2 0.4 0.6 0.8 1 1.219

20

21

22

23

24

25

26

Rate (Mbps @ 30Hz)

P S N R

( m e a n M S E

) d B

Mobile: 28 frames

SDTM5+IFOTM5

(b)

Figure 4: (a) Football: PSNR vs. Rate (28 frames) (b) Mobile: PSNR vs. Rate (28 frames)

References

[1] Y. Shoham, A. Gersho, “Efficient bit allocation for an arbitrary set of quantizers,” IEEE Trans. on

Acoustics, Speech and Signal Processing, vol. 36, no. 9, pp. 1445-1453, September 1988.

[2] P.A. Chou, T. Lookabaugh, R.M. Gray, “Optimal pruning with applications to tree-structured source

coding and modeling,” IEEE Trans. on Information Theory, vol. 35, no. 2, pp.299-315, March 1989.

[3] E.A. Riskin, “Optimal bit allocation via the generalized BFOS algorithm,” IEEE Trans. on Information

Theory, vol. 37, no. 2, pp. 400-402, March 1991.

[4] K. Ramchandran, A. Ortega, M. Vetterli, “Bit allocation for dependent quantization with applications

to multiresolution and MPEG video coders,” IEEE Trans. on Image Proc., vol. 3, no.5, pp. 533-545,

Sept. 1994.

[5] Liang-Jin Lin, A. Ortega, “Bit-rate control using piecewise approximated rate-distortion characteris-

tics,” IEEE Trans. on Circuits and Systems for Video Technology , vol. 8, no. 4, pp. 446-459, August

1998.

[6] S.W. Wu, A. Gersho, “Rate-constrained picture-adaptive quantization for JPEG baseline coders,” in

Proceedings of the ICASSP, 389-392, April 1993.

[7] ISO-IEC/JTC1/SC29/WG11/N0400. Test model 5, April 1993. Document AVC-491b, Version 2.

[8] G.J. Sullivan, T. Wiegand, “Rate-distortion optimization for video compression,” IEEE Signal Process-

ing Magazine, vol. 15, no. 6, pp. 74-90, November 1998.

Documents

Bit Allocation