38
ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

Embed Size (px)

Citation preview

Page 1: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

Multimedia Algorithms

Marilyn WolfDept. of Electrical Engr.Princeton University

Page 2: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

Outline

Compact disc player.Video compression.

Page 3: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

The multimedia processing funnel

Datavolume

Dataabstraction

pixel processing

principal component analysis,hidden Markov models

Edge extraction

Page 4: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

CD/MP3 player

AudioCPU

amp

Jogmemory

Errorcorrector

ServoCPU

Analogin

Analogout

FE, TE, amp

focus,tracking,sled,motor

head

drive

memory

memory

display

DAC

I2S

Page 5: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

CD medium

Rotational speed: 1.2-1.4 m/s (CLV).Track pitch: 1.6 microns.Diameter: 120 mm.Pit length: 0.8 -3 microns.Pit depth: .11 microns.Pit width: 0.5 microns.Laser wavelength: 780 nm.

Page 6: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Wayne Wolf

CD mechanism

Laser, lens, sled:

lase

r

CD

detectorsdiffraction

gratingsled

track

track

focus

Page 7: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

Laser focus

Focus controlled by vertical position of lens.

Unfocused beam causes irregular spot:

In focusOut of focus Out of focus

Page 8: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

Laser pickup

A

B

C

D

F

E

Side spotdetectors

Level:A+B+C+DFocus error:(A+C)-(B+D)Tracking error:E-F

Page 9: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

Servo control

Four main signals: focus (laser) @ 245 kHz; tracking (laser) @ 245 kHz; sled (motor): @ 800 Hz; Disc motor.

Optical pickup

Page 10: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

EFM

Eight-to-fourteen modulation: Fourteen-bit code guarantees a

maximum distance between transitions.

00000011 00100100000000

Page 11: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

Error correction

CD capacity: 6.99 GB raw, 700 MB formatted.

Reed-Solomon code: g(x) = (x-) (x- 2) … (x- n-k-1) (x- n-k)

Produces data, erasure bits.Time to solve varies greatly depending on

noise.CD interleaves Reed-Solomon blocks to

reduce effects of large data gaps.

Page 12: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

Control and error correction

Skips caused by physical disturbance. Wait for disturbance to subside. Retry.

Read errors caused by disc/servo problems. Detect error. Choose location for retry. Retry. Fail and interpolate.

Page 13: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

Audio output

Audio CD output straightforward. May perform D/A filtering in software.

MP3 decode is relatively straightforward. 10% of ARM7.

File system support for data CD is complex: PC/Mac. Arbitrary file structure.

Page 14: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

MPEG audio standards

Layer 1: Lossless compression of subbands +

optional simple masking modelLayer 2:

More advanced masking model.Layer 3:

Additional processing for lower bit rates.

Page 15: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

MPEG audio rates

Input sampling rates: 32, 44.1, 48 kHz.

Output bit rates: 23, 48, 64, 96, 112, 128, 192, 256, 384

kbits/sec.Output can be mono, dual-channel

(bilingual, etc.), stereo.

Page 16: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

Other standards

Dolby Digital (AC-3): Uses modified discrete cosine

transform.ATRAC (MiniDisc):

Uses subband + modified DCT.MPEG-2 AAC.

Page 17: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

Software implementations

Many standards with complex code. About 1 million lines of code required to

implement all major standards.Techniques are similar but details

vary. Variations from codec to codec. Parameter changes at run time---

window size, etc.

Page 18: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Wayne Wolf

MPEG Layer 1

384 samples/block at all frequencies. Equals 8 ms at 48 kHz.

Optional masking model. Driven by separate FFT for better

accuracy.

Page 19: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Wayne Wolf

MPEG Layer 1 data frame

Bit allocation codes specify word length in each subband.

Scale factors give gain for each band.

header CRCbit

allocationscale

factorssubband samples

auxdata

Page 20: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Wayne Wolf

MPEG Layer 1 encoder

Filterbank

ChooseScale factor

Maskingmodel

requantize*

FFT

mux

0101..

Page 21: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

MPEG Layer 1 decoder

0101..

demux

Scalefactor

* *

Stepsize

Inversefilterbank

inversequantize

expand

Page 22: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

MP3

Decoding is easier than encoding, but requires: decompression; filtering.

Basic CD standard for data discs.No standards for MP3 disc file

structure: player must understand Windows, Mac, Unix discs.

Page 23: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

Basic principles of MPEG-style compression

Discrete cosine transform (DCT) used to select perceptually significant information from blocks.

Motion estimation identifies temporal redundancy in frames.

Lossless (channel) coding reduces bit rate.

Page 24: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

MPEG-style compression engine

motionestimator

+ DCT Qvariablelengthcoder

buffer

Q-1

DCT-1

+

picturestore/

predictor

Page 25: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

Spatial frequency in 1D

highlow

Page 26: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

DCT

Discrete cosine transform: v(k) = (k) u(t) cos[(2t+1)k/2N]

2-D DCT can be computed from two 1-D DCTs.

1-D DCT can be computed in O(N log N) time.

Page 27: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

8-point DCT flowgraph (Lee)

x0

x7

x1

x6

x3

x4

x2

x5

y0

y2

y4

y6

y1

y3

y5

y7

C1

C3

C7

C5

C2

C6

C2

C6

C4

C4

C4

C4

C4

Page 28: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

DCT and quantization

DCT Q

Page 29: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Wayne Wolf

DCT coefficient quantization

DCT is used to throw out high spatial frequencies in an 8 x 8 block:

33 5 3 1 0 0 0 0

8 6 4 2 0 0 0 0

6

4

2

0

1

0

2 1 0 0 0 0 0

1 1 0 0 0 0 0

0 0 0 0 0 0 0

0 0 0 0 0 0 0

0 0 0 0 0 0 0

0 0 0 0 0 0 0

Page 30: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

Channel coding

Lossless encoding is applied to final bit stream to reduce bit rate.

Huffman-style encoding is used: variable-length code for common

symbols; escape code + fixed-length code for

less-common symbols.

Page 31: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

Block motion estimation

1

6

3

4 5

2 1

6

3

4 5

2

Page 32: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

Motion estmation, cont’d.

Two-dimensional correlation of a 16 x 16 macroblock within search range.

Best fit: abs(pb - ps)

Results in a motion vector which shows displacement of macroblock in search area.

Page 33: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

Search process

Page 34: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

ODFS and PLS algorithms

Page 35: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

CBAS and FE2SS

Center-biased adaptive search Fast and efficient 2 step search

Page 36: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

3SS related algorithms

E3SS differs from N3SS in that:1. A small diamond patter is used

instead of a square in the central area

2. Unrestricted search step for the small diamond rather than a single movement for the small square.

3. Test sequences: Coastguard, Football, Salesman, Suzie

4. FS 3SS 4SS N3SS DS E3SS (1) Large search window: 31*31, E3SS

performs better in terms of MSE and search points than any other non-full search algorithms

(2) Small window: 15*15, E3SS is similar like DS and N3SS

Page 37: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

4SS related algorithms

4SS1. Three 5*5 search windows and a final 3*3

window. First step uses 9 points. Second/third step uses three or five points. Final step uses 8 points.

2. Smaller search window 5*5 in the first step of 4SS VS 9*9 in 3SS related algorithms.

3. More regular search pattern than N3SS.4. 4SS has similar or worse image quality

than N3SS but less searching points

Other 4SS related algorithms: E4SSAverage Search points:

E4SS<4SS<N3SS<3SSMSE performance is similar like N3SS.

Page 38: ISPASS 2004 © 2004 Marilyn Wolf Multimedia Algorithms Marilyn Wolf Dept. of Electrical Engr. Princeton University

ISPASS 2004

© 2004 Marilyn Wolf

MPEG-1/2 frame types

I

frame t

P

frame t+3

B B

frame t+1 frame t+2