Thuat Toan Huffman

Embed Size (px)

Citation preview

  • 7/31/2019 Thuat Toan Huffman

    1/17FIT-HCMUS

    Ging vin:

    Vn Ch Nam Nguyn Th Hng Nhung ng Nguyn c Tin

    Gii thiu

    Mt s khi nim

    Gii thut nn Huffmantnh

    2

    Cu trc d liu v gii thut - HCMUS 2011

  • 7/31/2019 Thuat Toan Huffman

    2/17FIT-HCMUS

    Thut ng:Data compression

    Encoding

    Decoding

    Lossless data compression

    Lossy data compression

    3

    Cu trc d liu v gii thut - HCMUS 2011

    Cu trc d liu v gii thut - HCMUS 2011

    4

    Nn d liuNhu cu xut hin ngay sau khi h thng my tnh u

    tin ra i.

    Hin nay, phc v cho cc dng d liu a phng tin

    Tng tnh bo mt.

    ng dng: Lu tr

    Truyn d liu

  • 7/31/2019 Thuat Toan Huffman

    3/17FIT-HCMUS

    Cu trc d liu v gii thut - HCMUS 2011

    5

    Nguyn tc: Encode v decode s dng cng mt scheme.

    encode decode

    Cu trc d liu v gii thut - HCMUS 2011

    6

    T l nn (Data compression ratio) T l gia kch thc ca d liu nguyn thy v ca

    d liu sau khi p dng thut ton nn.

    Gi:N l kch thc ca d liu nguyn thy,

    N1 l kch thc ca d liu sau khi nn. T l nn R:

    V d:D liu ban u 8KB, nn cn 2 KB. T l nn: 4-1

    1N

    NR

  • 7/31/2019 Thuat Toan Huffman

    4/17FIT-HCMUS

    Cu trc d liu v gii thut - HCMUS 2011

    7

    T l nn (Data compression ratio)V kh nng tit kim khng gian: T l ca vic gim

    kch thc d liu sau khi p dng thut ton nn.

    Gi:N l kch thc ca d liu nguyn thy,

    N1 l kch thc ca d liu sau khi nn.

    T l nn R:

    V d:D liu ban u 8KB, nn cn 2 KB. T l nn: 75%

    N

    NR

    11

    Cu trc d liu v gii thut - HCMUS 2011

    8

    Nn d liu khng mt mt thng tin (Lossless datacompression)

    Cho php d liu nn c phc hi nguyn vn nh dliu nguyn thy (lc cha c nn).

    V d: Run-length encoding

    LZW

    ng dng:nh PCX, GIF, PNG,.. Tp tin *. ZIP ng dng gzip (Unix)

  • 7/31/2019 Thuat Toan Huffman

    5/17FIT-HCMUS

    Cu trc d liu v gii thut - HCMUS 2011

    9

    Nn d liu mt mt thng tin (Lossy datacompression)

    D liu nn c phc hi khng ging hon ton vi d liu nguyn thy; gn ging c th s dng c.

    ng dng:Dng nn d liu a phng tin (hnh nh, m

    thanh, video):nh: JPEG, DjVu;m thanh: AAC, MP2, MP3; Video: MPEG-2, MPEG-4

    Cu trc d liu v gii thut - HCMUS 2011

    10

  • 7/31/2019 Thuat Toan Huffman

    6/17FIT-HCMUS

    Cu trc d liu v gii thut - HCMUS 2011

    11

    Mong mun:Mt gii thut nn bo ton thng tin;

    Khng ph thuc vo tnh cht ca d liu;

    ng dng rng ri trn bt k d liu no, vi hiusut tt.

    Cu trc d liu v gii thut - HCMUS 2011

    12

    T tng chnh: Phng php c: dng 1 dy bit c nh biu din 1 k t

    David Huffman (1952): tm ra phng php xc nh m ti utrn d liu tnh : S dng vi bit biu din 1 k t (gi l m bit bit code)

    di m bit cho cc k t khng ging nhau: K t xut hin nhiu ln: biu din bng m ngn; K t xut hin t : biu din bng m di

    => M ha bng m c di thay i (Variable LengthEncoding)

  • 7/31/2019 Thuat Toan Huffman

    7/17FIT-HCMUS

    Cu trc d liu v gii thut - HCMUS 2011

    13

    Gi s c d liu sau y:ADDAABBCCBAAABBCCCBBBCDAADDEEAA

    Biu din 8 bit/k t cn:(10 + 8 + 6 + 5 + 2) * 8 = 248 bit

    K t Tn s xut hin

    A 10

    B 8

    C 6

    D 5

    E 2

    Cu trc d liu v gii thut - HCMUS 2011

    14

    D liu:ADDAABBCCBAAABBCCCBBBCDAADDEEAA

    Biu din bng chiu di thay i:

    (10*2 + 8*2 + 6*2 + 5*3 + 2*3) = 69 bit

    K t Tn s M

    A 10 11

    B 8 10C 6 00

    D 5 011

    E 2 010

  • 7/31/2019 Thuat Toan Huffman

    8/17FIT-HCMUS

    Cu trc d liu v gii thut - HCMUS 2011

    15

    [B1]: Duyt tp tin -> Lp bng thng k tn s xut hinca cc k t.

    [B2]: Xy dng cy Huffman da vo bng thng k tn sxut hin

    [B3]: Pht sinh bng m bit cho tng k t tng ng

    [B4]: Duyt tp tin -> Thay th cc k t trong tp tin bngm bit tng ng.

    [B5]: Lu li thng tin ca cy Huffman cho gii nn

    Cu trc d liu v gii thut - HCMUS 2011

    16

    ADDAABBCCBAAABBCCCBBBCDAADDEEAA

    11011011111110100000101111111010000

    0001010100001111110110110100101111

  • 7/31/2019 Thuat Toan Huffman

    9/17FIT-HCMUS

    Cu trc d liu v gii thut - HCMUS 2011

    17

    D liu:ADDAABBCCBAAABBCCCBBBCDAADDEEAA

    K t Tn s xut hin

    A 10

    B 8

    C 6

    D 5

    E 2

    Cy Huffman: cy nhphn Mi node l cha 1 k t

    Mi node cha cha cc kt ca nhng node con.

    Trng s ca node: Node con: tn s xut

    hin ca k t tng ng

    Node cha: Tng trng sca cc node con.

    18

    Cu trc d liu v gii thut - HCMUS 2011

  • 7/31/2019 Thuat Toan Huffman

    10/17FIT-HCMUS 1

    Cu trc d liu v gii thut - HCMUS 2011

    19

    E 2 D 5

    ED 7C 6

    CED 13

    B 8 A 10

    BA 18

    CEDBA 31

    Cu trc d liu v gii thut - HCMUS 2011

    20

    Pht sinh cy: Bc 1: Chn trong bng thng k hai phn t x,y c trng s

    thp nht.

    Bc 2: To 2 node ca cy cng vi node cha z c trng sbng tng trng s ca hai node con.

    Bc 3: Loi 2 phn t x,y ra khi bng thng k.

    Bc 4: Thm phn t z vo trong bng thng k.

    Bc 5: Lp li Bc 1-4 cho n khi cn 1 phn t trong bngthng k.

  • 7/31/2019 Thuat Toan Huffman

    11/17FIT-HCMUS

    Cu trc d liu v gii thut - HCMUS 2011

    21

    Quy c:Node c trng s nh hn s nm bn nhnh tri. Node

    cn li nm bn nhnh phi.

    Nu 2 node c trng s bng nhauNode no c k t nh hn th nm bn tri

    Node c k t ln hn nm bn phi.

    Cu trc d liu v gii thut - HCMUS 2011

    22

    K t Tn s

    A 10

    B 8

    C 6

    D 5

    E 2

  • 7/31/2019 Thuat Toan Huffman

    12/17FIT-HCMUS 1

    Cu trc d liu v gii thut - HCMUS 2011

    23

    K t Tn s

    A 10

    B 8

    ED 7

    C 6

    Cu trc d liu v gii thut - HCMUS 2011

    24

    K t Tn s

    CED 13

    A 10

    B 8

  • 7/31/2019 Thuat Toan Huffman

    13/17FIT-HCMUS 1

    Cu trc d liu v gii thut - HCMUS 2011

    25

    K t Tn s

    BA 18

    CED 13

    Cu trc d liu v gii thut - HCMUS 2011

    26

    K t Tn s

    CEDBA 31

  • 7/31/2019 Thuat Toan Huffman

    14/17FIT-HCMUS 1

    Cu trc d liu v gii thut - HCMUS 2011

    27

    M bit ca tng k t: ng i t node gcca cy Huffman n node l ca k t .

    Cch thc: Bit 0 c to ra khi i qua nhnh tri

    Bit 1 c to ra khi i qua nhnh phi

    Cu trc d liu v gii thut - HCMUS 2011

    28

    K t M

    A 11

    B 10

    C 00

    D 011

    E 010

  • 7/31/2019 Thuat Toan Huffman

    15/17FIT-HCMUS 1

    Cu trc d liu v gii thut - HCMUS 2011

    29

    Duyt tp tin cn nn

    Thay th tt c cc k t trong tp tin bng mbit tng ng ca n.

    Cu trc d liu v gii thut - HCMUS 2011

    30

    Phc v cho vic gii nn.

    Cch thc: Cy Huffman

    Bng tn s

  • 7/31/2019 Thuat Toan Huffman

    16/17FIT-HCMUS 1

    Cu trc d liu v gii thut - HCMUS 2011

    31

    Phc hi cy Huffman da trn thng tin lutr.

    Lpi t gc cy Huffmanc tng bit t tp tin c nnNu bit 0: i qua nhnh triNu bit 1: i qua nhnh phiNu n node l: xut ra k t ti node l ny.

    Cho n khi no ht d liu

    Cu trc d liu v gii thut - HCMUS 2011

    32

    C th khng lu tr cy Huffman hoc bngthng k tn s vo trong tp tin nn haykhng?

  • 7/31/2019 Thuat Toan Huffman

    17/17

    Cu trc d liu v gii thut - HCMUS 2011

    33

    Thng k sn trn d liu ln v tnh ton sn cyHuffman cho b m ha v b gii m.

    u im: Gim thiu kch thc ca tp tin cn nn. Gim thiu chi ph ca vic duyt tp tin lp bng thng

    k

    Khuyt im: Hiu qu khng cao trong trng hp khc dng d liu

    thng k

    34

    Cu trc d liu v gii thut - HCMUS 2011