28
1 Harvard University CSCI E-2a Life, Liberty, and Happiness After the Digital Explosion 3A: Data Representation

Harvard University CSCI E-2a Life, Liberty, and Happiness After the Digital Explosion

Embed Size (px)

DESCRIPTION

Harvard University CSCI E-2a Life, Liberty, and Happiness After the Digital Explosion. 3A: Data Representation. Representation. How do you represent “things” with bits Anything Text Documents Pictures Sounds. Representation. How do you represent “things” with bits - PowerPoint PPT Presentation

Citation preview

11

Harvard UniversityCSCI E-2a

Life, Liberty, and Happiness

After the Digital Explosion

Harvard UniversityCSCI E-2a

Life, Liberty, and Happiness

After the Digital Explosion

3A: Data Representation3A: Data Representation

22

RepresentationRepresentation

• How do you represent “things” with bits• Anything

• Text• Documents

• Pictures• Sounds

• How do you represent “things” with bits• Anything

• Text• Documents

• Pictures• Sounds

33

RepresentationRepresentation

• How do you represent “things” with bits

• Why does representation matter?

• How do you represent “things” with bits

• Why does representation matter?

Power

Money

44

Bits (“Binary digITs”)Bits (“Binary digITs”)

• There are two bits: 0 and 1• Everything else is a sequence of

bits • I.e. a “bit string”• 0010101, 111100010100101011

• There are two bits: 0 and 1• Everything else is a sequence of

bits • I.e. a “bit string”• 0010101, 111100010100101011

55

Digital representations are, by definition,

approximations.

They leave out a lot.

66

77

Representing ThingsRepresenting Things

Harry

Ken

Tyler

Sue

Xing

88

Representing ThingsRepresenting Things

Harry 1

Ken 2

Tyler 3

Sue 4

Xing 5

99

Representing ThingsRepresenting Things

Harry 1 0

Ken 2 1

Tyler 3 ?

Sue 4

Xing 5

1010

Representing ThingsRepresenting Things

Harry 1 0 00

Ken 2 1 01

Tyler 3 ? 10

Sue 4 11

Xing 5 ??

1111

Representing ThingsRepresenting Things

Harry 1 0 00 000

Ken 2 1 01 001

Tyler 3 ? 10 010

Sue 4 11 011

Xing 5 ?? 100

1212

How many bits for “n” thingsHow many bits for “n” things

• Each bit doubles the number• 1 bit = 2, 2 bits = 4, 3 bits = 8• N bits = 2n things• 10 bits = 1024 etc.

• Each bit doubles the number• 1 bit = 2, 2 bits = 4, 3 bits = 8• N bits = 2n things• 10 bits = 1024 etc.

Another example of exponential growth

1313

How many bits does it take to represent the 2007

Red Sox Season?

How many bits does it take to represent the 2007

Red Sox Season?

1414

Red Sox 2007Red Sox 2007

• 162 games = 162 bits (96 “1”s and 66 “0”s

• Add 4 bits / game for opponent• Add Inning results• Add at-bat results• Add pitch details• Stop when you’ve had enough

• 162 games = 162 bits (96 “1”s and 66 “0”s

• Add 4 bits / game for opponent• Add Inning results• Add at-bat results• Add pitch details• Stop when you’ve had enough

1515

Representing TextRepresenting Text

• 8 bits per character• “A” = 01000001• “(” = 00101000• How many combinations of 8 bits?

2· 2· 2· 2· 2· 2· 2· 2 = 28 = 256

• 8 bits per character• “A” = 01000001• “(” = 00101000• How many combinations of 8 bits?

2· 2· 2· 2· 2· 2· 2· 2 = 28 = 256

1616

Hexadecimal DigitsHexadecimal Digits

0000 0001 0010 0011 0100 0101 0110 0111

0 1 2 3 4 5 6 7

1000 1001 1010 1011 1100 1101 1110 1111

8 9 A B C D E F

1717

xy

0 1 2 3 4 5 6 7 8 9 A B C D E F

0

1

2 sp ! " # $ % & ' ( ) * + , - . /

3 0 1 2 3 4 5 6 7 8 9 : ; < = > ?

4 @ A B C D E F G H I J K L M N O

5 P Q R S T U V W X Y Z [ \ ] ^ _

6 ` a b c d e f g h i j k l m n o

7 p q r s t u v w x y z { | } ~ del

ASCIIAmerican Standard Code for Information

InterchangeCharacter represented by Hex xy, e.g. 4B is “K”

ASCIIAmerican Standard Code for Information

InterchangeCharacter represented by Hex xy, e.g. 4B is “K”

1818

ASCII UnderneathASCII Underneath

• Emails• Web pages

• Emails• Web pages

1919

What if you need more than 256 characters?

What if you need more than 256 characters?

• Unicode• 32 bits per character (roughly 4

billion different characters)

• Unicode• 32 bits per character (roughly 4

billion different characters)

2020

2121

2222

2323

What about documents?What about documents?

Representation+

Interpretation

Representation+

Interpretation

2424

Word ProcessorsWord Processors602PC SuiteAppleWorksApplix Word -Atlantis Ocean MindEasyWordFrameMakerHan/GulLotus Word ProMellelMicrosoft Word -Nisus Writer -Pages -Papyrus – PolyEditStarOfficeTextMakerWordExpressWordPerfectHieroglyphJarteMadhyamAmíAtariWriter BravoBank Street WriterDeskMate

DisplayWriteDocument EditorEasyWriterFullWrite Professional geoWriteGypsylexiconLocoScriptMacWriteMagic WandMindWrite MultiMatePaperClippfs:WriteProtextSpeedScriptSprintTasteTJ-2 [3]TranswriteWordMARCWordStarWordsworth WriteNow XyWrite

2525

Why not pick one?Why not pick one?

2626

2727

2828

Why representation mattersWhy representation matters

• Loss of data and the inability to exchange is the primary deterrent to switching vendors.

• Control the representation and your control what can be seen and what can be done.

• Loss of data and the inability to exchange is the primary deterrent to switching vendors.

• Control the representation and your control what can be seen and what can be done.