29
1 Information Representation in Computer Lecture Nine

1 Information Representation in Computer Lecture Nine

Embed Size (px)

Citation preview

Page 1: 1 Information Representation in Computer Lecture Nine

1

Information Representation in Computer

Lecture Nine

Page 2: 1 Information Representation in Computer Lecture Nine

2

Outline Everything is a number (in computer, c.f.

Pythagoras) Binary number system Negative integers Floating point numbers (reals) Accuracy and limitation in “computer

numbers” ASCII code for characters Code for programs, graphics, music, etc

Page 3: 1 Information Representation in Computer Lecture Nine

3

Bit, Byte, and Word

0

0 1 1 0 1 1 0 0

0 1 1 1 0 1 1 0 0 1 1 1 1 0 0 1 1 1 1 0 0 0 1 1

0 0 0 0 0 1 0 0

cont’d

A bit is a size that can store 1 digit of a binary number, 0 or 1.

A byte is 8 bits, which can store eight 0’s or 1’s.

A word is either 32 or 64 bits, depending on computers. Regular PC’s are 32-bit word in size, higher-end workstations are 64-bit. Word size is the size of the registers.

What do these bits mean is a matter of interpretation! All information in a computer are represented in a uniform format of bit patterns.

Page 4: 1 Information Representation in Computer Lecture Nine

4

Binary Nonnegative IntegersGiven a 32-bit pattern, e.g.,

0000 0000 … 0000 1101 1100,

it can represent the integer (if you interpret it that way)

Note that if we use 32-bit word, the smallest number is 0, and the largest number is 11111111 11111111 11111111 11111111, which is 232-1=4294967295. Numbers bigger than this cannot be represented. If such things happen in a calculation, we say it overflowed. Such interpretation is called unsigned int in the programming language C.

2 30 1 2 32 2 2 2 , 0,1k

k kn a a a a a a

a0 lowest bita31 highest bit

Page 5: 1 Information Representation in Computer Lecture Nine

5

Negative Integers To represent negative numbers, we’ll agree

that the highest bit a31 representing sign, rather than magnitude, 0 for positive, 1 for negative numbers.

More precisely, all modern computers use 2’s complement representations.

The rule to get a negative number representation is: first write out the bit pattern of the corresponding positive number, complement all bits, then add 1.

Page 6: 1 Information Representation in Computer Lecture Nine

6

Property of 2’s Complement Numbers

Two’s complement m of a number n is such that adding it together you get zero (m + n = 0, modulo word size)

Thus m is interpreted as negative of n.

The key point is that computer has a finite word length, the last carry is thrown away.

Page 7: 1 Information Representation in Computer Lecture Nine

7

2’s Complement, example

1 in bits is 00000000 00000000 00000000 00000001

complementing all bits, we get 11111111 11111111 11111111 11111110

Add 1, we get representation for –1, as 11111111 11111111 11111111 11111111

00000000 00000000 00000000 00000001

+ 11111111 11111111 11111111 11111111

1 00000000 00000000 00000000 00000000

Decimal

1

–1

0

Overflow bit not kept by computer

Page 8: 1 Information Representation in Computer Lecture Nine

8

2’s Complement, another e.g. 125 in bits is 00000000 00000000 00000000 01111101

complement all bits, we get 11111111 11111111 11111111 10000010

Add 1, we get representation for –125, as 11111111 11111111 11111111 10000011

00000000 00000000 00000000 01111111

+ 11111111 11111111 11111111 10000011

1 00000000 00000000 00000000 00000010

Decimal

127

–125

+2

Overflow bit not kept by computer

Page 9: 1 Information Representation in Computer Lecture Nine

9

Dual Interpretation of Bit Patterns

0000 0000 = 0

0000 0001 = 1

0000 0010 = 2

0000 0011 = 3

0000 0100 = 4

0000 0101 = 5

0000 0110 = 6

. . .

0111 1111 = 127

1000 0000 = 128 or -128

1000 0001 = 129 or -127

1000 0010 = 130 or -126

1000 0011 = 131 or -125

. . .

1111 1100 = 252 or -4

1111 1101 = 253 or -3

1111 1110 = 254 or -2

1111 1111 = 255 or -1

A byte (or char in C) Unsigned char

Signed char

Page 10: 1 Information Representation in Computer Lecture Nine

10

Signed and Unsigned Int If we interpret the number as unsigned, an

unsigned integer takes the range 0 to 232-1 (that is 000…000 to 1111….1111)

If we interpret the number as signed value, the range is –231 to 231-1 (that is 1000….000 to 1111…111, to 0, to 0111….1111).

Who decide what interpretation to take? What if we need numbers bigger than 232?

Page 11: 1 Information Representation in Computer Lecture Nine

11

Addition, Multiplication, and Division in Binary

0 0 0 1 0 0 1 1

0 0 1 0 0 1 0 1

0 0 1 1 1 0 0 0

+

0 1 1 1

0 1 0 1

0 1 1 1

0 1 1 1

1 0 0 0 1 1

+

1 0 0 1

1 0 0 0 1 0 0 1 0 1 0

1 0 0 0

1 0 1 0

1 0 0 0

1 0

75=35

74 8 = 9 remainder 2

19+37=56

Page 12: 1 Information Representation in Computer Lecture Nine

12

Floating Point Numbers (reals) To represent numbers like 0.5, 3.1415926,

etc, we need to do something else. First, we need to represent them in binary, as

E.g. 11.00110 for 2+1+1/8+1/16=3.1875 Next, we need to rewrite in scientific

notation, as 1.100110 21. That is, the number will be written in the form:

1.xxxxxx… 2e

2 2 32 1 0 1 2 3

12 2 2 2 2 2

2m k

m kn a a a a a a a a

x = 0 or 1

Page 13: 1 Information Representation in Computer Lecture Nine

13

Real Numbers on Computer

0 0.5 1 2 3 4 5 6 7

ε

1 ( 1)0 1 1

0 min max0, 0 ,

p ep

i

d d d

d d e e e

Example for β=2, p=3, emin= -1, emax=2

On modern computers the base β is exclusively 2.

Page 14: 1 Information Representation in Computer Lecture Nine

14

IEEE 754 standard The IEEE standard for floating numbers

specifies for 32-bit single precision (float in C) as follows:

The highest 31-th bit a31 is used for sign, 0 for positive, 1 for negative

8 bits, from position 23 to 30, are used to store an integer, for the biased exponent e+127.

The rest 23 bits are used for the significand.

sign exponent significand bit 0

bit 31

Page 15: 1 Information Representation in Computer Lecture Nine

15

IEEE Floating Point

sign exponent significand bit 0

a31 s-1 s-2 s-3 s-23e0e1e7

Bit 31 (a31) specifies the sign

Bit 23 to 30 (e7,e6,…,e1,e0), interpreted as an 8-bit unsigned integer, specifies the biased exponent E = e + 127.

Bit 0 to 22 (s-1,s-2,…,s-22,s-23) specifies the fractional part of number in scientific notation. The value of the pattern is:

31 23 1271 2 3 23( 1) (1 / 2 / 4 / 8 2 2 ) 2a k E

ks s s s s

However, the number +0 is a string of all zeros.

Page 16: 1 Information Representation in Computer Lecture Nine

16

Example of Floating Point Number

sign exponent significand

bit 31

bit 0

0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

The sign is positive, the biased exponent E=127=e+127, so the original exponent e is 0. The significand f is 0. Thus the pattern represents: +(1.0 + 0)20 = 1.0

sign exponent significand

bit 31

bit 0

1 0 1 1 1 1 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

The sign is negative, the biased exponent E=126=e+127, so the original exponent e is –1. The significand f =1/2+1/8. Thus the pattern represents: –(1.0 + f)2-1 = –0.8125.

Page 17: 1 Information Representation in Computer Lecture Nine

17

Encoding to IEEE Format What is the bit pattern of the real number 10.25 in

IEEE single precision (32-bit) format? Ans: first, we need to convert the number to binary.

We can consider separately the integer 10, and the fractional part 0.25. For 10, we have10 = 52 + 05 = 22 + 12 = 12 + 01 = 02 + 1so 1010=10102

For 0.25, we multiply by 2 (repeatedly) and take the integer part as the digit: 20.25 = 0.5 -> 020.5 = 1.0 -> 120.0 = 0.0 -> 0

So the fractional part is

0.2510=0.0100002

Thus 10.2510=1010.01=1.0100123

Page 18: 1 Information Representation in Computer Lecture Nine

18

10.25 in bits

10.2510=1.01001223

Thus sign is positive, f = .01001, e=3, and E=e+127=130=100000102. The bit pattern is

sign exponent significand

bit 31

bit 0

0 1 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

fE=e+127

Page 19: 1 Information Representation in Computer Lecture Nine

19

IEEE 754 (32-bit) Summary

The bit pattern

represents If e = 0: (-1)s f 2-126

If 0<e<255: (-1)s (1+f) 2e-127

If e=255 and f = 0: + or - ∞ and f ≠ 0: NaN.

… …se

f = b-12-1 + b-22-2 + … + b-232-23

b-1 b-2b-23

Page 20: 1 Information Representation in Computer Lecture Nine

20

Limitations in 32-bit Integer and Floating Point Numbers

Limited range of values (e.g. integers only from –231 to 231–1)

Limited resolution for real numbers. E.g., if x is a machine representable value, the next value is x + ε (for some small ε). There is not value in between. This causes “floating point errors” in calculation. The accuracy of a single precision floating point number is about 6 decimal places.

Page 21: 1 Information Representation in Computer Lecture Nine

21

Limitations of Single Precision Numbers

Given the representation of the single precision floating point number format, what is the largest magnitude possible? What is the smallest number possible?

With floating point number, it can happen that 1 + ε = 1. What is that largest ε?

Page 22: 1 Information Representation in Computer Lecture Nine

22

Double Precision An IEEE double precision takes twice

the number of the bits of the single precision number, 64 bits.

1 bit for the sign 11-bit for exponent 52-bit for significand

The double precision has about 15 decimal places of accuracy

Page 23: 1 Information Representation in Computer Lecture Nine

23

Floating Point Rounding Error

Consider 4-bit mantissa floating point addition:

1.01022 + 1.1012-1

1.010000 22

0.001101 22

1.011101 22

1.100 22

+ Shift exponent to that of the large number

Round to 4 bits

In decimal, it means 5.0 + 0.8125 6.0

Page 24: 1 Information Representation in Computer Lecture Nine

24

Representation of Characters, the ASCII code

0 1 2 3 4 5 6 7 8 9 A B C D E F

0 NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI

1 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US

2 SP ! " # $ % & ' ( ) * + , - . /

3 0 1 2 3 4 5 6 7 8 9 : ; < = > ?

4 @ A B C D E F G H I J K L M N O

5 P Q R S T U V W X Y Z [ \ ] ^ _

6 ` a b c d e f g h I j k l m n o

7 p q r s t u v w x y z { | } ~ DEL

How to read the table: the top line specifies the last digit in hexadecimal, the leftmost column specifies the higher value digit. E.g., at location 4116 (=0100 00012=6510) is the letter ‘A’.

Page 25: 1 Information Representation in Computer Lecture Nine

25

Base-16 Number, or Hexadecimal

Instead of writing out strings of 0’s and 1’s, it is easier to read if we group them in group of 4 bits. A four bit numbers can vary from 0 to 15, we denote them by 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F.

binary hexadecimal decimal

0000 0 0

0001 1 1

0010 2 2

0011 3 3

0100 4 4

0101 5 5

0110 6 6

0111 7 7

binary hexadecimal decimal

1000 8 8

1001 9 9

1010 A 10

1011 B 11

1100 C 12

1101 D 13

1110 E 14

1111 F 15

Page 26: 1 Information Representation in Computer Lecture Nine

26

Program as Numbers

High level programming languageC = A + B;

Assembly languageadd $5, $10, $3

Machine code (MIPS computer)bit 0

bit 31

0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0

Not used source 1 source 2 destination not used operation

$10 $3 $5 add

Page 27: 1 Information Representation in Computer Lecture Nine

27

Graphics as Numbers

A picture like this is also represented on computer by bits. If you magnify the picture greatly, you’ll see square “pixels” which is either black or white. These can be represented as binary 1’s and 0’s.

Color can also be presented with numbers, if we allow more bits per pixel.

Page 28: 1 Information Representation in Computer Lecture Nine

28

Music as Numbers – MP3 format

CD music is sampled 44,100 times per second (44.1 kHertz), each sample is 2 bytes long

The digital music signals are compressed, at a rate of 10 to 1 or more, into MP3 format

The compact MP3 file can be played back on computer or MP3 players

Page 29: 1 Information Representation in Computer Lecture Nine

29

Summary

All information in computer are represented by bits. These bits encode information. It’s meaning has to be interpreted in a specific way.

We’ve learned how to represent unsigned integer, negative integer, floating pointer number, as well as ASCII characters.