CPSC 2105 Lecture 21 - edwardbosworth.com

Operator Notation

Consider the infix expression (X Y) + (W U), with parentheses added to

make the evaluation order perfectly obvious.

This is an arithmetic expression written in standard form, called “infix form”. There

are two other forms, prefix and postfix, which are occasionally used in software.

Infix form has the operator in between two operands, as in “X + Y”. It is what we are

used to, but it is hard for computers to process.

The two other forms are prefix and postfix. I give the LISP variant of prefix.

Prefix: (+ ( X Y) ( W U) )

Postfix: XYWU+

Implicit in each of these examples is the assumption that single letter variables are used.

Each of the prefix and postfix is easier for a computer to interpret than is infix.

The assumption is that the expression is scanned either left to right or right to left.

Here, we arbitrarily scan left to right.

We return to this expression, but first consider a much simpler expression.

Problem with Scanning an Infix Expression

Suppose that we have a five symbol expression. We are scanning left to right.

We assume: all variables are single alphabetic characters

Each alphabetic characters is taken singly and represents a variable

At the beginning we have five unknown symbols:

We read the first character and find it to be a variable: X

We read the next character and find it to be an operator: X +

Something is being added.

We read the next character and find it to be a variable: X + Y .

Can we do the addition now? It depends on what the next character is.

There are two basic possibilities: X + Y + Z

X + Y Z

In the first option, we can do the addition immediately, forming (X + Y) to which Z is

later added. In the second option, we must wait and first do the multiplication to get

(Y Z) to which X is added.

Scanning a Postfix Expression

With the same assumptions, we scan a five symbol postfix expression.

At the beginning we have five unknown symbols:

We read the first character and find it to be a variable: Y

We read the next character and find it also to be a variable: Y Z

At this point we need an operator.

We read the next character and find it to be an operator: Y Z

Because this is postfix, the evaluator can immediately

form the product term (Y Z)

We read the next character and find it to be a variable: Y Z X

At this point, we need another operator.

We have two operands: (Y Z) and X.

We read the last character and find it to be an operator: Y Z X +

We do the addition and have the term (Y Z) + X.

The Stack Data Type

Stacks are LIFO (Last In, First Out) data structures. Stacks are the data structure most

naturally fit to evaluate expressions in postfix notation.

The stack is best viewed as an abstract data type with specific methods, each of which

has a well defined semantic (we can say exactly what it does).

We can also consider the stack as having a top. Items are added to

the stack top and then removed from the stack top.

The position of the stack top is indicated by a stack pointer (SP).

More on this later; it has two different definitions, each of which is valid.

The stack data type has three necessary methods, and two optional methods.

The mandatory methods are Push, Pop, and IsEmpty.

The standard optional methods are Top, and IsFull.

Note: When studying computer networks, we speak of a “protocol stack”, which has

nothing to do with the ADT (Abstract Data Type) stack that we study today.

We shall discuss the TCP/IP protocol stack later in this course.

Pushing Items onto the Stack

Consider the sequence: Push (X), Push (Y), and Push (Z). This sequence adds three

items to the stack, in the specified order.

Push X

Push Y

Push Z

After the third operation, Z is at the top of the stack and is the first to be removed

when a Pop operation is performed.

The SP (Stack Pointer) is a data item internal to the stack that indicates the location

of the top of the stack. It is never made public by any proper stack implementation.

The Stack Pointer (SP)

The Stack Pointer is that data item used to indicate the location of the stack top.

There are two possible interpretations of the SP. While each is completely valid,

the two are incompatible. A given implementation can use only one of these.

1. The SP indicates where the next item is to be placed.

2. The SP indicates where the last item was placed.

Here we illustrate the operations with Push (X) followed by Push (Y).

Next Item Last Item

Popping Items from the Stack

The Pop operation removes the last item placed onto the stack and places its

value into the named variable.

Suppose that the three previous operations were Push (X), Push (Y), and Push (Z).

The Pop operations are now performed.

Pop (W) W gets the value of Z, the last onto the stack.

The top of the stack changes.

Pop (W) W now gets the value of Y.

Pop (W) W now gets the value of X.

Other Stack Operations: IsEmpty, Top, and IsFull

It is usually important to discover when the last item has been popped from the stack.

The stack class can either raise an exception; provide a method to determine if the

stack is empty, or both.

The standard Boolean function IsEmpty returns TRUE when the stack is empty.

Consider the following pseudo–code.

Loop Until IsEmpty (S)

S.Pop (X) // Get top into X

Process (X) // Process the value

End Loop

The optional method Top returns the value at the top of the stack without removing it

from the stack. This can be implemented using only Push and Pop.

Top (W) Place the value at top of the stack into W, but leave it on the stack.

Pop (W) Remove the value at top of stack and place into W

Push (W) Place that value back onto the stack.

The optional Boolean function IsFull returns TRUE when the stack has been filled

completely and cannot accept another item. This occurs when memory is full.

Evaluation of the Postfix Expression Y Z X +

This is also called RPN for Reverse Polish Notation.

Rules: 1. Scan left to right.

2. Upon reading a variable, push it onto the stack.

3. Upon reading a dyadic operator, pop two variables from the stack,

perform the operation, and push the result.

We read the first character and find it to be a variable: Y

We read the next character and find it also to be a variable: Y Z

We read the next character and find it to be an operator: Y Z

Evaluation of the Postfix Expression Y Z X +

Continues

We read the next character and find it to be a variable: Y Z X

We read the last character and find it to be an operator: Y Z X +

After this, the stack will be popped and the value placed into some variable.

This example just looked at the expression, without considering the assignment.

Stack–Based (Zero Operand) Machine

Evaluate the expression Z = (X Y) + (W U)

which must be evaluated in its postfix (RPN) version: XYWU+

Push X

X

Push Y

X Y

Mult

XY

Push W

XY W

Push U

XY W U

Mult

XY WU

Add

XY +WU

Pop Z

More Explicit Evaluation: W = Y Z

It might be good at this point to make explicit a point that many consider obvious.

Suppose that before the evaluation above, W = 0, Y = 10, and Z = 15.

The push–pop code to execute this instruction will go as follows:

PUSH Y // Value of Y is placed on the stack.

PUSH Z // Value of Z is placed onto the stack.

MULT // Pop the top two values, multiply them

// and push the result onto the stack.

POP W // Pop the value at stack top and place into W.

At the start, here is what we have.

More Explicit Evaluation: Part 2

PUSH Y // Value of Y is placed on the stack.

PUSH Z // Value of Z is placed onto the stack.

More Explicit Evaluation: Part 3

MULT // Pop the top two values, multiply them

// and push the result onto the stack.

POP W // Pop the value at stack top and place into W.

Practical Evaluation of Postfix Operators

Here are some steps for manual evaluation.

1. Scan the postfix expression left to right. Find the first operator.

2. Determine how many operands that operator requires. Call it N.

Select the previous N operands, apply to the operator.

3. Remove the operator and its operands from the expression.

Replace these with the operands results.

4. Repeat

Here is an example, using single digit numbers to start with.

6 5 + 4 2 – *

(6 5 +) 4 2 – *

(11) 4 2 – *

(11) (4 2 –) *

(11) (2) *

22

Practical Evaluation of Prefix Operators

Note: We assume Lisp notation, with full use of parentheses.

1. Scan the expression left to right. If the expression does not contain another

parenthesized expression, evaluate it. Otherwise, attempt to evaluate its

subexpression. Formally, this is viewed as a tree traversal.

2. Keep “moving up” until the last is evaluated. Here is an example.

Again, it will start with single digit numbers in order to be more easily read.

Evaluate the prefix expression: ( + ( * 4 2 ) ( – 5 2 ) ( * ( + 6 4) (+ 1 1 ) ) )

(* 4 2) can be evaluated, so: ( + 8 ( – 5 2 ) ( * ( + 6 4) (+ 1 1 ) ) )

(– 5 2) is defined as 5 – 2, so: ( + 8 3 ( * ( + 6 4) (+ 1 1 ) ) )

(+ 6 4) can be evaluated, so: ( + 8 3 ( * 10 (+ 1 1 ) ) )

(+ 1 1) can be evaluated, so: ( + 8 3 ( * 10 2 ) )

(* 10 2) can be evaluated, so: ( + 8 3 20 )

(+ 8 3 20) = 31, so: 31

Expression Tree for the Prefix Evaluation

The expression is ( + ( * 4 2 ) ( – 5 2 ) ( * ( + 6 4) (+ 1 1 ) ) ).

Software to evaluate this prefix expression would first build this tree (not very hard to do)

and then evaluate it. For example, the node (+ 6 4) would be replaced by 10, etc.

Expression Tree Evaluation

Single Accumulator Machine


This requires an extra storage location, which I call T for “Temp”.

Load X // AC now has X

Mult Y // AC now has (X Y)

Store T // Now T = (X Y). AC still has (X Y)

Load W // AC now has W

Mult U // AC now has (W U)

Add T // AC now has (X Y) + (W U)

Store Z // And so does Z.

Multiple Register Machines


Standard CISC Approach (There are many variants of this one.)

Load R1, X // R1 has X

Mult R1, Y // R1 has (X Y)

Load R2, W // R2 has W

Mult R2, U // R2 has (W U)

Add R1, R2 // R1 has (X Y) + (W U)

Store R1, Z // And so does Z

Load–Store RISC Load R1, X // R1 has X

Load R2, Y // R2 has Y

Mult R1, R2 // R1 has (X Y)

Load R2, W // R2 now has W

Load R3, U // R3 has U

Mult R2, R3 // R2 has (W U)

Add R1, R2 // R1 has (X Y) + (W U)

Store R1, Z // And so does Z.

Data Types

Each Instruction Set Architecture has a set of data types for which it provides hardware

support. The standard set of data types includes:

1. Character (Single bytes, but soon to be 16 bits to support UNICODE)

2. Integers 8–bit, 16–bit, 32–bit, etc.

Signed (Two’s–Complement) and Unsigned

3. Real Numbers (Usually the IEEE–754 Standard Floating Point)

(Usually 32–bit and 64–bit formats are both supported)

4. Boolean TRUE or FALSE, usually stored as a single byte.

0 = FALSE, everything else is TRUE

Other languages support additional data types.

5. Decimal Business applications, such as COBOL, avoid floating point

arithmetic with its round–off errors and use BCD arithmetic.

6. Complex This arithmetic allows for taking the square roots of negative

numbers. Usually supported by a software package.

Instruction Types

There are basic instruction types that are commonly supported.

1. Data Movement (Better called “Data Copy”)

These copy bytes of data from a source to a destination.

If X and Y are 32–bit real numbers, then the instruction Y = X makes a

copy of the four bytes associated with X and places them in the address for Y.

2. Arithmetic The standard arithmetic operators are usually supported.

If division is supported, one usually also has the mod and rem functions.

On business–oriented machines, decimal arithmetic is always supported.

Graphics–oriented machines usually support saturation arithmetic.

Real number arithmetic is often not handled directly in the CPU, but by

a coprocessor attached to it.

Early on (Intel 80486 / 80487) this was a cost consideration.

RISC machines follow this approach in order to keep the CPU simple.

3. Boolean Logic Here I differ from the book’s description. Boolean instructions

are often used for non–Boolean purposes, such as bit manipulation.

The real Boolean instructions are the conditional branches.

More Instruction Types

4. Bit Manipulation These use instructions that appear to be Boolean, but

in fact operate differently. This is a distinction lost on many students,

perhaps because it is rarely important. More on this in a moment.

5. Input / Output The computer must communicate with external devices,

including permanent storage, printers and keyboards, process control devices

(through dedicated I/O registers), etc.

The MARIE architecture has a dedicated input device and a dedicated output

device. All commercial machines have “addressable” I/O devices; i.e., the

CPU issues a device identifier that appears to be an address to select the device.

From the CPU viewpoint, each I/O device is nothing more than a set of registers

(control, status, and data) and some timing constraints.

6. Transfer of Control These alter the normal sequential execution of code.

At the primitive machine language level, all we have is unconditional jumps,

conditional jumps, and subroutine invocations. Higher level languages elaborate

these features with “structured constructs” such as conditional statements,

counted loops, and conditional loops.

More on Logical Instructions vs. Bitwise Instructions

This is quite important in C, C++, and Java.

Consider the standard implementation of Boolean values as 8–bit bytes. This

is done for convenience in addressing by the CPU, as single bits are not addressable.

Let A = 0000 0000 C = 0000 0010

B = 0000 0001 D = 0000 0011

Logical operators in C++ AND && Expression && Expression

OR || Expression || Expression

Bitwise operators in C++ AND & Expression & Expression

OR | Expression | Expression

XOR ^ Expression ^ Expression

Logical Bitwise A && B = 0 (FALSE) A & B = 0000 0000

A || B = 1 (TRUE) A | B = 0000 0001

C && D = 1 (TRUE) C & D = 0000 0010

C || D = 1 (TRUE) C | D = 0000 0011

Source: The Annotated C++ Reference Manual (Sections 5.11 – 5.15)

Margaret Ellis and Bjarne Stroustrup, Addison–Wesley, 1990.

A Context for Bitwise Operators

For simplicity I consider a very old (late 1960’s) Line Printer, a predecessor to today’s

laser printer. We examine the Status/Control register for the LP–11.

This register is called “LPS” for “Line Printer Status” in the literature.

We have here two status bits and a control bit.

Status bits Bit 15 Error If Error = 1, then there is a device error, such as

power off, no paper in the printer, etc.

Bit 7 Done If Done = 1, the printer is ready for the next line.

Control bit Bit 6 IE If IE = 1, the printer is enabled to raise an interrupt

whenever Done becomes 1 or Error becomes 1.

More on the LPS Register

Why this arrangement of bits?

The PDP–11, for which the LP–11 was used, did not support 8–bit arithmetic.

A 16–bit integer was the smallest that the CPU would handle.

Viewed as a 16–bit signed integer, we note that the error bit (Bit 15) is the

sign bit. To test for an error, we just read the LPS into a register and test if it is negative.

Testing the Done Bit Recall that the Done Bit is bit 7 and that 0000 0000 1000 0000 is 0x0080.

LPS E000 0000 DI00 0000

0x0080 0000 0000 1000 0000

LPS & 0x0080 0000 0000 D000 0000

If ( 0 = = (LPS & 0x0080) ) then the Line Printer is Not Done

Still More on the LPS Register

Testing and Setting the Interrupt Enable Bit Recall that the Done Bit is bit 6 and that 0000 0000 0100 0000 is 0x0040.

1111 1111 1011 1111 is 0xFFBF.

Testing the Interrupt Enable Bit

LPS E000 0000 DI00 0000

0x0040 0000 0000 0100 0000

LPS & 0x0040 0000 0000 0I00 0000

If ( 0 = = (LPS & 0x0040) ) then the Line Printer Interrupt is disabled.

Yet More on the LPS Register

Enabling Interrupts (Setting the I Bit)

LPS E000 0000 DI00 0000

0x0040 0000 0000 0100 0000

LPS | 0x0040 E000 0000 D100 0000

Setting LPS = LPS 0x0040 enables the interrupt and leaves the other bits unchanged.

Disabling Interrupts (Clearing the I Bit)

LPS E000 0000 DI00 0000

0xFFBF 1111 1111 1011 1111

LPS & 0xFFBF E000 0000 D000 0000

Setting LPS = LPS 0xFFBF disables the interrupt and leaves the other bits unchanged.

Documents

CPSC 2105 Lecture 21 - edwardbosworth.com