Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Operator Notation
Consider the infix expression (X Y) + (W U), with parentheses added to
make the evaluation order perfectly obvious.
This is an arithmetic expression written in standard form, called “infix form”. There
are two other forms, prefix and postfix, which are occasionally used in software.
Infix form has the operator in between two operands, as in “X + Y”. It is what we are
used to, but it is hard for computers to process.
The two other forms are prefix and postfix. I give the LISP variant of prefix.
Prefix: (+ ( X Y) ( W U) )
Postfix: XYWU+
Implicit in each of these examples is the assumption that single letter variables are used.
Each of the prefix and postfix is easier for a computer to interpret than is infix.
The assumption is that the expression is scanned either left to right or right to left.
Here, we arbitrarily scan left to right.
We return to this expression, but first consider a much simpler expression.
Problem with Scanning an Infix Expression
Suppose that we have a five symbol expression. We are scanning left to right.
We assume: all variables are single alphabetic characters
Each alphabetic characters is taken singly and represents a variable
At the beginning we have five unknown symbols:
We read the first character and find it to be a variable: X
We read the next character and find it to be an operator: X +
Something is being added.
We read the next character and find it to be a variable: X + Y .
Can we do the addition now? It depends on what the next character is.
There are two basic possibilities: X + Y + Z
X + Y Z
In the first option, we can do the addition immediately, forming (X + Y) to which Z is
later added. In the second option, we must wait and first do the multiplication to get
(Y Z) to which X is added.
Scanning a Postfix Expression
With the same assumptions, we scan a five symbol postfix expression.
At the beginning we have five unknown symbols:
We read the first character and find it to be a variable: Y
We read the next character and find it also to be a variable: Y Z
At this point we need an operator.
We read the next character and find it to be an operator: Y Z
Because this is postfix, the evaluator can immediately
form the product term (Y Z)
We read the next character and find it to be a variable: Y Z X
At this point, we need another operator.
We have two operands: (Y Z) and X.
We read the last character and find it to be an operator: Y Z X +
We do the addition and have the term (Y Z) + X.
The Stack Data Type
Stacks are LIFO (Last In, First Out) data structures. Stacks are the data structure most
naturally fit to evaluate expressions in postfix notation.
The stack is best viewed as an abstract data type with specific methods, each of which
has a well defined semantic (we can say exactly what it does).
We can also consider the stack as having a top. Items are added to
the stack top and then removed from the stack top.
The position of the stack top is indicated by a stack pointer (SP).
More on this later; it has two different definitions, each of which is valid.
The stack data type has three necessary methods, and two optional methods.
The mandatory methods are Push, Pop, and IsEmpty.
The standard optional methods are Top, and IsFull.
Note: When studying computer networks, we speak of a “protocol stack”, which has
nothing to do with the ADT (Abstract Data Type) stack that we study today.
We shall discuss the TCP/IP protocol stack later in this course.
Pushing Items onto the Stack
Consider the sequence: Push (X), Push (Y), and Push (Z). This sequence adds three
items to the stack, in the specified order.
Push X
Push Y
Push Z
After the third operation, Z is at the top of the stack and is the first to be removed
when a Pop operation is performed.
The SP (Stack Pointer) is a data item internal to the stack that indicates the location
of the top of the stack. It is never made public by any proper stack implementation.
The Stack Pointer (SP)
The Stack Pointer is that data item used to indicate the location of the stack top.
There are two possible interpretations of the SP. While each is completely valid,
the two are incompatible. A given implementation can use only one of these.
1. The SP indicates where the next item is to be placed.
2. The SP indicates where the last item was placed.
Here we illustrate the operations with Push (X) followed by Push (Y).
Next Item Last Item
Popping Items from the Stack
The Pop operation removes the last item placed onto the stack and places its
value into the named variable.
Suppose that the three previous operations were Push (X), Push (Y), and Push (Z).
The Pop operations are now performed.
Pop (W) W gets the value of Z, the last onto the stack.
The top of the stack changes.
Pop (W) W now gets the value of Y.
Pop (W) W now gets the value of X.
Other Stack Operations: IsEmpty, Top, and IsFull
It is usually important to discover when the last item has been popped from the stack.
The stack class can either raise an exception; provide a method to determine if the
stack is empty, or both.
The standard Boolean function IsEmpty returns TRUE when the stack is empty.
Consider the following pseudo–code.
Loop Until IsEmpty (S)
S.Pop (X) // Get top into X
Process (X) // Process the value
End Loop
The optional method Top returns the value at the top of the stack without removing it
from the stack. This can be implemented using only Push and Pop.
Top (W) Place the value at top of the stack into W, but leave it on the stack.
Pop (W) Remove the value at top of stack and place into W
Push (W) Place that value back onto the stack.
The optional Boolean function IsFull returns TRUE when the stack has been filled
completely and cannot accept another item. This occurs when memory is full.
Evaluation of the Postfix Expression Y Z X +
This is also called RPN for Reverse Polish Notation.
Rules: 1. Scan left to right.
2. Upon reading a variable, push it onto the stack.
3. Upon reading a dyadic operator, pop two variables from the stack,
perform the operation, and push the result.
We read the first character and find it to be a variable: Y
We read the next character and find it also to be a variable: Y Z
We read the next character and find it to be an operator: Y Z
Evaluation of the Postfix Expression Y Z X +
Continues
We read the next character and find it to be a variable: Y Z X
We read the last character and find it to be an operator: Y Z X +
After this, the stack will be popped and the value placed into some variable.
This example just looked at the expression, without considering the assignment.
Stack–Based (Zero Operand) Machine
Evaluate the expression Z = (X Y) + (W U)
which must be evaluated in its postfix (RPN) version: XYWU+
Push X
X
Push Y
X Y
Mult
XY
Push W
XY W
Push U
XY W U
Mult
XY WU
Add
XY +WU
Pop Z
More Explicit Evaluation: W = Y Z
It might be good at this point to make explicit a point that many consider obvious.
Suppose that before the evaluation above, W = 0, Y = 10, and Z = 15.
The push–pop code to execute this instruction will go as follows:
PUSH Y // Value of Y is placed on the stack.
PUSH Z // Value of Z is placed onto the stack.
MULT // Pop the top two values, multiply them
// and push the result onto the stack.
POP W // Pop the value at stack top and place into W.
At the start, here is what we have.
More Explicit Evaluation: Part 2
PUSH Y // Value of Y is placed on the stack.
PUSH Z // Value of Z is placed onto the stack.
More Explicit Evaluation: Part 3
MULT // Pop the top two values, multiply them
// and push the result onto the stack.
POP W // Pop the value at stack top and place into W.
Practical Evaluation of Postfix Operators
Here are some steps for manual evaluation.
1. Scan the postfix expression left to right. Find the first operator.
2. Determine how many operands that operator requires. Call it N.
Select the previous N operands, apply to the operator.
3. Remove the operator and its operands from the expression.
Replace these with the operands results.
4. Repeat
Here is an example, using single digit numbers to start with.
6 5 + 4 2 – *
(6 5 +) 4 2 – *
(11) 4 2 – *
(11) (4 2 –) *
(11) (2) *
22
Practical Evaluation of Prefix Operators
Note: We assume Lisp notation, with full use of parentheses.
1. Scan the expression left to right. If the expression does not contain another
parenthesized expression, evaluate it. Otherwise, attempt to evaluate its
subexpression. Formally, this is viewed as a tree traversal.
2. Keep “moving up” until the last is evaluated. Here is an example.
Again, it will start with single digit numbers in order to be more easily read.
Evaluate the prefix expression: ( + ( * 4 2 ) ( – 5 2 ) ( * ( + 6 4) (+ 1 1 ) ) )
(* 4 2) can be evaluated, so: ( + 8 ( – 5 2 ) ( * ( + 6 4) (+ 1 1 ) ) )
(– 5 2) is defined as 5 – 2, so: ( + 8 3 ( * ( + 6 4) (+ 1 1 ) ) )
(+ 6 4) can be evaluated, so: ( + 8 3 ( * 10 (+ 1 1 ) ) )
(+ 1 1) can be evaluated, so: ( + 8 3 ( * 10 2 ) )
(* 10 2) can be evaluated, so: ( + 8 3 20 )
(+ 8 3 20) = 31, so: 31
Expression Tree for the Prefix Evaluation
The expression is ( + ( * 4 2 ) ( – 5 2 ) ( * ( + 6 4) (+ 1 1 ) ) ).
Software to evaluate this prefix expression would first build this tree (not very hard to do)
and then evaluate it. For example, the node (+ 6 4) would be replaced by 10, etc.
Expression Tree Evaluation
Single Accumulator Machine
Evaluate the expression Z = (X Y) + (W U)
This requires an extra storage location, which I call T for “Temp”.
Load X // AC now has X
Mult Y // AC now has (X Y)
Store T // Now T = (X Y). AC still has (X Y)
Load W // AC now has W
Mult U // AC now has (W U)
Add T // AC now has (X Y) + (W U)
Store Z // And so does Z.
Multiple Register Machines
Evaluate the expression Z = (X Y) + (W U)
Standard CISC Approach (There are many variants of this one.)
Load R1, X // R1 has X
Mult R1, Y // R1 has (X Y)
Load R2, W // R2 has W
Mult R2, U // R2 has (W U)
Add R1, R2 // R1 has (X Y) + (W U)
Store R1, Z // And so does Z
Load–Store RISC Load R1, X // R1 has X
Load R2, Y // R2 has Y
Mult R1, R2 // R1 has (X Y)
Load R2, W // R2 now has W
Load R3, U // R3 has U
Mult R2, R3 // R2 has (W U)
Add R1, R2 // R1 has (X Y) + (W U)
Store R1, Z // And so does Z.
Data Types
Each Instruction Set Architecture has a set of data types for which it provides hardware
support. The standard set of data types includes:
1. Character (Single bytes, but soon to be 16 bits to support UNICODE)
2. Integers 8–bit, 16–bit, 32–bit, etc.
Signed (Two’s–Complement) and Unsigned
3. Real Numbers (Usually the IEEE–754 Standard Floating Point)
(Usually 32–bit and 64–bit formats are both supported)
4. Boolean TRUE or FALSE, usually stored as a single byte.
0 = FALSE, everything else is TRUE
Other languages support additional data types.
5. Decimal Business applications, such as COBOL, avoid floating point
arithmetic with its round–off errors and use BCD arithmetic.
6. Complex This arithmetic allows for taking the square roots of negative
numbers. Usually supported by a software package.
Instruction Types
There are basic instruction types that are commonly supported.
1. Data Movement (Better called “Data Copy”)
These copy bytes of data from a source to a destination.
If X and Y are 32–bit real numbers, then the instruction Y = X makes a
copy of the four bytes associated with X and places them in the address for Y.
2. Arithmetic The standard arithmetic operators are usually supported.
If division is supported, one usually also has the mod and rem functions.
On business–oriented machines, decimal arithmetic is always supported.
Graphics–oriented machines usually support saturation arithmetic.
Real number arithmetic is often not handled directly in the CPU, but by
a coprocessor attached to it.
Early on (Intel 80486 / 80487) this was a cost consideration.
RISC machines follow this approach in order to keep the CPU simple.
3. Boolean Logic Here I differ from the book’s description. Boolean instructions
are often used for non–Boolean purposes, such as bit manipulation.
The real Boolean instructions are the conditional branches.
More Instruction Types
4. Bit Manipulation These use instructions that appear to be Boolean, but
in fact operate differently. This is a distinction lost on many students,
perhaps because it is rarely important. More on this in a moment.
5. Input / Output The computer must communicate with external devices,
including permanent storage, printers and keyboards, process control devices
(through dedicated I/O registers), etc.
The MARIE architecture has a dedicated input device and a dedicated output
device. All commercial machines have “addressable” I/O devices; i.e., the
CPU issues a device identifier that appears to be an address to select the device.
From the CPU viewpoint, each I/O device is nothing more than a set of registers
(control, status, and data) and some timing constraints.
6. Transfer of Control These alter the normal sequential execution of code.
At the primitive machine language level, all we have is unconditional jumps,
conditional jumps, and subroutine invocations. Higher level languages elaborate
these features with “structured constructs” such as conditional statements,
counted loops, and conditional loops.
More on Logical Instructions vs. Bitwise Instructions
This is quite important in C, C++, and Java.
Consider the standard implementation of Boolean values as 8–bit bytes. This
is done for convenience in addressing by the CPU, as single bits are not addressable.
Let A = 0000 0000 C = 0000 0010
B = 0000 0001 D = 0000 0011
Logical operators in C++ AND && Expression && Expression
OR || Expression || Expression
Bitwise operators in C++ AND & Expression & Expression
OR | Expression | Expression
XOR ^ Expression ^ Expression
Logical Bitwise A && B = 0 (FALSE) A & B = 0000 0000
A || B = 1 (TRUE) A | B = 0000 0001
C && D = 1 (TRUE) C & D = 0000 0010
C || D = 1 (TRUE) C | D = 0000 0011
Source: The Annotated C++ Reference Manual (Sections 5.11 – 5.15)
Margaret Ellis and Bjarne Stroustrup, Addison–Wesley, 1990.
A Context for Bitwise Operators
For simplicity I consider a very old (late 1960’s) Line Printer, a predecessor to today’s
laser printer. We examine the Status/Control register for the LP–11.
This register is called “LPS” for “Line Printer Status” in the literature.
We have here two status bits and a control bit.
Status bits Bit 15 Error If Error = 1, then there is a device error, such as
power off, no paper in the printer, etc.
Bit 7 Done If Done = 1, the printer is ready for the next line.
Control bit Bit 6 IE If IE = 1, the printer is enabled to raise an interrupt
whenever Done becomes 1 or Error becomes 1.
More on the LPS Register
Why this arrangement of bits?
The PDP–11, for which the LP–11 was used, did not support 8–bit arithmetic.
A 16–bit integer was the smallest that the CPU would handle.
Viewed as a 16–bit signed integer, we note that the error bit (Bit 15) is the
sign bit. To test for an error, we just read the LPS into a register and test if it is negative.
Testing the Done Bit Recall that the Done Bit is bit 7 and that 0000 0000 1000 0000 is 0x0080.
LPS E000 0000 DI00 0000
0x0080 0000 0000 1000 0000
LPS & 0x0080 0000 0000 D000 0000
If ( 0 = = (LPS & 0x0080) ) then the Line Printer is Not Done
Still More on the LPS Register
Testing and Setting the Interrupt Enable Bit Recall that the Done Bit is bit 6 and that 0000 0000 0100 0000 is 0x0040.
1111 1111 1011 1111 is 0xFFBF.
Testing the Interrupt Enable Bit
LPS E000 0000 DI00 0000
0x0040 0000 0000 0100 0000
LPS & 0x0040 0000 0000 0I00 0000
If ( 0 = = (LPS & 0x0040) ) then the Line Printer Interrupt is disabled.
Yet More on the LPS Register
Enabling Interrupts (Setting the I Bit)
LPS E000 0000 DI00 0000
0x0040 0000 0000 0100 0000
LPS | 0x0040 E000 0000 D100 0000
Setting LPS = LPS 0x0040 enables the interrupt and leaves the other bits unchanged.
Disabling Interrupts (Clearing the I Bit)
LPS E000 0000 DI00 0000
0xFFBF 1111 1111 1011 1111
LPS & 0xFFBF E000 0000 D000 0000
Setting LPS = LPS 0xFFBF disables the interrupt and leaves the other bits unchanged.