The following is an HTML version of the Instructor's manual for
Computer Organization and Assembly Language Programming by
James L. Peterson, published by Academic Press, 1978. The
Instructor's manual was originally published in 1980 (ISBN
0-12-552252-5). It is mostly a faithful reproduction of the
content of the original printed book, with the exception that it
has been modified to match the current HTML version
of Computer Organization and Assembly Language
Programming. The main difference is that all exercises are
identified by number (1, 2, 3, ...) instead of their original
mixture of numbers and letters (a, b, c, ...).
THE DESIGN OF A MIX SYSTEM
1.0 INTRODUCTION
MIX was introduced by Knuth [4,5,6,7] in 1968 to
aid in the analysis and description of computer
algorithms. It was designed to be like most machines,
but not to be any particular machine. It has been used
throughout the three volumes of The Art of Computer
Programming [4,5,6] which have been written, and has
also been adopted by some universities for the teaching
of assembly language programming [3,11].
1.1 MIX For Teaching
The MIX machine is by no means ideal for teaching
introductory machine organization and programming. Its
major alternative is the use of a real machine. With
the decreasing cost and increasing availability of
minicomputer systems, the potential for teaching with a
real computer system, such as the PDP-11, HP 2100, or
other minicomputer, improves. Most textbooks for this
subject are aimed more toward using a real computer
than a simulated one. The PDP-11 in particular is a
very popular choice.
1.2 Teaching A Real Machine Or Teaching MIX
It has been argued, however, that the use of a
real machine has as many potential disadvantages as
advantages. Specifically, although the IBM 370 and the
PDP-11 computer systems (which are most commonly used
for teaching assembly language programming) are quite
common machines, a student is just as likely to end up
programming on a computer from Univac, CDC, Data
General, Interdata, or any of a number of other
computer manufacturers, or even other architectures
from IBM or DEC (such as the Series 1 or PDP-8 or
DEC-10). Thus, what is important is the basic
programming techniques and understanding, not a
specific programming language. There is also the
problem of programming features or quirks peculiar to
the specific machine being used.
Accessibility is another potential problem.
Either students must have sole use of the machine, or
they must share it with other users. Sharing will
necessitate the use of an operating system, which will
both deny the student access to the real I/O
instructions and require that operating system topics,
such as system calls, file descriptor blocks, and so
on, be discussed rather than the basic I/O concepts of
the underlying machine. Thus, use of a large
multiprogrammed campus computer is providing a
simulated machine to the students; the simulation is
defined by and provided by the operating system.
Using a small dedicated minicomputer system to
provide the students with the ability to access and
control the entire machine without operating system
interference may cause other problems. One is the
requirement for hardware maintenance, and the
disruption of hardware failures. A simulated machine
such as MIX can be made available on all computers on
campus, allowing the job load to simply shift
temporarily to any available backup computer system,
even one remote from campus, if necessary. Another
problem is caused by the cost of providing sufficient
machine resources for the typically large number of
students in an undergraduate assembly language course.
A simulated machine can, of course, be run in a shared
mode on a large fast system.
One last advantage of a simulated machine is the
ability to know exactly what the system does in all
cases, since the program listing is available for
inspection. The only recourse for a real machine may
well be an examination of the actual circuit diagrams
which may themselves be either incorrect or vague. (I
once worked with a machine with an instruction which
moved the program counter into an index register. None
of the manuals indicated when the program counter was
updated, however, and our engineers never were able to
decide either. Empirical testing indicated that either
it was a rather complex function of the previous,
current, and next instruction, or a race condition.)
Other considerations, such as simplicity,
textbooks, cost, and so on can also be argued [3].
2.0 MIX SIMULATION SYSTEMS
All of the purported advantages of a simulated MIX
computer, however, depend upon the existence of a MIX
simulator. Since a MIX system requires a computer, and
the computer probably has an assembler but probably
does not have a MIX system, it is often easier to teach
the available assembly language rather than construct a
MIX system. The alternative to constructing your own
MIX system is of course to import a MIX system written
elsewhere.
Many MIX systems have been written, with various
success and various capabilities. A MIX system written
in 360 assembly language exists at Stanford [8].
Another system for the 360 is used at UCLA [2] and
another at Central Missouri State College [9]. These
are written specifically for the IBM 360/370
architecture, and could not then be moved to another
computer, but should be able to be moved to another
360/370 site. Another system was written at the
University of Texas [3,10] for CDC 6000 series
computers, but it is written in a mixture of assembly
language and an obsolete systems programing language
for a locally developed operating system, so movement
of the simulator even to similar computer systems is
almost impossible.
Systems developed primarily for their portable
nature include a Fortran system at the University of
Wisconsin [12] and one at the University of British
Columbia [1]. The latter system has several documents
which explain the system and how to transport it.
Additional systems may also be available.
3.0 MIX SYSTEM WRITING
Writing a MIX system is not a particularly
difficult task, however, and we would encourage you to
seriously consider writing your own. Knuth has given a
portion of the system and discussed the general
approach, in Volume One [4]. Writing a system does
require some time and care. The resulting system is
typically large (some 70 pages for [1] and around 100
pages for [10]). Since it may exist for some time and
may need to be moved from machine to machine, it should
be well documented, modular, transportable, but also
efficient in both time and space. Writing such a
program is a good exercise in software engineering.
The most crucial part of a successful simulator
system is a good basic design. The basic design of the
system is implementation independent. A clear view of
the complete design of the MIX simulator is very
important for a successful product. This paper
presents a top-down design for a MIX system. The
design is presented only generally, in a narrative
English, supplemented by some occasional Pascal-like
data definitions and flow of control. We are mainly
interested in the overall system design, and its
components. Certain of these components, such as the
assembler and the instruction execution simulator, have
been adequately covered elsewhere [4,11], and we simply
refer you to those references for that portion of the
design. However, the entire system is composed of more
than these two parts.
It is intended that the design presented here can
be and will be freely implemented in many different
languages by many different people on many different
computers.
4.0 A DESIGN FOR A MIX SYSTEM
The most common problem with MIX systems is an
incorrect assumption that the cpu instruction execution
is the primary part of a MIX system. To the contrary,
instruction execution is a minor portion of the
simulated system. Figure 1 shows the hierarchical
structure of our system design.
At the highest level, the MIX system consists of
several parts. The MIXAL assembler reads an assembly
language program and assembles it into MIX memory. The
MIX computer system executes the program which is in
MIX memory. MIX memory is the interface between the
assembler and the computer system.
The flow of control between these two components
is the main program of the MIX system. It can be
written as simply as,
BEGIN
Zero MIX memory;
Assemble a MIXAL program;
If there are no assembly errors,
Then execute the MIX program;
END.
This simple MIX system allows one MIXAL program to be
assembled and executed per job. For many applications
this is entirely adequate. Input MIX jobs would
consist of a MIXAL program, terminated by an END card,
immediately followed by any data to be used by the MIX
program in its execution.
4.1 MIX Control Cards
An alternative, more complex but more general, MIX
system uses MIX control cards to determine the flow of
control between the components of the MIX system,
rather than using a fixed flow of
assemble/execute/quit. In this case, the general MIX
system flow of control would be,
BEGIN
Zero MIX memory;
While there are still more MIX control cards,
do begin
get next control card;
Case control card type of
Assemble;
Execute;
...
end Case;
end While;
END.
The design of the control card system must
obviously be determined at each site, according to the
normal usage at each site. Typically control cards are
identified by a special character ($, @, ! or such) in
the first column of an input line. This is followed by
a keyword identifying the next step to be performed,
and optionally, some parameters. A typical set of
control card functions might include:
- $MIXAL - identifies the following cards as a MIXAL
assembly program which is to be assembled
into MIX memory. Possible parameters
include: listing control, assembly time and
line limits, input and output file locations,
etc.
- $RUN - starts (or restarts) the execution of the
program in MIX memory if there were no
assembly errors. Data would normally follow
this card. Possible parameters include:
time limit, line limit, location of input
data or output results, starting address,
etc.
- $GO - simulate the GO button of exercise 1.3.1-26 of
[4].
- $DUMP - prints MIX memory onto the output device. The
output dump would be for debugging purposes
and might include register contents and
memory contents in octal, decimal, character
or instruction format. Options would
indicate first and last word addresses and
format.
- $TRACE - turns on a tracing debug feature. Parameters
might indicate which words are to be traced.
Other control cards might also be necessary or
desired.
5.0 THE MIXAL ASSEMBLER
The MIXAL assembler has been discussed in great
detail elsewhere [11], and we refer the reader to that
discussion. Suffice it to say that the MIXAL assembler
assembles the input MIXAL program into a MIX machine
language program in MIX memory. The assembler produces
a listing on the output device and preferably a cross
reference listing also. The assembler may be one-pass
or two-pass, may be extended to include additional
pseudo-instructions, macros, conditional assembly or
whatever is desired. The important feature of the
assembler is that it is essentially self-contained and
interfaces with the remainder of the MIX system through
a small number of well-defined variables such as,
- MIX memory.
- The starting address (may be passed in the
program counter).
- The number and severity of assembly errors.
6.0 MIX MEMORY AND REGISTERS
MIX memory is an extremely important aspect of a
MIX system. It is very important that the
representation of the memory be very carefully thought
out. We would recommend strongly that information
hiding techniques be used to encapsulate the actual
representation of memory and hide it from all other
routines in the system.
In the abstract, of course, the definition of MIX
memory is quite straightforward.
MIX memory: array [0..3999] of MIX words;
MIX words = record
sign: (positive, negative);
array [1..5] of byte;
end record;
A byte is either a subrange 0..63 or 0..99 depending on
the base of the MIX machine. (One possibility is to
fix the byte size as a compile time constant; another
is to use a run time parameter.)
In fact, efficiency in time and space demand a
more complex representation of memory. At the least,
bytes should be packed to minimize memory usage. In
the Stanford MIX system, a decimal machine, each MIX
word occupies a double word (8 bytes, 64 bits). The
UT-MIX system stores six 6-bit quantities in one 60-bit
word, with the high-order 6-bit byte being either 0 (a
positive sign) or 63 (a negative sign). John Baker's
MIX system uses the lower 30 bits of a word to
represent bytes 1:5 of a binary MIX word and the next
bit to be the sign (0=positive; 1=negative). The
major problem with all of these is the need for a
minimum of 31 bits per word.
On a 16-bit minicomputer or an 8-bit
microprocessor, alternative schemes may be needed. One
approach is of course simply to use multi-word
programming techniques to simulate a 32-bit word from
four 8-bit bytes, in which a MIX word is then packed.
For a binary MIX machine, this may require considerable
shifting and masking, since the five 6-bit bytes will
not be conveniently aligned within the four 8-bit
bytes. An alternative scheme (also needed for decimal
MIX machines) is to store each 6-bit MIX byte in one
8-bit machine byte. This leaves at least one bit per
byte left over. One of these can be used as the sign,
and others can be used for debugging flags, if desired.
The Stanford MIX system uses some extra bits to catch
accesses to undefined memory locations, and to
locations which are currently being used for input or
output.
Probably the most important attribute of the MIX
memory should be the limited access allowed to it.
Ideally, MIX memory should be treated as an abstract
data structure, and all accesses to it should be made
through a very small number of routines. The most
obvious candidates would be a STORE(address,value) and
a LOAD(address,value) pair of routines which would load
or store the value to or from memory[address]. An
alternative, or complementary, set of routines would be
LOADBYTE (address,value,L,R)
STOREBYTE (address,value,L,R)
which would access the L:R byte of memory[address].
The LOAD routine is obviously just a
LOADBYTE(address,value,0,5), while the LOADBYTE routine
could call LOAD to fetch the entire word and then
extract the relevant bytes to return. Thus, these two
pairs of routines are complementary and both may not be
needed.
With proper design and restraint on the part of
the programmer, only these two routines, and the
routines to perform arithmetic and comparison
operations on MIX words need ever know how memory is
represented. Such an approach would thus allow the
memory representation to be easily changed, if this
were desired.
7.0 THE MIX COMPUTER SYSTEM
As indicated in Figure 1, the MIX computer system
is itself composed of several separate parts. The
reason for this decomposition is to allow the
representation of realistic parallelism between the
input/output system and the central processor unit.
While it is easiest to write a MIX simulator which
ignores I/O-cpu overlap, this prevents the programmer
from learning about this very important part of
assembly language programming.
The most appropriate programming technique to use
for controlling the computer simulation is an
event-driven simulation. This is possible because all
events in the MIX system are deterministic in their
time behavior.
As an example, assume that the cpu is computing
along, and executes an IN instruction for the card
reader, with the current clock at 1274 MIX time units.
The IN instruction takes one MIX unit moving the clock
to 1275. If it takes 1000 MIX units to read a card,
then the card reader will be busy until 2275. So we
post the event "card reader done" to occur at time
2275. The cpu then continues with the next
instruction. If the next instruction is an OUT to the
line printer which takes 1200 MIX units to print a
line, we post the event "line printer done" for time
2476. If the next instruction, after the OUT, were an
IOC to tape unit 0, taking 400 MIX units to rewind tape
0, we post another event "tape unit 0 done" for time
1677. Now we continue executing instructions until the
first posted event occurs. As the clock ticks, each
event must be handled at the appropriate time.
An event driven simulation requires only that
every time the clock is advanced, the current clock
value is compared against the posted time for all
pending events, and if it is time for any event, that
event should then be handled. A simple implementation
of this uses an event queue. The event queue is a list
of all pending events. Posting an event means entering
an element into the event queue. Events are taken off
the queue when it is time for them to occur. Each
event is described by a record which would contain,
- the scheduled time for the event
- the type of event
- other parameters (such as device unit number)
- a pointer to the next pending event
The last item is used to construct a linked list of
events; it would not be needed if a contiguous queue
representation were used.
The queue should be kept sorted by scheduled time
of events. This reduces the search time to find the
next event to occur. Entry into the queue requires a
sorted entry (insertion sort should be adequate). The
time of the next pending event is the time of the event
at the front of the queue. This allows the general
flow of the MIX computer system to be:
While the computer is not halted
do begin
while the queue is empty
or the next event time in the queue > clock
do execute the next instruction;
remove the event at the front of the queue;
case event of ... end case;
end while;
Several points are important here. Only one
instruction is executed between checks of the event
queue to see if an event is scheduled to occur. This
assures that we do not "overrun" any events. Also,
several events may be scheduled to happen at the same
time. Both executing instructions (such as I/O
instructions) and handling events may cause new events
to be put into the queue. The case of halting the
computer can be handled by scheduling an event ("halt")
to happen after the last I/O device is finished, or by
simply advancing the clock to the scheduled time for
the last event (or both).
Some problems with this approach can still exist
for the purist. The design given above does not allow
events to occur except between instruction execution.
Consider an event scheduled to occur at time 4201, with
the current clock value being 4200. Since it is not
yet time for the event, we execute the next
instruction. If the next instruction is, say, a MOVE
of 24 words (taking 49 time units), the clock value
will be 4249 before we check the front of the queue
again. Thus the scheduled event may occur late. In
most cases this does not matter, but if the late event
creates a new event, what time is used for the new
event? Does it occur a fixed amount of time after the
late event should have occurred (its scheduled time) or
its late time? This can be a complicated question.
Consider the case of the event being an input or
output of one word to or from memory, such as is
suggested below. Words are moved between the I/O
system and memory at regular intervals (the transfer
time divided by the number of words per record). Thus,
without the problem mentioned above transfer would
happen at time k, k+t, k+2t, k+3t, ... If the transfer
at time k occurs at time k', are all later transfers
rescheduled? Consider that for memory reference
instructions or MOVES, memory referencing would in fact
be delayed, either for the cpu or the I/O device
because of memory interference, but for shifts, MUL and
DIV, only the first time unit (corresponding to
instruction fetch) could cause interference; at any
other time cpu processing and memory referencing for
I/O would occur in parallel. Also note that on most
systems, I/O has priority over cpu memory references.
Correct simulation at this level of detail is
rather difficult and requires the MIX memory access
routines to check the event queue for I/O transfers
whenever access to memory is made by the cpu. It is
felt that this level of detail is unnecessary and would
overly complicate the simulator.
8.0 INPUT/OUTPUT DEVICE SIMULATION
Each I/O device should be represented by a
description in the I/O device array. As defined by
Knuth, only 20 devices are possible, although the
architecture clearly allows up to 64 devices. A device
descriptor would be indexed by the device unit number
and probably contain at least the following
information:
- device name
- device type: (card reader, tape, disk, ...)
- device record length
- device buffer (of the appropriate length).
All actual I/O would be between the host
machine operating system and this buffer.
Words would be transferred between here and
MIX memory at appropriate times.
- device existence (exists/does not exist for
this configuration)
- device status (busy/ready)
- time when device will be ready
- other device state information
We also need a device event routine for each device,
for each type of event, including IN, OUT, IOC, JBUS
and JRED I/O instructions.
The device event routines will vary from device to
device, but they can all be written in much the same
form. For an I/O instruction which transfers between
MIX memory and an I/O device, the simulator should
transfer words one at a time between MIX memory and the
device buffer in the descriptor record. Each device
would transfer words at regular intervals with a
possible start up delay and a stop delay. Suitable
values for start-up, stop and word transfer times can
be determined by examining times for I/O devices for
typical minicomputers.
These three times would also typically correspond
to the actions that must be taken by the simulator.
For example, for an IN instruction, the first event
would be the IN instruction. This would require the
appropriate host I/O to read the next MIX record into
the device buffer (in the descriptor). We would also
need to compute and save the effective address in MIX
memory to transfer to, and set the device status busy.
This corresponds to the start up delay. An event
"transfer ith word into memory[address]" would be
posted to occur after the start up delay, with i=1 and
the address being the effective address.
When the event "transfer ith word into
memory[address]" occurs, the ith word would be
transferred from the device buffer into MIX memory.
Another event "transfer ith word into memory[address]"
would be posted to occur after a transfer delay time
with i and the memory address incremented by one,
unless the value of i corresponds to the last word in
the input record. In this case, we would post an event
"device j input is complete" to occur after a stop time
delay. The "device j input is complete" event merely
clears the device status back to ready and any other
necessary clean-up.
Output is similar.
An alternative to this approach is to schedule all
of the transfer instructions to occur at their
respective times when the IN or OUT instruction is
executed. The difference between these two approaches
lies in the size of the event queue. In the first
approach, only one event per device will be scheduled,
but that event may cause a new event to be scheduled.
In the second approach, 16 or 24 or 100 events may be
scheduled per device at one time, but none of these
events will create any new events.
From this discussion, we can see that the
following events would probably be sufficient:
"IN instruction", device, address
"OUT instruction", device, address
"IOC instruction", device, address
"move word in", device, word number, address
"move word out", device, word number, address
"I/O operation complete", device
The word number and address are best included in the
event queue entry or the device descriptor record.
This means that no local information need be retained
by the device event handler, and thus, additional
devices of the same type can use the same device event
handler. (We do not need separate device handlers for
tape drive 0 and tape drive 1, although we may need
separate device handlers for each type of device.)
The simulation of card reader and line printer
devices should present no real problem, but tape and
disk drives may require careful thought. A magnetic
tape device can be simulated by most sequentially
accessed files. Simulating a disk or drum, on the
other band, would require both random access files and
the language features to access them. If the simulator
is implemented in Fortran, such random access files are
non-standard, and hence the system will not be
transportable. Also, operating system conventions for
creating, opening, closing, and deleting the files for
these devices may be quite tiresome and clumsy. If at
all possible, however, the system should simulate at
least a few tape and/or disk drives.
I/O device timing is an area which allows some
creative programming. As mentioned, most devices can
be characterized by a word transfer time and (possibly
zero) start and stop times. These times can be
determined by guess, or by examining manufacturers'
literature for typical small machine peripherals. The
IN and OUT instructions are thus quite straightforward.
The IOC instruction, on the other hand, can be quite
interesting.
- Line Printer - IOC 0 means skip to the top of
the page. The time for this is a function of
the distance from the top of the page, in
lines. IOC n (skip n lines) is a function of
n. Both suggest the need to count output
lines to keep track of the current output
line.
- Magnetic Tape - IOC 0 (rewind) is a function
of the current position of the tape. IOC n or
IOC -n (move tape n records forward or back)
is a function of n. Notice that IOC n can be
difficult to simulate in some systems with
sequential files and might then be best
simulated with a direct access file.
- Disk - An IOC (seek) takes a time which varies
according to the current position of the head
and the desired position (in the X register)
of the head. Introducing rotational delay for
IN and OUT instructions as well as head
movement delay could be complex.
A final comment about timing is that in real
systems, I/O timing is variable. Hence, greater
realism can be obtained by perturbing the calculated
times by some small random number. For example, in the
Wisconsin MIX system [12], I/O for tape units takes
1000 + X time units where X is a random number between
-200 and +200.
9.0 THE CPU INSTRUCTION INTERPRETER
The cpu instruction execution unit is a simple
standard simulator. Knuth sketches the design of such
a simulator in 1.4.3 of [4]. We would remark only that
the optional instructions and features (binary
instructions, indirect addressing, floating point
instructions) can be added to make the system more
complete.
10.0 DEBUGGING FACILITIES
What debugging facilities are to be provided for
the user of the MIX system? Remember the MIX is
basically an educational system, and so careful thought
is needed concerning debugging facilities. The
following might be considered:
- Dumps - Either partial or full dumps of MIX
memory should be available whenever a fatal
error occurs during execution. It may also be
necessary to dump on command from the user.
MIX memory is small enough that a complete
dump may be reasonable, particularly if some
scheme is used to avoid printing sequences of
identical lines (like first identical, ...,
last identical), and if MIX memory is
initialized to zero.
- Traces - A line by line trace of instruction,
program counter, register contents, clock,
effective address, and contents of the
effective address (where appropriate) should
be available when requested. An interesting
possibility used in the Stanford system [8] is
to trace any location at most a fixed number
of times (default was 2). This prevents long
loops from being followed too often but
generally gives enough information to find
errors. This would require a counter with
every instruction in MIX memory, but the
counter would also allow an execution profile
to be provided showing how often each
instruction is executed.
- Snaps and Breakpoints - These are dumps and
traces triggered by a specific program counter
value. For example, a breakpoint at a
particular location will cause a line of trace
every time that instruction is executed; a
snap will dump an area of memory every time a
specified instruction is executed.
- Memory Monitoring - A common problem is the
mysterious store which destroys the contents
of a location but the programmer does not know
where the store is (typically it is an indexed
or indirect store, or an incorrectly modified
instruction). Monitoring a particular memory
location will produce trace and dump
information whenever the specified location is
read or written (as selected).
- Execution History - Another common problem is
the fatal error which terminates the program
with the program counter being either a data
location or undefined memory. Some help on
this can be obtained by providing the user,
upon request, with a history of the
instructions executed before the fatal error.
This can be in the form of a complete
backwards trace of the last n instructions, or
a list of the last n jump addresses (address
of jump and address jumped to) allowing the
user to deduce the probable flow of control
into the unknown.
All of these debugging aids can obviously be much
more effective in a time-shared interactive
environment. Notice that the change from batch to
interactive use is only a change in the MIX control
card input source and output device.
11.0 CONCLUSIONS
We have sketched the basic design of a complete
MIX system, giving the major component pieces and
mentioning the major problems involved. Still much
work will need to be done by the programmer. Actual
code has not been given here, since that is very
machine and language dependent. However, the overall
design should be clear. The remainder then is mere
implementation.
12.0 REFERENCES
- Baker, J. L., "A MIX System", notes,
Department of Computer Science, University of
British Columbia, 1974.
- Braden, R. T., "MIX Users Guide", Campus
Computing Network, UCLA, 1975.
- Greenawalt, E. M., and Good, D. I., "The MIX
Computer as an Educational Tool", ACM National
Conference, 1972.
- Knuth, D. E., The Art of Computer
Programming, Volume 1: Fundamental
Algorithms, Addison-Wesley, 1968, Second
Edition, 1973.
- Knuth, D. E., The Art of Computer
Programming, Volume 2: Seminumerical
Algorithms, Addison-Wesley, 1971.
- Knuth, D. E., The Art of Computer
Programming, Volume 3: Sorting and Searching,
Addison-Wesley, 1973.
- Knuth, D. E., MIX, Addison-Wesley, 1970.
- Knuth, D. E., and Sites, R. L., "MIX/360
Users Guide", STAN-CS-71-197, Computer Science
Department, Stanford University, (March,
1971).
- Ohlendorf, M. W., "CMSC MIX: A Simulator
System for the MIX Computer", Master's Thesis,
Department of Computer Sciences, University of
Texas, (August, 1972).
- Peterson, J. L., "UT-MIX Reference Manual",
TR-64, Department of Computer Sciences,
University of Texas, (January, 1977).
- Peterson, J. L., Computer Organization and
Assembly Language Programming, Academic Press,
1978.
- "MIX Users Guide", Computer Sciences
Department, University of Wisconsin, (January
1970).
- Dale, P. B., Margolin, C. G., and Williams,
G. H., "MIX and MIXAL (A User's Manual)",
Computation Center, Union College,
Schenectady, New York, 1975.
MIX System Sources
This is a list of the places and people that I
have heard have MIX systems, which may be sources for
other places who want to acquire a MIX system. Other
systems may exist, and I would be glad to hear of them.
Also, some of these people may not be in a position to
widely distribute their systems.
- Stanford Center for Information Processing;
Stanford University; Stanford, California
94305. (MIX for an IBM 360).
- Campus Computing Network; Mathematical
Sciences Addition; University of California;
Los Angeles, California 90024. (MIX for an
IBM 360).
- Mr. Fred Jacobson; MACC - Madison Academic
Computing Center; 1210 W. Dayton Street;
Madison, Wisconsin 53706. (Fortran version).
- Mr. Kowalik; Academic Services; Washington
State University; Pullman, Washington 99163.
(Fortran version).
- John L. Baker; Department of Computer
Science; University of British Columbia;
Vancouver, British Columbia U6T 1W5; CANADA.
- C. G. Margolin and G. H. Williams;
Computation Center; Union College;
Schenectady, New York 12308. (MIX for B5700).
- Professor Higginbotham; Department of
Industrial Engineering; Auburn University;
Auburn, Alabama 36830. (IBM 360)
- Albert Shpuntoff; Department of Math and.
Computer Science; Morningside College; Sioux
City, Iowa 51106. (IBM 1130)
CHAPTER 1: ANSWERS TO EXERCISES
1.1 THE MEMORY UNIT (PAGE 10-11)
- Core memory and semiconductor memory. The
differences discussed on pages 4 to 8, and
center on: physical construction, speed,
magnetic versus electrical recording techniques,
and volatility. An alternate answer would be
read-write memory versus read-only memory (ROM),
but these are not necessarily physically
different, since both would typically be
semiconductor memories.
- The memory address register defines which word of
memory is to be read or written, while the memory
data register either has the information to be
stored (for a write) or the information which was
read (for a read).
- 12-bit addresses allow 4096 words of memory; 15
bits allow 32,768 words; 24-bit addresses allow
16,777,216 words.
- The cycle time of a memory is at least twice the
switching time since every memory cycle must
first read the information from memory and then
write it back into memory. This defines the
read-write cycle and is necessary because of
destructive read-outs. Access time on the other
hand is the same as switching time since the data
can be used when it is read out, before it is
re-written. Thus, access time = switching time,
and 2 * (switching time) = cycle time.
- An n-bit word can represent 2n different bit
patterns.
- For 4K of 16-bit words, we need 4096 * 16 bits =
65,536 bits. (If we included a parity bit with
each word, it would be 4096 * 17 bits = 69,632
bits.) If the memory module cost $3,000, this
would be $3000 ÷ 65536 = 4.58 cents per bit.
This cost is not unreasonable for old core
memory, but we recently bought 32K 16-bit (plus
parity) words of MOS memory for $1700, giving a
cost of about 0.3 cents per bit.
- An n-trit word would be able to represent 3n
different values. Similarly, an n-trit address
could address 3n different words. Three state
devices (-1,0,+1) are being used internally in
computers to control input-output bus control.
- If the high-order bit is used to distinguish
between modules A and B, then addresses 0 to 4095
are in one module and addresses 4096 to 8095 are
in the other module. This is known as memory
banks. If the low-order bit is used to
distinguish between modules A and B, then all the
even addresses are in one module and the odd
addresses are in the other module. This is
interleaved memory.
- Dynamic memory must be refreshed at regular
intervals or the electrical charge will decay to
the point that information is lost. Static
memories do not need to be refreshed. Both kinds
of memory are volatile and will lose information
if power is lost.
- Read-only memory is not volatile because it
cannot be changed, either by accident or on
purpose.
1.2 THE COMPUTATION UNIT (PAGES 39-41)
You can easily make up new problems in
addition, subtraction or conversion between
representation for binary, octal, decimal and
hexadecimal by just using random digits in the
appropriate base. One problem you may run into
is the TI programmer's calculator. This
hand-held calculator allows input and display in
either octal, decimal or hexadecimal, and hence
would allow the student to avoid learning the
principles behind the conversions. It would
probably be an unfair advantage on an exam.
- A number is an abstract concept which is
represented by a numeral. A digit is a symbol
which is used to construct numerals.
- The largest 16-bit unsigned number is 216-1 =
65,535.
- The largest unsigned number in n bits is 2n-1,
rather than 2n, because the numbering starts at
0, not at 1.
- 41,972 (base 10) = 0100 0001 1001 0111 0010
(BCD).
- This is a tricky question. Most people don't
think you can answer it and you can't exactly.
However, you can say that it is not base 1, 2,
3, 4, 5, or 6, since the largest digit is 6.
Thus, the correct answer is base b with b > 6.
- Both 0 and 1 have the same representation in
octal, binary, and decimal. For octal and
decimal, the numbers 0, 1, 2, ..., 7 all have
the same representation. In general, for bases
a and b, the numbers 0, ..., min(a,b)-1 have the
same representation.
- Depends on whether they have to be spelled
right. Any word with only the letters A, B, C,
D, E, and F are acceptable, including, A, BE,
CAB, CAD, BABE, ABED, DEAD, DEAF, DEED, BED,
BEAD, ACE, DEB, DAB, CEDE, BADE, FAD, FADE,
FEED, FED, ...
- Addition.
Binary
1011010 1001000111 11101011
+0101011 + 110010 +100100101
-------- ----------- ----------
10000101 1001111001 1000010000
Octal
567674 77651 3472010
+776571 +1623507 + 7743
------- -------- --------
1566465 1723360 3501753
Hexadecimal
143AF F9A8C7B 4FF58
+2E135 + 9A67B +141508
------ -------- -------
424E4 FA432F6 191460
- Conversion
Binary Decimal Octal Hexadecimal
1101 13 15 D
100110010 306 462 132
1100100 100 144 64
10010000 144 220 90
101101 45 55 2D
110010101011 3243 6253 CAB
1010111 87 127 57
101011 43 53 2B
101000100 324 504 144
- 35 bits will require 35 ÷ 3 = 12 octal digits. 12
octal digits can represent 12*3 = 36 bits. The
difference is because with a 35-bit number, the
high-order octal digit will only have two bits,
so we add a leading zero. This will restrict
the high-order octal digit to being only 0, 1,
2, or 3, so it isn't really an arbitrary octal
digit.
- To convert from binary to base four, we group
bits by twos and translate as 00 = 0, 01 = 1, 10
= 2, and 11 = 3. So 0100110110100110 =
10312212.
- Suppose we have a decimal number, 1978. Look
for the largest entry in the table which is
smaller than our decimal number. For 1978, this
is 1536 which is a 3 in digit position 4.
Subtract 1978 - 1536 leaving 442. Now repeat
with 442. We find 1978 = 1536 + 442 = 1536 +
384 + 58 = 1536 + 384 + 56 + 2 = 3672 (octal).
-
abc (base 8) | = | a * 64 + b * 8 + c |
|
| ≤ | a * 100 + b * 10 + c |
|
| = | abc (base 10). |
- A binary number bn...b1b0 = bn*2n + ... + b1*21
+ b0*20. So, 2*bn...b1b0 = bn*2n+1 + ... +
b1*22 + b0*21 + 0*20 = bn...b1b00 (base 2). A
zero should be shifted in as the low-order bit.
Similarly, bn...b1b0 ÷ 2 = bn...b1. The low-order
bit, b0 shifted off the right end is the
remainder.
- Negative number representations.
Sign and Two's Ones'
Number Magnitude Complement Complement Excess 128
------ --------- ---------- ---------- ----------
93 01011101 01011101 01011101 11011101
-93 11011101 10100011 10100010 00100011
-13 10001101 11110011 11110010 01110011
-14 10001110 11110010 11110001 01110010
47 00101111 00101111 00101111 10101111
128 - - - -
-128 - 10000000 - 00000000
0 00000000 00000000 00000000 10000000
-0 10000000 - 11111111 -
- In 8 bits with excess-3, we can represent -3 to
252. Adding the excess 3, changes this to 0 to
255, which is the range of numbers represented
with 8 bits. In general for excess-m with n
bits, we can represent -m to 2n-m-1.
- Negative number representation.
Sign + Ones' Two's Excess
Unsigned Mag. Compl. Compl. -4
-------- ------ ------ ------ ------
000 0 0 0 0 -4
001 1 1 1 1 -3
010 2 2 2 2 -2
011 3 3 3 3 -1
100 4 -0 -3 -4 0
101 5 -1 -2 -3 1
110 6 -2 -1 -2 2
111 7 -3 -0 -1 3
- For ones' complement addition, if a carry is
generated out of the high-order bits (sign
hits), it must be carried around to the
low-order bits and added back in. It is
necessary to correct the answer in this way to
maintain a correct representation.
- An n-bit number is transformed to a 2n-bit
number of the same value by,
- Unsigned: add n high-order 0 bits
- Sign and Magnitude: add n 0 bits between the
number and its sign bit
- Two's Complement: add n high-order bits which
are the same as the sign bit (sign extension)
- Ones' Complement: sign extension
- Excess-m Notation: For arbitrary m, add
22n-2n to the number. For excess-2n-1 to
excess 22n-1 conversion, complement the sign
bit, extend the sign bit, and recomplement the
new sign bit.
- Let x' be the complement of x, then -x = x'+1 so
-(-x) = -(x'+1) = -x'-1 = (x')'+1-1 = (x')' = x.
- For both ones' and two's complement, overflow
occurs if the carry into the sign bit differs
from the carry out, or (equivalently) if the
signs are equal and different from the carry
into the sign bit.
- Division of a double-length dividend by a
single-length divisor may leave a double-length
quotient. (Suppose that we divide by 1?) On the
other hand, the remainder is always less than
the divisor (in absolute value) and so must be
single-length.
- Suppose we have a decimal fraction with n
fractional places. The algorithm to convert to
binary requires repeated multiplication by 2.
Multiplying an n-place decimal fraction by 2
leaves at most an n-place decimal fraction.
There are only 10n possible n-place decimal
fractions. Thus, after at most 10n
multiplications, the same fraction is produced
and the binary fraction will start to repeat.
The repeating part of the number can thus be at
most 10n bits long. (In fact, much less, but I
don't know exactly how much less.)
- Binary fractions can be represented exactly as a
finite number of decimal digits because 2 is a
factor of 10 (=2*5). One way to convert would
be to multiply (in binary) the fraction by ten
and take the integer part as the next digit;
the fractional part is again multiplied by ten.
Each multiplication by ten includes a
multiplication by 2 which makes an n-bit binary
fraction into an integer in at most n
multiplications.
- 13.90625
- Overflow means the result of an operation has
exceeded the range which can be correctly
represented, i.e., the result is either too
positive or too negative to be represented.
Underflow is when a floating point number
becomes so small (close to zero) that the
precision necessary to distinguish it from zero
exceeds the precision of the representation.
This is reflected by the exponent of the number
becoming too negative to be represented.
- For a normalized floating point number, either
(a) the number is zero, or (b) the first digit
of the fraction is non-zero. The fraction digit
is taken in the base of the exponent. Thus, if
the exponent is for base 2, the first bit must
be 1; for an exponent of base 16 (like the IBM
370), at least one of the first four bits must
be 1.
- 3.1416 = 010000101100100100001111.
01111101110000000000000000 = 3 * 259.
- The advantage of an excess notation is that if
the exponent is treated as an unsigned number,
it will compare the same way as if it is treated
as signed. Thus, for normalized numbers with
excess exponents in the high-order bits, we can
use an integer compare instruction to compare
floating point numbers. This eliminates the
need for an explicit floating point compare
instruction.
- Certainly. BCD can be used to represent any
number in either fixed or floating point. For
example, a fixed point 4.36 can be 0100 0011
0110, while .92 * 1014 is just an exponent of
0001 0100 and a fraction of 1001 0010.
1.3 THE INPUT-OUTPUT SYSTEM (PAGE 61)
- Input/Output Devices and Media.
Card Reader | Cards |
Card Punch | Cards |
Line Printer | Paper |
Paper Tape Reader | Paper Tape |
Paper Tape Punch | Paper Tape |
Magnetic Tape Drives | Magnetic Tape |
Disk Drives | Magnetic Disk |
Drum Drive | Magnetic Drum |
CRT | ? |
Teletype Printer | Paper |
Keyboard | ? |
Plotter | Paper |
Floppy Disk Drive | Floppy Disk |
Cassette Tape Drive | Cassette Tape |
Console Switches | ? |
- BCD is a 6-bit code; EBCDIC is an 8-bit code;
ASCII is a 7-bit code, but sometimes 8-bit
(when parity is attached).
- The term "BCD" is used both for a technique for
representing decimal digits and for a class of
character codes used on older computers. There
is no direct relation between these two
meanings of the same term (at least as far as I
know).
- Six-bit codes do not allow enough different
codes to represent both upper and lower case
letters, in addition to numbers and special
characters. A 7-bit code is inconvenient
because it is not a power of two, or even an
even number, and so does not allow an exact
number of characters per word except on
machines with 14, 21, 28, 35, ... bits per
word, and very few computers have word lengths
that are a multiple of seven.
- One inch at 200 bpi is 200 frames, which for a
7-track tape is 200 * (7-1) = 1200 bits.
- 7-track tapes are for use with 6-bit character
codes; 9-track tapes are for use with 8-bit
character codes; 8-track tapes are for your
car tape player.
- We are forced to use 9-track tapes. The extra
bit can be ignored or used for additional
parity checking.
- See pages 45 and 46 for a discussion of parity.
- The sprocket holes tell the tape reader where
the information is on the tape so that it can
recognize how many zero frames are on a piece
of blank tape. With magnetic tape, either odd
parity is used so that every frame has at least
one one-bit (a sprocket bit of sorts); or even
parity, where the all-zero character code is
not allowed, again giving at least one one-bit
per frame. (The other use of the sprocket hole
for some systems is for moving the tape forward
and backward; tapes simply use a different
movement mechanism.)
- With 12 rows, we could represent 4096 different
characters. A column binary representation
would be more efficient. Since Hollerith is
essentially a 6-bit code, two characters could
be put in each column, doubling the number of
characters per card.
- At 200 bpi, an 80-character card image takes
0.4 inches, so 0.4 ÷ (0.4+0.75) = 35% of the tape
is used for information; at 1600 bpi, 6.2% of
the tape is used for information. Even with
these figures, 1600 bpi is better, since its
6.2% stores one-third more characters in the
same amount of tape as 200 bpi. (Less tape is
used but it is used much, much better).
- A disk is shaped like a phonograph record and
records on the flat surfaces of the disk; a
drum is a cylinder which records on the side of
the cylinder. (Thomas Edison used drums for
his original phonograph player.) A moving head
disk or drum has one head that moves back and
forth to read various tracks (like the needle
arm on a phonograph player). A fixed head disk
or drum has a separate head for each track, so
that the heads don't need to be moved.
- For a moving head device, seek time is the time
to move the head to the correct track; latency
is waiting for the correct sector to rotate
under the head; transfer time is the time to
actually read or write information. For a
fixed-head device, there is no seek time;
latency and transfer time are the same as for
moving head devices. -
1.4 THE CONTROL UNIT (PAGE 65)
- The four basic components of a computer are:
the memory system, the computation unit, the
input/output system, and the control unit.
- The function of the control unit is to direct
the other components of the computer (the
memory, cpu and I/O system) to perform their
functions at the right time in the right
order.
- The basic instruction execution cycle is:
- fetch the next instruction
- decode the instruction
- execute the instruction
- A special purpose computer is built to do a
specific task. The sequence of operations
that it performs are built into the computer.
A general purpose computer, on the other hand,
uses a program stored in memory to direct its
execution. The program can be easily changed.
- The program counter (or instruction address)
is used to define which instruction should be
the next instruction to be executed.
- For 37 instructions, we would need 6 bits
(allowing 64 opcodes) since 5 bits (32
opcodes) are not enough. This assumes that we
want a fixed length opcode. An alternative
approach would be to use 5 bits most of the
time, but pick one of them to say that another
3 bits should be used. This allows 31 5-bit
opcodes, and 8 8-bit opcodes, and may be more
effective (but harder to decode).
CHAPTER 2: ANSWERS TO EXERCISES
- The MIX machine has 4000 words of memory,
addressed from 0 to 3999. I
- The registers of the MIX machine are: A, X, I1,
I2, I3, I4, I5, I6, J, CI, and OT.
- A MIX instruction has an opcode field (C), a
field specification (F), an index field (I), and
an address field (+AA).
- The MIX computer can have magnetic tapes (units
0-7), magnetic disks and drums (units 8-15), card
reader (unit 16), typewriter/paper tape device
(unit 17), and line printer (unit 18).
- A MIX 1009B is a binary computer while a MIX
1009D is a decimal computer.
- The six (?) different types of MIX instructions
are: loading, storing, arithmetic, immediate,
comparison, jump, I/O and miscellaneous.
- Symbolic opcodes are easier to remember than
numeric opcodes. Assembly language programs can
have explanatory comments. Numbers can be
written in decimal. Symbolic labels can be used
rather than numeric addresses. Assembly language
is easier to write, read, change, correct, and
debug than machine language.
- MIX is a computer; MIXAL is an assembly
language.
- A HLT instruction is a machine instruction which
is translated into a machine instruction. The
END instruction is a pseudo-instruction which
tells the assembler that there are no more
assembly language statements to be processed.
- LDA =5= copies into the A register from a memory
location which has the number 5 as its contents,
thus putting the number 5 into the A register
(memory reference instruction). ENTA 5 puts the
number 5 directly into the A register (immediate
instruction). LDA 5 loads the A register with
the contents of location 5 in memory. We can't
tell what will be the value put in the A register
unless we know what the contents of location 5 of
memory are. ENTA =5= puts an address in the A
register. The address is the address of a
location in memory whose contents is the number
5.
- A program counter is a hardware register which
tells the MIX computer which instruction to
execute. A location counter is a variable used
by an assembler to indicate what address should
be used for the next assembled assembly language
statement.
- The ENTA n instruction takes only one time unit
to put n into the A register while LDA =n= takes
two time units plus the extra memory to store n.
However, the LDA =n= can load an arbitrary number
(up to 5 bytes plus sign) while the ENTA n is
limited to -4096 < n < +4096.
- The EQU puts the symbol ERG into the symbol table
with the value 1435, while the CON puts the
symbol ERG into the symbol table with the value
of the current location counter, and puts 1435 in
memory at that address.
- The question should ask: "What does the symbol *
mean in MIXAL?" The symbol * may mean (depending
upon its context): (a) multiplication, (b) the
value of the location counter, (c) a comment card
(in column one), or (d) the character * (in an
ALF).
- The error is with the ORIG *+2*N. Since
expressions are evaluated strictly left to right,
this is (*+2)*N, not *+(2*N). Write it as 2*N+*.
- The END card is the last card in an assembly
language program. Thus, all references to a
label on the END card occur before the END card,
and hence are forward references except a
reference in the operand field of the END card.
This would not be a forward reference, so the
book is wrong. (But a LABEL END LABEL card makes
no sense.)
- You would probably lose a friend. The ORIG *-100
card would reset the location counter back 100
and the next 100 statements would be stored in
the same memory locations as the previous 100.
The effect would be about the same as taking 100
cards out of the program.
- We can generate +214503000 by (among others),
NOP 1125,3
CON 17(1:1),37(2:2),3(3:3)
CON 679500(1:3)
ALF P7C
- Some assembly time NOPS would be,
ORIG *
EQU 0 NO LABEL TO PUT IN TABLE
ORIG *+0
MIXAL really doesn't have many interesting
assembly NOPs.
- First, all references to literals are forward
references. Second, it wouldn't make much sense.
With a literal we are saying that we are
concerned about the contents, but not the address
(the assembler can put it anywhere), whereas the
expression is trying to use the address of the
literal.
- Assuming there are no references to 2F or 2B
elsewhere, this is equivalent to MOVE *-1(*)
according to the definitions in this book. One
sticky point: Knuth indicates that a 2B is not
to refer to a label on the same statement as the
reference to 2B. Thus, his interpretation would
make it equivalent to MOVE *-2(10). However, the
point is unlikely to occur in practice and the
obvious implementation of a label such as LABEL
OPCODE is the same as LABEL EQU * followed by an
unlabeled OPCODE, for any OPCODE except EQU.
Knuth's semantics would be much more difficult to
define and implement.
- Pseudo-instructions are directives to the
assembler executed at assembly time, so ORIG,
EQU, ALF, and END are "executed" at assembly
time. JMP, ENTX, HLT, JXN are all machine
language instructions and are executed at
execution time.
- We can use the EQU to write FIND STJ EXIT as
FIND EQU *
STJ EXIT
This will work for all labels on non-EQU
statements (but see the answer to problem 21
above for a different view of local symbols).
- Using a two-pass approach, the symbol table would
be the following symbols and their octal values:
X 144
Y 132
BEGIN 145
and the machine language would be
0145 +0144010210
0146 +0132070530
0147 +0146000240
0150 +0000000205
The starting address is +0145.
- We could reconstruct an assembly language program
but it would not be unique. See problems 18 and
19 for example. The major problems are:
- We don't know which locations are
instructions and which are data (CON or ALF).
- We don't know what symbols were used.
- A possible assembly language program is,
IN BUF(16)
JBUS *(16)
OUT BUF(18)
ENT2 15,4
SLC 0,2
ENTX 2014
HLT
- The contents of the A register is indeterminate
unless we know the value of the X register also.
The problem is that the CON 5 will be loaded into
the A register by the LDA, and then the contents
of the next location will be executed as an
instruction. The next location in memory is the
number 5, which is a NUM instruction. If we
assume the X register to be zero, then the
contents of the A and X register would be NUMed
putting 500,000 in the A register.
(The problem was meant to have the constant be a
CHAR instruction, not a NUM. So if you change
the constant 5 to 69 (0105 octal), then the A
register contents will be +3636363636 octal when
the HLT is executed.)
- There are several ways to do it. See
Section 2.2, pages 75-81 for an example.
- Again, see Section 2.2, for writing machine
language programs. The changes to find the
maximum of four values would be major. The
easiest way to do this problem would be to write
the program in assembly language, assemble it and
copy the answer off the assembly listing. That
is how the program of Section 2.2 was written.
- The MIXAL assembler translates MIXAL assembly
language programs into MIX machine language
programs.
- The default field specification is 0:5 for memory
reference instructions, except STJ for which it
is 0:2. Non-memory reference instruction
defaults are defined by the opcode table (see
Appendix B, C, or D). The default value of all
other fields is zero.
- A fixed-format assembler specifies the columns of
an input card each statement must be put in. A
free-format assembler uses delimiters to separate
fields and identify the values to be put in each
field.
- Undefined opcode. Undefined symbol reference.
Multiply defined symbol. Expression overflow.
Illegal forward reference. No END card. Forward
reference in an ORIG, EQU, or CON statement.
Symbol too long. Literal too long. Reference to
an undefined local symbol (iB).
- A forward reference is a use of a symbol before
it is defined (by being a label on some assembly
language statement).
- They are not both needed. They are both included
to save the programmer from the need to convert
alphanumeric character strings to their MIX code
and express them as decimal numbers. (Or
equivalently, to save the programmer from needing
to convert numeric constants to an equivalent
character string if ALF were provided but not
CON.)
CHAPTER 3: ANSWERS TO EXERCISES
- The X register exists simply to extend the A
register where necessary, such as for the MUL,
NUM, and CHAR instructions.
- After a multiply, the A register has the upper
five bytes of the product; the X register has
the lower five bytes. After a divide, the A
register has the quotient and the X register has
the remainder.
- A load instruction must reference memory for the
value to put in the register, thus needing an
extra memory cycle. An enter has the value to
put in the register in the effective operand.
- A memory reference instruction must reference
memory to fetch its operand, while the immediate
instruction uses the effective address as its
operand. The instruction execution cycle for a
memory reference instruction is,
- fetch instruction
- decode instruction
- calculate effective address
- fetch operand from memory
- execute instruction
For an immediate instruction, step 4 is not
present.
- The index field is composed of two octal digits,
a:b. The effective address is the contents of
the address field plus the contents of index
register a plus the contents of index register b,
where the contents of "index register 0" are
understood to be zero.
- For a constant n, an increment of n is equivalent
to a decrement of -n. Thus, only one instruction
is needed for constant n. However, if we use
indexing, then INC allows addition of index
registers, while DEC allows subtraction.
- One-liners (in MIXAL).
- +0012000265 ENT5 10
- +0000000260 ENTA 0
- +0000030160 DECA 0,3
- +0000010062 INC2 0,1
- +2734000260 ENTA *
- +0000000143 IOC 0(1)
- Double indexing is the technique of using two
index registers in the calculation of the
effective address. An example of a machine
language instruction is +0000120262 (ENT2 0,2:1),
which is an alternative solution to 4 in problem
g above.
- Why is there a NOP instruction? No real reason
that I can think of anymore. It could be just a
filler (we have 32 different opcodes because of a
5-bit opcode field, but can only think of 29
things to do, so we fill the other three with
NOP), or was originally used by programmers who
wrote in machine language and needed to be able
to "patch" absolute programs in such a way as to
remove a previous instruction without moving
everything else in memory. But other NOPs exist
in MIX like JSJ *+1, INC1 0, SLC 10, MOVE 0(0),
so an explicit NOP is really not necessary.
- We would need 8 bits to represent 159 opcodes (27
= 128 < 159 < 256 = 28. The MIX computer has
only a 6-bit opcode because it also uses the
6-bit F field, giving an effective 12-bit opcode
(4096 instructions), most of which are not used.
- For a load, the L:R bytes of memory are copied
into the low order bytes of the register, with
zeros filling the upper bytes. For a store, the
low-order bytes of the register are stored into
the L:R bytes of memory with all other bytes
unchanged. For a compare, the L:R bytes of
memory are compared with the L:R bytes of the
register.
- The immediate instructions use the F field to
distinguish between the different immediate
instructions, so there is no room for a field
specification in the instruction. Also they
probably aren't really needed, and can be
simulated by the immediate instruction and
appropriate pre- and post-shifting.
- The MOVE instruction is not needed. It could be
simulated by a short loop, but would take much
longer.
- See Table 3.12 for MIX instruction execution
times. The MOVE instruction must both read and
write each word, requiring 2n time units, not n
(plus one to fetch and decode the instruction).
CHAPTER 4: ANSWERS TO EXERCISES
- Programming is the creative task of converting a
problem statement into an algorithm which will
solve the problem. Coding is the relatively
straightforward task of converting this algorithm
into a program in a suitable language.
- To compute (J*LOGICAL)/PHYSICAL + BASE,
LDA J
MUL LOGICAL
DIV PHYSICAL
ADD BASE
- To compute Z = Y + 2*(W+V)/4 - 6*(10-W-V),
ENTA 10
SUB W
SUB V
MUL =6=
STX TEMP TEMP=6*(10-W-V)
LDA W
ADD V
SLAX 5
DIV =2= (W+V)/2
ADD Y
ADD TEMP
STA Z
- To put the absolute value of N in the A register,
LDA N(1:5)
- To compute the sign function of P, we can write,
LDA P
JAZ 1F
STA *+1(0:0)
ENTA 1
1H EQU *
but I don't know how to get rid of the jump on
zero.
- A program can be thought of as being made up of
short segments, each segment doing some function
part of the problem. We may want to jump to the
end of a segment, and fall through to the next,
but the label should logically be attached to the
end of one segment, not the first instruction of
the next. In this case, we might use EQU *.
- The major problem is that we need another
register as a counter. Assume that we can use
some temporary locations in memory. Then for a
MOVE ADDR(N), we can write,
STA TEMPA
ST2 TEMP2
ENT2 0
1H CMP2 =N= CHECK FOR END OF MOVE
JGE 2F
LDA ADDR,2
STA 0,1 MOVE ADDR T0 I1
INC1 1
INC2 1
JMP 1B
2H LDA TEMPA RESTORE REGISTERS
LD2 TEMP2
- Some loops are run backwards because they run
faster and take less space.
N=3 N=10 N=n
From 1 to N 17 52 2+5n
From N to 1 12 33 3+3n
- Doesn't really. Arrays probably tend to be one
word per entry more often than tables which have
multiple words per entry, but not always.
- Use the shorter, faster code:
LD1 INDEX
LDA ARRAY-400,1
- A space-time tradeoff if a situation where the
same code can be written two ways. The code can
be faster, but only if it is larger, or made
smaller, but slower.
- Working strictly with positive integers, we can
write the following code for double precision
integers. Assume that each integer is two words:
the first word (P,Q,R) is high-order; the second
(P+1,Q+1,R+1) is low-order.
JOV *+1 MAKE SURE OT IS OFF
LDA P+1
ADD Q+1
STA R+1 R=P+Q (LOW-ORDER)
LDA P
JNOV *+2
INCA 1 ADD 1 IF OVERFLOW
ADD Q
STA R
- To PUSH,
* TOP = LENGTH INITIALLY
LD5 TOP
DEC5 1
JSN OVERFLOW
STA STACK,5
ST5 TOP
To POP,
LD5 TOP
CMP5 =LENGTH=
JGE UNDERFLOW
LDA STACK,5
INC5 1
ST5 TOP
- Let N be the number of nonblank characters per
card. Then space and time are,
Space Time
----- ----
Partial Fields 26 2+16*(4+5*6)-N
= 546-N
Instruction Modification 16 2+16*(5+5*12)-N
= 1042-N
Shifting 14 2+16*(7+5*8)-N
= 754-N
- The difficulty in averaging comes from the lack
of floating point numbers. If we print an
integer average, do we round or truncate? What
if we want to know the average to the nearest
tenth? In that case we multiply times ten before
we divide the sum by the number of numbers. Also
overflow is a problem.
- The percentages probably won't add up to 100 due
to round-off error (if you round) or truncation
problems (if you truncate). The sum of the
percents can be less, equal, or even greater than
100.
CHAPTER 5: ANSWERS TO EXERCISES
- Devices and their physical record size.
Tape | 100 MIX words |
Disk | 100 MIX words |
Drum | 100 MIX words |
Card Reader | 80 characters, 16 MIX words |
Line Printer | 120 characters, 24 MIX words. |
- The X register holds the track and sector address
in bytes 4:5.
- A buffer is an array in memory which is used to
temporarily hold information to be input or
output. It is used because of the differences in
speed between the cpu and I/O devices.
- I/O interlock time is the time that the cpu must
wait before executing an I/O command because the
device is already busy with a previous I/O
command.
- Buffering is used to smooth over the differences
in speed between the cpu and I/O devices.
- Blocking is used to allow efficient utilization
of space on I/O storage devices. It is necessary
because the size of information which we wish to
store may differ from the fixed physical record
size for the device.
- Same question as question n of Section 1.3. Seek
time is the time to move the head over the
desired track, latency is waiting for the disk to
revolve to the right sector; transfer is the
actual time to read or write information.
- The JBUS *(16) causes the computer to wait until
the I/O device has finished its last operation.
Without it the following LDX may load either the
old or new value of BUF, depending upon the speed
of the device.
- Hard to answer. At first shot, it would appear
that we will read a card and the print that card
on the line printer, preceded by the word DATA.
But the programmer has forgotten to wait for the
IN to complete before the OUT, so the input card
will probably not be read in before the OUT
prints the buffer, and so we can't really say.
- You might think so, and most of the time probably
can, but what if there is a reference to WAIT+1?
- Time to copy 1000 cards is 1000 * (50000 + 10 +
1000) microseconds = 51.01 seconds. If we
overlap processing and I/O we can reduce this to
about 50000 (for the first card) + 999 *
max(50000, 10, 1000) + 1000 (for the last card) =
50.001 seconds. Blocking multiple card images
per drum record (without overlap) would take 1000
* (50000+10) + 1000 ÷ 6 * 1000 (for loose blocking)
or 1000 * (50000+10) + 1000 ÷ 6.25 * 1000 (for
tight blocking). Blocking with overlap is the
same as the simple overlap case considered above
since total time is dominated by the card read
time. Specifically with a 50 millisecond per
card input rate, 1000 cards will always take at
least 50 seconds for input, totally independent
of I/O overlap or output blocking.
- In the best case, we would be able to overlap
input and output completely (except for the first
card) and so the time to process n cards would be
50 (for the first read) + (n-1) * max(50,1,50) +
max (50,1) (for the last write to disk or
printer) = (n+1) * 50 milliseconds.
- Blocking factors and wasted words for different
blocking modes.
No Blocking Loose Blocking Tight Blocking
---------------- ---------------- ----------------
Blocking Words Blocking Words Blocking Words
Factor Wasted Factor Wasted Factor Wasted
7 1 93 14 2 14.285 0
13 1 87 7 9 7.692 0
18 1 82 5 10 5.555 0
41 1 59 2 18 2.439 0
500 0.2 0 0.2 0 0.2 0
- ... Left as an exercise to the student ...
- There is very little overlap which can be done
here. The strategy would be to read and copy to
disk or drum, then read it backwards, printing.
The only overlap possible is between input and
disk and disk and output.
- A sequential access device only allows operations
of the form: move forward one, move backward
one, or rewind. A direct access device allows
any arbitrary record to be accessed directly. We
can simulate a sequential access device on a
direct access device easily. If you want to
stretch the definition of "simulate", it is also
possible to simulate a direct access device on a
sequential access device, but only at a very high
cost in I/O and time. (We simply keep track of
what record we are currently on and move the tape
to the next record by lots of movements, one
record at a time.)
- The major performance difference is time: it
takes longer to move information than to switch
pointers. The major programming difference is
that, with the pointers, all accesses must be
made indirectly, either by indexing or indirectly
through a memory location. If information is
moved, we can always access the information from
the same place in memory each time.
- It would be difficult to buffer a direct access
input device, since we do not know which record
will be needed next. A direct access output
device, on the other hand, can be buffered since
we know where each record is to be put and so we
can "buffer behind".
CHAPTER 6: ANSWERS TO EXERCISES
- One reason is to save space. Another reason,
more difficult to motivate but more important, is
to make the program more modular.
- A calling sequence is the method of calling a
subroutine. It involves describing how all of
the registers and memory should be set up to
allow parameters to be passed.
- Only the calling routine knows which registers
are being used, so only the calling routine can
save only those registers which need be saved.
Only the called routine knows which registers
will be used, so only the called routine can save
only the registers which need be saved.
- The J register exists to allow a return address
to be easily passed to a subroutine.
- The parameters can be passed: in registers; in
global variables; in the called routine before
the entry point; in the called routine after the
call; in a table; or on a stack.
- Not very useful. The subroutine always returns
to RTN, therefore it can never really be used as
a subroutine.
- ... Left as an exercise to the student ...
- Call by value passes the value of a parameter;
call by reference passes the address of a
parameter; call by name passes the address of a
thunk which will compute the address of a
parameter every time it is called.
- Call by reference is sufficient, as long as we
can only pass simple variables.
- The values would be,
I J A(1) A(2)
Call by Value 1 2 100 0
Call by Reference 2 2 0 0
Call by Name 2 2 100 0
- Thunks are, because they are needed to correctly
implement the copy rule.
- Self-modifying code is code which stores new
instructions in itself, thus changing the program
as it executes. Serially reentrant code may be
self-modifying but always changes things back
before it quits or after it starts, so each
execution is of the same program. A recursive
routine may call itself. A coroutine is a
routine which will call its caller rather than
returning to its caller like a subroutine. Pure
code is nonself-modifying. Reentrant code is
nonself-modifying.
- A recursive subroutine needs a stack.
- A reentrant subroutine is serially reentrant, and
pure code. It need not be position independent
nor recursive.
- Something like,
- On a computer which is used by many people at the
same time, it may be possible to share pure code
between many users, thus saving memory space.
This is an operating system topic (Section 9.5),
and particularly useful for time shared operating
systems.
CHAPTER 7: ANSWERS TO EXERCISES
- Bootstrap loader, absolute loader, (relocatable
loader, linking loader, relocatable linking
loader), linkage editor, overlay loader.
- Each logical record for an absolute loader must
contain a sequence of values plus the addresses
at which to load them. This is often encoded as:
a number N, an address A, and N values V1, V2,
..., VN, to be loaded at A, A+1, A+N-1,
respectively. A checksum may also be provided.
- Locations within a routine which are used by
another routine are called entry points and
externals. The name used is determined by
whether we are considering the user (external) or
the definer (entry point).
- The purpose of a checksum is to provide a form of
error detection.
- AND and ORR would be of no value for checksum
computation. For AND, x AND 0 = 0, so that all
bits in the checksum would tend to become zero,
independent of the input bit values. For ORR, x
ORR 1 = 1, so all bits in the checksum would tend
to be one. The XOR works fine and is often used
for checksum computation.
- A bootstrap loader is the original loader used to
load the regular system loader. It is needed
when the computer is initially turned on and has
no other system software in its memory. Alt is
generally an absolute loader of the most
primitive type.
- One technique is to use two passes; one to build
the entry point table, the other to actually
link. Another technique is to build an entry
point reference table which simply keeps track of
all addresses where entry points are used (as
externals) and the correct address is filled in
when it becomes available. Another technique is
to use chaining.
- The relocatable statements are STJ EXIT, STZ I,
LDA I, MUL I, OHPX N, JG FOUND, LDA I, STA I, JHP
1B, LDA I, and JMP *. Basically everything
except the two CONs and INCA 1.
- Processing these modules with a linkage editor
would leave FORE and TOD as externals, and would
declare BEGIN, ARY, N, FOD, NIX and CS as entry
points.
- An absolute loader and bootstrap loader are one
pass. A relocatable loader (relocatable linking
loader, we assume) could be either one or two
pass. A linkage editor would just about have to
be two pass.
- Relocated code stored in memory would be,
0432 +0000000000
0435 +0432000510
0436 +0433000530
0437 +0001000260
0440 +0434003430
- Binding time for machine language programs is
when they are written, for absolute assembly when
they are assembled, and for relocatable programs
when they are loaded.
- It would be possible to have a relocatable,
non-linking loader, but it would make little
sense, since the separately relocated parts would
have no way to interact.
- A linking loader inputs relocatable load modules
and links them together in memory. A linkage
editor inputs relocatable load modules, links and
relocates them, but then outputs the result as
another (larger) relocatable load module.
- Bootstrap first, then absolute loader, then
relocatable and linking as one loader, finally a
linkage editor.
- ... Left as an exercise to the student ...
- The assembly language would need to have Entry
Point and External declaration statements. Also
there would be restrictions on the form of
expressions allowed since all expressions need to
be either relocatable or absolute when evaluated.
CHAPTER 8: ANSWERS TO EXERCISES
- Two pass.
+0000000144 MAXSEM EQU 100
+0000 SYMBOL ORIG 2*MAXSYM+2
*
+0312 +0324 00 02 40 SEARCH STJ ENDSEAR
+0313 +0323 00 02 32 ST2 SAVSEAR(0:2)
+0314 +0000 01 05 30 STA 0,1
+0315 +0001 01 05 12 LD2 1,1
+0316 +0002 02 00 62 INC2 2,2
+0317 2H EQU *
+0317 +0002 00 01 62 DEC2 2
+0320 +0000 12 05 70 CMPA 0,1:2
+0321 +0317 00 10 47 JNE 2B
+0322 +0000 02 02 61 ENT1 0,2
+0323 +0323 00 02 62 SAVSEAR ENT2 *
+0324 +0324 00 00 47 ENDSEAR JMP *
- One pass, with symbol table following.
+0000000144 MAXSEM EQU 100
+0000 SYMBOL ORIG 2*MAXSYM+2
*
+0312 +7777 00 02 40 SEARCH STJ ENDSEAR
+0313 +7777 00 02 32 ST2 SAVSEAR(0:2)
+0314 +0000 01 05 30 STA 0,1
+0315 +0001 01 05 12 LD2 1,1
+0316 +0002 02 00 62 INC2 2,2
+0317 2H EQU *
+0317 +0002 00 01 62 DEC2 2
+0320 +0000 12 05 70 CMPA 0,1:2
+0321 +0317 00 10 47 JNE 2B
+0322 +0000 02 02 61 ENT1 0,2
+0323 +0323 00 02 62 SAVSEAR ENT2 *
+0324 +0324 00 00 47 ENDSEAR JMP *
Symbol Table
MAXSYM 0144
SYMBOL 0000
SEARCH 0312
SAVSEAR 0323
ENDSEAR 0324
- For a fixed format assembler, the free format
routine may need to be reformatted; for a free
format assembler, both should work without
reformatting.
- It is possible to relax the requirement for
labels to start in column one, but this may
introduce another problem. How do we distinguish
between a label and an opcode (if no label is
used the first symbol on the card would be an
opcode)? The traditional solution is to mark a
label definition to easily separate labels from
opcodes. Specifically, a label definition is
immediately followed by a colon, as LABEL:
OPCODE OPERAND.
- We would not be able to use opcode mnemonics as
labels (You probably didn't realize that you
could). The search for opcodes and labels would
take longer. Only one search routine would be
needed so the assembler might be smaller.
Another field might be needed with each
symbol/opcode table entry to distinguish user
symbols from opcodes.
- For each symbol table entry, a one-pass assembler
needs to keep (1) the symbol, (2) whether it is
defined or a forward reference, (3) its value if
defined, and (4) the addresses of its use if it
is a forward reference.
- Symbol table searching from the beginning to the
end will find the first definition of a
multiply-defined symbol. Searching from the end
will find the most recently defined value.
- No. This would produce a hash function whose
value would depend upon the time of day when it
was computed. What is needed is a hash function
which depends only on the symbol being hashed
(and the size of the symbol table).
- On pass 1, we must construct the symbol table,
scan for the opcode, process ORIGs, search the
symbol table (for multiply defined labels),
update the location counter, and process the
label field (1,3,5,6,7,10). On pass 2, we must
output code for the loader, print the listing,
search the symbol table (for operands), treat an
LDA different from an STA, and process CONs
(2,4,6,8,9). It is possible to do 3,5,7 on both
pass 1 and pass 2, or we can do them in pass 1
and save the results for pass 2 in the
intermediate text. One could argue that 7 must
be done on both passes, if we say that the
location counter is being updated by reading the
values from the intermediate text.
- An error can be noted. The second symbol
definition can be ignored, or the second
definition can be saved as the symbol table
value.
- On which pass are these errors found?
- Undefined symbols would normally be found on
pass 2, but may be found in pass 1 as illegal
forward references.
- Pass 1
- Pass 1
- Pass 1
- Pass 2
- (1) two passes, (2) chaining with fixup by the
loader, (3) use table, with fixup by the loader.
- At least two. A one-pass assembler will not have
the symbol table until the end of its first pass
through the program and this pass must also
produce the listing, so a one-pass assembler must
produce the listing first, then the symbol table.
- The one-pass would be faster, while the two-pass
could allow more general use of forward
references, would be able to provide a more
complete listing, and would need a simpler
loader.
- A two-pass assembler would not be able to handle
forward references to X, as,
Z EQU X
X EQU Y
Y EQU 0
This would require three passes. In general, one
can define a chain of forward references as a
sequence of forward references in which each
depends ,on the following (except the last which
is defined). The above example is a chain
(Z,X,Y) of length 3. A chain of length N
requires N passes to resolve all symbol values.
- It will not need any additional data structures,
per se, but may be able to make good use of an
entry point table and an external symbol table.
This information can be kept in a symbol table
entry which would need to include, in addition to
the normal information, an indication of whether
the symbol was absolute, relocatable, or
external. For external symbols, we may also need
either a chain address or a list of addresses
where it is used.
- Simply add these new mnemonics to the opcode
table, with the same value as the older
mnemonics.
- ... Left as an exercise to the student ...
CHAPTER 9: ANSWERS TO EXERCISES
- A subroutine is written once and exists at only
one location in memory, out-of-line. A macro is
expanded in-line every time it is called. Macros
will execute faster than subroutines, while
subroutines will take less space.
- A macro is an open, in-line subroutine, which
passes parameters by name (1,2,6).
- First pass: macro expansion; second pass:
symbol table construction; third pass: code
generation.
- What may appear to be errors may be corrected by
the parameters passed to the macro.
- One way is to add additional code which is
executed every time a particular opcode is used.
For example, assume that a program is somehow,
somewhere incorrectly storing the A register in a
variable LC. If we have a free index register 6,
we could write the following macro to replace all
STA instructions without changing our program at
all:
STA MACR P1,P2
ENT6 P1,P2 CALCULATE EFFECTIVE ADDRESS
GMP6 =LC= COMPARE ADDRESS TO LC
JNE *+2
HLT STOP IF THIS IS STORE TO LC
CON 24(5:5),5(4:4),6(3:3) STA 0,6
ENDH
By temporarily adding this macro, every store
will be checked for storing in LC and when the
first culprit is found the program will halt (or
it could output a message and continue or
whatever). When the problem is corrected, the
macro can be removed, leaving everything the
same.
- Conditional assembly is absolutely necessary for
recursive macro calls. Otherwise the recursion
would continue indefinitely (or at least until
the assembler ran out of time or space).
- See pages 332-336.
- Definitions.
- Interpreter or simulator.
- Fortran compiler.
- MIXAL assembler.
- Operating system.
- An interpreter is a program which acts like a
computer, fetching instructions and executing
them.
- At least two (and for a smart system only two),
because a segment may be marked for conditional
loading when it is encountered, and we will not
know if it is going to be used or not. We know
this only at the end of our first pass. By
storing all the segment names, lengths, entry
points, and conditional load bits, we can
determine, after the first pass, which segments
are to be loaded and which (conditionally loaded)
segments are needed, determine starting
addresses, relocation bases, and entry point
addresses, so on pass two everything that is
needed can be loaded.
- Higher-level languages are easier to write, to
read, to use, to move to another machine, and to
debug and test. Assembly languages allow better
code (smaller, faster) to be written and allow
access to and control over all machine features.
Systems programming languages attempt to combine
the advantages of both high level and assembly
languages.
- Interesting question. Probably not, since it is
possible to use any arbitrary symbol (including
opcodes and pseudo-instructions) as labels and
symbolic operands. The closest thing to a
reserved word is probably *.
- The real computer may not exist (like MIX), or
may not exist yet (under construction). Or the
program being interpreted may be under study
because of bugs or poor performance and so more
information may be desired than simply the final
register contents. By interpreting, special
actions can be taken to display the actual
execution of the program for better
understanding.
- Control cards tell the operating system what
steps it would perform for a job.
- Device independent I/O is a program which is
written so that it need not know or take special
actions based on its input or output devices. It
uses logical devices. These logical devices are
mapped onto the physical devices by the operating
system. A file is a logical device which acts
like a magnetic tape or disk or drum.
- Any code which can be generated by a compiler
could also have been written in assembly
language, thus a compiler cannot produce better
code than an assembly language programmer. More
commonly it produces poorer code because it does
not "know" what the program is supposed to do.
Thus, it must translate the statements which are
often overly general. A systems programing
language allows the programmer to more explicitly
control which instructions are generated by the
compiler.
CHAPTER 10: ANSWERS TO EXERCISES
10.2 THE PDP-8 (PAGES 368-369)
- PDP-8 memory consists of 4096 words of 12 bits
each. Addresses are 12-bits. The two registers
are the Accumulator and the Link bit.
- PDP-8 memory is logically organized in pages of
128 words. The effective address calculation is
described on page 357.
- The PDP-8 has only 8 basic instructions, but one
of these opcodes (the Operate instruction) uses
the remaining 9 bits in the instruction to
specify additional instructions. This allows
many more instructions and mnemonics.
- The number of instructions could have been
further reduced. ISZ and JMS could have been
dropped as could the IAC, RTR, RTL, RAL, CLA,
SMA, and RSS instructions (at least). These
additional instructions were added because they
were thought to be generally convenient to use
and not overly expensive to implement.
-
- Complement and Increment: forms the two's
complement.
- Illegal. SMA and IAC are in different
groups.
- ... Missing Answer ...
- Since the A register may have an arbitrary value,
we must clear it first, then TAD TEA, and skip on
zero and minus, as,
CLA
TAD TEA
SMA
JMP NNEG
JMP NGE
- Comparing PDP-8 and MIX code.
PDP-8 MIX
----- ---
ISZ TOPS LDA TOPS
JMP LOOP INCA 1
STA TOPS
JANZ LOOP
- ... Left as an exercise to the student ...
10.3 THE HP 2100 (PAGE 378)
- The HP 2100 has 16-bit words and 15-bit addresses
allowing up to 32K words of memory. The word
size is larger than the address size to allow an
indirect bit to be specified with each address.
- There are four registers: A and B (16 bits), E
(Extend bit), and O (Overflow bit).
- The fundamental difference is the increase from
12 bits to 16 bits per word, allowing an
instruction set with 4 bits for an opcode field
plus a register bit selecting one of two
registers. This allows some additional
instructions such as compares and stores.
- The HP 2100 has DMA transfers and a vectored
priority interrupt system.
- DMA is a method of doing input or output directly
between memory and an I/O device without
requiring program control of every word or
character transferred.
- A vectored interrupt system interrupts by a
subroutine jump to different locations depending
upon the device which is interrupting.
10.4 THE PDP-11 (PAGES 386-387)
- The word size is 16 bits with a 16-bit address,
but since memory is byte addressable, with two
8-bit bytes per word, we can only address 32K
words of memory (64K bytes).
- By making each byte addressable, character
manipulation is easier.
- There are eight user registers plus the condition
code. The first six registers (0 to 5) are
general purpose. Register 6 is the stack
pointer, and register 7 is the program counter.
- It is possible to do many operations without
needing the registers. For example, to assign
the value of a variable P to a variable Q, we
simply MOV P,Q rather than LDA P; STA Q.
- This obviously seems to be a cue to the assembler
used to identify the pseudo-instructions. It
allows the assembler to use separate tables for
pseudo-instructions and machine instructions so
each table is smaller and faster to search.
- Recursive programs need a stack, and the PDP-11
makes it easy to program with a stack.
- The PDP-11 is a priority vectored interrupt
system, like the HP 2100. The major difference
is that the priorities are determined by the
processor status word and hence are not fixed but
can be changed under program control.
10.5 THE IBM SYSTEM 360 AND SYSTEM 370 (PAGE 396)
- The 370 has 8-bit byte addressable memory.
Memory addresses are up to 24 bits allowing a
maximum of 16 megabytes of memory. Each word is
32 bits, four bytes. There are 16 general
purpose registers plus 8 floating point registers
plus the program status word register.
- A model 30 is cheaper, slower, and smaller than
the 370 model 168.
- Effective address calculation is the sum of the
displacement plus the contents of the base
register plus the contents of the index register.
If register 0 is used as base register or index
register, a value of 0 is used.
- See pages 388-393.
- The USING and DROP pseudo-instructions are used
to tell the assembler which registers can be used
as base registers and what their contents are.
This allows the assembler to automatically
compute the correct base register and
displacement for each symbol.
- Input/Output on the 370 is done by a channel.
The advantage of this approach is that
input/output can be completely controlled by the
channel and need not interfere with cpu execution
except to start and complete the I/O
instructions.
10.6 THE BURROUGHS B5500 (PAGES 402-403).
- The memory of the B5500 is up to 32K (15-bit
addresses) of 48 bit words.
- The assembler is not needed and all programming
can be done in a higher-level language with all
of its advantages (see problem k of Chapter 9).
- The bootstrap loader, at least, must be written
in machine language. So must the first program
which translates into machine language. Unless,
of course, another computer exists, in which case
all code (for any computer) can be written on the
existing computer in its existing higher-level
language, translated into machine language for
the new computer, and carried over and run on the
new machine.
- The designers probably felt that the two
different types of applications for which the
machine would be used (character handling and
scientific number manipulation) could best be
done in two different ways.
- Integers are treated as special cases of floating
point numbers with a zero exponent. This
eliminates the need for extra instructions and
hardware to process separately integer and
floating point numbers.
- The program would be something like,
LOAD Y
LITERAL 2
LOAD W
LOAD V
ADD
MUL
LITERAL 4
DIV
ADD
LITERAL 6
LOAD W
SUB
LOAD V
SUB
MUL
SUB
- No. Programs with lots of loads and stores will
tend to not be shorter on stack machines. Other
instruction sets can be just as efficient on many
problems. Register machines are particularly
good on computations with lots of common
subexpressions.
10.7 THE CDC 6600 (PAGE 411)
- The CDC 6600 has 60-bit words. Addresses are 18
bits allowing up to 262K words of memory.
- Loads and stores occur automatically when the A
registers are set to a memory address. Setting
A1, A2, ..., A5 will automatically load into X1,
X2, ..., X5, respectively. Setting A6 or A7 will
store X6 or X7 into the memory address put in A6
or A7.
- The 370 has its general purpose registers which
can be used as index registers.
- The peripheral processors do all I/O, and use
polling to detect I/O completion.
- The 18-bit accumulator is related to the 18-bit
addresses of the central processor.
- The word length for the 370 (32 bits) works well
with hex but would be awkward with octal,
especially with 8-bit EBCDIC. The 6600 could use
hexadecimal for its 60-bit words, but since it
uses a 6-bit character code, octal is more
convenient. However, other machines, like the
PDP-11 and HP 2100 with 16-bit words and 8-bit
character codes use octal instead of hexadecimal.
The choice is strictly personal bias.
10.8 THE INTEL 8080 (PAGE 419)
- There is no memory on an Intel 8080. However, we
have a 16-bit address with an 8-bit word which
allows up to 64K bytes memory to be used if it is
separately provided.
- Cost. A larger word size would require more
interface lines between the cpu chip and the
outside world (i.e., memory) and also would
require more hardware on the chip to store and
process the larger words.
- Multiple precision arithmetic would require
several bytes to be used for each number. To add
a number, the lower bytes would be added plus the
carry (0 or 1) from the low-order bytes. This
would continue until all bytes have been added.
Subtraction is similar. Two's complement
representation would probably be easiest.
Multiplication and division are much harder.
- The 8080 interrupt system is sort of in the
middle of the PDP-8 and the HPZIOO. It has up to
8 levels of interrupts which are separately
vectored, so it is more general than the PDP-8,
but not as general as the HP 2100 which has 64
interrupt vector locations.
- A cross-assembler is an assembler which executes
on one computer (machine A) and assembles
instructions for another computer (machine B).
This is done for the 8080 because the 8080 may be
too small, too slow and too limited in I/O
devices to support its own assembler. Thus, the
assembler is written for and executes on another,
larger computer system.
- The I/O instructions are similar but for the 8080
only one 8-bit byte is transferred, as opposed to
the MIX system which transfers a fixed number
depending on the device. Also there is no
built-in way to check for a device being busy or
done, as there is on the MIX computer. See pages
417-418.
10.9 SUMMARY FOR CHAPTER 10 (PAGES 421-422)
- Computer Corporate Genealogy.
- Amdahl came from IBM.
- Control Data came from Univac.
- Data General came from Digital Equipment.
- Cray came from CDC (which came from Univac).
- See the answer to question j of Chapter 3.
- Instruction cycle for a computer with interrupts.
- Fetch an instruction.
- Decode the instruction.
- Execute the instruction.
- If the interrupt system is on, check if there
are any interrupt requests. If so, enter an
interrupt phase. Then go back to step 1.
- The advantage of a skip instruction is that there
is no need for a (jump) address. This allows the
instruction to be encoded in fewer bits. Also,
whenever the condition controls the execution of
exactly one instruction, it is very efficient.
However, for a more general conditional jump, two
instructions are needed. Thus many designers
feel that the conditional jump is more likely to
be needed and so build it into the instruction
set to try to match the instruction set to its
use.
- A lot of registers allow important values to be
kept easily available at all times. On the other
hand, it requires more time to save and restore
more registers for subroutine calls.
- The 0-address computer and the B5500 (which is a
0-address computer) are stack machines (c,f).
- This code for a stack machine to compute
D=A*(B+C).
- Microprogramming is the technique of building a
simple computer (A), then writing for this simple
computer an interpreter for another computer (B).
The interpreter is put in ROM and the whole
package is sold as if it were the computer B
(which it is). The advantage of this approach is
that the same hardware can be used (with
different interpreters) to build other computers,
and that a complicated computer can be easily
built from simple hardware (with a complicated
interpreter).
- Interrupts allow the computer to be busy doing
some useful work until it is needed to service an
I/O device with no need to constantly check the
state of the I/O device (polling) and no fear of
waiting too long before servicing the device.
- A trap.
- Comparing different computers.
| MIX | PDP-8 | HP 2100 | PDP-11 | IBM 370 | CDC 6600 |
Word Size (in bits) | 31 | 12 | 16 | 16 | 32 | 60 |
Addressable Memory Unit | word | word | word | byte | byte | word |
Integer Number Representation | Sign and
Magnitude | Two's
Complement | Two's
Complement | Two's
Complement | Two's
Complement | Ones'
Complement |
Major Registers | A, X, J,
I1,...,I6 | A | A,B | R0,...,R5,
SP, PC | R0,...,R15,
Floating
0,2,4,6 | A0,...,A7,
B1,...,B7,
X0,...,X7, |
Bits in Address | 12 | 12 | 15 | 16 | 24 | 18 |
Opcode Size (in bits) | 6 | 3 | 4 | 4 or 8 | 8 | 6 |