The following is an HTML version of the Instructor's manual for Computer Organization and Assembly Language Programming by James L. Peterson, published by Academic Press, 1978. The Instructor's manual was originally published in 1980 (ISBN 0-12-552252-5). It is mostly a faithful reproduction of the content of the original printed book, with the exception that it has been modified to match the current HTML version of Computer Organization and Assembly Language Programming. The main difference is that all exercises are identified by number (1, 2, 3, ...) instead of their original mixture of numbers and letters (a, b, c, ...).

THE DESIGN OF A MIX SYSTEM

1.0 INTRODUCTION

MIX was introduced by Knuth [4,5,6,7] in 1968 to aid in the analysis and description of computer algorithms. It was designed to be like most machines, but not to be any particular machine. It has been used throughout the three volumes of The Art of Computer Programming [4,5,6] which have been written, and has also been adopted by some universities for the teaching of assembly language programming [3,11].

1.1 MIX For Teaching

The MIX machine is by no means ideal for teaching introductory machine organization and programming. Its major alternative is the use of a real machine. With the decreasing cost and increasing availability of minicomputer systems, the potential for teaching with a real computer system, such as the PDP-11, HP 2100, or other minicomputer, improves. Most textbooks for this subject are aimed more toward using a real computer than a simulated one. The PDP-11 in particular is a very popular choice.

1.2 Teaching A Real Machine Or Teaching MIX

It has been argued, however, that the use of a real machine has as many potential disadvantages as advantages. Specifically, although the IBM 370 and the PDP-11 computer systems (which are most commonly used for teaching assembly language programming) are quite common machines, a student is just as likely to end up programming on a computer from Univac, CDC, Data General, Interdata, or any of a number of other computer manufacturers, or even other architectures from IBM or DEC (such as the Series 1 or PDP-8 or DEC-10). Thus, what is important is the basic programming techniques and understanding, not a specific programming language. There is also the problem of programming features or quirks peculiar to the specific machine being used.

Accessibility is another potential problem. Either students must have sole use of the machine, or they must share it with other users. Sharing will necessitate the use of an operating system, which will both deny the student access to the real I/O instructions and require that operating system topics, such as system calls, file descriptor blocks, and so on, be discussed rather than the basic I/O concepts of the underlying machine. Thus, use of a large multiprogrammed campus computer is providing a simulated machine to the students; the simulation is defined by and provided by the operating system.

Using a small dedicated minicomputer system to provide the students with the ability to access and control the entire machine without operating system interference may cause other problems. One is the requirement for hardware maintenance, and the disruption of hardware failures. A simulated machine such as MIX can be made available on all computers on campus, allowing the job load to simply shift temporarily to any available backup computer system, even one remote from campus, if necessary. Another problem is caused by the cost of providing sufficient machine resources for the typically large number of students in an undergraduate assembly language course. A simulated machine can, of course, be run in a shared mode on a large fast system.

One last advantage of a simulated machine is the ability to know exactly what the system does in all cases, since the program listing is available for inspection. The only recourse for a real machine may well be an examination of the actual circuit diagrams which may themselves be either incorrect or vague. (I once worked with a machine with an instruction which moved the program counter into an index register. None of the manuals indicated when the program counter was updated, however, and our engineers never were able to decide either. Empirical testing indicated that either it was a rather complex function of the previous, current, and next instruction, or a race condition.)

Other considerations, such as simplicity, textbooks, cost, and so on can also be argued [3].

2.0 MIX SIMULATION SYSTEMS

All of the purported advantages of a simulated MIX computer, however, depend upon the existence of a MIX simulator. Since a MIX system requires a computer, and the computer probably has an assembler but probably does not have a MIX system, it is often easier to teach the available assembly language rather than construct a MIX system. The alternative to constructing your own MIX system is of course to import a MIX system written elsewhere.

Many MIX systems have been written, with various success and various capabilities. A MIX system written in 360 assembly language exists at Stanford [8]. Another system for the 360 is used at UCLA [2] and another at Central Missouri State College [9]. These are written specifically for the IBM 360/370 architecture, and could not then be moved to another computer, but should be able to be moved to another 360/370 site. Another system was written at the University of Texas [3,10] for CDC 6000 series computers, but it is written in a mixture of assembly language and an obsolete systems programing language for a locally developed operating system, so movement of the simulator even to similar computer systems is almost impossible.

Systems developed primarily for their portable nature include a Fortran system at the University of Wisconsin [12] and one at the University of British Columbia [1]. The latter system has several documents which explain the system and how to transport it. Additional systems may also be available.

3.0 MIX SYSTEM WRITING

Writing a MIX system is not a particularly difficult task, however, and we would encourage you to seriously consider writing your own. Knuth has given a portion of the system and discussed the general approach, in Volume One [4]. Writing a system does require some time and care. The resulting system is typically large (some 70 pages for [1] and around 100 pages for [10]). Since it may exist for some time and may need to be moved from machine to machine, it should be well documented, modular, transportable, but also efficient in both time and space. Writing such a program is a good exercise in software engineering.

The most crucial part of a successful simulator system is a good basic design. The basic design of the system is implementation independent. A clear view of the complete design of the MIX simulator is very important for a successful product. This paper presents a top-down design for a MIX system. The design is presented only generally, in a narrative English, supplemented by some occasional Pascal-like data definitions and flow of control. We are mainly interested in the overall system design, and its components. Certain of these components, such as the assembler and the instruction execution simulator, have been adequately covered elsewhere [4,11], and we simply refer you to those references for that portion of the design. However, the entire system is composed of more than these two parts.

It is intended that the design presented here can be and will be freely implemented in many different languages by many different people on many different computers.

4.0 A DESIGN FOR A MIX SYSTEM

The most common problem with MIX systems is an incorrect assumption that the cpu instruction execution is the primary part of a MIX system. To the contrary, instruction execution is a minor portion of the simulated system. Figure 1 shows the hierarchical structure of our system design.

At the highest level, the MIX system consists of several parts. The MIXAL assembler reads an assembly language program and assembles it into MIX memory. The MIX computer system executes the program which is in MIX memory. MIX memory is the interface between the assembler and the computer system.

The flow of control between these two components is the main program of the MIX system. It can be written as simply as,

     BEGIN
          Zero MIX memory;
          Assemble a MIXAL program;
          If there are no assembly errors,
             Then execute the MIX program;
     END.
This simple MIX system allows one MIXAL program to be assembled and executed per job. For many applications this is entirely adequate. Input MIX jobs would consist of a MIXAL program, terminated by an END card, immediately followed by any data to be used by the MIX program in its execution.

4.1 MIX Control Cards

An alternative, more complex but more general, MIX system uses MIX control cards to determine the flow of control between the components of the MIX system, rather than using a fixed flow of assemble/execute/quit. In this case, the general MIX system flow of control would be,

     BEGIN
          Zero MIX memory;
          While there are still more MIX control cards,
             do begin
                     get next control card;
                     Case control card type of
                           Assemble;
                           Execute;
                             ...
                     end Case;
          end While;
     END.

The design of the control card system must obviously be determined at each site, according to the normal usage at each site. Typically control cards are identified by a special character ($, @, ! or such) in the first column of an input line. This is followed by a keyword identifying the next step to be performed, and optionally, some parameters. A typical set of control card functions might include:

Other control cards might also be necessary or desired.

5.0 THE MIXAL ASSEMBLER

The MIXAL assembler has been discussed in great detail elsewhere [11], and we refer the reader to that discussion. Suffice it to say that the MIXAL assembler assembles the input MIXAL program into a MIX machine language program in MIX memory. The assembler produces a listing on the output device and preferably a cross reference listing also. The assembler may be one-pass or two-pass, may be extended to include additional pseudo-instructions, macros, conditional assembly or whatever is desired. The important feature of the assembler is that it is essentially self-contained and interfaces with the remainder of the MIX system through a small number of well-defined variables such as,

  1. MIX memory.
  2. The starting address (may be passed in the program counter).
  3. The number and severity of assembly errors.

6.0 MIX MEMORY AND REGISTERS

MIX memory is an extremely important aspect of a MIX system. It is very important that the representation of the memory be very carefully thought out. We would recommend strongly that information hiding techniques be used to encapsulate the actual representation of memory and hide it from all other routines in the system.

In the abstract, of course, the definition of MIX memory is quite straightforward.

     MIX memory: array [0..3999] of MIX words;
     
     MIX words = record
                     sign: (positive, negative);
                     array [1..5] of byte;
                 end record;
A byte is either a subrange 0..63 or 0..99 depending on the base of the MIX machine. (One possibility is to fix the byte size as a compile time constant; another is to use a run time parameter.)

In fact, efficiency in time and space demand a more complex representation of memory. At the least, bytes should be packed to minimize memory usage. In the Stanford MIX system, a decimal machine, each MIX word occupies a double word (8 bytes, 64 bits). The UT-MIX system stores six 6-bit quantities in one 60-bit word, with the high-order 6-bit byte being either 0 (a positive sign) or 63 (a negative sign). John Baker's MIX system uses the lower 30 bits of a word to represent bytes 1:5 of a binary MIX word and the next bit to be the sign (0=positive; 1=negative). The major problem with all of these is the need for a minimum of 31 bits per word.

On a 16-bit minicomputer or an 8-bit microprocessor, alternative schemes may be needed. One approach is of course simply to use multi-word programming techniques to simulate a 32-bit word from four 8-bit bytes, in which a MIX word is then packed. For a binary MIX machine, this may require considerable shifting and masking, since the five 6-bit bytes will not be conveniently aligned within the four 8-bit bytes. An alternative scheme (also needed for decimal MIX machines) is to store each 6-bit MIX byte in one 8-bit machine byte. This leaves at least one bit per byte left over. One of these can be used as the sign, and others can be used for debugging flags, if desired. The Stanford MIX system uses some extra bits to catch accesses to undefined memory locations, and to locations which are currently being used for input or output.

Probably the most important attribute of the MIX memory should be the limited access allowed to it. Ideally, MIX memory should be treated as an abstract data structure, and all accesses to it should be made through a very small number of routines. The most obvious candidates would be a STORE(address,value) and a LOAD(address,value) pair of routines which would load or store the value to or from memory[address]. An alternative, or complementary, set of routines would be

     LOADBYTE (address,value,L,R)
     STOREBYTE (address,value,L,R)
which would access the L:R byte of memory[address]. The LOAD routine is obviously just a LOADBYTE(address,value,0,5), while the LOADBYTE routine could call LOAD to fetch the entire word and then extract the relevant bytes to return. Thus, these two pairs of routines are complementary and both may not be needed.

With proper design and restraint on the part of the programmer, only these two routines, and the routines to perform arithmetic and comparison operations on MIX words need ever know how memory is represented. Such an approach would thus allow the memory representation to be easily changed, if this were desired.

7.0 THE MIX COMPUTER SYSTEM

As indicated in Figure 1, the MIX computer system is itself composed of several separate parts. The reason for this decomposition is to allow the representation of realistic parallelism between the input/output system and the central processor unit. While it is easiest to write a MIX simulator which ignores I/O-cpu overlap, this prevents the programmer from learning about this very important part of assembly language programming.

The most appropriate programming technique to use for controlling the computer simulation is an event-driven simulation. This is possible because all events in the MIX system are deterministic in their time behavior.

As an example, assume that the cpu is computing along, and executes an IN instruction for the card reader, with the current clock at 1274 MIX time units. The IN instruction takes one MIX unit moving the clock to 1275. If it takes 1000 MIX units to read a card, then the card reader will be busy until 2275. So we post the event "card reader done" to occur at time 2275. The cpu then continues with the next instruction. If the next instruction is an OUT to the line printer which takes 1200 MIX units to print a line, we post the event "line printer done" for time 2476. If the next instruction, after the OUT, were an IOC to tape unit 0, taking 400 MIX units to rewind tape 0, we post another event "tape unit 0 done" for time 1677. Now we continue executing instructions until the first posted event occurs. As the clock ticks, each event must be handled at the appropriate time.

An event driven simulation requires only that every time the clock is advanced, the current clock value is compared against the posted time for all pending events, and if it is time for any event, that event should then be handled. A simple implementation of this uses an event queue. The event queue is a list of all pending events. Posting an event means entering an element into the event queue. Events are taken off the queue when it is time for them to occur. Each event is described by a record which would contain,

  1. the scheduled time for the event
  2. the type of event
  3. other parameters (such as device unit number)
  4. a pointer to the next pending event
The last item is used to construct a linked list of events; it would not be needed if a contiguous queue representation were used.

The queue should be kept sorted by scheduled time of events. This reduces the search time to find the next event to occur. Entry into the queue requires a sorted entry (insertion sort should be adequate). The time of the next pending event is the time of the event at the front of the queue. This allows the general flow of the MIX computer system to be:

While the computer is not halted
   do begin

       while the queue is empty
          or the next event time in the queue > clock
          do execute the next instruction;

       remove the event at the front of the queue;
       case event of ... end case;

     end while;
Several points are important here. Only one instruction is executed between checks of the event queue to see if an event is scheduled to occur. This assures that we do not "overrun" any events. Also, several events may be scheduled to happen at the same time. Both executing instructions (such as I/O instructions) and handling events may cause new events to be put into the queue. The case of halting the computer can be handled by scheduling an event ("halt") to happen after the last I/O device is finished, or by simply advancing the clock to the scheduled time for the last event (or both).

Some problems with this approach can still exist for the purist. The design given above does not allow events to occur except between instruction execution. Consider an event scheduled to occur at time 4201, with the current clock value being 4200. Since it is not yet time for the event, we execute the next instruction. If the next instruction is, say, a MOVE of 24 words (taking 49 time units), the clock value will be 4249 before we check the front of the queue again. Thus the scheduled event may occur late. In most cases this does not matter, but if the late event creates a new event, what time is used for the new event? Does it occur a fixed amount of time after the late event should have occurred (its scheduled time) or its late time? This can be a complicated question.

Consider the case of the event being an input or output of one word to or from memory, such as is suggested below. Words are moved between the I/O system and memory at regular intervals (the transfer time divided by the number of words per record). Thus, without the problem mentioned above transfer would happen at time k, k+t, k+2t, k+3t, ... If the transfer at time k occurs at time k', are all later transfers rescheduled? Consider that for memory reference instructions or MOVES, memory referencing would in fact be delayed, either for the cpu or the I/O device because of memory interference, but for shifts, MUL and DIV, only the first time unit (corresponding to instruction fetch) could cause interference; at any other time cpu processing and memory referencing for I/O would occur in parallel. Also note that on most systems, I/O has priority over cpu memory references.

Correct simulation at this level of detail is rather difficult and requires the MIX memory access routines to check the event queue for I/O transfers whenever access to memory is made by the cpu. It is felt that this level of detail is unnecessary and would overly complicate the simulator.

8.0 INPUT/OUTPUT DEVICE SIMULATION

Each I/O device should be represented by a description in the I/O device array. As defined by Knuth, only 20 devices are possible, although the architecture clearly allows up to 64 devices. A device descriptor would be indexed by the device unit number and probably contain at least the following information:

  1. device name
  2. device type: (card reader, tape, disk, ...)
  3. device record length
  4. device buffer (of the appropriate length). All actual I/O would be between the host machine operating system and this buffer. Words would be transferred between here and MIX memory at appropriate times.
  5. device existence (exists/does not exist for this configuration)
  6. device status (busy/ready)
  7. time when device will be ready
  8. other device state information
We also need a device event routine for each device, for each type of event, including IN, OUT, IOC, JBUS and JRED I/O instructions.

The device event routines will vary from device to device, but they can all be written in much the same form. For an I/O instruction which transfers between MIX memory and an I/O device, the simulator should transfer words one at a time between MIX memory and the device buffer in the descriptor record. Each device would transfer words at regular intervals with a possible start up delay and a stop delay. Suitable values for start-up, stop and word transfer times can be determined by examining times for I/O devices for typical minicomputers.

These three times would also typically correspond to the actions that must be taken by the simulator. For example, for an IN instruction, the first event would be the IN instruction. This would require the appropriate host I/O to read the next MIX record into the device buffer (in the descriptor). We would also need to compute and save the effective address in MIX memory to transfer to, and set the device status busy. This corresponds to the start up delay. An event "transfer ith word into memory[address]" would be posted to occur after the start up delay, with i=1 and the address being the effective address.

When the event "transfer ith word into memory[address]" occurs, the ith word would be transferred from the device buffer into MIX memory. Another event "transfer ith word into memory[address]" would be posted to occur after a transfer delay time with i and the memory address incremented by one, unless the value of i corresponds to the last word in the input record. In this case, we would post an event "device j input is complete" to occur after a stop time delay. The "device j input is complete" event merely clears the device status back to ready and any other necessary clean-up.

Output is similar.

An alternative to this approach is to schedule all of the transfer instructions to occur at their respective times when the IN or OUT instruction is executed. The difference between these two approaches lies in the size of the event queue. In the first approach, only one event per device will be scheduled, but that event may cause a new event to be scheduled. In the second approach, 16 or 24 or 100 events may be scheduled per device at one time, but none of these events will create any new events.

From this discussion, we can see that the following events would probably be sufficient:

"IN instruction", device, address
"OUT instruction", device, address
"IOC instruction", device, address
"move word in", device, word number, address
"move word out", device, word number, address
"I/O operation complete", device
The word number and address are best included in the event queue entry or the device descriptor record. This means that no local information need be retained by the device event handler, and thus, additional devices of the same type can use the same device event handler. (We do not need separate device handlers for tape drive 0 and tape drive 1, although we may need separate device handlers for each type of device.)

The simulation of card reader and line printer devices should present no real problem, but tape and disk drives may require careful thought. A magnetic tape device can be simulated by most sequentially accessed files. Simulating a disk or drum, on the other band, would require both random access files and the language features to access them. If the simulator is implemented in Fortran, such random access files are non-standard, and hence the system will not be transportable. Also, operating system conventions for creating, opening, closing, and deleting the files for these devices may be quite tiresome and clumsy. If at all possible, however, the system should simulate at least a few tape and/or disk drives. I/O device timing is an area which allows some creative programming. As mentioned, most devices can be characterized by a word transfer time and (possibly zero) start and stop times. These times can be determined by guess, or by examining manufacturers' literature for typical small machine peripherals. The IN and OUT instructions are thus quite straightforward. The IOC instruction, on the other hand, can be quite interesting.

  1. Line Printer - IOC 0 means skip to the top of the page. The time for this is a function of the distance from the top of the page, in lines. IOC n (skip n lines) is a function of n. Both suggest the need to count output lines to keep track of the current output line.
  2. Magnetic Tape - IOC 0 (rewind) is a function of the current position of the tape. IOC n or IOC -n (move tape n records forward or back) is a function of n. Notice that IOC n can be difficult to simulate in some systems with sequential files and might then be best simulated with a direct access file.
  3. Disk - An IOC (seek) takes a time which varies according to the current position of the head and the desired position (in the X register) of the head. Introducing rotational delay for IN and OUT instructions as well as head movement delay could be complex.
A final comment about timing is that in real systems, I/O timing is variable. Hence, greater realism can be obtained by perturbing the calculated times by some small random number. For example, in the Wisconsin MIX system [12], I/O for tape units takes 1000 + X time units where X is a random number between -200 and +200.

9.0 THE CPU INSTRUCTION INTERPRETER

The cpu instruction execution unit is a simple standard simulator. Knuth sketches the design of such a simulator in 1.4.3 of [4]. We would remark only that the optional instructions and features (binary instructions, indirect addressing, floating point instructions) can be added to make the system more complete.

10.0 DEBUGGING FACILITIES

What debugging facilities are to be provided for the user of the MIX system? Remember the MIX is basically an educational system, and so careful thought is needed concerning debugging facilities. The following might be considered:

  1. Dumps - Either partial or full dumps of MIX memory should be available whenever a fatal error occurs during execution. It may also be necessary to dump on command from the user. MIX memory is small enough that a complete dump may be reasonable, particularly if some scheme is used to avoid printing sequences of identical lines (like first identical, ..., last identical), and if MIX memory is initialized to zero.
  2. Traces - A line by line trace of instruction, program counter, register contents, clock, effective address, and contents of the effective address (where appropriate) should be available when requested. An interesting possibility used in the Stanford system [8] is to trace any location at most a fixed number of times (default was 2). This prevents long loops from being followed too often but generally gives enough information to find errors. This would require a counter with every instruction in MIX memory, but the counter would also allow an execution profile to be provided showing how often each instruction is executed.
  3. Snaps and Breakpoints - These are dumps and traces triggered by a specific program counter value. For example, a breakpoint at a particular location will cause a line of trace every time that instruction is executed; a snap will dump an area of memory every time a specified instruction is executed.
  4. Memory Monitoring - A common problem is the mysterious store which destroys the contents of a location but the programmer does not know where the store is (typically it is an indexed or indirect store, or an incorrectly modified instruction). Monitoring a particular memory location will produce trace and dump information whenever the specified location is read or written (as selected).
  5. Execution History - Another common problem is the fatal error which terminates the program with the program counter being either a data location or undefined memory. Some help on this can be obtained by providing the user, upon request, with a history of the instructions executed before the fatal error. This can be in the form of a complete backwards trace of the last n instructions, or a list of the last n jump addresses (address of jump and address jumped to) allowing the user to deduce the probable flow of control into the unknown.

All of these debugging aids can obviously be much more effective in a time-shared interactive environment. Notice that the change from batch to interactive use is only a change in the MIX control card input source and output device.

11.0 CONCLUSIONS

We have sketched the basic design of a complete MIX system, giving the major component pieces and mentioning the major problems involved. Still much work will need to be done by the programmer. Actual code has not been given here, since that is very machine and language dependent. However, the overall design should be clear. The remainder then is mere implementation.

12.0 REFERENCES

  1. Baker, J. L., "A MIX System", notes, Department of Computer Science, University of British Columbia, 1974.
  2. Braden, R. T., "MIX Users Guide", Campus Computing Network, UCLA, 1975.
  3. Greenawalt, E. M., and Good, D. I., "The MIX Computer as an Educational Tool", ACM National Conference, 1972.
  4. Knuth, D. E., The Art of Computer Programming, Volume 1: Fundamental Algorithms, Addison-Wesley, 1968, Second Edition, 1973.
  5. Knuth, D. E., The Art of Computer Programming, Volume 2: Seminumerical Algorithms, Addison-Wesley, 1971.
  6. Knuth, D. E., The Art of Computer Programming, Volume 3: Sorting and Searching, Addison-Wesley, 1973.
  7. Knuth, D. E., MIX, Addison-Wesley, 1970.
  8. Knuth, D. E., and Sites, R. L., "MIX/360 Users Guide", STAN-CS-71-197, Computer Science Department, Stanford University, (March, 1971).
  9. Ohlendorf, M. W., "CMSC MIX: A Simulator System for the MIX Computer", Master's Thesis, Department of Computer Sciences, University of Texas, (August, 1972).
  10. Peterson, J. L., "UT-MIX Reference Manual", TR-64, Department of Computer Sciences, University of Texas, (January, 1977).
  11. Peterson, J. L., Computer Organization and Assembly Language Programming, Academic Press, 1978.
  12. "MIX Users Guide", Computer Sciences Department, University of Wisconsin, (January 1970).
  13. Dale, P. B., Margolin, C. G., and Williams, G. H., "MIX and MIXAL (A User's Manual)", Computation Center, Union College, Schenectady, New York, 1975.

MIX System Sources

This is a list of the places and people that I have heard have MIX systems, which may be sources for other places who want to acquire a MIX system. Other systems may exist, and I would be glad to hear of them. Also, some of these people may not be in a position to widely distribute their systems.

  1. Stanford Center for Information Processing; Stanford University; Stanford, California 94305. (MIX for an IBM 360).
  2. Campus Computing Network; Mathematical Sciences Addition; University of California; Los Angeles, California 90024. (MIX for an IBM 360).
  3. Mr. Fred Jacobson; MACC - Madison Academic Computing Center; 1210 W. Dayton Street; Madison, Wisconsin 53706. (Fortran version).
  4. Mr. Kowalik; Academic Services; Washington State University; Pullman, Washington 99163. (Fortran version).
  5. John L. Baker; Department of Computer Science; University of British Columbia; Vancouver, British Columbia U6T 1W5; CANADA.
  6. C. G. Margolin and G. H. Williams; Computation Center; Union College; Schenectady, New York 12308. (MIX for B5700).
  7. Professor Higginbotham; Department of Industrial Engineering; Auburn University; Auburn, Alabama 36830. (IBM 360)
  8. Albert Shpuntoff; Department of Math and. Computer Science; Morningside College; Sioux City, Iowa 51106. (IBM 1130)

CHAPTER 1: ANSWERS TO EXERCISES

1.1 THE MEMORY UNIT (PAGE 10-11)

  1. Core memory and semiconductor memory. The differences discussed on pages 4 to 8, and center on: physical construction, speed, magnetic versus electrical recording techniques, and volatility. An alternate answer would be read-write memory versus read-only memory (ROM), but these are not necessarily physically different, since both would typically be semiconductor memories.
  2. The memory address register defines which word of memory is to be read or written, while the memory data register either has the information to be stored (for a write) or the information which was read (for a read).
  3. 12-bit addresses allow 4096 words of memory; 15 bits allow 32,768 words; 24-bit addresses allow 16,777,216 words.
  4. The cycle time of a memory is at least twice the switching time since every memory cycle must first read the information from memory and then write it back into memory. This defines the read-write cycle and is necessary because of destructive read-outs. Access time on the other hand is the same as switching time since the data can be used when it is read out, before it is re-written. Thus, access time = switching time, and 2 * (switching time) = cycle time.
  5. An n-bit word can represent 2n different bit patterns.
  6. For 4K of 16-bit words, we need 4096 * 16 bits = 65,536 bits. (If we included a parity bit with each word, it would be 4096 * 17 bits = 69,632 bits.) If the memory module cost $3,000, this would be $3000 ÷ 65536 = 4.58 cents per bit. This cost is not unreasonable for old core memory, but we recently bought 32K 16-bit (plus parity) words of MOS memory for $1700, giving a cost of about 0.3 cents per bit.
  7. An n-trit word would be able to represent 3n different values. Similarly, an n-trit address could address 3n different words. Three state devices (-1,0,+1) are being used internally in computers to control input-output bus control.
  8. If the high-order bit is used to distinguish between modules A and B, then addresses 0 to 4095 are in one module and addresses 4096 to 8095 are in the other module. This is known as memory banks. If the low-order bit is used to distinguish between modules A and B, then all the even addresses are in one module and the odd addresses are in the other module. This is interleaved memory.
  9. Dynamic memory must be refreshed at regular intervals or the electrical charge will decay to the point that information is lost. Static memories do not need to be refreshed. Both kinds of memory are volatile and will lose information if power is lost.
  10. Read-only memory is not volatile because it cannot be changed, either by accident or on purpose.

1.2 THE COMPUTATION UNIT (PAGES 39-41)

You can easily make up new problems in addition, subtraction or conversion between representation for binary, octal, decimal and hexadecimal by just using random digits in the appropriate base. One problem you may run into is the TI programmer's calculator. This hand-held calculator allows input and display in either octal, decimal or hexadecimal, and hence would allow the student to avoid learning the principles behind the conversions. It would probably be an unfair advantage on an exam.

  1. A number is an abstract concept which is represented by a numeral. A digit is a symbol which is used to construct numerals.
  2. The largest 16-bit unsigned number is 216-1 = 65,535.
  3. The largest unsigned number in n bits is 2n-1, rather than 2n, because the numbering starts at 0, not at 1.
  4. 41,972 (base 10) = 0100 0001 1001 0111 0010 (BCD).
  5. This is a tricky question. Most people don't think you can answer it and you can't exactly. However, you can say that it is not base 1, 2, 3, 4, 5, or 6, since the largest digit is 6. Thus, the correct answer is base b with b > 6.
  6. Both 0 and 1 have the same representation in octal, binary, and decimal. For octal and decimal, the numbers 0, 1, 2, ..., 7 all have the same representation. In general, for bases a and b, the numbers 0, ..., min(a,b)-1 have the same representation.
  7. Depends on whether they have to be spelled right. Any word with only the letters A, B, C, D, E, and F are acceptable, including, A, BE, CAB, CAD, BABE, ABED, DEAD, DEAF, DEED, BED, BEAD, ACE, DEB, DAB, CEDE, BADE, FAD, FADE, FEED, FED, ...
  8. Addition.

    Binary
         1011010      1001000111       11101011
        +0101011     +    110010     +100100101
        --------     -----------     ----------
        10000101      1001111001     1000010000
    
    Octal
          567674           77651        3472010
         +776571        +1623507       +   7743
         -------        --------       --------
         1566465         1723360        3501753
    
    Hexadecimal
           143AF         F9A8C7B          4FF58
          +2E135        +  9A67B        +141508
          ------        --------        -------
           424E4         FA432F6         191460
    
  9. Conversion
           Binary    Decimal   Octal   Hexadecimal
             1101         13      15             D
        100110010        306     462           132
          1100100        100     144            64
         10010000        144     220            90
           101101         45      55            2D
     110010101011       3243    6253           CAB
          1010111         87     127            57
           101011         43      53            2B
        101000100        324     504           144
    
  10. 35 bits will require 35 ÷ 3 = 12 octal digits. 12 octal digits can represent 12*3 = 36 bits. The difference is because with a 35-bit number, the high-order octal digit will only have two bits, so we add a leading zero. This will restrict the high-order octal digit to being only 0, 1, 2, or 3, so it isn't really an arbitrary octal digit.
  11. To convert from binary to base four, we group bits by twos and translate as 00 = 0, 01 = 1, 10 = 2, and 11 = 3. So 0100110110100110 = 10312212.
  12. Suppose we have a decimal number, 1978. Look for the largest entry in the table which is smaller than our decimal number. For 1978, this is 1536 which is a 3 in digit position 4. Subtract 1978 - 1536 leaving 442. Now repeat with 442. We find 1978 = 1536 + 442 = 1536 + 384 + 58 = 1536 + 384 + 56 + 2 = 3672 (octal).
  13. abc (base 8) = a * 64 + b * 8 + c
     
      a * 100 + b * 10 + c
     
      = abc (base 10).
  14. A binary number bn...b1b0 = bn*2n + ... + b1*21 + b0*20. So, 2*bn...b1b0 = bn*2n+1 + ... + b1*22 + b0*21 + 0*20 = bn...b1b00 (base 2). A zero should be shifted in as the low-order bit. Similarly, bn...b1b0 ÷ 2 = bn...b1. The low-order bit, b0 shifted off the right end is the remainder.
  15. Negative number representations.
             Sign and        Two's        Ones'
    Number  Magnitude   Complement   Complement   Excess 128
    ------  ---------   ----------   ----------   ----------
        93   01011101     01011101     01011101     11011101
       -93   11011101     10100011     10100010     00100011
       -13   10001101     11110011     11110010     01110011
       -14   10001110     11110010     11110001     01110010
        47   00101111     00101111     00101111     10101111
       128      -             -            -            -
      -128      -         10000000         -        00000000
         0   00000000     00000000     00000000     10000000
        -0   10000000         -        11111111         -
    
  16. In 8 bits with excess-3, we can represent -3 to 252. Adding the excess 3, changes this to 0 to 255, which is the range of numbers represented with 8 bits. In general for excess-m with n bits, we can represent -m to 2n-m-1.
  17. Negative number representation.
                   Sign +    Ones'    Two's   Excess
        Unsigned     Mag.   Compl.   Compl.       -4
        --------   ------   ------   ------   ------
    000        0        0        0        0       -4
    001        1        1        1        1       -3
    010        2        2        2        2       -2
    011        3        3        3        3       -1
    100        4       -0       -3       -4        0
    101        5       -1       -2       -3        1
    110        6       -2       -1       -2        2
    111        7       -3       -0       -1        3
    
  18. For ones' complement addition, if a carry is generated out of the high-order bits (sign hits), it must be carried around to the low-order bits and added back in. It is necessary to correct the answer in this way to maintain a correct representation.
  19. An n-bit number is transformed to a 2n-bit number of the same value by,
    • Unsigned: add n high-order 0 bits
    • Sign and Magnitude: add n 0 bits between the number and its sign bit
    • Two's Complement: add n high-order bits which are the same as the sign bit (sign extension)
    • Ones' Complement: sign extension
    • Excess-m Notation: For arbitrary m, add 22n-2n to the number. For excess-2n-1 to excess 22n-1 conversion, complement the sign bit, extend the sign bit, and recomplement the new sign bit.
  20. Let x' be the complement of x, then -x = x'+1 so -(-x) = -(x'+1) = -x'-1 = (x')'+1-1 = (x')' = x.
  21. For both ones' and two's complement, overflow occurs if the carry into the sign bit differs from the carry out, or (equivalently) if the signs are equal and different from the carry into the sign bit.
  22. Division of a double-length dividend by a single-length divisor may leave a double-length quotient. (Suppose that we divide by 1?) On the other hand, the remainder is always less than the divisor (in absolute value) and so must be single-length.
  23. Suppose we have a decimal fraction with n fractional places. The algorithm to convert to binary requires repeated multiplication by 2. Multiplying an n-place decimal fraction by 2 leaves at most an n-place decimal fraction. There are only 10n possible n-place decimal fractions. Thus, after at most 10n multiplications, the same fraction is produced and the binary fraction will start to repeat. The repeating part of the number can thus be at most 10n bits long. (In fact, much less, but I don't know exactly how much less.)
  24. Binary fractions can be represented exactly as a finite number of decimal digits because 2 is a factor of 10 (=2*5). One way to convert would be to multiply (in binary) the fraction by ten and take the integer part as the next digit; the fractional part is again multiplied by ten. Each multiplication by ten includes a multiplication by 2 which makes an n-bit binary fraction into an integer in at most n multiplications.
  25. 13.90625
  26. Overflow means the result of an operation has exceeded the range which can be correctly represented, i.e., the result is either too positive or too negative to be represented. Underflow is when a floating point number becomes so small (close to zero) that the precision necessary to distinguish it from zero exceeds the precision of the representation. This is reflected by the exponent of the number becoming too negative to be represented.
  27. For a normalized floating point number, either (a) the number is zero, or (b) the first digit of the fraction is non-zero. The fraction digit is taken in the base of the exponent. Thus, if the exponent is for base 2, the first bit must be 1; for an exponent of base 16 (like the IBM 370), at least one of the first four bits must be 1.
  28. 3.1416 = 010000101100100100001111.
    01111101110000000000000000 = 3 * 259.
  29. The advantage of an excess notation is that if the exponent is treated as an unsigned number, it will compare the same way as if it is treated as signed. Thus, for normalized numbers with excess exponents in the high-order bits, we can use an integer compare instruction to compare floating point numbers. This eliminates the need for an explicit floating point compare instruction.
  30. Certainly. BCD can be used to represent any number in either fixed or floating point. For example, a fixed point 4.36 can be 0100 0011 0110, while .92 * 1014 is just an exponent of 0001 0100 and a fraction of 1001 0010.

1.3 THE INPUT-OUTPUT SYSTEM (PAGE 61)

  1. Input/Output Devices and Media.
    Card Reader Cards
    Card Punch Cards
    Line Printer Paper
    Paper Tape Reader Paper Tape
    Paper Tape Punch Paper Tape
    Magnetic Tape Drives Magnetic Tape
    Disk Drives Magnetic Disk
    Drum Drive Magnetic Drum
    CRT ?
    Teletype Printer Paper
    Keyboard ?
    Plotter Paper
    Floppy Disk Drive Floppy Disk
    Cassette Tape Drive Cassette Tape
    Console Switches ?
  2. BCD is a 6-bit code; EBCDIC is an 8-bit code; ASCII is a 7-bit code, but sometimes 8-bit (when parity is attached).
  3. The term "BCD" is used both for a technique for representing decimal digits and for a class of character codes used on older computers. There is no direct relation between these two meanings of the same term (at least as far as I know).
  4. Six-bit codes do not allow enough different codes to represent both upper and lower case letters, in addition to numbers and special characters. A 7-bit code is inconvenient because it is not a power of two, or even an even number, and so does not allow an exact number of characters per word except on machines with 14, 21, 28, 35, ... bits per word, and very few computers have word lengths that are a multiple of seven.
  5. One inch at 200 bpi is 200 frames, which for a 7-track tape is 200 * (7-1) = 1200 bits.
  6. 7-track tapes are for use with 6-bit character codes; 9-track tapes are for use with 8-bit character codes; 8-track tapes are for your car tape player.
  7. We are forced to use 9-track tapes. The extra bit can be ignored or used for additional parity checking.
  8. See pages 45 and 46 for a discussion of parity.
  9. The sprocket holes tell the tape reader where the information is on the tape so that it can recognize how many zero frames are on a piece of blank tape. With magnetic tape, either odd parity is used so that every frame has at least one one-bit (a sprocket bit of sorts); or even parity, where the all-zero character code is not allowed, again giving at least one one-bit per frame. (The other use of the sprocket hole for some systems is for moving the tape forward and backward; tapes simply use a different movement mechanism.)
  10. With 12 rows, we could represent 4096 different characters. A column binary representation would be more efficient. Since Hollerith is essentially a 6-bit code, two characters could be put in each column, doubling the number of characters per card.
  11. At 200 bpi, an 80-character card image takes 0.4 inches, so 0.4 ÷ (0.4+0.75) = 35% of the tape is used for information; at 1600 bpi, 6.2% of the tape is used for information. Even with these figures, 1600 bpi is better, since its 6.2% stores one-third more characters in the same amount of tape as 200 bpi. (Less tape is used but it is used much, much better).
  12. A disk is shaped like a phonograph record and records on the flat surfaces of the disk; a drum is a cylinder which records on the side of the cylinder. (Thomas Edison used drums for his original phonograph player.) A moving head disk or drum has one head that moves back and forth to read various tracks (like the needle arm on a phonograph player). A fixed head disk or drum has a separate head for each track, so that the heads don't need to be moved.
  13. For a moving head device, seek time is the time to move the head to the correct track; latency is waiting for the correct sector to rotate under the head; transfer time is the time to actually read or write information. For a fixed-head device, there is no seek time; latency and transfer time are the same as for moving head devices. -

1.4 THE CONTROL UNIT (PAGE 65)

  1. The four basic components of a computer are: the memory system, the computation unit, the input/output system, and the control unit.
  2. The function of the control unit is to direct the other components of the computer (the memory, cpu and I/O system) to perform their functions at the right time in the right order.
  3. The basic instruction execution cycle is:
    1. fetch the next instruction
    2. decode the instruction
    3. execute the instruction
  4. A special purpose computer is built to do a specific task. The sequence of operations that it performs are built into the computer. A general purpose computer, on the other hand, uses a program stored in memory to direct its execution. The program can be easily changed.
  5. The program counter (or instruction address) is used to define which instruction should be the next instruction to be executed.
  6. For 37 instructions, we would need 6 bits (allowing 64 opcodes) since 5 bits (32 opcodes) are not enough. This assumes that we want a fixed length opcode. An alternative approach would be to use 5 bits most of the time, but pick one of them to say that another 3 bits should be used. This allows 31 5-bit opcodes, and 8 8-bit opcodes, and may be more effective (but harder to decode).

CHAPTER 2: ANSWERS TO EXERCISES

  1. The MIX machine has 4000 words of memory, addressed from 0 to 3999. I
  2. The registers of the MIX machine are: A, X, I1, I2, I3, I4, I5, I6, J, CI, and OT.
  3. A MIX instruction has an opcode field (C), a field specification (F), an index field (I), and an address field (+AA).
  4. The MIX computer can have magnetic tapes (units 0-7), magnetic disks and drums (units 8-15), card reader (unit 16), typewriter/paper tape device (unit 17), and line printer (unit 18).
  5. A MIX 1009B is a binary computer while a MIX 1009D is a decimal computer.
  6. The six (?) different types of MIX instructions are: loading, storing, arithmetic, immediate, comparison, jump, I/O and miscellaneous.
  7. Symbolic opcodes are easier to remember than numeric opcodes. Assembly language programs can have explanatory comments. Numbers can be written in decimal. Symbolic labels can be used rather than numeric addresses. Assembly language is easier to write, read, change, correct, and debug than machine language.
  8. MIX is a computer; MIXAL is an assembly language.
  9. A HLT instruction is a machine instruction which is translated into a machine instruction. The END instruction is a pseudo-instruction which tells the assembler that there are no more assembly language statements to be processed.
  10. LDA =5= copies into the A register from a memory location which has the number 5 as its contents, thus putting the number 5 into the A register (memory reference instruction). ENTA 5 puts the number 5 directly into the A register (immediate instruction). LDA 5 loads the A register with the contents of location 5 in memory. We can't tell what will be the value put in the A register unless we know what the contents of location 5 of memory are. ENTA =5= puts an address in the A register. The address is the address of a location in memory whose contents is the number 5.
  11. A program counter is a hardware register which tells the MIX computer which instruction to execute. A location counter is a variable used by an assembler to indicate what address should be used for the next assembled assembly language statement.
  12. The ENTA n instruction takes only one time unit to put n into the A register while LDA =n= takes two time units plus the extra memory to store n. However, the LDA =n= can load an arbitrary number (up to 5 bytes plus sign) while the ENTA n is limited to -4096 < n < +4096.
  13. The EQU puts the symbol ERG into the symbol table with the value 1435, while the CON puts the symbol ERG into the symbol table with the value of the current location counter, and puts 1435 in memory at that address.
  14. The question should ask: "What does the symbol * mean in MIXAL?" The symbol * may mean (depending upon its context): (a) multiplication, (b) the value of the location counter, (c) a comment card (in column one), or (d) the character * (in an ALF).
  15. The error is with the ORIG *+2*N. Since expressions are evaluated strictly left to right, this is (*+2)*N, not *+(2*N). Write it as 2*N+*.
  16. The END card is the last card in an assembly language program. Thus, all references to a label on the END card occur before the END card, and hence are forward references except a reference in the operand field of the END card. This would not be a forward reference, so the book is wrong. (But a LABEL END LABEL card makes no sense.)
  17. You would probably lose a friend. The ORIG *-100 card would reset the location counter back 100 and the next 100 statements would be stored in the same memory locations as the previous 100. The effect would be about the same as taking 100 cards out of the program.
  18. We can generate +214503000 by (among others),
    NOP 1125,3 CON 17(1:1),37(2:2),3(3:3) CON 679500(1:3) ALF P7C
  19. Some assembly time NOPS would be,
    ORIG * EQU 0 NO LABEL TO PUT IN TABLE ORIG *+0
    MIXAL really doesn't have many interesting assembly NOPs.
  20. First, all references to literals are forward references. Second, it wouldn't make much sense. With a literal we are saying that we are concerned about the contents, but not the address (the assembler can put it anywhere), whereas the expression is trying to use the address of the literal.
  21. Assuming there are no references to 2F or 2B elsewhere, this is equivalent to MOVE *-1(*) according to the definitions in this book. One sticky point: Knuth indicates that a 2B is not to refer to a label on the same statement as the reference to 2B. Thus, his interpretation would make it equivalent to MOVE *-2(10). However, the point is unlikely to occur in practice and the obvious implementation of a label such as LABEL OPCODE is the same as LABEL EQU * followed by an unlabeled OPCODE, for any OPCODE except EQU. Knuth's semantics would be much more difficult to define and implement.
  22. Pseudo-instructions are directives to the assembler executed at assembly time, so ORIG, EQU, ALF, and END are "executed" at assembly time. JMP, ENTX, HLT, JXN are all machine language instructions and are executed at execution time.
  23. We can use the EQU to write FIND STJ EXIT as
    FIND EQU * STJ EXIT
    This will work for all labels on non-EQU statements (but see the answer to problem 21 above for a different view of local symbols).
  24. Using a two-pass approach, the symbol table would be the following symbols and their octal values:
         X     144
         Y     132
         BEGIN 145
    
    and the machine language would be
         0145   +0144010210
         0146   +0132070530
         0147   +0146000240
         0150   +0000000205
    
    The starting address is +0145.
  25. We could reconstruct an assembly language program but it would not be unique. See problems 18 and 19 for example. The major problems are:
    1. We don't know which locations are instructions and which are data (CON or ALF).
    2. We don't know what symbols were used.
  26. A possible assembly language program is,
    IN BUF(16) JBUS *(16) OUT BUF(18) ENT2 15,4 SLC 0,2 ENTX 2014 HLT
  27. The contents of the A register is indeterminate unless we know the value of the X register also. The problem is that the CON 5 will be loaded into the A register by the LDA, and then the contents of the next location will be executed as an instruction. The next location in memory is the number 5, which is a NUM instruction. If we assume the X register to be zero, then the contents of the A and X register would be NUMed putting 500,000 in the A register. (The problem was meant to have the constant be a CHAR instruction, not a NUM. So if you change the constant 5 to 69 (0105 octal), then the A register contents will be +3636363636 octal when the HLT is executed.)
  28. There are several ways to do it. See Section 2.2, pages 75-81 for an example.
  29. Again, see Section 2.2, for writing machine language programs. The changes to find the maximum of four values would be major. The easiest way to do this problem would be to write the program in assembly language, assemble it and copy the answer off the assembly listing. That is how the program of Section 2.2 was written.
  30. The MIXAL assembler translates MIXAL assembly language programs into MIX machine language programs.
  31. The default field specification is 0:5 for memory reference instructions, except STJ for which it is 0:2. Non-memory reference instruction defaults are defined by the opcode table (see Appendix B, C, or D). The default value of all other fields is zero.
  32. A fixed-format assembler specifies the columns of an input card each statement must be put in. A free-format assembler uses delimiters to separate fields and identify the values to be put in each field.
  33. Undefined opcode. Undefined symbol reference. Multiply defined symbol. Expression overflow. Illegal forward reference. No END card. Forward reference in an ORIG, EQU, or CON statement. Symbol too long. Literal too long. Reference to an undefined local symbol (iB).
  34. A forward reference is a use of a symbol before it is defined (by being a label on some assembly language statement).
  35. They are not both needed. They are both included to save the programmer from the need to convert alphanumeric character strings to their MIX code and express them as decimal numbers. (Or equivalently, to save the programmer from needing to convert numeric constants to an equivalent character string if ALF were provided but not CON.)

CHAPTER 3: ANSWERS TO EXERCISES

  1. The X register exists simply to extend the A register where necessary, such as for the MUL, NUM, and CHAR instructions.
  2. After a multiply, the A register has the upper five bytes of the product; the X register has the lower five bytes. After a divide, the A register has the quotient and the X register has the remainder.
  3. A load instruction must reference memory for the value to put in the register, thus needing an extra memory cycle. An enter has the value to put in the register in the effective operand.
  4. A memory reference instruction must reference memory to fetch its operand, while the immediate instruction uses the effective address as its operand. The instruction execution cycle for a memory reference instruction is,
    1. fetch instruction
    2. decode instruction
    3. calculate effective address
    4. fetch operand from memory
    5. execute instruction
    For an immediate instruction, step 4 is not present.
  5. The index field is composed of two octal digits, a:b. The effective address is the contents of the address field plus the contents of index register a plus the contents of index register b, where the contents of "index register 0" are understood to be zero.
  6. For a constant n, an increment of n is equivalent to a decrement of -n. Thus, only one instruction is needed for constant n. However, if we use indexing, then INC allows addition of index registers, while DEC allows subtraction.
  7. One-liners (in MIXAL).
    1. +0012000265 ENT5 10
    2. +0000000260 ENTA 0
    3. +0000030160 DECA 0,3
    4. +0000010062 INC2 0,1
    5. +2734000260 ENTA *
    6. +0000000143 IOC 0(1)
  8. Double indexing is the technique of using two index registers in the calculation of the effective address. An example of a machine language instruction is +0000120262 (ENT2 0,2:1), which is an alternative solution to 4 in problem g above.
  9. Why is there a NOP instruction? No real reason that I can think of anymore. It could be just a filler (we have 32 different opcodes because of a 5-bit opcode field, but can only think of 29 things to do, so we fill the other three with NOP), or was originally used by programmers who wrote in machine language and needed to be able to "patch" absolute programs in such a way as to remove a previous instruction without moving everything else in memory. But other NOPs exist in MIX like JSJ *+1, INC1 0, SLC 10, MOVE 0(0), so an explicit NOP is really not necessary.
  10. We would need 8 bits to represent 159 opcodes (27 = 128 < 159 < 256 = 28. The MIX computer has only a 6-bit opcode because it also uses the 6-bit F field, giving an effective 12-bit opcode (4096 instructions), most of which are not used.
  11. For a load, the L:R bytes of memory are copied into the low order bytes of the register, with zeros filling the upper bytes. For a store, the low-order bytes of the register are stored into the L:R bytes of memory with all other bytes unchanged. For a compare, the L:R bytes of memory are compared with the L:R bytes of the register.
  12. The immediate instructions use the F field to distinguish between the different immediate instructions, so there is no room for a field specification in the instruction. Also they probably aren't really needed, and can be simulated by the immediate instruction and appropriate pre- and post-shifting.
  13. The MOVE instruction is not needed. It could be simulated by a short loop, but would take much longer.
  14. See Table 3.12 for MIX instruction execution times. The MOVE instruction must both read and write each word, requiring 2n time units, not n (plus one to fetch and decode the instruction).

CHAPTER 4: ANSWERS TO EXERCISES

  1. Programming is the creative task of converting a problem statement into an algorithm which will solve the problem. Coding is the relatively straightforward task of converting this algorithm into a program in a suitable language.
  2. To compute (J*LOGICAL)/PHYSICAL + BASE,
    LDA J MUL LOGICAL DIV PHYSICAL ADD BASE
  3. To compute Z = Y + 2*(W+V)/4 - 6*(10-W-V),
    ENTA 10 SUB W SUB V MUL =6= STX TEMP TEMP=6*(10-W-V) LDA W ADD V SLAX 5 DIV =2= (W+V)/2 ADD Y ADD TEMP STA Z
  4. To put the absolute value of N in the A register,
    LDA N(1:5)
  5. To compute the sign function of P, we can write,
    LDA P JAZ 1F STA *+1(0:0) ENTA 1 1H EQU *
    but I don't know how to get rid of the jump on zero.
  6. A program can be thought of as being made up of short segments, each segment doing some function part of the problem. We may want to jump to the end of a segment, and fall through to the next, but the label should logically be attached to the end of one segment, not the first instruction of the next. In this case, we might use EQU *.
  7. The major problem is that we need another register as a counter. Assume that we can use some temporary locations in memory. Then for a MOVE ADDR(N), we can write,
    STA TEMPA ST2 TEMP2 ENT2 0 1H CMP2 =N= CHECK FOR END OF MOVE JGE 2F LDA ADDR,2 STA 0,1 MOVE ADDR T0 I1 INC1 1 INC2 1 JMP 1B 2H LDA TEMPA RESTORE REGISTERS LD2 TEMP2
  8. Some loops are run backwards because they run faster and take less space.
                    N=3      N=10     N=n
    From 1 to N      17        52     2+5n
    From N to 1      12        33     3+3n
    
  9. Doesn't really. Arrays probably tend to be one word per entry more often than tables which have multiple words per entry, but not always.
  10. Use the shorter, faster code:
    LD1 INDEX LDA ARRAY-400,1
  11. A space-time tradeoff if a situation where the same code can be written two ways. The code can be faster, but only if it is larger, or made smaller, but slower.
  12. Working strictly with positive integers, we can write the following code for double precision integers. Assume that each integer is two words: the first word (P,Q,R) is high-order; the second (P+1,Q+1,R+1) is low-order.
    JOV *+1 MAKE SURE OT IS OFF LDA P+1 ADD Q+1 STA R+1 R=P+Q (LOW-ORDER) LDA P JNOV *+2 INCA 1 ADD 1 IF OVERFLOW ADD Q STA R
  13. To PUSH,
    * TOP = LENGTH INITIALLY LD5 TOP DEC5 1 JSN OVERFLOW STA STACK,5 ST5 TOP
    To POP,
    LD5 TOP CMP5 =LENGTH= JGE UNDERFLOW LDA STACK,5 INC5 1 ST5 TOP
  14. Let N be the number of nonblank characters per card. Then space and time are,
                              Space        Time
                              -----        ----
    Partial Fields             26      2+16*(4+5*6)-N
                                     = 546-N
    
    Instruction Modification   16      2+16*(5+5*12)-N
                                     = 1042-N
    
    Shifting                   14      2+16*(7+5*8)-N
                                     = 754-N
    
  15. The difficulty in averaging comes from the lack of floating point numbers. If we print an integer average, do we round or truncate? What if we want to know the average to the nearest tenth? In that case we multiply times ten before we divide the sum by the number of numbers. Also overflow is a problem.
  16. The percentages probably won't add up to 100 due to round-off error (if you round) or truncation problems (if you truncate). The sum of the percents can be less, equal, or even greater than 100.

CHAPTER 5: ANSWERS TO EXERCISES

  1. Devices and their physical record size.
    Tape 100 MIX words
    Disk 100 MIX words
    Drum 100 MIX words
    Card Reader 80 characters, 16 MIX words
    Line Printer 120 characters, 24 MIX words.
  2. The X register holds the track and sector address in bytes 4:5.
  3. A buffer is an array in memory which is used to temporarily hold information to be input or output. It is used because of the differences in speed between the cpu and I/O devices.
  4. I/O interlock time is the time that the cpu must wait before executing an I/O command because the device is already busy with a previous I/O command.
  5. Buffering is used to smooth over the differences in speed between the cpu and I/O devices.
  6. Blocking is used to allow efficient utilization of space on I/O storage devices. It is necessary because the size of information which we wish to store may differ from the fixed physical record size for the device.
  7. Same question as question n of Section 1.3. Seek time is the time to move the head over the desired track, latency is waiting for the disk to revolve to the right sector; transfer is the actual time to read or write information.
  8. The JBUS *(16) causes the computer to wait until the I/O device has finished its last operation. Without it the following LDX may load either the old or new value of BUF, depending upon the speed of the device.
  9. Hard to answer. At first shot, it would appear that we will read a card and the print that card on the line printer, preceded by the word DATA. But the programmer has forgotten to wait for the IN to complete before the OUT, so the input card will probably not be read in before the OUT prints the buffer, and so we can't really say.
  10. You might think so, and most of the time probably can, but what if there is a reference to WAIT+1?
  11. Time to copy 1000 cards is 1000 * (50000 + 10 + 1000) microseconds = 51.01 seconds. If we overlap processing and I/O we can reduce this to about 50000 (for the first card) + 999 * max(50000, 10, 1000) + 1000 (for the last card) = 50.001 seconds. Blocking multiple card images per drum record (without overlap) would take 1000 * (50000+10) + 1000 ÷ 6 * 1000 (for loose blocking) or 1000 * (50000+10) + 1000 ÷ 6.25 * 1000 (for tight blocking). Blocking with overlap is the same as the simple overlap case considered above since total time is dominated by the card read time. Specifically with a 50 millisecond per card input rate, 1000 cards will always take at least 50 seconds for input, totally independent of I/O overlap or output blocking.
  12. In the best case, we would be able to overlap input and output completely (except for the first card) and so the time to process n cards would be 50 (for the first read) + (n-1) * max(50,1,50) + max (50,1) (for the last write to disk or printer) = (n+1) * 50 milliseconds.
  13. Blocking factors and wasted words for different blocking modes.
                  No Blocking     Loose Blocking     Tight Blocking
             ----------------   ----------------   ----------------
             Blocking   Words   Blocking   Words   Blocking   Words
             Factor    Wasted   Factor    Wasted   Factor    Wasted
    
      7             1      93         14       2     14.285       0
     13             1      87          7       9      7.692       0
     18             1      82          5      10      5.555       0
     41             1      59          2      18      2.439       0
    500           0.2       0        0.2       0      0.2         0
    
  14. ... Left as an exercise to the student ...
  15. There is very little overlap which can be done here. The strategy would be to read and copy to disk or drum, then read it backwards, printing. The only overlap possible is between input and disk and disk and output.
  16. A sequential access device only allows operations of the form: move forward one, move backward one, or rewind. A direct access device allows any arbitrary record to be accessed directly. We can simulate a sequential access device on a direct access device easily. If you want to stretch the definition of "simulate", it is also possible to simulate a direct access device on a sequential access device, but only at a very high cost in I/O and time. (We simply keep track of what record we are currently on and move the tape to the next record by lots of movements, one record at a time.)
  17. The major performance difference is time: it takes longer to move information than to switch pointers. The major programming difference is that, with the pointers, all accesses must be made indirectly, either by indexing or indirectly through a memory location. If information is moved, we can always access the information from the same place in memory each time.
  18. It would be difficult to buffer a direct access input device, since we do not know which record will be needed next. A direct access output device, on the other hand, can be buffered since we know where each record is to be put and so we can "buffer behind".

CHAPTER 6: ANSWERS TO EXERCISES

  1. One reason is to save space. Another reason, more difficult to motivate but more important, is to make the program more modular.
  2. A calling sequence is the method of calling a subroutine. It involves describing how all of the registers and memory should be set up to allow parameters to be passed.
  3. Only the calling routine knows which registers are being used, so only the calling routine can save only those registers which need be saved. Only the called routine knows which registers will be used, so only the called routine can save only the registers which need be saved.
  4. The J register exists to allow a return address to be easily passed to a subroutine.
  5. The parameters can be passed: in registers; in global variables; in the called routine before the entry point; in the called routine after the call; in a table; or on a stack.
  6. Not very useful. The subroutine always returns to RTN, therefore it can never really be used as a subroutine.
  7. ... Left as an exercise to the student ...
  8. Call by value passes the value of a parameter; call by reference passes the address of a parameter; call by name passes the address of a thunk which will compute the address of a parameter every time it is called.
  9. Call by reference is sufficient, as long as we can only pass simple variables.
  10. The values would be,
                         I   J  A(1)  A(2)
    Call by Value        1   2   100   0
    Call by Reference    2   2     0   0
    Call by Name         2   2   100   0
    
  11. Thunks are, because they are needed to correctly implement the copy rule.
  12. Self-modifying code is code which stores new instructions in itself, thus changing the program as it executes. Serially reentrant code may be self-modifying but always changes things back before it quits or after it starts, so each execution is of the same program. A recursive routine may call itself. A coroutine is a routine which will call its caller rather than returning to its caller like a subroutine. Pure code is nonself-modifying. Reentrant code is nonself-modifying.
  13. A recursive subroutine needs a stack.
  14. A reentrant subroutine is serially reentrant, and pure code. It need not be position independent nor recursive.
  15. Something like,

  16. On a computer which is used by many people at the same time, it may be possible to share pure code between many users, thus saving memory space. This is an operating system topic (Section 9.5), and particularly useful for time shared operating systems.

CHAPTER 7: ANSWERS TO EXERCISES

  1. Bootstrap loader, absolute loader, (relocatable loader, linking loader, relocatable linking loader), linkage editor, overlay loader.
  2. Each logical record for an absolute loader must contain a sequence of values plus the addresses at which to load them. This is often encoded as: a number N, an address A, and N values V1, V2, ..., VN, to be loaded at A, A+1, A+N-1, respectively. A checksum may also be provided.
  3. Locations within a routine which are used by another routine are called entry points and externals. The name used is determined by whether we are considering the user (external) or the definer (entry point).
  4. The purpose of a checksum is to provide a form of error detection.
  5. AND and ORR would be of no value for checksum computation. For AND, x AND 0 = 0, so that all bits in the checksum would tend to become zero, independent of the input bit values. For ORR, x ORR 1 = 1, so all bits in the checksum would tend to be one. The XOR works fine and is often used for checksum computation.
  6. A bootstrap loader is the original loader used to load the regular system loader. It is needed when the computer is initially turned on and has no other system software in its memory. Alt is generally an absolute loader of the most primitive type.
  7. One technique is to use two passes; one to build the entry point table, the other to actually link. Another technique is to build an entry point reference table which simply keeps track of all addresses where entry points are used (as externals) and the correct address is filled in when it becomes available. Another technique is to use chaining.
  8. The relocatable statements are STJ EXIT, STZ I, LDA I, MUL I, OHPX N, JG FOUND, LDA I, STA I, JHP 1B, LDA I, and JMP *. Basically everything except the two CONs and INCA 1.
  9. Processing these modules with a linkage editor would leave FORE and TOD as externals, and would declare BEGIN, ARY, N, FOD, NIX and CS as entry points.
  10. An absolute loader and bootstrap loader are one pass. A relocatable loader (relocatable linking loader, we assume) could be either one or two pass. A linkage editor would just about have to be two pass.
  11. Relocated code stored in memory would be,
    0432 +0000000000
    0435 +0432000510
    0436 +0433000530
    0437 +0001000260
    0440 +0434003430
    
  12. Binding time for machine language programs is when they are written, for absolute assembly when they are assembled, and for relocatable programs when they are loaded.
  13. It would be possible to have a relocatable, non-linking loader, but it would make little sense, since the separately relocated parts would have no way to interact.
  14. A linking loader inputs relocatable load modules and links them together in memory. A linkage editor inputs relocatable load modules, links and relocates them, but then outputs the result as another (larger) relocatable load module.
  15. Bootstrap first, then absolute loader, then relocatable and linking as one loader, finally a linkage editor.
  16. ... Left as an exercise to the student ...
  17. The assembly language would need to have Entry Point and External declaration statements. Also there would be restrictions on the form of expressions allowed since all expressions need to be either relocatable or absolute when evaluated.

CHAPTER 8: ANSWERS TO EXERCISES

  1. Two pass.
               +0000000144  MAXSEM  EQU  100
    +0000                   SYMBOL  ORIG 2*MAXSYM+2
                            *
    +0312  +0324  00 02 40  SEARCH  STJ  ENDSEAR
    +0313  +0323  00 02 32          ST2  SAVSEAR(0:2)
    +0314  +0000  01 05 30          STA  0,1
    +0315  +0001  01 05 12          LD2  1,1
    +0316  +0002  02 00 62          INC2 2,2
                     +0317  2H      EQU  *
    +0317  +0002  00 01 62          DEC2 2
    +0320  +0000  12 05 70          CMPA 0,1:2
    +0321  +0317  00 10 47          JNE  2B
    +0322  +0000  02 02 61          ENT1 0,2
    +0323  +0323  00 02 62  SAVSEAR ENT2 *
    +0324  +0324  00 00 47  ENDSEAR JMP  *
    
  2. One pass, with symbol table following.
               +0000000144  MAXSEM  EQU  100
    +0000                   SYMBOL  ORIG 2*MAXSYM+2
                            *
    +0312  +7777  00 02 40  SEARCH  STJ  ENDSEAR
    +0313  +7777  00 02 32          ST2  SAVSEAR(0:2)
    +0314  +0000  01 05 30          STA  0,1
    +0315  +0001  01 05 12          LD2  1,1
    +0316  +0002  02 00 62          INC2 2,2
                     +0317  2H      EQU  *
    +0317  +0002  00 01 62          DEC2 2
    +0320  +0000  12 05 70          CMPA 0,1:2
    +0321  +0317  00 10 47          JNE  2B
    +0322  +0000  02 02 61          ENT1 0,2
    +0323  +0323  00 02 62  SAVSEAR ENT2 *
    +0324  +0324  00 00 47  ENDSEAR JMP  *
    
    Symbol Table
    MAXSYM  0144
    SYMBOL  0000
    SEARCH  0312
    SAVSEAR 0323
    ENDSEAR 0324
    
  3. For a fixed format assembler, the free format routine may need to be reformatted; for a free format assembler, both should work without reformatting.
  4. It is possible to relax the requirement for labels to start in column one, but this may introduce another problem. How do we distinguish between a label and an opcode (if no label is used the first symbol on the card would be an opcode)? The traditional solution is to mark a label definition to easily separate labels from opcodes. Specifically, a label definition is immediately followed by a colon, as LABEL: OPCODE OPERAND.
  5. We would not be able to use opcode mnemonics as labels (You probably didn't realize that you could). The search for opcodes and labels would take longer. Only one search routine would be needed so the assembler might be smaller. Another field might be needed with each symbol/opcode table entry to distinguish user symbols from opcodes.
  6. For each symbol table entry, a one-pass assembler needs to keep (1) the symbol, (2) whether it is defined or a forward reference, (3) its value if defined, and (4) the addresses of its use if it is a forward reference.
  7. Symbol table searching from the beginning to the end will find the first definition of a multiply-defined symbol. Searching from the end will find the most recently defined value.
  8. No. This would produce a hash function whose value would depend upon the time of day when it was computed. What is needed is a hash function which depends only on the symbol being hashed (and the size of the symbol table).
  9. On pass 1, we must construct the symbol table, scan for the opcode, process ORIGs, search the symbol table (for multiply defined labels), update the location counter, and process the label field (1,3,5,6,7,10). On pass 2, we must output code for the loader, print the listing, search the symbol table (for operands), treat an LDA different from an STA, and process CONs (2,4,6,8,9). It is possible to do 3,5,7 on both pass 1 and pass 2, or we can do them in pass 1 and save the results for pass 2 in the intermediate text. One could argue that 7 must be done on both passes, if we say that the location counter is being updated by reading the values from the intermediate text.
  10. An error can be noted. The second symbol definition can be ignored, or the second definition can be saved as the symbol table value.
  11. On which pass are these errors found?
    1. Undefined symbols would normally be found on pass 2, but may be found in pass 1 as illegal forward references.
    2. Pass 1
    3. Pass 1
    4. Pass 1
    5. Pass 2
  12. (1) two passes, (2) chaining with fixup by the loader, (3) use table, with fixup by the loader.
  13. At least two. A one-pass assembler will not have the symbol table until the end of its first pass through the program and this pass must also produce the listing, so a one-pass assembler must produce the listing first, then the symbol table.
  14. The one-pass would be faster, while the two-pass could allow more general use of forward references, would be able to provide a more complete listing, and would need a simpler loader.
  15. A two-pass assembler would not be able to handle forward references to X, as,
    Z EQU X X EQU Y Y EQU 0
    This would require three passes. In general, one can define a chain of forward references as a sequence of forward references in which each depends ,on the following (except the last which is defined). The above example is a chain (Z,X,Y) of length 3. A chain of length N requires N passes to resolve all symbol values.
  16. It will not need any additional data structures, per se, but may be able to make good use of an entry point table and an external symbol table. This information can be kept in a symbol table entry which would need to include, in addition to the normal information, an indication of whether the symbol was absolute, relocatable, or external. For external symbols, we may also need either a chain address or a list of addresses where it is used.
  17. Simply add these new mnemonics to the opcode table, with the same value as the older mnemonics.
  18. ... Left as an exercise to the student ...

CHAPTER 9: ANSWERS TO EXERCISES

  1. A subroutine is written once and exists at only one location in memory, out-of-line. A macro is expanded in-line every time it is called. Macros will execute faster than subroutines, while subroutines will take less space.
  2. A macro is an open, in-line subroutine, which passes parameters by name (1,2,6).
  3. First pass: macro expansion; second pass: symbol table construction; third pass: code generation.
  4. What may appear to be errors may be corrected by the parameters passed to the macro.
  5. One way is to add additional code which is executed every time a particular opcode is used. For example, assume that a program is somehow, somewhere incorrectly storing the A register in a variable LC. If we have a free index register 6, we could write the following macro to replace all STA instructions without changing our program at all:
    STA MACR P1,P2 ENT6 P1,P2 CALCULATE EFFECTIVE ADDRESS GMP6 =LC= COMPARE ADDRESS TO LC JNE *+2 HLT STOP IF THIS IS STORE TO LC CON 24(5:5),5(4:4),6(3:3) STA 0,6 ENDH
    By temporarily adding this macro, every store will be checked for storing in LC and when the first culprit is found the program will halt (or it could output a message and continue or whatever). When the problem is corrected, the macro can be removed, leaving everything the same.
  6. Conditional assembly is absolutely necessary for recursive macro calls. Otherwise the recursion would continue indefinitely (or at least until the assembler ran out of time or space).
  7. See pages 332-336.
  8. Definitions.
    1. Interpreter or simulator.
    2. Fortran compiler.
    3. MIXAL assembler.
    4. Operating system.
  9. An interpreter is a program which acts like a computer, fetching instructions and executing them.
  10. At least two (and for a smart system only two), because a segment may be marked for conditional loading when it is encountered, and we will not know if it is going to be used or not. We know this only at the end of our first pass. By storing all the segment names, lengths, entry points, and conditional load bits, we can determine, after the first pass, which segments are to be loaded and which (conditionally loaded) segments are needed, determine starting addresses, relocation bases, and entry point addresses, so on pass two everything that is needed can be loaded.
  11. Higher-level languages are easier to write, to read, to use, to move to another machine, and to debug and test. Assembly languages allow better code (smaller, faster) to be written and allow access to and control over all machine features. Systems programming languages attempt to combine the advantages of both high level and assembly languages.
  12. Interesting question. Probably not, since it is possible to use any arbitrary symbol (including opcodes and pseudo-instructions) as labels and symbolic operands. The closest thing to a reserved word is probably *.
  13. The real computer may not exist (like MIX), or may not exist yet (under construction). Or the program being interpreted may be under study because of bugs or poor performance and so more information may be desired than simply the final register contents. By interpreting, special actions can be taken to display the actual execution of the program for better understanding.
  14. Control cards tell the operating system what steps it would perform for a job.
  15. Device independent I/O is a program which is written so that it need not know or take special actions based on its input or output devices. It uses logical devices. These logical devices are mapped onto the physical devices by the operating system. A file is a logical device which acts like a magnetic tape or disk or drum.
  16. Any code which can be generated by a compiler could also have been written in assembly language, thus a compiler cannot produce better code than an assembly language programmer. More commonly it produces poorer code because it does not "know" what the program is supposed to do. Thus, it must translate the statements which are often overly general. A systems programing language allows the programmer to more explicitly control which instructions are generated by the compiler.

CHAPTER 10: ANSWERS TO EXERCISES

10.2 THE PDP-8 (PAGES 368-369)

  1. PDP-8 memory consists of 4096 words of 12 bits each. Addresses are 12-bits. The two registers are the Accumulator and the Link bit.
  2. PDP-8 memory is logically organized in pages of 128 words. The effective address calculation is described on page 357.
  3. The PDP-8 has only 8 basic instructions, but one of these opcodes (the Operate instruction) uses the remaining 9 bits in the instruction to specify additional instructions. This allows many more instructions and mnemonics.
  4. The number of instructions could have been further reduced. ISZ and JMS could have been dropped as could the IAC, RTR, RTL, RAL, CLA, SMA, and RSS instructions (at least). These additional instructions were added because they were thought to be generally convenient to use and not overly expensive to implement.
    1. Complement and Increment: forms the two's complement.
    2. Illegal. SMA and IAC are in different groups.
  5. ... Missing Answer ...
  6. Since the A register may have an arbitrary value, we must clear it first, then TAD TEA, and skip on zero and minus, as,
    CLA TAD TEA SMA JMP NNEG JMP NGE
  7. Comparing PDP-8 and MIX code.
                 PDP-8             MIX
                 -----             ---
               ISZ  TOPS         LDA  TOPS
               JMP  LOOP         INCA 1
                                 STA  TOPS        
                                 JANZ LOOP        
    
  8. ... Left as an exercise to the student ...

10.3 THE HP 2100 (PAGE 378)

  1. The HP 2100 has 16-bit words and 15-bit addresses allowing up to 32K words of memory. The word size is larger than the address size to allow an indirect bit to be specified with each address.
  2. There are four registers: A and B (16 bits), E (Extend bit), and O (Overflow bit).
  3. The fundamental difference is the increase from 12 bits to 16 bits per word, allowing an instruction set with 4 bits for an opcode field plus a register bit selecting one of two registers. This allows some additional instructions such as compares and stores.
  4. The HP 2100 has DMA transfers and a vectored priority interrupt system.
  5. DMA is a method of doing input or output directly between memory and an I/O device without requiring program control of every word or character transferred.
  6. A vectored interrupt system interrupts by a subroutine jump to different locations depending upon the device which is interrupting.

10.4 THE PDP-11 (PAGES 386-387)

  1. The word size is 16 bits with a 16-bit address, but since memory is byte addressable, with two 8-bit bytes per word, we can only address 32K words of memory (64K bytes).
  2. By making each byte addressable, character manipulation is easier.
  3. There are eight user registers plus the condition code. The first six registers (0 to 5) are general purpose. Register 6 is the stack pointer, and register 7 is the program counter.
  4. It is possible to do many operations without needing the registers. For example, to assign the value of a variable P to a variable Q, we simply MOV P,Q rather than LDA P; STA Q.
  5. This obviously seems to be a cue to the assembler used to identify the pseudo-instructions. It allows the assembler to use separate tables for pseudo-instructions and machine instructions so each table is smaller and faster to search.
  6. Recursive programs need a stack, and the PDP-11 makes it easy to program with a stack.
  7. The PDP-11 is a priority vectored interrupt system, like the HP 2100. The major difference is that the priorities are determined by the processor status word and hence are not fixed but can be changed under program control.

10.5 THE IBM SYSTEM 360 AND SYSTEM 370 (PAGE 396)

  1. The 370 has 8-bit byte addressable memory. Memory addresses are up to 24 bits allowing a maximum of 16 megabytes of memory. Each word is 32 bits, four bytes. There are 16 general purpose registers plus 8 floating point registers plus the program status word register.
  2. A model 30 is cheaper, slower, and smaller than the 370 model 168.
  3. Effective address calculation is the sum of the displacement plus the contents of the base register plus the contents of the index register. If register 0 is used as base register or index register, a value of 0 is used.
  4. See pages 388-393.
  5. The USING and DROP pseudo-instructions are used to tell the assembler which registers can be used as base registers and what their contents are. This allows the assembler to automatically compute the correct base register and displacement for each symbol.
  6. Input/Output on the 370 is done by a channel. The advantage of this approach is that input/output can be completely controlled by the channel and need not interfere with cpu execution except to start and complete the I/O instructions.

10.6 THE BURROUGHS B5500 (PAGES 402-403).

  1. The memory of the B5500 is up to 32K (15-bit addresses) of 48 bit words.
  2. The assembler is not needed and all programming can be done in a higher-level language with all of its advantages (see problem k of Chapter 9).
  3. The bootstrap loader, at least, must be written in machine language. So must the first program which translates into machine language. Unless, of course, another computer exists, in which case all code (for any computer) can be written on the existing computer in its existing higher-level language, translated into machine language for the new computer, and carried over and run on the new machine.
  4. The designers probably felt that the two different types of applications for which the machine would be used (character handling and scientific number manipulation) could best be done in two different ways.
  5. Integers are treated as special cases of floating point numbers with a zero exponent. This eliminates the need for extra instructions and hardware to process separately integer and floating point numbers.
  6. The program would be something like,
    LOAD Y LITERAL 2 LOAD W LOAD V ADD MUL LITERAL 4 DIV ADD LITERAL 6 LOAD W SUB LOAD V SUB MUL SUB
  7. No. Programs with lots of loads and stores will tend to not be shorter on stack machines. Other instruction sets can be just as efficient on many problems. Register machines are particularly good on computations with lots of common subexpressions.

10.7 THE CDC 6600 (PAGE 411)

  1. The CDC 6600 has 60-bit words. Addresses are 18 bits allowing up to 262K words of memory.
  2. Loads and stores occur automatically when the A registers are set to a memory address. Setting A1, A2, ..., A5 will automatically load into X1, X2, ..., X5, respectively. Setting A6 or A7 will store X6 or X7 into the memory address put in A6 or A7.
  3. The 370 has its general purpose registers which can be used as index registers.
  4. The peripheral processors do all I/O, and use polling to detect I/O completion.
  5. The 18-bit accumulator is related to the 18-bit addresses of the central processor.
  6. The word length for the 370 (32 bits) works well with hex but would be awkward with octal, especially with 8-bit EBCDIC. The 6600 could use hexadecimal for its 60-bit words, but since it uses a 6-bit character code, octal is more convenient. However, other machines, like the PDP-11 and HP 2100 with 16-bit words and 8-bit character codes use octal instead of hexadecimal. The choice is strictly personal bias.

10.8 THE INTEL 8080 (PAGE 419)

  1. There is no memory on an Intel 8080. However, we have a 16-bit address with an 8-bit word which allows up to 64K bytes memory to be used if it is separately provided.
  2. Cost. A larger word size would require more interface lines between the cpu chip and the outside world (i.e., memory) and also would require more hardware on the chip to store and process the larger words.
  3. Multiple precision arithmetic would require several bytes to be used for each number. To add a number, the lower bytes would be added plus the carry (0 or 1) from the low-order bytes. This would continue until all bytes have been added. Subtraction is similar. Two's complement representation would probably be easiest. Multiplication and division are much harder.
  4. The 8080 interrupt system is sort of in the middle of the PDP-8 and the HPZIOO. It has up to 8 levels of interrupts which are separately vectored, so it is more general than the PDP-8, but not as general as the HP 2100 which has 64 interrupt vector locations.
  5. A cross-assembler is an assembler which executes on one computer (machine A) and assembles instructions for another computer (machine B). This is done for the 8080 because the 8080 may be too small, too slow and too limited in I/O devices to support its own assembler. Thus, the assembler is written for and executes on another, larger computer system.
  6. The I/O instructions are similar but for the 8080 only one 8-bit byte is transferred, as opposed to the MIX system which transfers a fixed number depending on the device. Also there is no built-in way to check for a device being busy or done, as there is on the MIX computer. See pages 417-418.

10.9 SUMMARY FOR CHAPTER 10 (PAGES 421-422)

  1. Computer Corporate Genealogy.
    1. Amdahl came from IBM.
    2. Control Data came from Univac.
    3. Data General came from Digital Equipment.
    4. Cray came from CDC (which came from Univac).
  2. See the answer to question j of Chapter 3.
  3. Instruction cycle for a computer with interrupts.
    1. Fetch an instruction.
    2. Decode the instruction.
    3. Execute the instruction.
    4. If the interrupt system is on, check if there are any interrupt requests. If so, enter an interrupt phase. Then go back to step 1.
  4. The advantage of a skip instruction is that there is no need for a (jump) address. This allows the instruction to be encoded in fewer bits. Also, whenever the condition controls the execution of exactly one instruction, it is very efficient. However, for a more general conditional jump, two instructions are needed. Thus many designers feel that the conditional jump is more likely to be needed and so build it into the instruction set to try to match the instruction set to its use.
  5. A lot of registers allow important values to be kept easily available at all times. On the other hand, it requires more time to save and restore more registers for subroutine calls.
  6. The 0-address computer and the B5500 (which is a 0-address computer) are stack machines (c,f).
  7. This code for a stack machine to compute D=A*(B+C).
  8. Microprogramming is the technique of building a simple computer (A), then writing for this simple computer an interpreter for another computer (B). The interpreter is put in ROM and the whole package is sold as if it were the computer B (which it is). The advantage of this approach is that the same hardware can be used (with different interpreters) to build other computers, and that a complicated computer can be easily built from simple hardware (with a complicated interpreter).
  9. Interrupts allow the computer to be busy doing some useful work until it is needed to service an I/O device with no need to constantly check the state of the I/O device (polling) and no fear of waiting too long before servicing the device.
  10. A trap.
  11. Comparing different computers.
       MIX   PDP-8   HP 2100   PDP-11   IBM 370   CDC 6600 
    Word Size (in bits) 31 12 16 16 32 60
    Addressable Memory Unit word word word byte byte word
    Integer Number Representation Sign and Magnitude Two's Complement Two's Complement Two's Complement Two's Complement Ones' Complement
    Major Registers A, X, J, I1,...,I6 A A,B R0,...,R5, SP, PC R0,...,R15, Floating 0,2,4,6 A0,...,A7, B1,...,B7, X0,...,X7,
    Bits in Address 12 12 15 16 24 18
    Opcode Size (in bits) 6 3 4 4 or 8 8 6