On the Formatting of Pascal Programs




James L. Peterson




Department of Computer Sciences
The University of Texas at Austin
Austin, Texas 78712




December 1977




This paper has been published. It should be cited as

James L. Peterson, ``On the Formatting of Pascal Programs'', SIGPLAN Notices, Volume 12, Number 12, (December 1977), pages 83-86.


A part of the current discussion on programming techniques deals with programming style. Original thoughts that programs would have very short lifetimes and be read only by compilers have been shown to be incorrect. Programs can have very long lifetimes. Due to this, it will be necessary for these programs to be read by many programmers as maintenance, changes, and improvements are made to the program during its years of changing environments, uses and specifications.

One aspect of programming style which affects the usefulness of programs is their readability. A program is readable if a programmer can pick up the program, and read and understand it. Many aspects of style affect readability, including variable names, commenting, modularity and formatting. It is the this last aspect of readability that we discuss here.

Formatting, and all other aspects of readability, are strictly matters of personal taste. There is (as of yet at least) no provably best format for writing programs. in this paper, we present simply a suggested formatting approach, with vague arguments for the adoption of these formatting rules. Notice that most programmers do have at least implicit formatting conventions. It is our purpose to make these implicit rules explicit.

Pascal is used as the example for describing the formatting rules, although similar formatting rules could be defined for other block structured languages such as Algol or Simula.

We begin first with the simple statements and then proceed to compound statements.

The simple statements in Pascal are either assignment statements or procedure calls. In either case, these are simply written, one per line, in a linear manner.

OUTPOINTER := OUTPOINTER + 1;
OUTCARD[OUTPOINTER] := C;
GETNEXTCHAR(C);
UPPER(C);

The compound statement is formed by a BEGIN-END pair and a sequence of statements, S1, S2, ..., Sn. The important semantic information which should be conveyed by the formatting is the inclusion of a statement Si in the compound statement. To accomplish, this, the component statements are indented between the enclosing BEGIN-END delimiters. The indentation should be at least five characters, to account for the length of the BEGIN keyword.

BEGIN
     S1;
     S2;
     ...
     Sn;
END;

This format allows a programmer to glance to the right of a statement and then straight up and down to the first non-blank in order to find the enclosing BEGIN-END delimiters.

BEGIN
     J := J + 1;
     OUTCARD[J] := C;
END;

BEGIN
      INFO := I;
      LEFT := L;
      RIGHT := R;
END;

The remaining non-declarative statements in Pascal are control structures. The important idea here is that each control structure has a controlling expression. The component statements are subservient to the controlling expression and this is indicated by indenting the component statements under the controlling statement.

IF <expression>
   THEN <statement>
   ELSE <statement>

WHILE <expression>
   DO <statement>

REPEAT
      <statement>
UNTIL <expression>

WITH <record list>
  DO <statement>

FOR <variable> := <expression> TO <expression>
 DO <statement>

Again this format shows explicitly, by indentation, that the <statement> is controlled. The controlling statement is found by the first non-blank to the left and above.

IF A < B
   THEN MIN := A
   ELSE MIN := B;

WHILE CH = EOL
   DO GETNEXT(CH);

REPEAT
       GETLINE;
       PROCESSLINE;
       OUTPUTLINE;
UNTIL EOF(INPUT);

WITH TREE[TOP]
  DO N := N - 1;

FOR I := 1 TO N
 DO POSITION[I] := 0;

Each of the component statements may, of course, be a compound statement. In this case, the compound statement is indented under the control structure and the components of the compound statement are indented within the BEGIN-END delimiters as would be expected from the above descriptions.

IF (N > NLIMIT) OR (M > MLIMIT)
   THEN MOREPOSITIONS := FALSE
   ELSE BEGIN
              POSITION[1] := M;
              FOR K := 2 TO N+1
               DO POSITION[K] := 0;
END;

WHILE POSITION[K] = POSITION[K-1]
   DO BEGIN
            POSITION[K]:= 0;
            K := K - 1;
      END;

WITH TREE[FREE]
  DO BEGIN
           INFO := I;
           LEFT := L;
           RIGHT := R;
     END;

Declarative statements and procedure declarations are simply listed. Declarations are listed one variable per line. The main body of a procedure is a compound statement, and follows the rules listed above.

CONST
       BLANK = " ";
VAR
       CH:    CHAR;
       TREE:  RECORD
                     INFO:  0..127;
                     LEFT:  INDEX;
                     RIGHT: INDEX;
              END;

PROCEDURE GETNEXTCHAR (VAR CH: CHAR);
BEGIN
      REPEAT
             IF EOLN(INPUT)
                THEN READLN;
             READ(CH);
      UNTIL CH # BLANK;
END;

These formatting rules, plus others such as triple spacing between sections (CONST, TYPE, VAR) and between procedure declarations, double spacing before and after comments and so on, will result in a well-formatted program. Note, of course, that although proper formatting is necessary for good readability, it is by no means sufficient. Proper formatting is simply a matter of deciding upon and using a consistent set of formatting conventions, such as presented here. The more major problems of identifier names, modularity, and proper commenting are prehaps more necessary for good readability.