ECS 40 Fall 1997, Program Standards Program Standards

Computer programs are meant for two audiences: the computer that compiles and runs it, and the people who must read, modify, or evaluate the program. Most programming shops have "inhouse standards" for programming style since a standard style tends to make programs more readable. Engineering and Computer Science 40 is no exception; its rules are listed in this section. Part of your homework grade reflects how well you meet these standards.

Code Indentation and Formatting must follow the standard C style, which is illustrated in the text. It's not the prettiest style but it is quite standard. Please indent eight spaces (a tab) for each level of indentation.

Capitalization rules in C are also fairly standardized, although a bit unpleasant. Every part of your code must be in lowercase, except for names of macros, defined types, and enumerated constants which are entirely UPPERCASE.

Comments should be used to make programs absolutely clear. Use them liberally but not so much that the meaning gets submerged.

You should comment your program as you write it. It is actually much faster than going back later, and comments are very useful when debugging. (If what the comment says the code does is different than what it really does, you found a bug.) Graders, teaching assistants, and instructors reserve the right to refuse to look at uncommented programs.

ECS 40 "house standards" call for four kinds of comments; we'll describe each in turn.

Start-of-file Comments

/*
 *
 * hashtest.c      Sam Bent          date unknown
 *
 * Test hash table routines. Loop endlessly, requesting keys from
 * stdin. If key doesn't exist, insert it in the table with value
 * n (where it's the n-th key added to table). If key does exist,
 * print its stored value of n.
 *
 * Modification History:
 *
 *     1 Jan 94   Matt Bishop; changed to conform to ECS 40 style
 *    29 Nov 90   Chip Elliott; changed to conform to CS 23 style
 *                and use ANSI C.
 *
 */
The comments at the beginning of a file, collectively called its "header," must contain your name in addition to other general information about the program and what it does. The date when the program was written and a history of major modifications are required.

Complex data structures used by a file should be illustrated with a figure, as shown below.

 * HASHTABLE       hashstruct
 *           +-------------------+     Allocated Block
 *    ht  -> |     ht_ID         |       of HASHSLOTs
 *           +-------------------+    +-----------------+
 *           |     ht_size       | +->|      ....       |
 *           +-------------------+ |  +-----------------+
 *           |     ht_cnt        | |  |      ....       |
 *           +-------------------+ |  +-----------------+
 *           |     ht_curslot    | |  |      ....       |
 *           +-------------------+ |  +-----------------+
 *           |     ht_slots      |-+  |     HASHSLOT    |----+
 *           +-------------------+    +-----------------+    |
 *                                           ....            |
 *                                           ....            |
 *           +-----------------------------------------------+
 *           |
 *           |        hashentry
 *           |   +----------------+    +-----------------+
 *           +-> |      h_key     |--> |    key's text   |
 *               +----------------+    +-----------------+
 *               |      h_val     |
 *               +----------------+

Start-of-function Comments

/*
 * ht_enter
 *
 * Enter a key into a hashtable and return its HASHSLOT.
 * If necessary, expand the table to hold the new key.
 * If the key is already there, just return its HASHSLOT.
 *
 * Entered by:  ht   -- hash table
 *             *key  -- key to enter
 *
 * Exits with:  key in hashtable
 *                 if newly created, its val.p field is
 *                 set to NULL
 *              ht possibly expanded
 *              returns key's HASHSLOT
 *
 * Exceptions:  "parameter is not a hash table"  (fatal)
 *              "not enough memory to enter item into 
 *                                  hash table"  (fatal)
 *
 */
HASHSLOT ht_enter(HASHTABLE ht, char *key)
{
Each function must have its own header. The reader should be able to determine what a function does and how to use it by reading only this header information. Things like global variables referenced or modified, assumptions about the input, or anything else that the reader should know before "lifting" a function and using it in another program should go here. So should the algorithm used, if it's not absolutely clear.

One thing that you may not be familiar with is input and output assertions. "Input assertions" is a fancy way of saying "I assume this about the input." For example, for a square root routine sqrt(x), the usual input assertion is "x > 0." Output assertions are similar -- they describe what is true after the function has been called. It is a good habit to get into writing these assertions. It helps you specify exactly what the routine does, and this is precisely the information that someone needs to know in order to lift your routine and use it elsewhere.

You should also include a list of Exceptions, i.e., unusual/error cases and what the function does with them.

If you borrow code from another source (a fine idea, by the way) you should cite the author, book, and -- if possible -- page numbers. Be specific enough so another reader can find the reference.

Paragraph Comments

	/*
	 * update ht's HASHSTRUCT contents
	 */
	/* first, copy size and count */
	ht->ht_size  = newht->ht_size;
	ht->ht_cnt   = newht->ht_cnt;
	/* release old hash slots, start using new ones */
	free(ht->ht_slots);
	ht->ht_slots = newht->ht_slots;
	/* free newht's HASHTABLE */
	free(newht);
When you write pseudo-code or describe how a subroutine accomplishes its task ,you usually break it down into a series of steps. (To insert an item into an array you find where it belongs, move everything after it down one place, and finally copy the item into the correct position.) These steps sometimes become subroutines with descriptive names, but often they become half a dozen lines of code. In this case, it helps to write a comment that explains the purpose of the next section of code. The first comment is an example of this kind of comment.

Sometimes a paragraph comment is completely boxed, like this:

	/***********************************
	 * update ht's HASHSTRUCT contents *
	 ***********************************/
These are also called boxed comments. You can use this style if you prefer.

Line Comments

main()
{
	HASHTABLE ht;		/* ht  = hashtable */
	HASHSLOT *hp;		/* hp  = ptr to hash table entry */
	int n=O;			/* n   = key number (put into ht) */
	char key[100];		/* key = entry's key */


	/*
	 * create a new hash table
	 */
	ht = ht_create(0);

	/*
	 * loop, reading input and acting accordingly
	 */
	for(;;){
	    /* prompt and get input */
	    printf("Key: ");
	    scanf("%s", key);

	    /*
	     * if "dump", show what's in the hash table
	     */
	    if (strcmp(key, "dump") == 0){
	        /* bingo -- dump the hash table */
	        ht_dump(ht);
	        continue;
	    }

	    /*
	     * enter the key into the hash table
	     * announcing if it is already there
	     */
	    hp = ht_enter(ht, key);
	    if (hp->h_data == NULL){
	        /* key is new -- say so, and store it in entry */
	        printf("%s is a new key, number %d. ", key, ++n);
	        hp->h_data = (caddr_t) n;
	    }
	    else{
	        /* key is new -- say so, and store it in entry */
	        printf("%s is key %d. ", key, (int) hp->h_data);
	    }
	}
}
In declarations. Explain the use of constants, types, and variables. Part of this is choosing descriptive names. Avoid single-letter names (except for i, j, k, and their ilk as loop variables when nothing more descriptive is useful). Most of the time you should supplement this with a more descriptive comment when the constant, type, or variable is declared. Sum is a fine variable name for a partial sum, but its declaration should state whether it is summing the grades for an exam, the cost of books for a given term, or whatever. Put these comments at the end of each declaration line, and align the beginning of the ordinary comments in a set of declarations, as shown. If the comment takes more than one line, align the second line too.

In code. Comments should also be placed above a line of code if it will clarify what the code does. These comments should explain how that particular line of code fits into the general scheme of things. The comment:

/* Assign item to ith position of list */
list[i] = item;
is worthless. It simply repeats what the code says. Assume that the reader can read the code, and explain the purpose of the statement. The comment:
/* Put item to be inserted into empty spot */
list[i] = item;
is much better. These comments are especially useful if the code is somewhat tricky or non-intuitive, and if an innocent-looking statement has consequences not obvious from the statement itself.

Align all these ordinary comments with the beginning of the line they explain. Otherwise they are messy and distracting.

Structured Programming

All the comments in the world will not make a 10-page monolithic block of spaghetti code intelligible. You should already know a little about breaking your programs into functions and subroutines. We will be discussing methods for decomposing problems in this class, and suggest that you use them. A common question is "how long should a function be?" Most functions end up being short. If a function is longer than a page, it is usually (but not always) better broken into separate functions.

We would like to also comment on the infamous goto statement. The uncontrolled use of goto can make programs totally unintelligible, and many of you have probably been convinced that a goto statement is programming's worst sin.

We consider goto statements justifiable in one case: "error bailout." In some cases, functions call other functions (often recursively) and it is possible to get 20 or 30 nested calls. If at this point an error is discovered, that error must be passed back up through the 20 or 30 calls. Each function must deal with the question "what should I do if this function I am about to call finds an error?" This is certainly doable and can sometimes lead to better error recovery, but functions must pass and test error flags everywhere. A goto back to the top level, if carefully used and properly commented, can greatly simplify life. We will discuss both approaches later in class.


Send email to cs153@csif.cs.ucdavis.edu.

Department of Computer Science
University of California at Davis
Davis, CA 95616-8562



Page last modified on 2/17/98