Consolidated lecture notes for CS223 as taught in the Spring 2005 semester at Yale by Jim Aspnes.

Contents

  1. C/Variables
  2. Machine memory
  3. Variables
    1. Variable declarations
    2. Variable names
  4. Using variables
  5. C/IntegerTypes
  6. Integer types
    1. Integer constants
    2. Integer operations
    3. Alignment
  7. C/Statements
  8. Simple statements
  9. Compound statements
    1. Conditionals
    2. Loops
      1. The while loop
      2. The do..while loop
      3. The for loop
      4. Loops with break, continue, and goto
  10. C/AssemblyLanguage
  11. C/Functions
  12. Function definitions
  13. Calling a function
  14. The return statement
  15. Function declarations and modules
  16. Static functions
  17. Local variables
  18. Mechanics of function calls
  19. C/InputOutput
  20. Character streams
  21. Reading and writing single characters
  22. Formatted I/O
  23. Rolling your own I/O routines
  24. File I/O
  25. C/Pointers
  26. Memory and addresses
  27. Pointer variables
    1. Declaring a pointer variable
    2. Assigning to pointer variables
    3. Using a pointer
  28. The null pointer
  29. Pointers and functions
  30. Pointer arithmetic and arrays
    1. Arrays and functions
  31. Void pointers
  32. Run-time storage allocation
  33. C/DynamicStorageAllocation
  34. Dynamic storage allocation in C
  35. Overview
  36. Allocating space with malloc
    1. Testing the return value of malloc
  37. Freeing space with free
    1. Storage allocation discipline
    2. Perils of buffer overflows
    3. The lingering curse of referencing freed blocks
  38. Dynamic arrays using realloc
  39. Example of building a data structure with malloc
      1. malloc2d.c
      2. Makefile
      3. Output of make test
  40. C/Strings
  41. String processing in general
  42. C strings
  43. String constants
  44. String buffers
  45. Operations on strings
  46. Finding the length of a string
    1. The strlen tarpit
  47. Comparing strings
  48. Formatted output to strings
  49. Dynamic allocation of strings
  50. argc and argv
  51. C/Structs
  52. Structs
  53. Unions
  54. Bit fields
  55. AbstractDataTypes
  56. Abstraction
  57. Example of an abstract data type
    1. Interface
      1. sequence.h
    2. Implementation
      1. sequence.c
    3. Compiling and linking
      1. main.c
      2. Makefile
  58. Designing abstract data types
    1. Parnas's Principle
    2. When to build an abstract data type
  59. C/Definitions
  60. Naming types
  61. Naming constants
  62. Naming values in sequences
  63. Other uses of #define
  64. AsymptoticNotation
  65. Definitions
  66. Motivating the definitions
  67. Proving asymptotic bounds
  68. Asymptotic notation hints
    1. Remember the difference between big-O, big-Ω, and big-Θ
    2. Simplify your asymptotic terms as much as possible
    3. Remember the limit trick
  69. Variations in notation
    1. Absolute values
    2. Abusing the equals sign
  70. More information
  71. LinkedLists
  72. Stacks and linked lists
    1. Implementation
    2. A more complicated implementation
    3. Building a stack out of an array
  73. Looping over a linked list
  74. Looping over a linked list backwards
  75. Queues
  76. Deques and doubly-linked lists
  77. Circular linked lists
  78. What linked lists are and are not good for
  79. Further reading
  80. C/Debugging
  81. Assertions
  82. gdb
    1. My favorite gdb commands
    2. Debugging strategies
  83. valgrind
    1. Compilation flags
    2. Valgrind versions
    3. Automated testing
  84. Not recommended: debugging output
  85. TestingDuringDevelopment
  86. Setting up the test harness
    1. Module interface
      1. stack.h
    2. Test code
      1. test-stack.c
    3. Makefile
      1. Makefile
  87. Stub implementation
      1. stack.c
  88. Bounded-space implementation
      1. stack.c
  89. First fix
  90. Final version
      1. stack.c
  91. Moral
  92. Appendix: Test macros
      1. tester.h
      2. tester.c
  93. LinkedLists
  94. Stacks and linked lists
    1. Implementation
    2. A more complicated implementation
    3. Building a stack out of an array
  95. Looping over a linked list
  96. Looping over a linked list backwards
  97. Queues
  98. Deques and doubly-linked lists
  99. Circular linked lists
  100. What linked lists are and are not good for
  101. Further reading
  102. HashTables
  103. Basics of hashing
  104. Resolving collisions
    1. Chaining
    2. Open addressing
  105. Choosing a hash function
    1. Division method
    2. Multiplication method
    3. Universal hashing
  106. Maintaining a constant load factor
  107. Examples
    1. A low-overhead hash table using open addressing
    2. A string to string dictionary using chaining
  108. C/FunctionPointers
  109. Basics
  110. Function pointer declarations
  111. Applications
    1. Callbacks
    2. Dispatch tables
    3. Iterators
  112. Closures
  113. Objects
  114. C/PerformanceTuning
  115. Timing under Linux
  116. Profiling with gprof
  117. AlgorithmDesignTechniques
  118. Basic principles of algorithm design
  119. Classifying algorithms by structure
  120. Example: Finding the maximum
  121. Example: Sorting
  122. DynamicProgramming
    1. A more complicated example
    2. Memoization
    3. Preserving alternatives
    4. Knapsack
    5. Non-linear structures
      1. Trees
      2. Treewidth
        1. Examples
        2. Dynamic programming
    6. All-pairs shortest paths
    7. Longest common subsequence
  123. BinaryTrees
  124. Tree basics
  125. Binary tree implementations
  126. The canonical binary tree algorithm
  127. Nodes vs leaves
  128. Special classes of binary trees
  129. Heaps
  130. Priority queues
  131. Expensive implementations of priority queues
  132. Heaps
  133. Packed heaps
  134. Bottom-up heapification
  135. More information
  136. BinarySearchTrees
  137. Searching for a node
  138. Inserting a new node
  139. Costs
  140. BalancedTrees
  141. The basics: tree rotations
  142. AVL trees
  143. 2-3 trees
  144. Red-black trees
  145. B-trees
  146. Splay trees
  147. Skip lists
  148. Implementations
  149. RadixSearch
  150. Tries
    1. Searching a trie
    2. Inserting a new element into a trie
    3. Implementation
  151. Patricia trees
  152. Ternary search trees
  153. More information
  154. C/BitExtraction
  155. C/Graphs
  156. Graphs
  157. Why graphs are useful
  158. Operations on graphs
  159. Representations of graphs
    1. Adjacency matrices
    2. Adjacency lists
  160. An implementation
  161. Searching for paths in a graph
    1. Depth-first and breadth-first search
    2. Other variations on the basic algorithm
  162. DepthFirstSearch
  163. Depth-first search in a tree
  164. Depth-first search in a directed graph
  165. DFS forests
  166. Running time
  167. A more complicated application: strongly-connected components
  168. BreadthFirstSearch
  169. ShortestPath
  170. Single-source shortest paths
    1. Relaxation
    2. Dijkstra's algorithm
    3. Bellman-Ford
  171. All-pairs shortest paths
  172. Implementations
  173. C/FloatingPoint
  174. Floating point basics
  175. The IEEE-754 floating-point standard
  176. Error
  177. Reading and writing floating-point numbers
  178. Non-finite numbers in C
  179. C/Persistence
  180. Persistent data
  181. A simple solution using text files
  182. Using a binary file
  183. A version that updates the file in place
  184. An even better version using mmap
  185. Concurrency and fault-tolerance issues: ACIDitiy
  186. C/WhatNext
  187. What's wrong with C
  188. What C++ fixes
  189. Other C-like languages to consider
  190. Scripting languages

1. C/Variables

2. Machine memory

Basic model: machine memory consists of many bytes of storage, each of which has an address which is itself a sequence of bits. Though the actual memory architecture of a modern computer is complex, from the point of view of a C program we can think of as simply a large address space that the CPU can store things in (and load things from), provided it can supply an address to the memory. Because we don't want to have to type long strings of bits all the time, the C compiler lets us give names to particular regions of the address space, and will even find free space for us to use.

3. Variables

A variable is a name given in a program for some region of memory. Each variable has a type, which tells the compiler how big the region of memory corresponding to it is and how to treat the bits stored in that region when performing various kinds of operations (e.g. integer variables are added together by very different circuitry than floating-point variables, even though both represent numbers as bits). In modern programming languages, a variable also has a scope (a limit on where the name is meaningful, which allows the same name to be used for different variables in different parts of the program) and an extent (the duration of the variable's existence, controlling when the program allocates and deallocates space for it).

3.1. Variable declarations

Before you can use a variable in C, you must declare it. Variable declarations show up in three places:

  • Outside a function. These declarations declare global variables that are visible throughout the program (i.e. they have global scope). Use of global variables is almost always a mistake.

  • In the argument list in the header of a function. These variables are parameters to the function. They are only visible inside the function body (local scope), exist only from when the function is called to when the function returns (bounded extent---note that this is different from what happens in some garbage-collected languages like Scheme), and get their initial values from the arguments to the function when it is called.

  • At the start of any block delimited by curly braces. Such variables are visible only within the block (local scope again) and exist only when the containing function is active (bounded extent). The convention in C is generally to declare all such local variables at the top of a function; this is different from the convention in C++ or Java, which encourage variables to be declared when they are first used.

Variable declarations consist of a type name followed by one or more variable names separated by commas and terminated by a semicolon (except in argument lists, where each declaration is terminated by a comma). I personally find it easiest to declare variables one per line, to simplify documenting them. It is also possible for global and local variables (but not function arguments) to assign an initial value to a variable by putting in something like = 0 after the variable name. It is good practice to put a comment after each variable declaration that explains what the variable does (with a possible exception for conventionally-named loop variables like i or j in short functions). Below is an example of a program with some variable declarations in it:

   1 #include <stdio.h>
   2 #include <ctype.h>
   3 
   4 /* This program counts the number of digits in its input. */
   5 
   6 /*
   7  *This global variable is not used; it is here only to demonstrate
   8  * what a global variable declaration looks like.
   9  */
  10 unsigned long SpuriousGlobalVariable = 127;
  11 
  12 int
  13 main(int argc, char **argv)
  14 {
  15     int c;              /* character read */
  16     int count = 0;      /* number of digits found */
  17 
  18     while((c = getchar()) != EOF) {
  19         if(isdigit(c)) {
  20             count++;
  21         }
  22     }
  23 
  24     printf("%d\n", count);
  25 
  26     return 0;
  27 }

3.2. Variable names

The evolution of variable names:

11101001001001
Physical addresses represented as bits.
#FC27
Typical assembly language address represented in hexadecimal to save typing (and because it's easier for humans to distinguish #A7 from #B6 than to distinguish 10100111 from 10110110.)
A1$
A string variable in BASIC, back in the old days where BASIC variables were one uppercase letter, optionally followed by a number, optionally followed by $ for a string variable and % for an integer variable.
IFNXG7
A typical FORTRAN variable name, back in the days of 6-character all-caps variable names. The I at the start means it's an integer variable. The rest of the letters probably abbreviate some much longer description of what the variable means.
i, j, c, count, top_of_stack, accumulatedTimeInFlight
Typical names from

modern C programs. Note that there are two different conventions for representing multi-word names: the first is to replace spaces with underscores, and the second is to capitalize the first letter of each word (possibly excluding the first letter), a style called "camel case" (C2find:CamelCase). You should pick one of these two conventions and stick to it.

prgcGradeDatabase

An example of Hungarian notation, a style of variable naming in which the type of the variable is encoded in the first few character. See http://web.umr.edu/~cpp/common/hungarian.html or C2find:HungarianNotation. Not clearly an improvement on standard naming conventions, but it is popular in some programming shops.

In C, variable names are called identifiers.1 An identifier in C must start with a lower or uppercase letter or the underscore character _. Typically variables starting with underscores are used internally by system libraries, so it's dangerous to name your own variables this way. Subsequent characters in an identifier can be letters, digits, or underscores. So for example a, ____a___a_a_11727_a, AlbertEinstein, aAaAaAaAaAAAAAa, and ______ are all legal identifiers in C, but $foo and 01 are not.

The basic principle of variable naming is that a variable name is a substitute for the programmer's memory. It is generally best to give identifiers names that are easy to read and describe what the variable is used for, i.e., that are self-documenting. None of the variable names in the preceding list are any good by this standard. Better names would be total_input_characters, dialedWrongNumber, or stepsRemaining. Non-descriptive single-character names are acceptable for certain conventional uses, such as the use of i and j for loop iteration variables, or c for an input character. Such names should only be used when the scope of the variable is small, so that it's easy to see all the places where it is used at once.

C identifiers are case-sensitive, so aardvark, AArDvARK, and AARDVARK are all different variables. Because it is hard to remember how you capitalized something before, it is important to pick a standard convention and stick to it. The traditional convention in C goes like this:

  • Ordinary variables and functions are lowercased or camel-cased, e.g. count, countOfInputBits.

  • User-defined types (and in some conventions global variables) are capitalized, e.g. Stack, TotalBytesAllocated.

  • Constants created with #define or enum are put in all-caps: MAXIMUM_STACK_SIZE, BUFFER_LIMIT.

4. Using variables

Ignoring pointers (C/Pointers) for the moment, there are essentially two things you can do to a variable: you can assign a value to it using the = operator, as in:

   1     x = 2;      /* assign 2 to x */
   2     y = 3;      /* assign 3 to y */

or you can use its value in an expression:

   1     x = y+1;    /* assign y+1 to x */

The assignment operator is an ordinary operator, and assignment expressions can be used in larger expressions:

    x = (y=2)*3; /* sets y to 2 and x to 6 */

This feature is usually only used in certain standard idioms, since it's confusing otherwise.

There are also shorthand operators for expressions of the form variable = variable operator expression. For example, writing x += y is equivalent to writing x = x + y, x /= y is the same as x = x / y, etc.

For the special case of adding or subtracting 1, you can abbreviate still further with the ++ and -- operators. These come in two versions, depending on whether you want the result of the expression (if used in a larger expression) to be the value of the variable before or after the variable is incremented:

   1     x = 0;
   2     y = x++;    /* sets x to 1 and y to 0 (the old value) */
   3     y = ++x;    /* sets x to 2 and y to 2 (the new value) */

The intuition is that if the ++ comes before the variable, the increment happens before the value of the variable is read (a preincrement; if it comes after, it happens after the value is read (a postincrement). This is confusing enough that it is best not to use the value of preincrement or postincrement operations except in certain standard idioms. But using x++ by itself as a substitute for x = x+1 is perfectly acceptable style.


CategoryProgrammingNotes

5. C/IntegerTypes

6. Integer types

In order to declare a variable, you have to specify a type, which controls both how much space the variable takes up and how the bits stored within it are interpreted in arithmetic operations.

The standard C integer types are:

Name

Typical size

Signed?

char

8 bits

Unspecified

short

16 bits

Yes

int

32 bits

Yes

long

32 bits

Yes

The typical size is for 32-bit architectures like the Intel i386. Some 64-bit machines will have 64-bit ints and longs, and some prehistoric computers had 16-bit ints. Particularly bizarre architectures might have even wilder bit sizes, but you are not likely to see this unless you program vintage 1970s supercomputers. Some compilers also support a long long type that is usually twice the length of a long (e.g. 64 bits on i386 machines); this may or may not be available if you insist on following the ANSI specification strictly. The general convention is that int is the most convenient size for whatever computer you are using and should be used by default.

Whether a variable is signed or not controls how its values are interpreted. In signed integers, the first bit is the sign bit and the rest are the value in 2's complement notation; so for example a signed char with bit pattern 11111111 would be interpreted as the numerical value -1 while an unsigned char with the same bit pattern would be 255. Most integer types are signed unless otherwise specified; an n-bit integer type has a range from -2n-1 to 2n-1-1 (e.g. -32768 to 32767 for a short.) Unsigned variables, which can be declared by putting the keyword unsigned before the type, have a range from 0 to 2n-1 (e.g. 0 to 65535 for an unsigned short).

For chars, whether the character is signed (-128..127) or unsigned (0..255) is at the whim of the compiler. If it matters, declare your variables as signed char or unsigned char. For storing actual characters that you aren't doing arithmetic on, it shouldn't matter.

6.1. Integer constants

Constant integer values in C can be written in any of four different ways:

  • In the usual decimal notation, e.g. 0, 1, -127, 9919291, 97.

  • In octal or base 8, when the leading digit is 0, e.g. 01 for 1, 010 for 8, 0777 for 511, 061 for 97.

  • In hexadecimal or base 16, when prefixed with 0x. The letters a through f are used for the digits 10 through 15. For example, 0x61 is another way to write 97.

  • Using a character constant, which is a single ASCII character or an escape sequence inside single quotes. The value is the ASCII value of the character: 'a' is 97.2 Unlike languages with separate character types, C characters are identical to integers; you can (but shouldn't) calculate 972 by writing 'a'*'a'. You can also store a character anywhere.

Except for character constants, you can insist that an integer constant is unsigned or long by putting a u or l after it. So 1ul is an `unsigned long version of 1.  By default integer constants are (signed) int`s.

6.2. Integer operations

The usual + (addition), - (negation or subtraction), and * (multiplication) operators work on integers pretty much the way you'd expect. The only caveat is that if the result lies outside of the range of whatever variable you are storing it in, it will be truncated instead of causing an error:

   1     unsigned char c;
   2 
   3     c = -1;             /* sets c = 255 */
   4     c = 255 + 255;      /* sets c = 254 */
   5     c = 256 * 1772717;  /* sets c = 0 */

This can be a source of subtle bugs if you aren't careful. The usual giveaway is that values you thought should be large positive integers come back as random-looking negative integers.

Division (/) of two integers also truncates: 2/3 is 0, 5/3 is 1, etc. For positive integers it will always round down. If either the numerator or denominator is negative, the behavior is unpredictable and depends on what your processor does---in practice this means you should never use / if one or both arguments might be negative. There is also a remainder operator % with e.g. 2%3 = 2, 5%3 = 2, 27 % 2 = 1, etc. It is also unpredictable when fed negative values, although y == x*(y/x) + y%x should always be true.

In addition to the arithmetic operators, integer types support bitwise logical operators that apply some Boolean operation to all the bits of their arguments in parallel. What this means is that the i-th bit of the output is equal to some operation applied to the i-th bit(s) of the input(s). The bitwise logical operators are ~ (bitwise negation: used with one argument as in ~0 for the all-1's binary value), & (bitwise AND), '|' (bitwise OR), and '^' (bitwise XOR, i.e. sum mod 2). These are mostly used for manipulating individual bits or small groups of bits inside larger words, as in the expression x & 0x0f, which strips off the bottom four bits stored in x. Two additional logical operators are the shift operators << and >>; see KernighanRitchie for more about these.

To add to the confusion, there are also three logical operators that work on the truth-values of integers, where 0 is defined to be false and anything else is defined by be true. These are && (logical AND), ||, (logical OR), and ! (logical NOT). The result of any of these operations is always 0 or 1 (so !!x, for example, is 0 if x is 0 and 1 if x is anything else). The && and || operators evaluate their arguments left-to-right and ignore the second argument if the first determines the answer (this is the only place in C where argument evaluation order is specified); so

   1     0 && execute_programmer();
   2     1 || execute_programmer();

is in a very weak sense perfectly safe code to run.

Yet another logical operator is the ternary operator ?:, where x ? y : z equals the value of y if x is nonzero and z if x is zero. Like && and ||, it only evaluates the arguments it needs to:

   1     fileExists(badFile) ? deleteFile(badFile) : createFile(badFile);

Most uses of ?: are better done using an if-then-else statement (C/ControlStructures).

Logical operators usually operate on the results of relational operators or comparisons: these are '==' (equality), '!=' (inequality), '<=' (less than or equal to) and '>=' (greater than or equal to). So, for example,

    if(size >= MIN_SIZE && size <= MAX_SIZE) {
        puts("just right");
    }

tests if size is in the (inclusive) range [MIN_SIZE..MAX_SIZE].

Beware of confusing == with =. The code

   1     /* DANGER! DANGER! DANGER! */
   2     if(x = 5) {
   3         ...

is perfectly legal C, and will set x to 5 rather than testing if it's equal to 5. Because 5 happens to be nonzero, the body of the if statement will always be executed.

6.3. Alignment

Modern CPU architectures typically enforce alignment restrictions on multi-byte values, which mean that the address of an int or long typically has to be a multiple of 4. This is an effect of the memory being organized as groups of 32 bits that are written in parallel instead of 8 bits. Such restrictions are not obvious when working with integer-valued variables directly, but will come up when we talk about pointers in C/Pointers.


CategoryProgrammingNotes

7. C/Statements

The bodies of C functions (including the main function) are made up of statements. These can either be simple statements that do not contain other statements, or compound statements that have other statements inside them. Control structures are compound statements like if/then/else, while, for, and do..while that control how or whether their component statements are executed.

8. Simple statements

The simplest kind of statement in C is an expression (followed by a semicolon, the terminator for all simple statements). Its value is computed and discarded. Examples:

   1     x = 2;              /* an assignment statement */
   2     x = 2+3;            /* another assignment statement */
   3     2+3;                /* has no effect---will be discarded by smart compilers */
   4     puts("hi");         /* a statement containing a function call */
   5     root2 = sqrt(2);    /* an assignment statement with a function call */

Most statements in a typical C program are simple statements of this form.

Other examples of simple statements are the jump statements return, break, continue, and goto. A return statement specifies the return value for a function (if there is one), and when executed it causes the function to exit immediately. The break and continue statements jump immediately to the end of a loop (or switch; see below) or the next iteration of a loop; we'll talk about these more when we talk about loops. The goto statement jumps to another location in the same function, and exists for the rare occasions when it is needed. Using it in most circumstances is a sin.

9. Compound statements

Compound statements come in two varieties: conditionals and loops.

9.1. Conditionals

These are compound statements that test some condition and execute one or another block depending on the outcome of the condition. The simplest is the if statement:

   1     if(houseIsOnFire) {
   2         /* ouch! */
   3         scream();
   4         runAway();
   5     }

The body of the if statement is executed only if the expression in parentheses at the top evaluates to true (which in C means any value that is not 0).

The braces are not strictly required, and are used only to group one or more statements into a single statement. If there is only one statement in the body, the braces can be omitted:

   1     if(programmerIsLazy) omitBraces();

This style is recommended only for very simple bodies. Omitting the braces makes it harder to add more statements later without errors.

   1     if(underAttack)
   2         launchCounterAttack();   /* executed only when attacked */
   3         hideInBunker();          /* executed always */

In the example above, the lack of braces means that the hideInBunker() statement is not part of the if statement, despite the misleading indentation. This sort of thing is why I generally always put in braces in an if.

An if statement may have an else clause, whose body is executed if the test is false (i.e. equal to 0).

   1     if(happy) {
   2         smile();
   3     } else {
   4         frown();
   5     }

A common idiom is to have a chain of if and else if branches that test several conditions:

   1     if(temperature < 0) {
   2         puts("brrr");
   3     } else if(temperature < 100) {
   4         puts("hooray");
   5     } else {
   6         puts("ouch!");
   7     }

This can be inefficient if there are a lot of cases, since the tests are applied sequentially. For tests of the form <expression> == <small constant>, the switch statement may provide a faster alternative. Here's a typical switch statement:

   1     /* print plural of cow, maybe using the obsolete dual number construction */
   2     switch(numberOfCows) {
   3     case 1:
   4         puts("cow");
   5         break;
   6     case 2:
   7         puts("cowen");
   8         break;
   9     default:
  10         puts("cows");
  11         break;
  12     }

This prints the string "cow" if there is one cow, "cowen" if there are two cowen, and "cows" if there are any other number of cows. The switch statement evaluates its argument and jumps to the matching case label, or to the default label if none of the cases match. Cases must be constant integer values.

The break statements inside the block jump to the end of the block. Without them, executing the switch with numberOfCows equal to 1 would print all three lines. This can be useful in some circumstances where the same code should be used for more than one case:

   1     switch(c) {
   2     case 'a':
   3     case 'e':
   4     case 'i':
   5     case 'o':
   6     case 'u':
   7         type = VOWEL;
   8         break;
   9     default:
  10         type = CONSONANT;
  11         break;
  12     }

or when a case "falls through" to the next:

   1     switch(countdownStart) {
   2     case 3:
   3         puts("3");
   4     case 2:
   5         puts("2");
   6     case 1:
   7         puts("1")
   8     case 0:
   9         puts("KABLOOIE!");
  10         break;
  11     default:
  12         puts("I can't count that high!");
  13         break;
  14     }

Note that it is customary to include a break on the last case even though it has no effect; this avoids problems later if a new case is added. It is also customary to include a default case even if the other cases supposedly exhaust all the possible values, as a check against bad or unanticipated inputs.

   1     switch(oliveSize) {
   2     case JUMBO():
   3         eatOlives(SLOWLY);
   4         break;
   5     case COLLOSSAL:
   6         eatOlives(QUICKLY);
   7         break;
   8     case SUPER_COLLOSSAL:
   9         eatOlives(ABSURDLY);
  10         break;
  11     default:
  12         /* unknown size! */
  13         abort();
  14         break;
  15     }

Though switch statements are better than deeply nested if/else-if constructions, it is often even better to organize the different cases as data rather than code. We'll see examples of this when we talk about function pointers C/FunctionPointers.

Nothing in the C standards prevents the case labels from being buried inside other compound statements. One rather hideous application of this fact is Duff's_device.

9.2. Loops

There are three kinds of loops in C.

9.2.1. The while loop

A while loop tests if a condition is true, and if so, executes its body. It then tests the condition is true again, and keeps executing the body as long as it is. Here's a program that deletes every occurence of the letter e from its input.

   1 #include <stdio.h>
   2 
   3 int
   4 main(int argc, char **argv)
   5 {
   6     int c;
   7 
   8     while((c = getchar()) != EOF) {
   9         switch(c) {
  10         case 'e':
  11         case 'E':
  12             break;
  13         default:
  14             putchar(c);
  15             break;
  16         }
  17     }
  18 
  19     return 0;
  20 }

Note that the expression inside the while argument both assigns the return value of getchar to c and tests to see if it is equal to EOF (which is returned when no more input characters are available). This is a very common idiom in C programs. Note also that even though c holds a single character, it is declared as an int. The reason is that EOF (a constant defined in stdio.h) is outside the normal character range, and if you assign it to a variable of type char it will be quietly truncated into something else. Because C doesn't provide any sort of exception mechanism for signalling unusual outcomes of function calls, designers of library functions often have to resort to extending the output of a function to include an extra value or two to signal failure; we'll see this a lot when the null pointer shows up in C/Pointers.

9.2.2. The do..while loop

The do..while statement is like the while statement except the test is done at the end of the loop instead of the beginning. This means that the body of the loop is always executed at least once.

Here's a loop that plays a slot machine until it recovers its inital losses (if ever). If we changed the do..while loop to a while loop, it would never pull the lever the first time, because winnings would not be less than 0 to start with.

   1     winnings = 0;
   2     do {
   3         winnings += pullSlotMachineLever();
   4     } while(winnings < 0);

The do..while loop is used much less often in practice than the while loop. Note that it is always possible to convert a do..while loop to a while loop by making an extra copy of the body in front of the loop.

9.2.3. The for loop

The for loop is a form of syntactic sugar that is used when a loop iterates over a sequence of values stored in some variable (or variables). Its argument consists of three expressions: the first initializes the variable and is called once when the statement is first reached. The second is the test to see if the body of the loop should be executed; it has the same function as the test in a while loop. The third sets the variable to its next value. Some examples: