factorial(5) = 5 * factorial(4)More generally, if n is greater than zero, one can write
factorial(n) = n * factorial(n-1).This is a very special calculation because it is circular in nature. The strategy for computing factorial(n) requires finding factorial(n-1) first. But how does one compute factorial(n-1)? The answer is that one must compute factorial(n-2) and so forth.
But the formula
factorial(n) = n * factorial(n-1)fails if n = 0. In the case n = 0, we write
factorial(n) = 1.This special situation corresponds to what is often called the base case or the halting case. In fact, the general definition of factorial is
factorial(n) = if n = 0 then 1 else n * factorial(n-1)This is called a recursive calculation because the function being defined is used in the definition. Here is a general format for such calculations:
This can be illustrated by showing how it works on factorial:A method for doing computation C on data D to obtain result R. if the calculation is trivial then do it and return the result R else { divide D into two parts D1 and D2 do part of the calculation on D1 to obtain R1 (possibly using C) do part of the calculation on D2 to obtain R2 (possibly using C) combine R1 and R2 to compute and return the result R }
Here is the factorial method:A method for computing the factorial of n to obtain result f. if (n == 0) { f = 1 return f } else { separate n into parts D1 = n and D2 = n-1 do part of the calculation on D1 to obtain R1 = D1 compute the factorial of D2 to obtain R2 combine R1 and R2 to obtain f = R1 * R2 return f }
int factorial(int n)
{
int D1, D2, R1, R2;
if (n == 0)
{
R1 = 1;
return R1;
}
else
{
D1 = n;
D2 = n-1;
R1 = D1;
R2 = factorial(D2);
return R1 * R2;
}
}
This, of course, uses more data locations than are necessary. The program
can be shortened to this:
int factorial(int n)
{
if (n == 0)
{
return 1;
}
else
{
return n * factorial(n - 1);
}
}
(Note: If this program is run, it will overflow the integer variables for
values of n that are not small.)
You can better understand this program for factorial if you carry through a computation by hand. Here is a trace of its major actions when it computes the factorial value for 3. All computations from a single call are indented equally.
Call factorial(3)
i = 2
Call factorial(2)
i = 1
Call factorial(1)
i = 0
Call factorial(0)
return 1
return 1 * (value from previous call) = 1 * 1 = 1
return 2 * (value from previous call) = 2 * 1 = 2
return 3 * (value from previous call) = 3 * 2 = 6
Imagine that a visitor has come to America from another country and is learning English. So, he makes heavy use of a dictionary. The dictionary provides an excellent analogy to a recursive method. Let's see how that goes.
Let's assume that our visitor needs to look up the meaning of prime as in a "prime number". The dictionary gives us (simplifying a bit) "Divisible by no number except itself or one."
That seems straightforward to us, but our visitor doesn't know the meaning of the word divisible . Is he stuck now? No, he has the dictionary! And he uses it recursively to look up that word. He may have to use it several times before he finally understands what prime means. However, it should be clear, that in general, use of a dictionary is a recursive process.
The dictionary also illustrates a problem we may encounter with poorly formulated recursive algorithms. If our visitor's knowledge of English is extremely limited, he might find that a definition he has come to, after several steps, includes the word he was originally looking up. In other words has extracted a completely circular definition. Unfortunately, he is stuck now. A dictionary is inherently circular, defining English words in terms of English words. What is needed to make it work is a basic vocabulary that will allow us to halt after one, or maybe a few references.
Recursive algorithms always require one or more base cases or halting cases for which recursion is not used. Otherwise the algorithm is completely circular and its implementation will lead to an infinite loop. For our factorial example, the base case is n == 0.
A method for sorting a list L. if L has length 1 or less then
void quicksort(int[] list, int first, int last)
{
int pivot;
if (first < last)
{
pivot = rearrange(list, first, last);
quicksort(list, first, pivot - 1);
quicksort(list, pivot + 1, last);
}
}
There are many strategies for coding rearrange and its
auxiliary routine exchange, and we will not
discuss them here. The following is one Java version of these routines you
may wish to study:
void exchange(int[] list, int x, int y)
{
int temp;
temp = list[x];
list[x] = list[y];
list[y] = temp;
}
int rearrange(int[] list, int first, int last)
{
int pivot, pval, p, k;
pivot = (first+last)/2; // Find middle
exchange(list, first, pivot); // Move to front of sublist
pval = list[first];
p = first;
k = first + 1;
while (k <= last)
{
if (list[k] <= pval)
{
p = p + 1;
exchange(list, p, k);
}
k = k + 1;
}
exchange(list, first, p);
return p;
}
In conclusion, the strategy of dividing a problem into simpler parts
and solving each separately was explored earlier in this chapter. In this
section, the approach for computing function C divides the data into parts,
calculates partial results using C, and combines those results to obtain the
final answer. The methodology is called recursion, and it is both subtle
and powerful. Some writers also refer to the methodology of splitting the
data into parts and solving the divide and conquer methodology.
Recursion can often be used as an alternative to writing ordinary looping programs. For example, you can probably write a program to compute factorial easily without recursion. However, other problems are extremely difficult to write unless you use recursion. An example of such a problem is given in exercise 4 below, the computation of all of the orderings of a set of symbols. Other examples of problems that need recursion for straightforward solution are tree searching problems as described in the chapter on artificial intelligence.
a,b,c
a,c,b
b,a,c
b,c,a
c,a,b
c,b,a
Use a recursive strategy to solve this problem.
This chapter has illustrated both strategies in a series of examples. The first was the database problem, where the original statement of the task seemed to involve programming far beyond the grasp of novice programmers. Yet with careful structuring of the solution, a way was found to achieve the target behaviors with a single loop program and relatively little additional complexity.
Java and most other programming languages provide syntactic support to the decomposition process through the subroutine feature. A subroutine or method is a module of programming that solves a part of the total problem. Whenever the solution to any task seems complex, we section off parts of it into subroutines that will be written later. Each part of the solution should be cut down to such a proportion that it seems straightforward and obviously correct.
The use of the subroutine feature requires care in designing communication between the subroutine and its calling program. This communication is carried out through the parameter list. The higher-level program needs to have a job done on its data structures, and it calls the subroutine to do it. The subroutine references only the needed structures in the calling program through the parameter list. If the subroutine needs additional memory locations to do its work, it should have them declared internally.
The database program developed here introduces the concept of a database and shows how it organizes data. An excellent way to retrieve information from such a database is to search for data patterns where some but not all of the fields are specified. One of the shortcomings of the program studied here is its inability to do inferences when the needed facts are not explicitly in the memory. A later chapter will reexamine this shortcoming and describe a method for overcoming it.
In the early days of computing, an interesting competition arose related to the power of computer languages. One researcher would show that his or her language could be used to compute every function that some other language could compute. Then someone else would show that the second language could compute everything that the first could. The implication was that neither language could compute anything more than the other; both were capable of computing the same class of functions. Such comparisons occurred many times with the same surprising result: any nontrivial computer language that one can invent is apparently capable of computing no more and no fewer functions than all the other nontrivial programming languages. This is known as the Church-Markov-Turing Thesis, and it applies to the part of Java described in this book. You have learned all the computational features needed to compute any function that can be programmed by any language yet invented. Additional features provide convenience but not increased computational capability.
The Church-Markov-Turing Thesis has even more profound implications. Consider the portion of the English language appropriate for discussing computation in very concrete terms. This widely used sublanguage includes such sentences as, "Find the largest number in this finite set that is prime" and "Sort the set of records on the fifth field." Let us call this portion of English C-English and think of it as just another programming language. As far as anyone can tell, the power of C-English is similar to any other programming language; it is capable of describing any calculation that any standard programming language can do and no more. The Church-Markov-Turing Thesis in its extended form thus asserts that any algorithm that anyone can describe in concrete English (or any other natural language) can be translated to any modern programming language , including the Java presented in this book. This thesis is embraced by most computer scientists, and it implies a kind of completeness to modern programming languages. It says that if you can specify an algorithm concretely in English, you can program it. Modern computer languages are powerful enough; no new languages will enable us to compute more functions than are now possible unless some currently unimaginable new paradigm for computing is created.
You should now be able to solve many problems of moderate size using the Java language. However, this is only a brief introduction to programming, and substantial additional study is needed in order to become an accomplished programmer. If you wish to become an expert, learn the rest of the Java features, as well as one or two other languages. Then you need to undertake an extensive study of data structures and the multitude of ways that can be used to represent data in computers. Associated with this study, learn to analyze the execution times of programs so that you can create efficient, as well as correct, programs. Also, you must learn methodologies for proving the correctness of programs, so that they can be guaranteed to meet the specifications that are claimed for them. Finally, you should gain experience applying all of these methodologies to the analysis and coding of a variety of problems.