In addition to the original specifications, your team must add the following functionality to your program. These features emphasize specific design issues and are meant to help you think about what it means to "hard-code" things within your program and how to allow the user to gain access to those things when using your program.
A typical architecture for many computer programs is one that divides a program's execution into three stages: input, data is provided to the program; process, that data is transformed; and output, the program displays the results of transforming that data. This input/process/output (IPO) model of programming is used in simple programs like this one as well as in million-line programs that forecast the weather or predict stock market fluctuations. Your final program should be clearly separated into three independent modules such that each contains one or more classes that make it flexible enough to accommodate a variety of options without requiring either of the other modules to change. To do this, you must think carefully about what the result of each step is so that it can be safely received by the next step.
The requirements for each module are described below:
Your program should be able to read text files from a variety of sources. For example,
Your program should be flexible in how it orders and chooses its keywords. For example,
Your program should be flexible in the formatting of the output. For example,
Your team may implement all of these options or additional options to distinguish itself from the masses (i.e., for extra credit). However, note that the amount of extra credit will be in proportion to the amount of intellectual effort needed to implement the option. For example, accepting regular expressions in addition of exact words would be worth a lot of credit because it would require learning about regular expressions and mastering the available implementation. On the other hand, adding yet another way to set apart the keyword on a line would not be worth very much. Of course, a well-tested, perfectly working program that has fewer features (but plenty of clear paths to easy expansion) is always worth more than the leaky kitchen sink.
In short, to maximize your grade, you should implement enough variety in your program to clearly demonstrate that your design supports such extensions.
By default, your program should work as described in the original specifications. However, if there is a file called kwic.properties in the directory where the program is being run, then it should be able to customize the output based on the following options. You can add other options for extra credit.
Option Format |
Default |
Description |
|---|---|---|
| before=<int> | 3 | maximum number of words of context to print before the keyword |
| after=<int> | 3 | maximum number of words of context to print after the keyword |
| order=<string(s)> | alphabetical | output order: length means by word length; number means by most occurrences; chronological means by first appearance; reverse means opposite order Any number of these options may appear as part of this option separated by spaces. The order in which each appears determines its order of importance. For example, given the option "length number alphabetical", the output should be sorted first by the length of the word, then by number of appearances, and finally, if both of those are equal, then by alphabetical order. |
| offset=<string> | none | output keyword should surrounded immediately before and after by string given in option |
| color=<#hexcolor> | none | output keyword in the given browser color (only valid in HTML output) |
| aligned=<boolean> | true | output keywords such that they are aligned in a column |
| min=<int> | 3 | minimum number of letters in a word to be considered a keyword in the concordance |
| exclude=<filename> | none | exclude words from the given file from being a keyword in the concordance |
| include=<reg_exp> | all | exclude all words fom being keywords except those that match the given expression |
| max=<int> | all | maximum number of occurrences of keyword to print at a time |
| output=<string> | text | output format: either html or text |