The Data Compression Conference did not like the paper [about the transform], perhaps put off by my rather pragmatic presentation. However, Mark Nelson drew attention to it in a Dr. Dobbs article, and that was enough to ensure its survival. I was embarrassed to see that Mark Nelson called it the "Burrows-Wheeler Transform", even though I tried to make it clear that the algorithm was David's. Of course, David was far too polite to mention this.Interestingly both Wheeler and Burrows are more well-known for other accomplishments than the transform.
Wheeler is credited with inventing the subroutine while working on the EDSAC, one of the first computers. Subroutines were both difficult to implement, though conceptually simple; without them we wouldn't be able to design and implement large programs at all.
Burrows co-invented Alta Vista arguably the predecessor of Google and the first large-scale successful search engine. For this and other work he won the Mark Weiser Award for "contributions that are highly creative, innovative, and possibly high-risk, in keeping with the visionary spirit of Mark Weiser."
From his London Times obituary we find the following about Wheeler:
Wheeler was an inspiring teacher who helped to develop computer science teaching at Cambridge from its inception in 1953, when the Diploma in Computer Science was launched as the world's first taught course in computing. Many of his research students now occupy senior positions in major computer companies, or have made their own significant contributions to computer science, including the development of new programming languages.
From an online-article about Burrows that is no longer accessible from Stanford, we find the following:
Mike is not just another hip, smart young Googler. He's one of the pioneers of the information age. His invention of Alta Vista helped open up an entire new route for the information highway that is still far from fully explored. His work history, intertwined with the development of the high-tech industry over the past two decades, is distinctly a tale of scientific genius.Mike knew he wanted to be a scientist early on and focused on it at school. "I strongly prefer the British educational system," he says. "We begin to specialize at age 14. That means after age 14 I didn't have to study history, geography or any social sciences I was bad at. I was able to focus on the subjects I liked. Early specialization gives you more time to concentrate and makes it easier for you to succeed."
char[]
,
an array of char values (though all in the range 0-255).
char[]
for
the transformed data.
ByteArrayInputStream
and
Huffman Coding as described in the howto pages.
Write the compressed data and the first index to the output
file/stream.
int
value representing the first
index for the compressed chunk.
ByteArrayOutputStream
for uncompressing.
byte[]
to a char[]
so that
you can then undo the move-to-front transform.
char[]
from the previous step.
If you snarf the assignment or browse the code
directory you'll get a new GUI HuffViewer
class with an
additional menu item to transform an input file to an output file using
BWT. The GUI will now call a method transform
specified by
the
IHuffProcessor
to encode/transform via BWT and Huffman
coding (with move-to-front). You'll need to write this
transform code in the class
you have that implements the IHuffProcessor
interface. There's information in the howto pages on this.
In the code you'll write you'll use methods from the
BurrowsWheeler
class --
there are three comments labeled TODO in that class, you'll need
to write code in those places in addition to the code you write in your
IHuffProcessor.transform
method.
To uncompress a file compressed by BWT you'll need to look at the
magic-number, than call a method you write in your
IHuffProcessor
class to unhuff/untransform the
data. There's more in the howto pages,
but basically you'll need to reorganize your Huffman
compressing/uncompressing code so you can call it appropriately
when using BWT.