Dave Reed, statement CS2 workshop

The debate between abstraction and implementation in CS2 is clearly one of degrees. I think that everyone would agree that students should be comfortable working with abstract data types. Being able to understand the interface to and general behavior of an abstraction is becoming an increasingly important skill in software engineering. The use of abstractions also allows for more interesting and complex applications in a CS2 course, since part of the complexity of a problem can be encapsulated into classes that are supplied to the students. At the same time, actually implementing data types can help the students to develop programming skills and better understand/appreciate the tradeoffs between alternative structures.

In the past, the balance between abstraction and implementation was largely decided for us by the languages we used. For example, Pascal did not provide strings as a standard data type. As such, a common exercise for students in Pascal was to implement strings as arrays of characters and to define useful operations on those strings. Similarly, the effective use of C strings required a relatively in-depth understanding of their implementation. This naturally led to having students implement their own string type, or at least to study the existing C implementation. With the standardization of string classes in C++ and Java, implementing your own string class makes less sense. The supplied class is far more robust and complete than any a student would define, and it has the advantage of being standardized across platforms. The question then becomes: what is lost by the students using the string abstraction provided for them as opposed to building their own implementation? In the right context, I would venture to say that nothing is lost. There is nothing inherently important or useful about the string implementation. What was important was that by designing and implementing the data type, the students achieved a better appreciation of abstraction, gained experience manipulating arrays, and better understood the significance of a linear, direct-access structure. As long as these specific skills are covered in a different application, it is perfectly reasonable and even logical to adopt the string abstraction.

With the advent of standardized libraries of data structures (STL in C++ or JFC in Java), the question of abstraction vs. implementation can be applied to classic data structures such as stacks, queues, and trees. Again, I would argue that there is nothing inherently important about the implementation of these particular data structures. It is important that students understand the behavior of these structures and appreciate their applicability to certain types of problems. However, it would seem reasonable to utilize the standardized implementations of some or even all of these data structure in CS2. It is important that students gain some low-level experience implementing and then using data types, but the choice as to which types is not so important. It is the skills and experiences obtained by the implementation that must be the focus.

What we must do as instructors is identify those skills and experiences that we believe are important in a CS2 class. We must then ensure that each of them is highlighted via one or more applications in the course. For example, I would claim that CS2 students should have experience building a dynamically linked structure and using recursion to manipulate that structure. A natural application in which to obtain that experience would be in implementing a tree class (e.g., a binary search tree). However, these same skills could be achieved via a different application (e.g., searching a graph). Similarly, I feel that students better appreciate implementation tradeoffs if they actually implement a structure in different ways and experimentally compare performance. This type of learning experience could be achieved by having the students implement a queue using different techniques (e.g., vector, linked-list, circularly linked-list) and then perform experiments. It could also be obtained in other settings, such as implementing a dictionary class using alternative structures (e.g., vector, binary search tree, trie). Again, the issue is providing the students with the appropriate experiences, not following a check list of data structures.

A common argument against utilizing predefined data structures is that it provides a disservice to students who may move on to other languages where these structures are not provided. This argument has some merit, although its force is weakening as the popularity of standardized languages such as C++ and Java grows. Ideally, this argument can be rebutted as long as the appropriate skills and experiences have been conveyed to the students. Once a skill or approach has been mastered, students should be able to apply it to numerous applications. For example, students who have previously implemented a graph as a dynamically-linked structure should be able to apply their knowledge to implementing a binary search tree. Our goal as instructors, then, is to teach an appreciation of data structures as abstractions and enough implementation skills to enable students to apply their knowledge across applications.