Working with Sets
Terminology
- Definition
- A set is an unordered collection of distinct objects. The objects in a set are also called the elements or members of the set.
- Common Operations
- The union of two sets A and B is the set
containing elements that are either in A or in B.
- The intersection of two sets A and B is the set containing elements that are both A and B.
- The difference between two sets A and B is the set containing elements that are in A but not B.
- The complement of a set A is the set of all elements of the universal set (i.e. set containing all objects under consideration) that are not in A.
- The intersection of two sets A and B is the set containing elements that are both A and B.
Using Sets
Suppose A is a Set of String objects representing the names of first-year students at Duke and B is a Set of String objects representing students taking CompSci 6 at Duke. Express each of the following sets in terms of operations on sets A and B.
- the set of first-year students taking CompSci 6
- the set of first-year students who are not taking CompSci 6
- the set of students who either are first-year students or are taking CompSci 6
- the set of students who either are not in their first-year or are not taking CompSci 6
Using TreeSet
In Java, the TreeSet class,
is used to model the concept of a set given above. In the remainder of this
activity, you will use one or more TreeSet objects to discover several interesting statistics about words in a file: how many distinct words exist in the file, how many appear in two separate files, how many are common to both files, and finally, how many appear in one file but not the other. Sets prove to be a useful tool to explore these kinds of statistics because each is a common use of sets: creation, union, intersection, and difference, respectively.
There are two classes defined for this activity: SetAlgorithms and TestSets.
The first is the class you will be implementing, the second simply tests
the first by calling its methods and printing the results for your inspection.
You should start by reading over the code in SetAlgorithms
and making sure you understand it. Note, you cannot access items in a set
directly, as was possible with an ArrayList, so instead you must access each
item in turn using a separate Iterator object.
The output of TestSets should indicate that two methods that create and print sets are properly implemented. Implementing the set operation methods is your exercise today. After you have verified the current implementation, you should implement the following additional methods and test them by inspecting their output.
- Complete the method,
union, that creates a newTreeSetthat is the union of the twoSets passed as parameters. The union of two sets is the set that contains all of the elements of both sets. This method should not modify either of the sets passed as parameters, but instead create and return a new set. Additionally, it should not use or modify any instance variables. - Complete the method,
intersection, that creates a newTreeSetthat is the intersection of the twoSets passed as parameters. The intersection of two sets is the set that contains only those elements that are contained in both sets. This method should not modify either of the sets passed as parameters, but instead create and return a new set. Additionally, it should not use or modify any instance variables. - Complete the method,
difference, that creates a newTreeSetthat is the difference of the twoSets passed as parameters. The difference of two sets is the set that contains only those elements that are in the first set, but not the second set. This method should not modify either of the sets passed as parameters, but instead create and return a new set. Additionally, it should not use or modify any instance variables. - Complete the overloaded method,
union, that creates a newTreeSetthat is the union of any number ofSets passed as an list. This method should not modify either of the sets passed as parameters, but instead create and return a new set. Additionally, it should not use or modify any instance variables. - Complete the overloaded method,
intersection, that creates a newTreeSetthat is the intersection of any number ofSets passed as an list. This method should not modify either of the sets passed as parameters, but instead create and return a new set. Additionally, it should not use or modify any instance variables.
What additional test files might be useful to determine if each method works correctly?