Working with Sets


Terminology

Definition
A set is an unordered collection of distinct objects. The objects in a set are also called the elements or members of the set.
Common Operations
The union of two sets A and B is the set containing elements that are either in A or in B.
The intersection of two sets A and B is the set containing elements that are both A and B.
The difference between two sets A and B is the set containing elements that are in A but not B.
The complement of a set A is the set of all elements of the universal set (i.e. set containing all objects under consideration) that are not in A.

Using Sets

Suppose A is a Set of String objects representing the names of first-year students at Duke and B is a Set of String objects representing students taking CompSci 6 at Duke. Express each of the following sets in terms of operations on sets A and B.

  1. the set of first-year students taking CompSci 6
    
    
    
    
    
    
  2. the set of first-year students who are not taking CompSci 6
    
    
    
    
    
    
  3. the set of students who either are first-year students or are taking CompSci 6
    
    
    
    
    
    
  4. the set of students who either are not in their first-year or are not taking CompSci 6
    
    
    
    
    
    

Using TreeSet

In Java, the TreeSet class, is used to model the concept of a set given above. In the remainder of this activity, you will use one or more TreeSet objects to discover several interesting statistics about words in a file: how many distinct words exist in the file, how many appear in two separate files, how many are common to both files, and finally, how many appear in one file but not the other. Sets prove to be a useful tool to explore these kinds of statistics because each is a common use of sets: creation, union, intersection, and difference, respectively.

There are two classes defined for this activity: SetAlgorithms and TestSets. The first is the class you will be implementing, the second simply tests the first by calling its methods and printing the results for your inspection. You should start by reading over the code in SetAlgorithms.java and making sure you understand it.

The output of TestSets.java should indicate that two methods that create and print sets are properly implemented. Implementing the set operation methods is your exercise today.

After you have verified the current implementation, you should implement the following additional methods and test them by inspecting their output.

  1. Complete the method, union, that creates a new TreeSet that is the union of the two TreeSets passed as parameters. The union of two sets is the set that contains all of the elements of both sets. This method should not modify either of the sets passed as parameters, but instead create and return a new set. Additionally, it should not use or modify any instance variables.
     
  2. Complete the method, intersection, that creates a new TreeSet that is the intersection of the two TreeSets passed as parameters. The intersection of two sets is the set that contains only those elements that are contained in both sets. This method should not modify either of the sets passed as parameters, but instead create and return a new set. Additionally, it should not use or modify any instance variables.
     
  3. Complete the method, difference, that creates a new TreeSet that is the difference of the two TreeSets passed as parameters. The difference of two sets is the set that contains only those elements that are in the first set, but not the second set. This method should not modify either of the sets passed as parameters, but instead create and return a new set. Additionally, it should not use or modify any instance variables.
     
  4. Write a new version of the method, union, that creates a new TreeSet that is the union of any number of TreeSets passed as an array parameter. This method should not modify either of the sets passed as parameters, but instead create and return a new set. Additionally, it should not use or modify any instance variables.
     
  5. Write a new version of the method, intersection, that creates a new TreeSet that is the intersection of any number of TreeSets passed as an array parameter. This method should not modify either of the sets passed as parameters, but instead create and return a new set. Additionally, it should not use or modify any instance variables.
     

What additional test files might be useful to determine if each method works correctly?








Comments?