CPS 206 Fall 1999, Problem 11. Work alone.

Fibers: Final Project

 

You are to complete one or more of the following suggested projects, all related to the Fiber system of program execution. Each project should include determining the "useful instruction efficiency" of the Fiber program built, should assume that all input is read before timing begins, and all output is printed after timing ends (from an array or list left in memory). Each project should include a comparison of the Fiber program's results with those obtained from a sequential implementation of the same algorithm, to demonstrate correctness. Report the efficiency obtained on at least two test cases of your choosing:

  1. Build an efficient Fiber program to execute one alternative approach to Dijkstra's Single Source Shortest Path Algorithm. Test cases should be directed graphs on N nodes, with about N2/20 arcs. Each arc-length should be an integer in the range 1..20. This algorithm was outlined in a lecture on PROLOG as follows:

Assume the data base contains the representation of a directed graph, as follows:

Each node i in the graph is accompanied by the facts: arcs(i, dl), where dl is a list of [ node, length] sublists. If [j, x] appears in dl, the graph has an arc of length dl from node i to node j. The problem is to find a path of least total distance (sum of its arc lengths) from node 0 to node N: path(N,L, D) is true when L is such a path (a list of nodes), and D is the total distance of path L.

A method that works well is due to Dijkstra: Label those nodes whose minimum distance from 0 is known with that distance. A new node not yet labeled can be labeled if that node is one whose direct distance from any labeled node is smallest. The currently-labeled nodes are maintained in a set H using facts that are added to the data base: inH(i) indicates that i is in H. Each node has a distance: d(i, distance) and these facts are updated during the computation. The graph's node set consists of all nodes numbered 0..N. A computation stage occurs as a new node is added to H. First, the arcs leaving that node are followed, and the tentative distance for arcs at each arc tip are adjusted. Then, the set of nodes not in H are searched for a node with smallest distance among these nodes. One node at minimum distance is selected and added to H, beginning the next stage. Once all nodes reachable from 0 have been added to H, the computation ends (it can end sooner, when N enters H). The list returned can be discovered from the d() facts in the data base, or it can be computed backwards from N, by a trail t(j, prev) of facts entered when node "prev" minimizes j's distance. During the computation, it may be helpful to maintain also the set of nodes that are reachable, and are not yet in H.

Modify the problem specification to allow the graph to be read from input in the form of triples of integers: From To Length.

There are at least 3 alternative approaches to the implementation of the algorithm. You should implement one of them:

      1. Place the length of the arc from node i to node j in element A[i, j], and assume that every node is connected to every other node; when node i enters H, examine A[i, x] for all x not in H, updating the Distance(x), the distance from 0 to x using only nodes currently in H. Find the minimum Distance(x), and enter x into H.

        1. The minimum can be found using a linear search over all nodes not in H, or
        2. By maintaining a data structure called a "priority queue" that allows elements to be added, and at any time, the least element to be removed, both operations taking O(log N) time, when there are N elements in the queue. (This option uses fewer total instructions asymptotically, but is much harder to implement.)

      1. Sort arcs Ar whose "tail node", Ar.i, is in H by the value of T(Ar) = Distance(Ar.i)+Ar.Length. These can be sorted using a "bucket sort" by placing all arcs A with T(A)=K into a list headed by Bucket[K]. The arcs in Bucket[Cur] can be removed, and examined to see if their "head node" can now be added to H. After all arcs in this bucket have been examined, Cur is increased by 1, to begin work on the next longest paths.

  1. My fiber programs seem to use several instructions when one fiber completes, and wants to communicate its results to another fiber, for further processing. I have 2 suggestions for improving this situation: (a) Implement User Semaphores, with operations up(), and down(); use them to allow one fiber to wait for another to signal completion. (b) Implement a class Channel. A channel object C has operations C.send(x) and y=C.rcv() defined, which implement the ability for one fiber to send an integer to another, with the receiving fiber delaying on the rcv() until the matching send() has been issued. up() and down() can be implemented in 1 (useless) cycle each, but send() and rcv() take 5 useless cycles each. Use each of these schemes in at least two Fiber programs, and determine which yields faster (or more efficient) implementations. You can use the "Find Maximum" algorithm, modified to make use of a "tree" of fibers to combine results from several fibers. You can choose the second algorithm yourself, as long as it makes effective use of multiple fibers, whose results must be combined in some way that requires synchronization. In your report, discuss how the fibers can decide one common "channels" or "semaphores" to use for this synchronization, using as few extra instructions for the purpose as possible. If you can think of some compiler algorithms that produce efficient implementations of the code you have to write to use the Java fiber system, explain them.
  2. Radix sort: Suppose an array A of N integers, each randomly chosen from {0,…,4095} is given. Place those integers whose leading 4 bits are I into the I-th row of an array Kb[][], sort each of the rows of that array separately into non-decreasing order, and finally place all the integers back into A, now sorted into non-decreasing order. Write an efficient Fiber algorithm for this problem.
  3. The Knapsack problem: A list of N objects [(W1, V1), …, (WN, VN)], along with a weight-limit W. The i-th object's weight is Wi, and its value is Vi. The object is to pack a knapsack such that, without exceeding the weight-limit W, the objects in the knapsack have provably maximum value, among all combinations of objects that can be chosen to fill the knapsack. This problem can be solved using Dynamic Programming, without examining every possible combination of objects that can be chosen: Let KN(j, u) be the maximum value that can be achieved by selecting only objects numbered 1, …, j, and using a weight limit of u. (Assume all weights are integers). Then KN(j, u) = max ( KN(j-1,u), Vj+KN(j-1, u-Wj ) ), if u>= Wj, and KN(j-1,u), otherwise. The base cases are: KN(0, u)=0. KN(j,0)=0 also, but hopefully is implied by the definition of the general case for KN(j, u). The original problem has value KN(N,W). To solve the problem quickly, compute an array Kna[j, u] = KN(j, u), and instead of calling KN(j-1,x) recursively, reference the previously-computed value Kna[j-1,x]. The N-th object is in the knapsack if Kna[N,W] != Kna[N-1,W], and the presence of the object with next smaller index can be found by examining either Kna[N-1, W] or Kna[N-1, W-WN]. Write an efficient Fiber program to solve this problem. What part of the problem's solution is least efficient?
  4. Does the fiber scheme work on recursive programs? Consider the Prolog program you wrote for computing all the partitions of a number S into I parts, where each of the I integers in a partition is positive. Implement some variant of that program in Fibers, to execute efficiently. Is it better to redesign the program to work iteratively, or can Fibers be added directly to a recursive version of the program?