v22.0310-003: Homework 5

Due date: Wed December 8

You have more than 2 weeks to do this homework. On the other hand, it is a big one, so do not leave everything to the last few days. At least get the programming part out of the way first, as this cannot be done under time pressure. This has the usual programming part and written part, to be submitted as usual.


Look here for additional information after the original posting of assignment:
  1. Prove the correctness of the topological sort algorithm on page 205. We suggest that you first state an appropriate property about the "tophelp" method, and prove that this property is true.
    REMEMBER: a proof is really like an essay which has a well-defined structure, and must be carefully thought out. If you want to use a proof by contradiction, say this in your first sentence! Then tell us what you are trying to find a contradiction for. Or if you want to use induction, say this first. Then tell us what is the induction hypothesis. Review Chapter 2 on proofs.

  2. Problem 7.7 (page 221) in Text: simulation of Dijkstra's Algorithm.

  3. Problem 7.8 (page 221) in Text: modification of Dijkstra's Algorithm.
    Please use the algorithm in Figure 7.16 (p.212) as your basis. IMPORTANT: You must explain how your modified algorithm answers the question.

  4. Problem 7.9 (page 221) in Text: root of a DAG.
    First explain in English how you wish to solve this problem. Then present your solution in the form of valid Java code. REQUIREMENTS: you must comment your code, and it must be possible to compile and execute your code (so you need a class and a main method).
    HINT: Although we do not require it, we suggest that you actually compile and run your code to test it.

  5. Problem 8.5 (page 263): modified Selection Sort.

  6. Problem 8.7 (page 263): modified Quicksort.

  7. Problem 8.12 (page 265) in Text: nuts and bolt.
    Instead of solving this entire problem we ask you to device a decision tree for n=3. Let the input be {N1, N2, N3, B1, B2, B3}. Draw this decision tree.
    HINT: You may exploit symmetry to draw only one part of the tree (just say that "the other part is symmetrical"). E.g. you may begin by comparing (B1:N1). This is the root of your decision tree. There are three outcomes:
    	B1=N1, B1<N1, B1>N1.
    	
    Continue from there.

  8. Programming Question.
    In this problem, we explore how we can make experimental comparisons between different algorithms. In this particular example we will try to compare the usual multiply routine from the BigInteger package with the multiply routine you (should) have written in hw4.

    First note that we are interested in getting the asymptotic performance of the respective algorithms. For this purpose, assume our algorithm has polynomial time complexity and we want to determine the exponent which occurs in the complexity expression. Thus, for example, for Karatsuba's algorithm, the complexity of multiplying two n bit numbers is

    	Theta(n(lg 3))
    	
    So the exponent is
    	lg 3 = 1.58...
    	
    We want an experimental method to obtain this exponent. In practice, we expect to get an approximation to this exponent because of various system-related factors which are out of our control. This quibbling aside, what measurements can we perform to find out the exponent?

    First we can express a polynomial running time T(n) (approximately) as follows:

    	T(n) = C * ne,
    	
    for some constants C and e. We want to find e experimentally. Taking logarithms,
    	log T(n) = log C  +  e * (log n)
    	
    Thus, the log - log plot of T against n is a straight line, whose slope is the value e we want. We can estimate this slope in many ways (a well known method is called "least squares method"). Also, you need to take the measurements several times to give a more accurate reading.

    In this program, instead of calculating the slope using least square estimation described above, we simply take the first value as correct, and measure the slope with respect to that value. The slope is called dimension in the code.

    In the following program timing.java we provide code to process the arguments in a nice way. (You are welcome to write this stuff too). You have to fill in code which

    1. measures the average time taken for multiplication for each bit length; both for Karatsuba's algorithm, as well as for the usual BigInteger.multiply.
    2. calculates the exponent for each value of the bit length.

    The approach is quite straightforward. We keep a variable called numBits, which, as should be clear from the name, stores bit lengths of the two BigIntegers we are multiplying at that point. We let this bit length vary from some minimum value to some maximum value, with some step length. For each bit length, we do a certain number of trials. Each trial consists of randomly picking two bigIntegers, and multiplying them together. We measure the time taken to do this multiplication.

    All the relevant parameters can be set from the command line.

    What are the other problems which might arise?

    1. We measure time using the System.currentTimeMillis() which returns the number of milliseconds ellapsed since some reference date. However, calculating the time from this value merely calculates the so called "wall time" which might not have any relation to the actual time taken by the cpu; remember that in any decent multiprocessing system there are scores of other processes running in the system.
    2. The wall time also does not measure time spent by the process sitting in various OS queues, waiting for input etc.
    3. The jvm imposes timing issues of its own. As in any other garbage collected language, the runtime might start a garbage collection which basically might take arbitrary time. Even otherwise, when the runtime has to allocate (to itself) more memory from the underlying OS, it does so by making a malloc (or some such) call. This, however, can take indeterminate time as the OS might not have a free page to map to the process etc...
    4. Conclusion: It would be nice if you can find some way of calculating the time accurately.