PROGRAMMING WITH INVARIANTS (i.e. FINITE DIFFERENCING) Idea: To avoid a costly computation f(x1,..,xn) in a program region R of high execution frequency, we can sometimes maintain equality c = f(x1,...,xn) cheaply within R so that all occurrences of f(x1,...,xn) can be replaced by the stored value c. EXAMPLE 1 (Topological Sorting) - Given a set S of Tasks and a prerequisites relation, schedule all the tasks. [comment - order a sequence of courses needed for a masters degree program consistent with the course prerequisites; in general everyone has to solve this problem just to get through the day - even animals have to schedule their activities (we all exist in linearly ordered time)] Solution: Starting with an empty schedule repeat (*) exhaustively: (*) Augment the schedule with an unscheduled task that has no unscheduled prerequisites. In pictures: ----------------------------|-------------------------------------------------- <---------\ | . % <---\ . / <----+<---/ <---/ . scheduled | . unscheduled | . | . | . | . | . | . -------------------------------^----------------------------------------------- |___schedulable (prerequisites are all scheduled) [comment - We illustrate Knuth's ingenious topological sorting algorithm with pictures. The snapshot above portrays an intermediate state of Knuth's algorithm in which the tasks are partitioned into three regions (which represent invariants - i.e., each region represents the value of some complex expression). We repeatedly add an arbitrary schedulable task to the set of scheduled tasks until all tasks have been scheduled. Maintaining the schedulable tasks as a small separate part of the unscheduled tasks restricts the search space of tasks needed to be scheduled next, and also makes it easier to make the transition from one state to the next. How these three regions change corresponds to how these invariants are maintained. If the task labelled + (whose 3 prerequisites are scheduled) is added to the scheduled region, then all the prerequisites of the task labelled % become scheduled, which makes % task schedulable. The task labelled % is located efficiently using the inverse of the prerequisite relation, which can be constructed at the beginning of the algorithm. The inverse relation allows us to efficiently locate all tasks that have + as a prerequisite. For each such task we decrement a reference count of the number of its unscheduled prerequisites (since + moves from the unscheduled to the scheduled region). When the reference count for a task goes to zero, we know that its prerequisites are all scheduled, and that the task is now schedulable. EXAMPLE 2 (Linear Polynomial Tabulation) An idea known from antiquity x := x0 repeat print(i*c) x +:= e end => invariant t = x*c t:= x0*c t1 := e*c repeat print(t) t +:= t1 end EXAMPLE 3 (Polynomial Tabulation by Difference Polynomials) An idea discovered by Henry Briggs in the 16th Century Briggs was a human calculator, who worked on Napier's logarithm project. He generalized the method of antiquity for computing tables of single-variate dth degree Polynomials f(x) = a0 + x(a1 + x(a2 + ...+ xad))))....) Briggs's Idea: Difference Polynomials f1(x) = f(x + e) - f(x) has degree no more than d - 1 fi(x) = fi-1(x + e) - fi-1(x) has degree no more than d - i. i = 1,...,d fd(x) is a constant, possibly zero Polynomial Tabulation By Finite Differencing x := xo repeat print(f(x)) -- d additions and d products using Horner's Rule x +:= e end => Invariants: ti = fi(x) for i = 0,...,d t0 := f(x0) t1 := f1(x0) ... td := fd(x0) repeat print(t0) t0 +:= t1 t1 +:= t2 ... td-1 +:= td -- d additions and 0 products end Baggage's analytic difference engine was designed to implement the main loop of Briggs's method. EXAMPLE 4 (Spreadsheets) Consider the compound interest formula, p * (1+i)**y, which gives the value of an initial principal p after y years at interest rate i. If we want to create a spreadsheet with columns corresponding to principals p + j*delta, j = 0,1,...,m, and rows corresponding to years y = 0,1,...,n, then we can evaluate this mxn matrix efficiently by maintaining certain crucial invariants. 1. column computation - invariants determined by subexpression decomposition . invariant 1: t1 = 1 + i established by assignment t1 = 1 + i maintained with no work formula is simplified to p * t1**y . invariant 2: t2 = t1**y established by assignment t2 = t1 maintained when y is incremented by assignment t2 = t2 * t1 (based on the law of exponents, t1**n = (t1**(n-1)) * t1 ) formula is simplified to p * t2 . invariant 3: t3 = p * t2 established by assigment t3 = p * t1 (since t1 = t2 initially) maintained when t2 is modified by assignment t3 = t3 * t1 (based on associative law of products) formula is simplified to t3 Observe that only one produce is needed to compute each new column entry t3, because maintenance of t2 is not needed at all. 2. row computation from a given column - based on same idea . invariant 4: t4 = delta * t2 established by assignment t4 = delta * t2 maintained with no work a new row entry t3 is maintained by assignment t3 = t3 + t4