Sarah: the current version of the proof is: http://cs.nyu.edu/cs/faculty/shasha/papers/tappJournal.pdf with the full zip file here: http://cs.nyu.edu/cs/faculty/shasha/papers/forSarah.zip That zip file has the climbing version of the algorithm. Here is the descending version. \paragraph{Goal of Algorithm.} The LiquidClimber algorithm ends with a configuration that is lexicographically maximum (e.g. maximize packages P1, P2 in that order) and that satisfies the query constraints on versions (e.g. only use versions 4, 5, and 6 of package P7). The query constraints are used to form sourcemap but play no other role in the algorithm. The order of packages to maximize are encoded in todolist. The variable {\em memo} keeps tracks of calls that have failed, e.g. Pi.vi calling Pj.vj has failed. If a configuration contains a pair of package-versions that have failed, then that configuration will fail so will not need to be executed. \begin{small} \begin{verbatim} current := package version pairs in the initial configuration (we know this works) todolist := packages requiring maximization in descending order of priority constraints := some query constraints on versions of various packages sourcemap := for each package the versions that exist and satisfy query constraints memo := calls for some Pi.vi to some Pj.vj that fail. Initially this is empty LiquidClimber() for each package p in todolist in descending order of priority update current to the highest version of p that works and that satisfies the constraints # use git binary on sourcemap if current has less than highest version of p in sourcemap then versionstodo := find all greater versions of p or major releases in DESCENDING order of version number within sourcemap for each v in versionstodo in order temp := current with p set to v ret := TryToMakeWork(p, temp) if ret is not null then # we've had success current:= ret # adjust temp to make it work # p is the package we're trying to push TryToMakeWork(p,temp) execute temp unless we know from memo that temp will fail if the execution of temp fails Find the first call, say from Pi.vi' to Pj.vj', that fails Record in memo that Pi.vi' calling Pj.vj' fails keepfixed := {x | x == p or a package earlier than p in the todolist} if Pi in keepfixed then possible(Pi) := (vi') else possible(Pi) := (all versions of Pi in sourcemap sorted from greatest to least) end if if Pj in keepfixed then possible(Pj) := {vj'} else possible(Pj) := (all versions of Pj in sourcemap sorted from greatest to least) end if for each untried configuration c that can be constructed from the cross-product of possible(Pi) and possible(Pj) such that c does not include some pair (Pk.vk, Pm.vm) in memo ret := TryToMakeWork(p,c) if (ret is not null) then return ret end for return null else return temp # this configuration works \end{verbatim} \end{small} \textbf{Theorem}: The algorithm finds the lexicographically maximum configuration that works. \textbf{Proof sketch}: The query constraints are taken into account by the construction of sourcemap. Todolist is ordered from most important package to least. Suppose that there is a another configuration C’’=P1.v1'', P2.v2'', ...., Pn.vn'' that is lexicographically greater and that also works. We prove that P1.v1'', P2.v2'', ...., Pn.vn'' must have been tried by our algorithm. Let Px be the first package in Q for which Px.Vx’’ $>$ Px.Vx’. By construction of versionstodo, Px.vx’’ was tried by the algorithm. If TryToMakeWork did not succeed to make it work, there was some call from Pi.vi to Pj.vj that failed. But TryToMakeWork will try all combinations of versions between Pi and Pj as long as those combinations aren't in memo and don't affect previously optimized packages or the package that is currently pushed (for which other versions will be tried in the for each loop LiquidClimber). The recursive call continues this exploration in a depth first manner. Done. \textbf{Theorem}: If there are a maximum of M versions per package and n packages, then the number of tests $ \le n \times (n-1) \times M \times M $ \textbf{Proof}: The number of possible calling pairs $\leq n \times (n-1)$. Equality holds only in the case that every package calls every other package. For each package Pi that calls package Pj, in the worst case, we might have to test each version of Pi calling each version of Pj. Each time such a call of the form Pi.vi calling Pj.vj causes an error, we record that fact in memo. Once recorded, no configuration that includes Pi.vi and Pj.vj will be executed. So, for every such pair Pi and Pj, there need be at most $M \times M$ executions. Done. We have started here with the most minimal assumption. Stronger assumptions could be considered (e.g., if Pi.vi calls Pj.vj and fails, then any version greater than vi will fail on any version less than vj). Considering such stronger assumptions is part of our ongoing work. \subsection{LiquidClimber -- optimized } {\em Sarah: In this pass, we need to make our proof sketches solid proofs in this section and elsewhere} The previous section assumed that there was no particular semantics to version order. That of course is false in practice. In real software, if Pi.vi calls Pj.vj and fails then Pj.vj is missing a function that Pi.vi needs. Functions are ofren added to packages (Pj in this case) as versions increase, but functions are normally not removed. This model is encoded in the CUDF dependencies of the form Pi version 8 depends on Pj version 10 or greater.\footnote{ NB. Consider a dependency set D = that Pi.1 calls and is successful on Pj.1 and Pi.3 calls and is successful on Pj.3, but there are no other compatibilities. D would not satisfy the CUDF model, because we wouldn't have Pi.1 being able to call Pj.3 successfully. } Further we assume that newer versions of a caller will depend on newer versions of a callee. To continue our example from the previous paragraph, Pi version 9 depends on Pj version 13 or greater is more reasonable than Pi version 9 depends on Pj version 7 or greater. We incorporate all this in the following assumption. Dependency monotonicity: Suppose that Pi.vi succeeds on a call to Pj.vj. If Pi.vi' depends on vj' or greater of Pj and vi $<$ vi', then Pi.vi will succeed on a call to Pj.y for all y $\ge$ vj'. \textbf{Lemma 1}: If Pi.vi fails on a call to Pj.vj, then for all x $\ge$ vi, y $\le$ vj, Pi.x will fail on a call to Pj.y. \textbf{Proof}: Consider x $>$ vi, for Pi.x to succeed on a call to Pj.y, there must be an x $\ge$ x' $>$ vi and a y' $\le$ y such that Pi.x' depends on y' or greater. But in that case by dependency monotonicity and the fact that x $\ge$ x' $>$ vi and y' $\le$ y $\le$ vj, Pi.vi will succeed on Pj.vj. Contradiction. Done. Our algorithm doesn't know whether or not these dependencies hold. However, if the (hidden) dependencies satisfy dependency monotonicity, the following will achieve a lexicographically maximum configuration. Algorithm: StrongMono LiquidClimber {\em Sarah: the simplest way to state the algorithm is as a variant of the previous one with a few changes marked below as CHANGED} \begin{small} \begin{verbatim} current := package version pairs in the initial configuration todolist := packages requiring maximization in descending order of priority constraints := some query constraints on versions of various packages sourcemap := for each package the versions that exist and satisfy query constraints memo := calls for some Pi.vi to some Pj.vj that fail. Initially this is empty LiquidClimber() for each package p in todolist in descending order of priority update current to the highest version of p that works # use git binary on sourcemap if current has less than highest version of p in sourcemap then versionstodo := find all greater versions of p or major releases in DESCENDING order of version number within sourcemap for each v in versionstodo in order temp := current with p set to v ret := TryToMakeWork(p, temp) if ret is not null then # we've had success current:= ret # adjust temp to make it work # p is the package we're trying to push # Knowing that a version will fail # makes use of strong monotonicity: # if Pk.vk calling Pm.vm fails then for all x >= vk and y <= vm, # Pk.x will fail on Pm.y. TryToMakeWork(p,temp) execute temp unless we know from memo and strong monotonicity that temp will fail # CHANGED if the execution of temp fails Find the first call, say from Pi.vi' to Pj.vj', that fails Record in memo that Pi.vi' calling Pj.vj' fails keepfixed := {x | x == p or a package earlier than p in the todolist} if Pi in keepfixed then possible(Pi) := (vi') else possible(Pi) := (all versions of Pi in sourcemap that are less than or equal to vi sorted from greatest to least) # CHANGED end if if Pj in keepfixed then possible(Pj) := {vj'} else possible(Pj) := (all versions of Pj in sourcemap that are less than or equal to vj sorted from greatest to least) # CHANGED end if for each untried configuration c that can be constructed from the cross-product of possible(Pi) and possible(Pj) such that c does not include some pair (Pk.vk, Pm.vm) that strong monotonoicity would eliminate based on memo # CHANGED ret := TryToMakeWork(p,c) if (ret is not null) then return ret end for return null else return temp # this configuration works \end{verbatim} \end{small} \textbf{Complexity}: Every execution advances the version of at least one package (either Pi or Pj) so complexity is linear with the number of versions but still quadratic in the number of packages. (Dennis thinks) \textbf{Lemma 2}: Under the two assumptions of local/global and strong monotonicity, any configuration that is ignored will fail. \textbf{Proof Sketch}: The main thing to justify is that the set of versions that are still tried can be ignored. Let us say that Pi.vi has called Pj.vj and fails. Then for vi' $\ge$ vi, Pi.vi' will also fail on Pj.y for y $\le$ vj, by lemma 1. Done. \textbf{Theorem}: Under the two assumptions of local/global and strong monotonicity, the algorithm finds the lexicographically maximum configuration that works. \textbf{Proof Sketch}: LiquidClimber will go in priority order through the packages to be maximized. Any configuration that is skipped will fail by lemma 2. Overall Algorithm Start with the strong monotonicity algorithm and record all successful executions as well as all unsuccessful pairs (not the inferred ones, just the ones directly tried). Then if in the order of the todolist, we have achieved the maximum out of the first L $\le$ K, take those as fixed and call the earlier version of LiquidClimber. So, if strong monotonicity holds, we have a linear algorithm, but even if it doesn't, we have an algorithm that works.