Title: Genomics via Optical Mapping IV: Sequence Validation via Optical Map Matching
Author(s): Antoniotti, Marco; Anantharaman, Thomas; Paxia, Salvatore; Mishra, Bud
This paper describes the unerlying mathematical model and the dynamic programming algorithm technique for the valicdation of a (DNA) sequence against a (DNA) map. The sequence can be obtained from a variety of sources (r,g, GenBAnk, Sanger's Lab, or Celera P.E.) and it is assumed to be written out as a string of nucleotides. The map is an ordered restriction map obtained through an optical mapping process and is augmented with statistical information which will ne used to place (or not) the sequence in the genome.
Our approach has many other applications beyond validation: e.g. map-based sequence assembly, phasing sequence contigs, detecting and closing gaps and annotation of partially sequenced genomes to find open reading frames, genes and synteny groups.
We tested our system by checking various maps against publicly available sequence data for Plasmodium falciparum.
Title: Knowledge Discovery in Databases for Intrusion Detection, Disease Classification and Beyond
Candidate: Berger, Gideon
Advisor(s): Mishra, Bud
As the number of networked computers grows and the amount of sensitive information available on them grows as well there is an increasing need to ensure the security of these systems. The security of computer networks is not a new issue. We have dealt with the need for security for a long time with such measures as passwords and encryption. These will always provide an important initial line of defense. However, given a clever and malicious individual these defenses can often be circumvented. Intrusion detection is therefore needed as another way to protect computer systems. This thesis describes a novel three stage algorithm for building classification models in the presence of non-stationary, temporal, high dimensional data, in general, and for detecting network intrusion detections, in particular. Given a set of training data records the algorithm begins by identifying "interesting'' temporal patterns in this data using a modal logic. This approach is distinguished from other work in this area where frequent patterns are identified. We show that when frequency is replaced by our measure of "interestingness'' the problem of finding temporal patterns is NP-complete. We then offer an efficient heuristic approach that has proven effective in experiments. Having identified interesting patterns, these patterns then become the predictor variables in the construction of a Multivariate Adaptive Regression Splines (MARS) model. This approach will be justified, after surveying other methods for solving the classification problem, by its ability to capture complex nonlinear relationships between the predictor and response variables which is comparable to other heuristic approaches such as neural networks and classification trees, while offering improved computational properties such as rapid convergence and interpret-ability. After considering a variety of approaches to the problems of over-fitting which is inherent when modeling high dimensional data and non-stationarity, we describe our approach to addressing these issues through the use of truncated Stein shrinkage. This approach is motivated by showing the inadmissibility of the maximum likelihood estimator (MLE) in the high dimensional (dimension >= 3) data. We then discuss the application of our approach as participants in the 1999 DARPA Intrusion Detection Evaluation where we were able to exhibit the benefits of our approach. Finally, we suggest another area of research where we believe that our work would meet with similar success, namely, the area of disease classification.
Title: DisCo: A Distribution Infrastructure for Securely Deploying Decomposable Services in Partly Trusted Environments
Author(s): Freudenthal, Eric; Keenan, Edward; Pesin, Tracy; Port, Lawrence; Karamcheti, Vijay
The growing popularity of network-based services and peer-to-peer networks has resulted in situations where components of a distributed application often need to execute in environments that are only partly trusted by the application's owner. Such deployment into partial or unstable trust environments exacerbates the classical problems of distributing decomposable services: authentication and access control, trust management, secure communication, code distribution and installation, and process rights management. Unfortunately, the application developer's burden of coping with these latter issues often dominates the benefits of service distribution. The DisCo infrastructure is specifically targeted to the development of systems and services deployed into coalition environments: networks of users and hosts administered by multiple authorities with changing trust relationships. The DisCo infrastructure provides application-neutral support for the classical problems of distributed services, thereby relieving the developer of the burden of independently managing these features. DisCo also includes support for continuously monitoring established connections, enabling corrective action from an application to cope with changing trust relationships. Our experience with building a secure video distribution service using the DisCo toolkit indicates that the latter permits distributed secure deployment into a partly trusted environment with minimal application developer effort, affording the advantages of natural expression and convenient deployment without compromising on efficiency.
Title: dRBAC: Distributed Role-based Access Control for Dynamic Environments
Author(s): Freudenthal, Eric; Pesin, Tracy; Port, Lawrence; Keenan, Edward; Karamcheti, Vijay
Distributed Role-Based Access Control (dRBAC) is a scalable, decentralized trust-management and access-control mechanism for systems that span multiple administrative domains. dRBAC represents controlled actions in terms of roles , which are defined within the trust domain of one entity and can be transitively delegated to other roles within a different trust domain. dRBAC utilizes PKI to identify all entities engaged in trust-sensitive operations and to validate delegation certificates. The mapping of roles to authorized name spaces obviates the need to identify additional policy roots. dRBAC distinguishes itself from previous trust management and role-based access control approaches in its support for three features: (1) third-party delegations , which improve expressiveness by allowing an entity to delegate roles outside its namespace when authorized by an explicit delegation of assignment ; (2) valued attributes , which modulate transferred access rights via mechanisms that assign and manipulate numerical values associated with roles; and (3) credential subscriptions , which enable continuous monitoring of established trust relationships using a pub/sub infrastructure to track the status of revocable credentials. This paper describes the dRBAC model, its scalable implementation using a graph-based model of credential discovery and validation, and its application in a larger security context.
Title: Credentialed Secure Communication "Switchboards"
Author(s): Freudenthal, Eric; Port, Lawrence; Keenan, Edward; Pesin, Tracy; Karamcheti, Vijay
Software development in distributed computation is complicated by the extra overhead of communication between connected, dispersed hosts in dynamically changing, multiple administrative domains. Many disparate technologies exist for trust management, authentication, secure communication channels, and service discovery, but composing all of these elements into a single system can outweigh principal development efforts. The NYU Disco Switchboard consolidates these connectivity issues into a single convenient, extensible architecture, providing an abstraction for managing secure, host-pair communication with connection monitoring facilities. Switchboard extends the secure authenticated communication channel abstraction provided by standard interfaces such as SSL/TLS with mechanisms to support trust management, key sharing, service discovery, and connection liveness and monitoring. We present an extensible architecture which is particularly useful in dynamically changing, distributed coalition environments. Applications that utilize Switchboard benefit from the availability of authentication, trust management, cryptography, and discovery, while retaining the simplicity of a common interface.
Title: Title: Automatic Deployment of Transcoding Components for Ubiquitous, Network-Aware Access to Internet Services
Author(s): Fu, Xiadong; Shi, Weisong; Karamacheti, Vijay
Advances in wireless communication together with the growing number of mobile end devices hold the potential of ubiquitous access to sophisticated internet services; however, such access must cope with an inherent mismatch between the low-bandwidth, limited-resource characteristics of mobile devices and the high-bandwidth expectations of many content-rich services. One promising way of bridging this gap is by deploying application-specific components on the path between the device and service, which perform operations such as protocol conversion and content transcoding. Although several researchers have proposed infrastructures allowing such deployment, most rely on static, hand-tuned deployment strategies restricting their applicability in dynamic situations.
In this paper, we present an automatic approach for the dynamic deployment of such transcoding components, which can additionally be dynamically reconfigured as required. Our approach relies on three components: (a) a high-level integrated type-based specification of components and network resources, essential for "late binding" components to paths; (b) an automatic path creation strategy that selects and maps components so as to optimize a global metric; and (c) system support for low-overhead path reconfiguration, consisting of both restrictions on component interfaces and protocols satisfying application semantic continuity requirements. We comprehensively evaluate the effectiveness of our approach over a range of network and end-device characteristics using both a web-access scenario where client performance is for reduced access time, and a streaming scenario where client preference is for increased throughput. Our results verify that (1) automatic path creation and reconfiguration is achievable and does in fact yield substantial performance benefits; and (2) that despite their flexibility, both path creation and reconfiguration can be supported with low run-time overhead.
Title: Algorithms for Rendering in Artistic Styles
Candidate: Hertzmann, Aaron
We describe new algorithms and tools for generating paintings, illustrations, and animation on a computer. These algorithms are designed to produce visually appealing and expressive images that look hand-painted or hand-drawn. In many contexts, painting and illustration have many advantages over photorealistic computer graphics, in aspects such as aesthetics, expression, and computational requirements. We explore three general strategies for non-photorealistic rendering:
First, we describe explicit procedures for placing brush strokes. We begin with a painterly image processing algorithm inspired by painting with real physical media. This method produces images with a much greater subjective impression of looking hand-made than do earlier methods. By adjusting algorithm parameters, a variety of styles can be generated, such as styles inspired by the Impressionists and the Expressionists. This method is then extended to processing video, as demonstrated by painterly animations and an interactive installation. We then present a new style of line art illustration for smooth 3D surfaces. This style is designed to clearly convey surface shape, even for surfaces without predefined material properties or hatching directions.
Next, we describe a new relaxation-based algorithm, in which we search for the painting that minimizes some energy function. In contrast to the first approach, we ideally only need to specify what we want, not how to directly compute it. The system allows as fine user control as desired: the user may interactively change the painting style, specify variations of style over an image, and/or add specific strokes to the painting.
Finally, we describe a new framework for processing images by example, called ``image analogies.'' Given an example of a painting or drawing (e.g. scanned from a hand-painted source), we can process new images with some approximation to the style of the painting. In contrast to the first two approaches, this allows us to design styles without requiring an explicit technical definition of the style. The image analogies framework supports many other novel image processing operations.
Title: Fast Solvers and Domain Decomposition Preconditioners for Spectral Element Discretizations of Problems in H(curl)
Author(s): Hientzsch, Bernhard
For problems with piecewise smooth solutions, spectral element methods hold great promise. They combine the exponential convergence of spectral methods with the geometric flexibility of finite elements. Spectral elements are well-established for scalar elliptic problems and problems of fluid dynamics, and recently the first methods for problems in H(curl) and H(div) were proposed. In this dissertation we study spectral element methods for a model problem. We first consider Maxwell's equation and derive the model problem in H(curl). Then we introduce anisotropic spectral Nédélec element discretizations with variable numerical integration for the model problem. We discuss their structure, and their convergence and approximation properties. We also obtain results on the norm of the Nédélec interpolants between Nédélec and Raviart-Thomas spaces of different degree, needed for the computation of the splitting constant for the domain decomposition preconditioner and the numerical analysis of nonlinear equations. We also prove a Friedrichs-like inequality for the model problem for the spectral case.
We present fast direct solvers for the model problem on separable domains, taking advantage of the tensor product discretization and fast diagonalization methods. We use those fast solvers as local solvers in domain decomposition methods for problems that are too large to be solved directly, or posed on non-separable domains, and use them to compute and subassemble the Schur complement system corresponding to the interface. We also apply them in the direct solution of the Schur complement system for general domains.
As an example for the domain decomposition methods that can be implemented with these tools, we introduce overlapping Schwarz methods, both one-level and two-level versions.
We extend the theory for overlapping Schwarz methods to the spectral Nédélec element case. We reduce the proof of the condition number estimate to three basic estimates, and present theoretical and numerical results on those estimates. The technique of the proof works in both the two-dimensional and three-dimensional case.
We also present numerical results for one-level and two-level methods in two dimensions.
Title: Region-based Register Allocation for EPIC Architectures
Candidate: Kim, Hansoo
Advisor(s): Palem, Krishna
Instruction-level parallelism(ILP) is a family of processor and compiler design techniques that speed up execution by allowing individual machine operations. Explicitly Parallel Instruction computing (EPIC) processors evolved in an attempt to achieve high levels of ILP without the hardware complexity. In EPIC processors most of the functions to extract ILP are performed by the compiler. To take advantage higher level of ILP of these architectures, the ILP compiler must use aggressive ILP technique. This opportunity for improved performance comes at the price of increased compilation time.
As the size of the compilation unit is limited, the compilation time can be reduced. But the limited scope of compilation may restrict the scope of optimization. As a result, the compiler may generate less efficient quality of code. Ideally, we want to get smaller compilation time and the same or better execution time as that obtained using the global approach.
In this thesis, we address the problem of the compilation time and execution performance trade-off in region-based compilation within the context of the key optimization of register allocation . We demonstrate that schemes designed for region-based allocation perform as well as or even better than schemes designed for global based allocation while having smaller compilation time. To achieve this goal, we propose several innovative techniques which form the core of this thesis.
We show considerable compilation time savings with comparable execution time performance by synthesizing our techniques in a region-based register allocation. We also explore the relation between the performance of the register allocation and the region size and quantify it. Our research shows selecting the right size of region has the important impact to the performance of register allocation. We proposed the concept of restructuring the regions based on register pressure and discussed how we can estimate the register pressure in order to improve compilation time while maintaining the execution time.
Title: Overlapping Schwarz Algorithms using Discontinuous Iterates for Poisson's Equation
Author(s): Kimn, Jung-Han
A new type of overlapping Schwarz methods, the overlapping Schwarz algorithms using discontinuous iterates is constructed from the classical overlapping Schwarz algorithm. It allows for discontinuities at each artificial interface. The new algorithm, for Poisson's equation, can be considered as an overlapping version of Lions' Robin iteration method for which little is known concerning the convergence. Since overlap improves the performance of the classical algorithms considerably, the existence of a uniform convergence factor is the fundamental question for our new algorithm.
The first part of this thesis concerns the formulation of the new algorithm. A variational formulation of the new algorithm is derived from the classical algorithms. The discontinuity of the iterates of the new algorithm is the fundamental distinction from the classical algorithms. To analyze this important property, we use a saddle-point approach. We show that the new algorithm can be interpreted as a block Gauss-Seidel method with dual and primal variables.
The second part of the thesis deals with algebraic properties of the new algorithm. We prove that the fractional steps of the new algorithm are nonsymmetric. The algebraic systems of the primal variables can be reduced to those of the dual variables. We analyze the structure of the dual formulation algebraically and analyze its numerical behavior.
The remaining part of the thesis concerns convergence theory and numerical results for the new algorithm. We first extend the classical convergence theory, without using Lagrange multipliers, in some limited cases. A new theory using Lagrange multiplier is then introduced and we find conditions for the existence of uniform convergence factors of the dual variables, which implies convergence of the primal variables, in the two overlapping subdomain case with any Robin boundary condition. Our condition shows a relation between the given conditions and the artificial interface condition. The numerical results for the general case with cross points are also presented. They indicate possible extensions of our results to this more general case.
Title: Dual-Primal FETI Methods for Three-dimensional Elliptic Problems with Heterogeneous Coefficients
Author(s): Klawonn, Axel; Widlund, Olof; Dryja, Maksymilian
In this paper, certain iterative substructuring methods with Lagrange multipliers are considered for elliptic problems in three dimensions. The algorithms belong to the family of dual--primal FETI methods which have recently been introduced and analyzed successfully for elliptic problems in the plane. The family of algorithms for three dimensions is extended and a full analysis is provided for the new algorithms. Particular attention is paid to finding algorithms with a small primal subspace since that subspace represents the only global part of the dual--primal preconditioner. It is shown that the condition numbers of several of the dual--primal FETI methods can be bounded polylogarithmically as a function of the dimension of the individual subregion problems and that the bounds are otherwise independent of the number of subdomains, the mesh size, and jumps in the coefficients. These results closely parallel those for other successful iterative substructuring methods of primal as well as dual type.
Title: Adversarial Reasoning: A Logical Approach for Computer Go
Candidate: Klinger, Tamir
Advisor(s): Davis, Ernest
Go is a board game with simple rules but complex strategy requiring ability in almost all aspects of human reasoning. A good Go player must be able to hypothesize moves and analyze their consequences; to judge which areas are relevant to the analysis at hand; to learn from successes and failures; to generalize that knowledge to other ``similar'' situations; and to make inferences from knowledge about a position.
Unlike computer chess, which has seen steady progress since Shannon's  and Turing's  original papers on the subject, progress on computer Go remains in relative infancy. In computer chess, minimax search with [IMAGE ] - [IMAGE ] pruning based on a simple evaluation function can beat a beginner handily. No such simple evaluation function is known for Go. To accurately evaluate a Go position requires knowledge of the life and death status of the points on the board. Since the player with the most live points at the end of the game wins, a small mistake in this analysis can be disastrous.
In this dissertation we describe the design, performance, and underlying logic of a knowledgebased program that solves life and death problems in the game of Go. Our algorithm applies life and death theory coupled with knowledge about which moves are reasonable for each relevant goal in that theory to restrict the search space to a tractable size. Our results show that simple depth-first search armed with a goal theory and heuristic move knowledge yields very positive results on standard life and death test problems - even without sophisticated move ordering heuristics.
In addition to a description of the program and its internals we present a modal logic useful for describing strategic theories in games and use it to give a life and death theory and to formally state the rules of Go. We also give an axiomatization for this logic using the modal [IMAGE ] calculus  and prove some basic theorems of the system.
Title: Machine Level Optimizations for High Level Languages
Candidate: Leung, Allen
Advisor(s): Palem, Krishna
Two machine instruction level compiler optimization problems are considered in this work.
The first problem is time-constrained instruction scheduling, i.e., finding optimal schedules for machine code in the presence of time constraints such as release-times and deadlines. These types of time constraints appear naturally in embedded applications, and also as a side effect of many other compiler optimization problems. While the general problem is NP-hard, we have developed a new algorithm which can optimally handle many P-time solvable sub-instances. In fact, we show that almost all previous algorithms in this related area can be seen as an instance of the priority computation scheme that we have developed. Our work extends and unifies many algorithmic results in classical deterministic scheduling theory related to release-times, deadlines and pipeline latencies.
The second problem that we investigate in this work is scalar optimizations in machine code. We present a new framework that utilizes static single assignment form (SSA) at the level of individual machine instructions. Complementing the framework, we have also developed new SSA construction algorithms which are faster than previous algorithms, and are very simple to implement.
Title: Exact Geometric Computation: Theory and Applications
Candidate: Li, Chen
Abstract:Exact Geometric Computation: Theory and Applications Abstract This dissertation explores the theory and applications of Exact Geometric Computation (EGC), a general approach to robust geometric computing. The contributions of this thesis are organized into three parts. A fundamental task in EGC is to support exact comparison of algebraic expressions. This leads to the problem of constructive root bounds for algebraic expressions. Such root bounds determine the worst-case complexity of exact comparisons. In the first part, we present a new constructive root bound which, compared to previous bounds, can give dramatically better performance in many common computations involving divisions and radical roots. We also improve the well-known degree-measure bound by exploiting the sharing of common sub-expressions. In the second part, we discuss the design and implementation of the Core Library, a C++ library which embraces the EGC approach to robust numerical and geometric computation. Our design emphasizes ease of use and facilitates the rapid development of robust geometric applications. It allows non-specialist programmers to add robustness into new or existing applications with little extra effort. A number of efficiency and implementation issues are investigated. Although focused on geometric computation, the EGC techniques and software we developed can be applied to other areas where it is critical to guarantee numerical precision. In the third part, we introduce a new randomized test for the vanishing of multivariate radical expressions. With this test, we develop a probabilistic approach to proving elementary geometry theorems about ruler-and-compass constructions. A probabilistic theorem prover based on this approach has been implemented using the Core Library. We present some empirical data.
Title: A Dual-Primal FETI Method for Incompressible Stokes Equations
Author(s): Li, Jing
In this paper, a dual-primal FETI method is developed for incompressible Stokes equation approximated by mixed finite elements with discontinuous pressures. The domain of the problem is decomposed into nonoverlapping subdomains, and the continuity of the velocity across the subdomain interface is enforced by introducing Lagrange multipliers. By a Schur complement procedure, solving the indefinite Stokes problem is reduced to solving a symmetric positive definite problem for the dual variables, i.e., the Lagrange multipliers. This dual problem is solved by a Krylov space method with a Dirichlet preconditioner. At each step of the iteration, both subdomain problems and a coarse problem on the course subdomain mesh are solved by a direct method. It is proved that the condition number of this preconditioned problem is independent of the number of subdomains and bounded from above by the product of the inverse of the inf-sup constant of the discrete problem and the square of the logarithm of the number of unknowns in the individual subdomain problems. Illustrative results are presented by solving a lid driven cavity problem.
Title: An On-Line Handwriting Recognizer with Fisher Matching, Hypotheses Propagation Network and Context Constraint Models
Candidate: Oh, Jong
Advisor(s): Geiger, Davi
We have developed an on-line handwriting recognition system. Our approach integrates local bottom-up constructs with a global top-down measure into a modular recognition engine. The bottom-up process uses local point features for hypothesizing character segmentations and the top-down part performs shape matching for evaluating the segmentations. The shape comparison, called Fisher segmental matching, is based on Fisher's linear discriminant analysis. The component character recognizer of the system uses two kinds of Fisher matching based on different representations and combines the information to form the multiple experts paradigm.
Along with an efficient ligature modeling, the segmentations and their character recognition scores are integrated into a recognition engine termed Hypotheses Propagation Network (HPN), which runs a variant of topological sort algorithm of graph search. The HPN improves on the conventional Hidden Markov Model and the Viterbi search by using the more robust mean-based scores for word level hypotheses and keeping multiple predecessors during the search.
We have also studied and implemented a geometric context modeling termed Visual Bigram Modeling that improves the accuracy of the system's performance by taking the geometric constraints into account, in which the component characters in a word can be formed in relation with the neighboring characters. The result is a shape-oriented system, robust with respect to local and temporal features, modular in construction and has a rich range of opportunities for further extensions.
Title: Continuous Model for Salient Shape Selection and Representation
Candidate: Pao, Hsing-Kuo (Kenneth)
Advisor(s): Geiger, Davi
We propose a new framework for shape representation and salient shape selection. The framework is considered as a low- to middle-level vision process. The framework can be applied to various topics, including figure/ground separation, searching of the shape axis, junction detection and illusory figure finding. The model construction is inspired by the Gestalt studies. They suggest that proximity, convexity, similarity, good continuation, closure, symmetry, etc, are useful for figure/ground separation and visual organization construction. First, we quantify those attributes for (completed or partial) shapes by our distributed systems. The shape will be evaluated and represented by those results. In particular, the shape convexity, rather than other shape attributes like the symmetry axis or size which were well-studied before, will be emphasized in our discussion. Our problem is proposed in a continuous manner. For the shape convexity, unlike the conventional mathematical definition, we are aimed at deriving a definition to describe a shape ``more convex'' or ``less convex'' than the other. To search the shape axis, more than a binary information telling a point on or off any axis, a continuous information will be obtained. We distinguish axes with ``stronger'' or ``weaker'' declarations. An Easy and natural scheme of pruning can be applied by such representation. For the junction detection, we do not assume any artificial threshold. Instead, the transition from low-curvature to high-curvature curves or curves with discontinuities will be shown by our representation. The model is based on a variational approach, provided by the minimization of the data fitting error as well as the neighborhood discrepancy. Two models will be proposed, the decay diffusion process and the orientation diffusion process.
Title: Title: Balancing Neumann-Neumann Methods for Incompressible Stokes Equations
Author(s): Pavarino, Luca; Widlund, Olof
This paper describes the unerlying mathematical model and the Balancing Neumann-Neumann methods are introduced and studied for incompressible Stokes equations discretized with mixed finite or spectral elements with discontinuous pressures. After decomposing the original domain of the problem into nonoverlapping subdomains, the interior unknowns, which are the interior velocity component and all except the constant pressure component, of each subdomain problem are implicitly eliminated. The resulting saddle point Schur complement is solved with a Krylov space method with a balancing Neumann-Neumann preconditioner based on the solution of a coarse Stokes problem with a few degrees of freedom per subdomain and on the solution of local Stokes problems with natural %Neumann velocity and essential boundary conditions on the subdomains. This preconditioner is of hybrid form in which the coarse problem is treated multiplicatively while the local problems are treated additively. The condition number of the preconditioned operator is independent of the number of subdomains and is bounded from above by the product of the square of the logarithm of the local number of unknowns in each subdomain and the inverse of the inf-sup constants of the discrete problem and of the coarse subproblem. Numerical results show that the method is quite fast; they are also fully consistent with the theory.
Title: Modeling Object Characteristics of Dynamic Web Content
Author(s): Shi, Weisong; Collins, Eli; Karamcheti, Vijay
Requests for dynamic and personalized content increasingly dominate current-day Internet traffic, driven both by a growth in dynamic web services and a ``trickle-down'' effect stemming from the effectiveness of caches and content-distribution networks at serving static content. To efficiently serve this trend, several server-side and cache-side techniques have recently been proposed. Although such techniques, which exploit different forms of reuse at the sub-document level, appear promising, a significant impediment to their widespread deployment is (1) the absence of good models describing characteristics of dynamic web content, and (2) the lack of effective synthetic content generators, which reduce the effort involved in verifying the effectiveness of a proposed solution.
This paper addresses both of these shortcomings. Its primary contribution is a set of models that capture the characteristics of dynamic content both in terms of independent parameters such as the distributions of object sizes and their freshness times, as well as derived parameters such as content reusability across time and linked documents. These models are derived from an analysis of the content from six representative news and e-commerce sites, using both size-based and level-based splitting techniques to infer document objects. A secondary contribution is a Tomcat-based dynamic content emulator, which uses these models to generate ESI-based dynamic content and serve requests for whole document and separate objects. To validate both the models and the design of the content emulator, we compare the bandwidth requirements seen by an idealized cache simulator that is driven by both the real trace and emulated content. Our simulation results verify that the output of the content emulator effectively and efficiently models real content.
Title: Language Support for Program Generation Reasoning, Implementation, and Applications
Candidate: Yang, Zhe
Advisor(s): Danvy, Olivier; Goldberg, Benjamin
This dissertation develops programming languages and associated techniques for sound and efficient implementations of algorithms for program generation.
First, we develop a framework for practical two-level languages. In this framework, we demonstrate that two-level languages are not only a good tool for describing program-generation algorithms, but a good tool for reasoning about them and implementing them as well. We pinpoint several general properties of two-level languages that capture common proof obligations of program-generation algorithms:
In addition, to justify concrete implementations, we use a native embedding of a two-level language into a one-level language.
We present two-level languages with these properties both for a call-by-name object language and for a call-by-value object language with computational effects, and demonstrate them through two classes of non-trivial applications: one-pass transformations into continuation-passing style and type-directed partial evaluation for call-by-name and for call-by-value.
Next, to facilitate implementations, we develop several general approaches to programming with type-indexed families of values within the popular Hindley-Milner type system. Type-indexed families provide a form of type dependency, which is employed by many algorithms that generate typed programs, but is absent from mainstream languages. Our approaches are based on type encodings, so that they are type safe. We demonstrate and compare them through a host of examples, including type-directed partial evaluation and printf-style formatting.
Finally, upon the two-level framework and type-encoding techniques, we recast a joint work with Bernd Grobauer, where we formally derived a suitable self application for type-directed partial evaluation, and achieved automatic compiler generation.
Title: Enforcing Resource Sharing Agreements among Distributed Server Cluster
Author(s): Zhao, Tao; Karamcheti, Vijay
Future scalable, high throughput, and high performance applications are likely to execute on platforms constructed by clustering multiple autonomous distributed servers, with resource access governed by agreements between the owners and users of these servers. As an example, application service providers (ASPs) can pool their resources together according to pre-specified sharing agreements to provide better services to their customers. Such systems raise several new resource management challenges, chief amongst which is the enforcement of agreements to ensure that, despite the distributed nature of both requests and resources, user requests only receive a predetermined share of the aggregate resource and that the resources of a participant are not misused. Current solutions only enforce such agreements at a coarse granularity and in a centralized fashion, limiting their applicability for general workloads.
This paper presents an architecture for the distributed enforcement of resource sharing agreements. Our approach exploits a uniform application-independent representation of agreements, and combines it with efficient time-window based coordinated queuing algorithms running on multiple nodes. We have successfully implemented this general strategy in two different network layers: a layer-7 HTTP redirector and a layer-4 packet redirector, which redirect connection requests from distributed clients to a cluster of distributed servers. Our measurements of both implementations verify that our approach is general and effective: different client groups receive service commensurate with their agreements.