Portrait

     
  Azer Bestavros
Professor
Department of Computer Science
College of Arts & Sciences
Boston University
 

Office: 111 Cummington Street, MCS-276, Boston, MA 02215
Tel: 617.353.9726 / Fax: 617.353.6457 / Email: best @bu.edu

 

TEACHING
[CS350] [CS697] [DWE]

RESEARCH
[Papers] [Talks] [Projects] [Students]

GROUPS
[Wing] [RTCC]  [NRG] [iBench] [Sensorium]

PROFESSIONAL
[Vita]
[Service] 

PERSONAL
 [Coptic] [Egypt] [Facebook] [Salon]

 

(C) Azer Bestavros

 

Active Research Projects


Past Research Projects


Supervised PhD Theses


MASS: Diagnosis and Control of Network Variability by MASS Servers

    PIs: Azer Bestavros, John Byers, and Mark Crovella
    Participants: Khaled Harfoush, Shudong Jin, Jun Liu.

    Massively Accessed Scalable Servers (a.k.a. Mass servers) are popular Internet servers, which produce a substantial fraction of the traffic flowing through the network. Mass servers are uniquely positioned (a) to observe and diagnose network conditions by tracking the flows that they generate, and (b) to manage and control network resources by better regulating and scheduling the traffic they inject into the network.

    The MASS Research Group in the Department of Computer Science at Boston University is pursuing a number of projects to achieve these goals over a wide spectrum of time scales. Over shorter time scales, a Mass server can minimize packet loss by smoothing the (otherwise bursty) process of injecting packets into the network. Over longer time scales, a Mass server can perform aggregate congestion management by bundling like connections to avoid the burstiness that results from competition among flows.

    A key component of the Mass Servers Research Group work is implementation and prototyping. To that end, members of the group are developing three key elements of Mass Servers, namely: (1) Beacon: A collection of network measurement and diagnosis tools, (2) TurnPike: A collection of network management and control protocols and services, and (3) BackBay: A platform that supports the integration of Beacon and TurnPike functionality into a high-performance web server architecture.

    More information on this project is available from the MASS Project Home Page


ITM: Internet Traffic Managers

    PIs: Ibrahim Matta, Azer Bestavros, and Mark Crovella
    Participants:  Mina Guirguis, Liang Guo

    The scalability of the Internet hinges on our ability to tame the unpredictability associated with its open architecture. This project investigates the development of basic control strategies for reducing traffic burstiness and improving network utilization. Such strategies can be applied through Traffic Managers (TMs)---special network elements strategically placed in the Internet (e.g., in front of clients/servers or at exchange/peering points between administrative domains). We believe that the incorporation of such control functionalities will be key to the ability of the network infrastructure to sustain its own growth and to nurture the Quality-of-Service (QoS) needs of emerging applications.

    More information on this project is available from the ITM Project Home Page


Commonwealth: Scalable Web Server Architectures And Protocols

    PIs: Azer Bestavros and Mark Crovella
    Participants: Paul Barford, Jun Liu, Adam Bradley, David Martin (Phd'1999), Robert Frangioso, Jorge Londono (MA'1999), Luis Aversa (MA'1998), Naomi Katagai (MA'1998)

    The phenomenal growth of the WWW is imposing considerable strain on Internet resources and Web servers, prompting numerous concerns about the Web's continued viability. The success of high-performance Web servers in alleviating these performance problems is ultimately limited, unless Web services are designed to be inherently scalable. The Commonwealth project is designing, implementing, and evaluating a prototype architecture and a set of associated protocols for scalable Web services. The Commonwealth architecture for hosting scalable Web services allows scalability through (1) parallel processing on tightly-coupled nodes within a Web site, and (2) load distribution across loosely-coupled Web sites. Commonwealth's underlying philosophy is to achieve a WEALTH of performance through the use of COMMON components, and to do so along an incremental upgrade path.

    The Commonwealth project is pursuing basic research in four main areas: (1) Data placement (caching, prefetching, and dissemination protocols); (2) Resource management (scheduling, load balancing, and admission control protocols); (3) Networking (routing, packet redirection, and connection migration) and (4) Middleware Services (replication and resource discovery).

    These problems are being addressed with particular attention to the issue of high workload variability, which---in previous work by the Oceans Group at Boston University---was shown to pose significant problems for the design of Internet-based systems.

    More information on this project is available from The CommonWealth Project Home Page


ATCP: Aggregating TCP State Across Multiple Connections
 

    PI: Azer Bestavros
    Participants: Olivier Hartman, Khaled Harfoush, Thomas Gschwendtner

    In this project we investigate the design of a stateful TCP stack that allows multiple connections that share a common congested path to share the same state. In that respect we investigate two possible scenarios. In the first, we investigated an extension of the TCP stack that allows a sequence of TCP connections between the same machines to share the congestion window. Our Linux implementation of this scenario shows significant improvement in performance, particularly when the individual connections are short-lived. Such a behavior is common on the web, due to the nature of the HTTP protocol and the distribution of file sizes. In the second scenario, we investigate an extension of the TCP stack that allows a set of concurrent TCP connections between the same machines to be controlled in an aggregate manner. Work on this scenario is still on-going, but initial results suggest improved performance.


DCR: Distributed Connection Routing for Scalable Internet Servers

    PIs: Azer Bestavros and Mark Crovella
    Participants: Jun Liu, David Martin (PhD'1999), Jorge Londono (MA'1999), Luis Aversa (MA'1998)

    To construct high performance Web servers, system builders are increasingly turning to distributed designs. An important challenge that arises in distributed Web servers is the need to direct incoming connections to individual hosts. Previous methods for connection routing have employed a centralized node which handles all incoming requests. In this project we investigate the use of fully-distributed, all-software techniques for connection routing and distribution amongst cluster hosts. In that respect we have implemented and tested two protocols: DPR and CM. Distributed Packet Rewriting (DPR) operates at the TCP/IP layer, allowing connection routing using packet rewriting. Connection Migration (CM) is a kernel functionality presented to the application layer thorugh an API that allows applications (e.g. web servers) to "migrate" TCP connections. Both of these techniques are elements of the Commonwealth Scalable Web Server Architecture project.


WebHint: Client/Server-based WWW Prefetching Protocols

    PI: Azer Bestavros
    Participants: Carlos Cunha (PhD'1998), Martin Mroz (MA'1998), Chau Anh Nguyen (MA'1998)

    In this project we investigate the potential performance improvements possible through prefetching of Web documents. In that respect we investigate two possible policies. The first policy is server-based. It relies on analysis of reference patterns at the server, constructing Markov models (e.g. first-order and second-order, etc.) to capture the traversal and embedding dependencies that exist between documents, and using these Markov models to speculatively service documents, or simply provide prefetching hints to clients. The second policy is client-based. It relies on analysis of the client's logs, constructing Markov models (e.g. first-order and second-order, etc.) to capture the traversal and embedding dependencies that exist between documents, and using these Markov models to aggressively prefetch documents.

    On the server side, we used extensive trace simulations based on the logs of our departmental HTTP server to show that both server load and service time could be reduced considerably, if speculative service is used. This is above and beyond what is currently achievable using client-side caching, prefetching, and server-side dissemination.

    On the client side, we used extensive trace simulations based on logs we collected from over 400 users in our departmental lab to show that service time could be reduced, if client-initiated prefetching is used. This reduction, however, is quite dependent on what we term as the user "surfing" or "working" behavior, which turns out to be quite predictable.

    As part of this project we have implemented server-assisted prefetcing by modifying the NCSA HTTP server to provide hints based on Markov models constructed for the server's local documents, and by modifying the NCSA Mosaic browser to interpret these "hints" and perform the necessary prefetching. Also, as part of this project we have implemented client-initiated prefetching by modifying the NCSA Mosaic browser to construct Markov models of the user access patterns and use these to perform the necessary prefetching. We are working on combining our server-assisted prefetching and client-initiated prefetching in a single framework that allows switching between the two regimes appropriately.


WebSeed: Content Replication Protocols for the WWW

    PI: Azer Bestavros
    Participants: Carlos Cunha (PhD'1998), Shudong Jin

    Research on replication techniques to reduce traffic and minimize the latency of information retrieval in a distributed system has concentrated on client-based caching, whereby recently/frequently accessed information is cached at a client (or at a proxy thereof) in anticipation of future accesses. We believe that such myopic solutions---focussing exclusively on a particular client or set of clients---are likely to have a limited impact. Instead, we offer a solution that allows the replication of information to be done on a global supply/demand basis.

    The primary goal of this project is to develop data dissemination mechanisms that allow information to propagate from its producers to servers that are closer to its consumers. This idea relies on a particular future model of the Internet where in addition to clients and servers, service proxies offer their storage and bandwidth capacities ``for rent''.

    Our dissemination techniques reduce network traffic and balance load amongst servers by exploiting geographic and temporal locality of reference properties exhibited in client access patterns to decide on replication and placement strategies. The level of dissemination depends on the relative popularity of documents, and on the expected reduction in traffic that results from such dissemination.


AFTER: Adaptive Forward Timely Erasure Recovery in High-Speed Networks
 

    PI: Azer Bestavros
    Participants: Gitae Kim (PhD'1998), Jaehee Yoon

    A variety of real-time control protocols have been developed to cope with communication failures (namely, an entire or partial packet loss) due to the limitations of network resources in distributed computing systems. These protocols make use of temporal and/or spatial redundancies to meet preset reliability and synchronization constraints---often at an expensive cost, resulting from inefficient or inflexible resource usage.

    Protocols that employ spatial redundancy, such as the techniques based on bandwidth-reservation and Forward Error correction (FEC), fail to boost data reliabilility when bit-corruptions hit a packet (or cell) in such a way that the protocols cannot mask such corruptions. Furthermore, the protocols introduced so far in the literature are based on static, proactive schemes, whereby the level of redundancy is fixed at the time of protocol initiation or throughout the protocol's lifespan. In order to guarantee the quality of service (QoS) requested by an application, they need to reserve enough redundancy in preparation for a worst-case scenario that happens rarely, if ever. In particular, these methods lack the ability to adapt to the ever-varying data rates and/or communication failure rates, making them inefficient in their use of network bandwidth.

    Protocols that employ temporal redundancy for masking communication failures provide a high degree of reliability, yet tend to suffer from high packet delays, as well as large jitter. These protocols use reactive schemes that rely on packet retransmissions when failures are detected. Automatic Repeat Request (ARQ) techniques uniquely belong to this group of protocols. Because of their high latencies due to recovery time, ARQ methods are not suitable for real-time applications that require stringent delay bounds and a high degree of data integrity. Examples of such applications include on-line financial data feeds and group simulations.

    In this project, we introduce the notion of Adaptive Forward Timely Erasure Recovery (AFTER), which allows the use of a feedback-based dynamic redundancy control scheme to provide an effective transport framework, from which a variety of transport applications can benefit. AFTER uses efficient encoding techniques (e.g. Reed Solomon or Tornado codes) that make dynamic redundancy control possible. When it is implemented for reliable data transfers, AFTER's dynamic control scheme allows us to optimize the use of communication bandwidth by dynamically balancing the use of spatial redundancy (through FEC) and temporal redundancy (through retransmission) to achieve the level of packet delay and delay variance preset by the application's requested QOS. AFTER can be implemented for both reliable and unreliable data transports, under various network environments ranging from highly reliable high-speed communication channels, such as Constant Bit Rate (CBR) services in ATM network, to low-cost best-effort data communication channels, such as conventional packet-switched networks. To that end, we have proposed and implemented an AAL-layer Lazy Packet Discard (LPD) Policy for TCP/IP over ATM. Also, we are investigating techniques for using AFTER as a mechanism for reliable multicast.


SRMS: Statistical Rate Monotonic Scheduling for Real-time Computing and Communication Systems

    PI: Azer Bestavros
    Participants: Alia Atlas (PhD'1999), Adrian Prezioso (MA'1999)

    Rate Monotonic Scheduling (RMS) is a preemptive static priority-based scheduling algorithm for real-time periodic task systems. Using each task's utilization, RMS determines whether the system can be scheduled so that 100 percent of the deadlines are met. For periodic tasks with non-deterministic resource utilization requirements, RMS must use the peak utilization requirements. This may be extremely wasteful of system resources, especially for applications with highly-variable resource utilization requirements (e.g., MPEG video communication).

    In this project, we consider SRMS---a modified RMS where the tasks have random resource utilization requirements and do not require all of their deadlines to be met. Instead, statistical guarantees are made for the percentage of deadlines which will be met per task. To permit this, admission control is added for each job of a task. The current algorithm works for tasks whose periods are multiples of each other. Each task is given a guaranteed resource utilization allowance for the period of the next lowest task. Using this allowance, each job is examined at its release time to determine if the required resource utilization is available from the allowance. In addition, we allow any unused allowance to be inherited by lower tasks.

    Our SRMS algorithm allows an efficient use of system resources while providing quality of service guarantees for task systems with random resource utilization requirements. Moreover, it allows us to decouple a task's priority from its period, thus allowing different statistical guarantees to be provided to each task independent of that task's period. These capabilities have been established both analytically and through simulations.

    More information on this project is available from the home pages of the The SRMS Workbench and The SRMS NT Service projects.


SCC: Speculative Concurrency Control for Real-time Databases

    PI: Azer Bestavros
    Participants: Spyridon Braoudakis (PhD'1994), Sue Nagy (PhD'1997), Euthimios Panagos (PhD'1996), Benjamin Mandler (MA'1994), Biao Wang (MA'1994)

    In order for multiple transactions to operate concurrently on a shared database, a protocol must be adopted to coordinate their activities. Such a protocol -- called a concurrency control algorithm -- aims at insuring a consistent state of the database system, while allowing the maximum possible concurrency among transactions.

    Traditional concurrency control algorithms can be broadly classified as either pessimistic or optimistic. Pessimistic Concurrency Control (PCC) algorithms avoid any concurrent execution of transactions as soon as conflicts that may result in future inconsistencies are detected. On the contrary, Optimistic Concurrency Control (OCC) algorithms allow such transactions to proceed at the risk of having to restart them in case these suspected inconsistencies materialize.

    For conventional database management systems with limited resources, performance studies of concurrency control methods have concluded that PCC locking protocols perform better than OCC techniques. The main reason for this good performance is that PCC's blocking-based conflict resolution policy results in resource conservation, whereas OCC with its restart-based conflict resolution policy wastes more resources. For Real-Time DataBase Systems (RTDBS), where transactions execute under strict timing constraints, maximum concurrency (or throughput) ceases to be an expressive measure of performance. Rather, the number of transactions completed before their set deadlines becomes the decisive performance measure. Most real-time concurrency control schemes considered in the literature are based on Two-Phase Locking (2PL), which is a PCC strategy.

    Despite its widespread use in commercial systems, 2PL has some properties such as the possibility of deadlocks and long and unpredictable blocking times that damage its appeal for real-time environments, where the primary performance criterion is meeting time constraints and not just preserving consistency requirements. A recent evaluation of the behavior of both PCC and OCC schemes in a real-time environment concluded that for a RTDBS with firm deadlines (where late transactions are immediately discarded) OCC outperforms PCC, especially when resource contention is low.

    However, a disadvantage of the classical OCC is that when a conflict is detected the transaction being validated is always the one to be aborted, without respect to transactions' priorities or deadlines. An even more serious problem of classical OCC is that transaction conflicts are not detected until the validation phase, at which time it may be too late to restart. PCC two-phase locking algorithms do not suffer from this problem because they detect potential conflicts as they occur. They suffer, however, from the possibility of unnecessarily missing set deadlines as a result of unbounded waiting due to blocking.

    We propose a categorically different approach to concurrency control for RTDBS, which relies on the use of redundant processes that speculate on alternative schedules, once conflicts that threaten the consistency of the database are detected. These alternative schedules are adopted only if the suspected inconsistencies materialize; otherwise, they are abandoned. Due to its nature, we have termed this approach Speculative Concurrency Control (SCC). SCC algorithms use redundancy to combine the advantages of both PCC and OCC algorithms, while avoiding their disadvantages. On the one hand, SCC resembles PCC in that potentially harmful conflicts are detected as early as possible, allowing a head-start for alternative schedules, and thus increasing the chances of meeting the set timing constraints -- should these alternative schedules be needed (due to restart as in OCC). On the other hand, SCC resembles OCC in that it allows conflicting transactions to proceed concurrently, thus avoiding unnecessary delays (due to blocking as in PCC) that may jeopardize their timely commitment.


ACCORD: Admission Control and Capacity Overload management for RTDBs

    PI: Azer Bestavros
    Participant: Susan Nagy (PhD'1998)

    Admission control and overload management techniques are central to the design and implementation of Real-Time Database Systems. In this project, we investigate various such mechanisms using ACCORD: a novel Admission Control and Capacity Overload management framework for Real-time Databases.

    The main challenge involved in scheduling transactions in a Real-Time DataBase management (RTDB) system is that the resources needed to execute a transaction are not known a priori. For example, the set of objects to be read (written) by a transaction may be dependent on user input (as in a stock market application) or dependent on sensory inputs (as in a process control application). Therefore, the a priori reservation of resources (e.g., read/write locks on data objects) to guarantee a particular Worst Case Execution Time (WCET) becomes impossible---and the non-deterministic delays associated with the on-the-fly acquisition of such resources pose the real challenge of integrating real-time scheduling with other database protocols. To deal with this challenge, most RTDB systems make two assumptions: (1) they relax the transaction deadline semantics by allowing only soft and firm (but not hard) deadlines; and (2) they adopt time-cognizant, best-effort algorithms that optimize the system performance in the presence of such flexible deadlines.

    To illustrate this state-of-affairs, consider the huge body of research on real-time concurrency control, where complex time-cognizant concurrency control techniques are proposed for the sole purpose of maximizing the number of transactions that meet their deadlines (or other metrics thereof). A careful evaluation of these elaborate techniques reveals that their superiority is materialized only when the RTDB system is overloaded. However, when the system is not overloaded, the performance of these techniques becomes comparable to that of much simpler techniques (e.g., 2PL-PA). It is important to observe that when a RTDB system is overloaded, a large percentage of transactions end up missing their deadlines. This observation leads to the following question: How better would the performance of the system be if these same transactions (that ended up missing their deadlines) were not allowed into the system in the first place? The answer is obviously ``much better'' because with hindsight, the limited resources in the system would not have been wasted on these transactions to start with. While such a clairvoyant scheduling of transactions is impossible in a real system, admission control and overload management techniques could be used to achieve the same goal. In this project, we introduce and evaluate such techniques.

    At the heart of this project is ACCORD, an Admission Control and Capacity Overload management Real-time Database framework---an architecture and a transaction model---for hard deadline RTDB systems. The system architecture consists of admission control and scheduling components which provide early notification of failure to submitted transactions that are deemed not valuable or incapable of completing on time. The transaction model consists of two components: a primary task and a compensating task. Transactions which are admitted to the system are guaranteed, by the deadline of the transaction, one of two outcomes: either the primary task will successfully commit or the compensating task will safely terminate. Our admission control mechanisms permit transactions to fail at the earliest possible point in time (i.e. at submission time) rather than at a later time. Also as a system becomes overloaded, our admission control techniques allow for the utilization of system resources in the most profitable way.


CLEOPATRA: Language and Tools for Embedded Real-Time Computing

    PI: Azer Bestavros
    Participants: Robert Popp (MA'1992), Devora Reich (MA'1992)

    Predictability---the ability to foretell that an implementation will not violate a set of specified reliability and timeliness requirements ---is a crucial, highly desirable property of responsive embedded systems. In this project we proposed and implemented a development methodology for responsive systems, which enhances predictability by eliminating potential hazards resulting from physically-unsound specifications.

    The backbone of our methodology is the Time-constrained Reactive Automaton (TRA) formalism, which adopts a fundamental notion of space and time that restricts expressiveness in a way that allows the specification of only reactive, spontaneous, and causal computation. Using the TRA model, unrealistic systems---possessing properties such as clairvoyance, caprice, infinite capacity, or perfect timing---cannot even be specified. I argue that this "ounce of prevention" at the specification level is likely to spare a lot of time and energy in the development cycle of responsive systems---not to mention the elimination of potential hazards that would have gone, otherwise, unnoticed.

    The TRA model is presented to system developers through the CLEOPATRA programming language. CLEOPATRA features a C-like imperative syntax for the description of computation, which makes it easier to incorporate in applications already using C. It is event-driven, and thus appropriate for embedded process control applications. It is object-oriented and compositional, thus advocating modularity and reusability. CLEOPATRA is semantically sound; its objects can be transformed, mechanically and unambiguously, into formal TRA automata for verification purposes, which can be pursued using model-checking or theorem proving techniques. Since 1989, an ancestor of CLEOPATRA has been in use as a specification and simulation language for embedded time-critical robotic processes.


Phd Thesis: Reduction of Quality Attacks on Adaptation Mechanisms

    Mina Guirguis, PhD 2006

    One important consideration in realizing dependable computing systems and networks is to uncover vulnerabilities in their designs to adversarial attacks. Currently, the designs of these systems employ different forms of adaptation mechanisms in order to optimize their performance by ensuring that desirable properties, such as stability, efficiency and fairness, are not compromised. This thesis discovers and studies a new type of adversarial attacks that target such adaptation mechanisms by exploiting their dynamics of operation { i.e., the characteristics of their transient behavior. We coin this new breed of adversarial attacks, Reduction of Quality (RoQ) attacks. The premise of RoQ attacks is to keep an adaptive mechanism constantly in a transient state, effectively depriving the system from much of its capacity and significantly reducing its service quality. In this thesis we develop a general control-theoretic framework that provides a unified approach to modeling and vulnerability assessment of the dynamics underlying RoQ exploits. Within this framework, we introduce and formalize the notion of an attack "Potency" that capitalizes on the attacker's best incentive: maximizing the marginal utility of its attack traffic. Unlike traditional brute-force Denial of Service attacks that aim to take down a system at any cost, RoQ attacks aim to maximize the damage inflicted on a system through consuming an innocuous, small fraction of that system's hijacked capacity. We instantiate our framework using detailed analytical models and associated metrics on a series of adaptation mechanisms that are commonly used in networking protocols, end-system admission controllers and load balancers. We assess the impact of RoQ attacks using analysis, simulations, and Internet experiments. We identify key factors that expose the tradeoffs between resilience and susceptibility to RoQ attacks. These factors could be used to harden adaptation mechanisms against RoQ exploits, in addition to developing new forms of countermeasures and defense mechanisms.


Phd Thesis:  A Type-Disciplined Approach to Developing Resources and Applications for the World-Wide Web

    Adam Bradley, PhD 2004

    Programming of stand-alone computer systems has long benefited from formal methods for system specification, programming, and execution; such formalisms lend themselves to precise mechanical verification of a system's desired properties. Type systems particularly have proven effective at reining in the unbounded expressive power of programming languages without excessively burdening the programmer. It is the position of this thesis that such techniques can be adapted to suit the programming of open networked systems. Thereby, benefits can be realized to the correctness and stability of the individual components of said systems, to the precision with which such systems are specified, and to the correctness, reliability, predictability, and interoperability of networked programs and systems as a whole.

    In support of this concept, five formal methods (the P, B, and X type systems, the L type model, and the CHAIN methodology) are developed which reflect the promotion of ``flows'' to first-class programming language citizens accessible to type systems and other formal correctness tools. We employ ``flow'' as a generic abstraction for mechanisms which affect the transaction of state among components of a networked system to achieve its desired end. These five systems largely reflect novel applications of existing technologies and systems to various forms of flows; two also employ a novel type construct, the ``stacked type syntax'', which captures nesting relationships within data representations.

    The P type system enforces constraints upon the flow of system state changes by restricting program side-effects to predictable classes. The B type systems identify flows of information from server back-end sources (basis data) to representations thus identifying potential representational inconsistencies within the system. The X type systems enforce XML well-formedness upon the output streams (flows) of programs. The L type model imposes structure upon the specification of the HTTP protocol's content model and supports a more precise declaration of the semantics of its syntax (i.e., its interpretive flow). The CHAIN methodology offers a systematic approach to assessing correctness of systems which construct communication channels (flows) by composing arbitrarily long sequences of intermediaries, and are thus not amenable to direct global verification.


Phd Thesis: Scalability of Multicast-based Streaming Delivery Mechanisms on the Internet

    Shudong Jin, PhD 2003

    This thesis examines the scalability of multicast-based streaming delivery through analytical and empirical evaluation methodologies. Scalability is assessed with respect to two metrics: server bandwidth and network cost. The first metric quantifies the amount of server resources needed to serve a large number of concurrent clients. The second metric quantifies the amount of network resources needed to serve these clients over the Internet. Scalability along these metrics is evaluated subject to two types of characteristic properties: access patterns and Internet topology. Access patterns assumed in this thesis allow clients to be synchronous or asynchronous, and allow them to access content sequentially or randomly. Internet topologies considered in this thesis exhibit power-law vertex degree distributions and small-world behavior. The findings of this thesis show that the server bandwidth requirement of multicast-based delivery techniques–such as stream merging and periodic broadcasting–largely depends on client access patterns. In particular, for asynchronous clients, if access is sequential, then the lower bound on server bandwidth grows logarithmically with the request arrival rate, but if client access is random, then the lower bound grows as the square root of the request arrival rate. The thesis also shows that the network cost of multicast-based streaming delivery depends mainly on the Internet topological properties. Specifically, both Internet power-law vertex degree distribution and small-world behavior affect the scaling behavior of IP multicast and of end system multicast mechanisms.


Phd Thesis: Metric-Induced Network Topologies

    Khaled Harfoush, PhD 2002

    Interest in building compact, accurate and efficient end-to-end network models increases as network-aware applications and services over the Internet are deployed. These models could be used to optimize the utilization of network resources, improve the quality of content delivery and help analyze network performance. In this talk, I will summarize the work I have done so far and the work yet to be completed as part of my PhD thesis on a framework for inferring Metric-Induced Network Topologies.

    The main contributions of the proposed thesis are as follows. First, this thesis presents MINT---a framework for the characterization of Metric-Induced Network Topologies. MINT is a general framework that uses correlations between end-to-end measurements across multiple flows emanating from a single host to model the network properties between this host and the flows' end points. A salient feature of MINT is its ability to compress the representation of the network, subject to specific sensitivity thresholds. Second, this thesis characterizes a broad class of metrics, for which MINT is applicable. For these metrics, mechanisms for integrating network models (snapshots) obtained at different points in time and/or from different hosts are presented. Third, this thesis instantiates MINT for a variety of metrics---namely loss, delay, and bandwidth---in the context of unicast messaging. In that regard, it presents novel unicast end-to-end active probing techniques that enable the correlation of observations collected from end-points. The potential of passive probing by using feedback from established connections is also evaluated. Extensive simulation experiments show the effectiveness of these novel probing approaches as well as their robustness in terms of accuracy and convergence over a wide range of network conditions. Fourth, this thesis presents an implementation of the MINT framework through a Linux Application Programming Interface called the NetScope API. The value of the MINT framework and of the NetScope API are demonstrated through Internet deployment. Finally, to demonstrate the utility of the MINT framework, this thesis uses the NetScope API to enable the following applications at Massively accessed Internet servers: (1) use inference of shared congestion between a set of flows to enable shared congestion control, (2) optimize server selection in end-system multicast, (3) provide end-to-end statistical QoS (loss and delay) guarantees for a group of flows sharing the same bottleneck links.


Phd Thesis: Statistical Rate Monotonic Scheduling: Quality of Service through Management of Variability in Real-time Systems

    Alia Atlas, PhD 1999

    Interest in real-time scheduling increases as applications with quality of service (QoS) and timeliness constraints proliferate. The classical real-time task model, used in the optimal Rate Monotonic Scheduling (RMS), assumes constant resource requirements and hard deadlines. For the many applications with variable resource requirements, RMS uses pessimistic worst-case values and results in severe resource underutilization. To eliminate such underutilization, this dissertation examines how the variability of resource requirements should be considered in the problem of scheduling periodic tasks with statistical QoS constraints on the percentage of missed deadlines. To solve this problem, two on-line algorithms and an oracle are introduced, simulated and evaluated using two novel metrics. To show applicability, a computer-aided design tool and a design and implementation in KURT Linux are presented.

    The primary contribution, Statistical Rate Monotonic Scheduling (SRMS), is proposed with associated analysis for the calculation of statistical QoS guarantees and, given QoS requirements, proper resource allocation. SRMS assumes that variability can be smoothed through aggregation. It consists of a QoS calculator, a feasibility test, a scheduler and a constant-time job admission controller. Extensions provide time aggregation across tasks and a second chance for rejected jobs.

    Additional algorithms -- Slack Stealing Job Admission Control (SSJAC) and an omniscient off-line oracle --- are introduced to permit comparison of SRMS with previous research and with theoretical performance bounds. Different value functions enable the oracle to yield solutions optimal according to different metrics --- completion count, effective processor utilization (EPU), and job failure rate (JFR). The value function for the latter is introduced to provide a metric which considers all tasks of equal value.

    Via simulation, the performance of the algorithms is examined with JFR, EPU and two novel metrics. DeltaQoS evaluates the accuracy of QoS calculations. Intertask unfairness evaluates how unfair an algorithm is to different priority tasks. Experiments show that SRMS has superior performance during overload when the adjacent period ratio is at least two.

    To facilitate application development, the SRMS Workbench, a computer-aided design tool and simulator, and a design and implementation of SRMS in KURT Linux are provided. An API is introduced to support soft/firm-deadline and design-to-time tasks. The SRMS Workbench implements simple QoS negotiation and calculation of system specifications for requested QoS.

    The Thesis is available as a BUCS Technical Report


Phd Thesis: A Framework for Adaptive Forward Timely Erasure Recovery (AFTER)

    Gitae (Keith) Kim, PhD 1998

    A variety of real-time protocols have been developed to deal with communication failures (an entire or partial packet loss) in networked computing systems. These protocols rely on temporal and/or spatial redundancies to accomplish their goals --- often at an expensive cost, resulting from inefficient resource usage. Protocols that employ spatial redundancy, such as the techniques based on resource-reservation and information redundancy (eg FEC), while improving the level of responsiveness, are prone to suffer from low resource utilization, with possible degradation on reliability. On the other hand, protocols that employ temporal redundancy provide a high level of resource utilization and reliability, yet tend to suffer from high level of message delay and delay variance. Due to the high latencies during failure recoveries, these schemes are not suitable, especailly for those that require stringent real-time guarantee. Nevertheless, applications that require a high level of data integrity (e.g., on-line financial data feeds) need to use temporal redundancies at an expense of increased latencies, in order to provide guaranteed reliability.

    In this dissertation, we introduce the notion of Adaptive Forward Timely Erasure Recovery (AFTER) scheme, that takes a dynamic redundancy control approach to provide an effective transport framework, from which a variety of lower layers in the network system can benefit. The main idea behind AFTER is to provide a flexible resource control mechanism for those clients that require different level of reliability and responsiveness, by dynamically balancing the levels between spatial and temporal redundancy in handling communication failures. AFTER is an eleboration of the adaptive redundancy control scheme, the notion introduced in the Adaptive Information Dispersal Algorithm (AIDA). AFTER can be implemented for both reliable and unreliable data transport, under various network environments ranging from high-speed B-ISDN networks to low-cost best-effort data communication channels.

    To demonstrate the flexibility and superiority of AFTER, we implement it in two different scenarios: TCP/IP over ATM (i.e., TCP-Boston) using ATM's ABR service, and multimedia file transfers via ATM's CBR service class. TCP-Boston, a TCP/IP protocol especially suitable for ATM network, shows AFTER's adaptability to a network with small-sized transfer unit, for best-effort traffic using reliable transport method. AFTER's suitability under reliable communication channel is tested through the experiment for multimedia file transfers. For each application, we present a high-level implementation details of our protocol, evaluate the performance using both simulation and analytic methods, and show that AFTER-based protocols improve their performance over their counterparts.


PhD Thesis: Trace Analysis and its Applications to Performance Enhancements of Distributed Information Systems

    Carlos Cunha, PhD 1997

    The increasing importance of large-scale distributed information systems calls for a better understanding of the nature of their use. Such an understanding is critical to enable the design of high-performance and scalable information retrieval protocols.

    Over the last few years, the World-Wide Web (WWW or Web) has emerged as a unifying infrastructure for large-scale distributed information systems. This dissertation has two goals: (1) to perform a detailed analysis and characterization of WWW usage patterns and (2) to explore the performance enhancements that are possible to achieve as a result of such an analysis. The analysis of WWW usage can be done both at the client and at the server sides, enabling performance enhancements both at the client and at the server sides. On the client side, this dissertation explores prefetching techniques that alleviate the problem of long response times, by trading in network bandwidth for timeliness. On the server side, this dissertation explores replica allocation techniques that alleviate the problems of server load balancing and network bandwidth.

    The contributions of this dissertation are: (1) the presentation of a large database of actual client traces that has already proven to be crucial for studies of characterizing WWW traffic and client caching algorithms; (2) the identification of relationships between documents that indicate probable document sequences, which can be used to circumvent long retrieval delays through pre-fetching; (3) the identification of user behavior models to help in reducing the amount of bandwidth required for pre-fetching; (4) the establishment of a simplified Internet model based on routing structures for use in problems of resource allocation; (5) the demonstration of the stability of such structures; and (6) the introduction of various replica allocation algorithms, and the evaluation of their performance.


Phd Thesis: Admission Control and Scheduling Strategies for Real-time Database Systems

    Susan Nagy, PhD 1997

    The proliferation of Real-Time DataBase (RTDB) systems as repositories of information used by time-critical applications has been tremendous during the last decade. Many such systems continue to admit transactions to the point of overload which results in degraded performance. By the appropriate use of admission control and overload management techniques, the performance of such systems may be enhanced. Moreover, for some safety-critical applications (such as command and control systems), safety constraints require the early notification of transaction failure. Failure to do so results in wasting precious system resources, which could have been used by other admitted transactions, not to mention wasting precious time which could have been used to attempt alternative options for the failing transaction.

    In this dissertation, we propose ACCORD, an Admission Control and Capacity Overload management Real-time Database framework---an architecture and a transaction model---for hard deadline RTDB systems. The system architecture consists of admission control and scheduling components which provide early notification of failure to submitted transactions that are deemed not valuable or incapable of completing on time. The transaction model consists of two components: a primary task and a compensating task. Transactions which are admitted to the system are guaranteed, by the deadline of the transaction, one of two outcomes: either the primary task will successfully commit or the compensating task will safely terminate. Our admission control mechanisms permit transactions to fail at the earliest possible point in time (i.e. at submission time) rather than at a later time. Also as a system becomes overloaded, our admission control techniques allow for the utilization of system resources in the most profitable way.

    The contributions of this dissertation are: (1) the novel ACCORD framework for RTDB systems including a system architecture and a transaction model, (2) value-cognizant admission control mechanisms based upon workload, (3) value-cognizant admission control mechanisms based upon the level of concurrency conflicts, and (4) new scheduling algorithms suitable for ACCORD. These contributions are validated by an extensive experimental evaluation of ACCORD, through simulation, which confirms the performance benefits of admission control, overload management, and early failure notification.


PhD Thesis: Client-Based Logging: A New Paradigm For Distributed Transaction Management

    Euthimios Panagos, PhD 1996

    The proliferation of inexpensive workstations and networks has created a new era in distributed computing. At the same time, non-traditional applications such as computer-aided design (CAD), computer-aided software engineering (CASE), geographic- information systems (GIS), and office-information systems (OIS) have placed increased demands for high-performance transaction processing on database systems. The combination of these factors gives rise to significant challenges in the design of modern database systems. In this thesis, we propose novel techniques whose aim is to improve the performance and scalability of these new database systems. These techniques exploit client resources through client-based transaction management.

    Client-based transaction management is realized by providing logging facilities locally even when data is shared in a global environment. This thesis presents several recovery algorithms which utilize client disks for storing recovery related informa- tion (i.e., log records). Our algorithms work with both coarse and fine-granularity locking and they do not require the merging of client logs at any time. Moreover, our algorithms support fine-granularity locking with multiple clients permitted to con- currently update different portions of the same database page. The database state is recovered correctly when there is a complex crash as well as when the updates performed by different clients on a page are not present on the disk version of the page, even though some of the updating transactions have committed.

    This thesis also presents the implementation of the proposed algorithms in a memory-mapped storage manager as well as a detailed performance study of these algorithms using the OO1 database benchmark. The performance results show that client- based logging is superior to traditional server-based logging. This is because client-based logging is an effective way to reduce dependencies on server CPU and disk resources and, thus, prevents the server from becoming a performance bottleneck as quickly when the number of clients accessing the database increases.

    The Thesis is available as BUCS Technical Report TR-96-010


PhD Thesis: Concurrency Control Protocols for Real-Time Databases

    Spyridon Braoudakis, PhD 1994

    Concurrency control methods developed for traditional database systems are not appropriate for real-time database systems (RTDBS), where, in addition to database consistency requirements, satisfying timing constraints is an integral part of the correctness criterion. Most real-time concurrency control protocols considered in the literature combine time-critical scheduling with traditional concurrency control methods to conform to transaction timing constraints. These methods rely on either transaction blocking or restarts, both of which are inappropriate for real-time concurrency control because of the unpredictability they introduce. Moreover, RTDBS performance objectives differ from those of conventional database systems in that maximizing the number of transactions that complete before their deadlines becomes the decisive performance objective, rather than merely maximizing concurrency (or throughput). Recently, Speculative Concurrency Control (SCC) was proposed as a categorically different approach to concurrency control for RTDBS. SCC relies on the use of redundant processes (shadows), which speculate on alternative schedules, once conflicts that threaten the consistency of the database are detected. SCC algorithms utilize added system resources to ensure that correct (serializable) executions are discovered and adopted as early as possible, thus increasing the likelihood of the timely commitment of transactions.

    This dissertation starts by reviewing the Order-Based SCC (SCC-OB) algorithm which associates almost as many shadows as there are serialization orders of transactions. After demonstrating SCC-OB's excessive use of redundancy, a host of novel SCC-based protocols is introduced. Conflict-Based SCC (SCC-CB) reduces the number of shadows that a running transaction needs to keep by maintaining one shadow per uncommitted conflicting transaction. It is shown that the quadratic number of shadows maintained by SCC-CB is optimal, covering all serialization orders produced by SCC-OB. SCC-CB's correctness is established by showing that it admits only serializable histories. Next, the trade-off between the number of shadows and timeliness is considered. A generic SCC algorithm (SCC-kS) that operates under a limited redundancy assumption is presented; it allows no more than a constant number $k$ of shadows to coexist on behalf of any uncommitted transaction. Next, a novel technique is proposed that incorporates additional information such as deadline, priority and criticalness within the SCC methodology. SCC with Deferred Commit (SCC-DC) utilizes this additional information to improve the timeliness through the controlled deferment of transaction commitments. A probabilistic Value Induced Shadow Allocation (VISA) policy is developed which aims at preserving the most valuable shadows for each system transaction. The thesis of this dissertation is that SCC-based algorithms offer a new dimension, redundancy, to improve the timeliness of RTDBS. SCC-based algorithms are efficient (quadratic number of shadows is optimal), scalable (redundancy can be traded-off for timeliness), and easily amendable (deadline and priority information can be incorporated).


PhD Thesis: Real-Time Scheduling for Multimedia Services Using Network Delay Estimation