View Related Documents

Abstract

We will investigate a number of algorithmic problems that arise in large networks such as the world-wide web. We will mostly concentrate on the following problems.
General Caching Problems: Caching is a very well-studied problem. Consider a two-level memory system consisting of a small fast memory, that can store up to k bits, and a large slow memory, that can store potentially infinitely many bits. The goal is to serve a sequence of memory accesses with low total cost. In standard caching it is assumed that all memory pages have a uniform size and a uniform fault cost. In a general caching problem, however, pages or documents have varying sizes and varying costs. This problem arises, among other places, in the cache design for networked file systems or the world-wide web. In the web, for instance, the transmission time of a web page depends on the location of the corresponding server and also on transient conditions such as the server load or network congestion.
The offline variant of general caching problems is NP-hard. Irani [7] recently presented polynomial time algorithms that achieve an approximation ratio of O(log(k/s)), where s is the size of the smallest document ever requested. We present polynomial time constant factor approximation algorithms that use a small amount of additional memory. The values of the approximation ratios depend on the exact cost model assumed. In the most general setting we achieve a (4 + ε)-approximation, for any ε > 0. The main idea of our solutions is to formulate caching problems as integer programs and solve linear relaxations. A fractional solution is transformed into a feasible solution using a new rounding technique. Our results were published in [2].
Management of persistent TCP connections: Communication between clients and servers in the web is performed using HTTP (Hyper Text Transfer Protocol), which in turn uses TCP (Transmission Control Protocol) to transfer data. If data has to be transmitted between two network nodes, then there has to exist an open TCP connection between these two nodes. While the earlier HTTP/1.0 opened and closed a separate TCP connection for each transmission request, the new HTTP/1.1. permits persistent connections, i.e. a TCP connection may be kept open and idle until a new transmission request arrive or the connection is explicitly closed by the server or the client. The problem is to maintain a limited number of open TCP connections at each network node, not knowing which connections will be required in the future.
Cohen et al. [4,5] recently initiated the study of connection caching in the web and presented optimal competitive online algorithms assuming that the establishment cost is uniform for all the connections. They also analyzed various models of communication. We investigate the setting where connections may incur different establishment costs and present online algorithms that achieve an optimal competitiveness. Our algorithms use extra communication between network nodes while managing open connections. We can develop refined algorithms that allow tradeoffs between extra communication and competitive performance. We also consider problem extensions where connections may have time-out values or asymmetric establishment costs. The results appeared in [1]
Dynamic TCP acknowledgment: Consider a sequence of data packets that arrives at a network node over an open TCP connection. The node has to acknowledge the receipt of the packets by sending acknowledgments to the sending site. Most implementations of TCP use some acknowledgment delay mechanism, i.e. multiple incoming data packets are acknowledged with a single acknowledgment. A reduction of the number of acknowledgments sent leads to a smaller network congestion and to a smaller overhead incurred in sending and receiving acknowledgments. On the other hand, by sending fewer acknowledgments, we increase the latency of the TCP connection. The goal is to acknowledge dynamically a sequence of data packets, that arrives over time, so that the number of acknowledgments and the acknowledgment delays for the individual packets is as small as possible.
The study of dynamic TCP acknowledgment was initiated by Dooly et al. [6]. They considered the objective function of minimizing the sum of the number of acknowledgments and the total acknowledgment delays for all the packets. They presented deterministic online algorithms that achieve an optimal competitive ratio of 2. Recently Karlin et al. [8] gave randomized algorithms that achieve a competitiveness of e/(e™1) ≈ 1.58. We consider a different objective function that penalizes long acknowledgment delays. In practice if a data packet is not acknowledged within a certain amount of time, the packet is resent by the sending site, which increases again the network congestion. We investigate an objective function that minimizes the sum of the number of acknowledgments and the maximum delay that ever occurs for any of the data packets. We present optimal online algorithms [3].

Fulltext Preview

Image of the first page of the fulltext document