Abstract: Differentiated Services (or just DiffServ) is a very scalable architecture for delivering QoS in the Internet. This scalability is achieved by maintaining a per-class (and not a per-flow) state at the core routers. However, this exact property also makes ensuring end-to-end QoS very difficult, since one has to verify that the local resources allocated to each class are sufficient to support the requirements of all the flows in that class. This is particularly challenging when considering accumulative requirements such as delay or loss. This talk addresses two aspects of this problem. First we address the translation of the end-to-end delay requirements of flows to a specific resource allocation at the different routers. The required end-to-end QoS level is usually specified in a Service Level Agreement (SLA). The problem is thus to find a way that maximizes the number of flows for which the delay specified in the SLA is satisfied in congested conditions. The work evaluates the importance of both local algorithms that guarantee a constant ratio between local waiting times of packets belonging to different service classes, and global probing techniques that allow classifying the packets into the different service classes according to their SLA defined delay, in maximizing the number of flows for which the delay constraint is satisfied. Then we concentrate on the probing mechanism, trying to minimize the amount of overhead needed. We propose a new probing technique called, Adaptive Probing, in which the routers along the flow path are responsible to discover and notify a centralized entity when a violation of the SLA occurs (or soon to occur). We study the performance of this new algorithm through theoretical analysis and extensive simulations. Our results indicate that in addition to dramatically reducing the load from the centralized entity, the amount of network traffic needed is relatively small, and this new monitoring scheme scales well both with the number of hops and the network load. This is a joint work with Alex Dvorkin, Constantine Elster, and Ran Wolff.