Chapter 3 - QoS Technologies

Article
12/09/2009

Traffic Handling Mechanisms

In this section we discuss the more significant traffic handling mechanisms. Note that underlying any traffic handling mechanism is a set of queues and the algorithms for servicing these queues. (In Appendix A of this whitepaper, we discuss some general approaches to queuing and queue-servicing). Traffic handling mechanisms include:

802.1p
Differentiated service per-hop-behaviors (diffserv)
Integrated services (intserv)
ATM, ISSLOW and others

Each of these traffic-handling mechanisms is appropriate for specific media or circumstances and is described in detail below.

802.1p

Most local area networks (LANs) are based on IEEE 802 technology. These include Ethernet, token-ring, FDDI and other variations of shared media networks. 802.1p is a traffic-handling mechanism for supporting QoS in these networks1. QoS in LAN networks is of interest because these networks comprise a large percentage of the networks in use in university campuses, corporate campuses and office complexes.

802.1p2 defines a field in the layer-2 header of 802 packets that can carry one of eight priority values. Typically, hosts or routers sending traffic into a LAN will mark each transmitted packet with the appropriate priority value. LAN devices, such as switches, bridges and hubs, are expected to treat the packets accordingly (by making use of underlying queuing mechanisms). The scope of the 802.1p priority mark is limited to the LAN. Once packets are carried off the LAN, through a layer-3 device, the 802.1p priority is removed.

Differentiated Services (Diffserv)

Diffserv3 is a layer-3 QoS mechanism that has been in limited use for many years, although there has been little effort to standardize it until very recently. Diffserv defines a field in the layer-3 header of IP packets, called the diffserv codepoint (DSCP)4. Typically, hosts or routers sending traffic into a diffserv network will mark each transmitted packet with the appropriate DSCP. Routers within the diffserv network use the DSCP to classify packets and apply specific queuing or scheduling behavior (known as a per-hop behavior or PHB) based on the results of the classification.

An example of a PHB is the expedited-forwarding (or EF) PHB. This behavior is defined to assure that packets are transmitted from ingress to egress (at some limited rate) with very low latency. Other behaviors may specify that packets are to be given a certain priority relative to other packets, in terms of average throughput or in terms of drop preference, but with no particular emphasis on latency. PHBs are implemented using underlying queuing mechanisms.

PHBs are individual behaviors applied at each router. PHBs alone make no guarantees of end-to-end QoS. However, by concatenating routers with the same PHBs (and limiting the rate at which packets are submitted for any PHB), it is possible to use PHBs to construct an end-to-end QoS service. For example, a concatenation of EF PHBs, along a pre-specified route, with careful admission control, can yield a service similar to leased-line service, which is suitable for interactive voice. Other concatenations of PHBs may yield a service suitable for video playback, and so forth.

Integrated Services (Intserv)

Intserv is a service framework. At this time, there are two services defined within this framework. These are the guaranteed service and the controlled load service. The guaranteed service promises to carry a certain traffic volume with a quantifiable, bounded latency. The controlled load service agrees to carry a certain traffic volume with the 'appearance of a lightly loaded network'. These are quantifiable services in the sense that they are defined to provide quantifiable QoS to a specific quantity of traffic. (As we will discuss in depth later, certain diffserv services by comparison, may not be quantifiable).

Intserv services are typically (but not necessarily) associated with the RSVP signaling protocol, which will be discussed in detail later in this whitepaper. Each of the intserv services define admission control algorithms which determine how much traffic can be admitted to an intserv service class at a particular network device, without compromising the quality of the service. Intserv services do not define the underlying queuing algorithms to be used in providing the service.

ATM, ISSLOW and Others

ATM is a link layer technology that offers high quality traffic handling. ATM fragments packets into link layer cells, which are then queued and serviced using queue-servicing algorithms appropriate for the particular ATM service. ATM traffic is carried on virtual circuits (VC) which support one of the numerous ATM services. These include constant-bit-rate (CBR), variable-bit-rate (VBR), unknown-bit-rate (UBR) and others. ATM actually goes beyond a strict traffic handling mechanism in the sense that it includes a low level signaling protocol that can be used to set up and tear down ATM VCs.

Because ATM fragments packets into relatively small cells, it can offer very low latency service. If it is necessary to transmit a packet urgently, the ATM interface can always be cleared for transmission in the time it takes to transmit one cell. By comparison, consider sending normal TCP/IP data traffic on slow modem links without the benefit of the ATM link layer. A typical 1500-byte packet, once submitted for transmission on a 28.8 Kbps modem link, will occupy the link for about 400 msec until it is completely transmitted (preventing the transmission of any other packets on the same link). Integrated Services Over Slow Link Layers (ISSLOW) addresses this problem. ISSLOW is a technique for fragmenting IP packets at the link layer for transmission over slow links such that the fragments never occupy the link for longer than some threshold.

Other traffic handling mechanisms have been defined for various media, including cable modems, hybrid fiber coax (HFC) plants, P1394, and so on. These may use low level, link-layer specific signaling mechanisms (such as UNI signaling for ATM).

Per-Conversation vs. Aggregate Traffic Handling Mechanisms

An important general categorization of traffic handling mechanisms is that of per-conversation mechanisms vs. aggregate mechanisms. This categorization refers largely to the classification associated with the mechanism and can have a significant effect on the QoS experienced by traffic subjected to the mechanism.

Per-conversation traffic handling mechanisms are mechanisms that handle each conversation as a separate flow. In this context, a conversation includes all traffic between a specific instance of a specific application on one host and a specific instance of the peer application on a peer host. In the case of IP traffic, the source/destination IP address, port, and protocol (also known as a 5-tuple) uniquely identify a conversation. Traditionally, intserv mechanisms are provided on a per-conversation basis.

In aggregate traffic handling mechanisms, some set of traffic, from multiple conversations, is classified to the same flow and is handled in aggregate. Aggregate classifiers generally look at some aggregate identifier in packet headers. Diffserv and 802.1p are examples of aggregate traffic handling mechanisms at layer-3 and at layer-2, respectively. In both these mechanisms, packets corresponding to multiple conversations are marked with the same DSCP or 802.1p mark.

When traffic is handled on a per-conversation basis, resources are allotted on a per-conversation basis. From the application perspective, this means that the application's traffic is granted resources completely independent of the effects of traffic from other conversations in the network. While this tends to enhance the quality of the service experienced by the application, it also imposes a burden on the network equipment. Network equipment is required to maintain independent state for each conversation and to apply independent processing for each conversation. In the core of large networks, where it is possible to support millions of conversations simultaneously, per-conversation traffic handling may not be practical.

When traffic is handled in aggregate, the state maintenance and processing burden on devices in the core of a large network is reduced significantly. On the other hand, the quality of service perceived by an application's conversation is no longer independent of the effects of traffic from other conversations that have been aggregated into the same flow. As a result, in aggregate traffic handling, the quality of service perceived by the application tends to be somewhat compromised. Allocating excess resources to the aggregate traffic class can offset this effect. However, this approach tends to reduce the efficiency with which network resources are used.

Provisioning and Configuration Mechanisms

In order to be effective in providing network QoS, it is necessary to effect the provisioning and configuration of the traffic handling mechanisms described consistently, across multiple network devices. Provisioning and configuration mechanisms include:

Resource Reservation Protocol (RSVP) signaling and the Subnet Bandwidth Manager (SBM)
Policy mechanisms and protocols
Management tools and protocols

These are described in detail in the paragraphs below.

Provisioning vs. Configuration

In this whitepaper, we use the term provisioning to refer to more static and longer term management tasks. These may include selection of network equipment, replacement of network equipment, interface additions or deletions, link speed modifications, topology changes, capacity planning, and so forth. We use the term configuration to refer to more dynamic and shorter term management tasks. These include such management tasks as modifications to traffic handling parameters in diffserv networks. The distinction between provisioning and configuration is not clearly delineated and is used as a general guideline rather than a strict categorization. The terms are often used interchangeably unless otherwise specified.

Top-Down vs. Signaled Mechanisms

It is important to note the distinction between top-down QoS configuration mechanisms and signaled QoS configuration mechanisms. Top-down mechanisms typically 'push' configuration information from a management console down to network devices. Signaled mechanisms typically carry QoS requests (and implicit configuration requests) from one end of the network to the other, along the same path traversed by the data that requires QoS resources. Top-down configuration is typically initiated on behalf of one or more applications by a network management program. Signaled configuration is typically initiated by an application's changes in resource demands.

RSVP and the SBM

RSVP is a signaled QoS configuration mechanism. It is a protocol by which applications can request end-to-end, per-conversation, QoS from the network, and can indicate QoS requirements and capabilities to peer applications. RSVP is a layer-3 protocol, suited primarily for use with IP traffic. As currently defined, RSVP uses intserv semantics to convey per-conversation QoS requests to the network. However, RSVP per-se is neither limited to per-conversation usage, nor to intserv semantics. In fact, currently proposed extensions to RSVP enable it to be used to signal information regarding traffic aggregates. Other extensions enable it to be used to signal requirements for services beyond the traditional guaranteed and controlled load intserv services. In this section we discuss RSVP in its traditional per-conversation, intserv form. Later in this whitepaper we will discuss its applicability to aggregated services and to services which are not traditionally intserv.

Since RSVP is a layer-3 protocol, it is largely independent of the various underlying network media over which it operates. Therefore, RSVP can be considered an abstraction layer between applications (or host operating systems) and media-specific QoS mechanisms.

There are two significant RSVP messages, PATH and RESV. Transmitting applications send PATH messages towards receivers. These messages describe the data that will be transmitted and follow the path that the data will take. Receivers send RESV messages. These follow the path seeded by the PATH messages, back towards the senders, indicating the profile of traffic that particular receivers are interested in. In the case of multicast traffic flows, RESV messages from multiple receivers are 'merged', making RSVP suitable for QoS with multicast traffic.

As defined today, RSVP messages carry the following information:

How the network can identify traffic on a conversation (classification information)
Quantitative parameters describing the traffic on the conversation (data rate, etc.)
The service type required from the network for the conversation's traffic
Policy information (identifying the user requesting resources for the traffic and the application to which it corresponds)

Classification information is conveyed using IP source and destination addresses and ports. In the conventional intserv use of RSVP, an Intserv service type is specified and quantitative traffic parameters are expressed using a token-bucket model. Policy information is typically a secure means for identifying the user and/or the application requesting resources. Network administrators use policy information to decide whether or not to allocate resources to a conversation.

How RSVP Works
PATH messages wind their way through all network devices en-route from sender to receivers. RSVP aware devices in the data path note the messages and establish state for the flow described by the message. (Other devices pass the messages through transparently).

When a PATH message arrives at a receiver, the receiver responds with a RESV message (if the receiving application is interested in the traffic flow offered by the sender). The RESV message winds its way back towards the sender, following the path established by the incident PATH messages. As the RESV message progresses toward the sender, RSVP-aware devices verify that they have the resources necessary to meet the QoS requirements requested. If a device can accommodate the resource request, it installs classification state corresponding to the conversation and allocates resources for the conversation. The device then allows the RESV message to progress on up toward the sender. If a device cannot accommodate the resource request, the RESV message is rejected and a rejection is sent back to the receiver.

In addition, RSVP aware devices in the data path may extract policy information from PATH messages and/or RESV messages, for verification against network policies. Devices may reject resource requests based on the results of these policy checks by preventing the message from continuing on its path, and sending a rejection message.

When requests are not rejected for either resource availability or policy reasons, the incident PATH message is carried from sender to receiver, and a RESV message is carried in return. In this case, a reservation is said to be installed. An installed reservation indicates that RSVP-aware devices in the traffic path have committed the requested resources to the appropriate flow and are prepared to allocate these resources to traffic belonging to the flow. This process of approving or rejecting RSVP messages is known as admission-control and is a key QoS concept.

The SBM
The SBM is based on an enhancement to the RSVP protocol, which extends its utility to shared networks. In shared sub-networks or LANs (which may include a number of hosts and/or routers interconnected by a switch or hub), standard RSVP falls short. The problem arises because RSVP messages may pass through layer-2 (RSVP-unaware) devices in the shared network, implicitly admitting flows that require shared network resources. RSVP-aware hosts and routers admit or reject flows based on availability of their private resources, but not based on availability of shared resources. As a result, RSVP requests destined for hosts on the shared subnet may result in the over-commitment of resources in the shared subnet.

The SBM solves this problem by enabling intelligent devices that reside on the shared network to volunteer their services as a 'broker' for the shared network's resources. Eligible devices are (in increasing order of suitability):

Attached SBM-capable hosts
Attached SBM-capable routers
SBM-capable switches which comprise the shared network

These devices automatically run an election protocol that results in the most suitable device(s) being appointed designated SBMs (DSBM). When eligible switches participate in the election, they subdivide the shared network between themselves based on the layer-2 network topology. Hosts and routers that send into the shared network discover the closest DSBM and route RSVP messages through the device. Thus, the DSBM sees all messages that will affect resources in the shared subnet and provides admission control on behalf of the subnet.

Policy Mechanisms and Protocols

Network administrators configure QoS mechanisms subject to certain policies. Policies determine which applications and users are entitled to varying amounts of resources in different parts of the network.

Policy components include:

A data-store, which contains the policy data itself, such as user names, applications, and the network resources to which these are entitled.
Policy decision points (PDPs) - these translate network-wide higher layer policies into specific configuration information for individual network devices. PDPs also inspect resource requests carried in RSVP messages and accept or reject them based on a comparison against policy data.
Policy enforcement points (PEPs) act on the decisions made by PDPs. These are typically network devices that either do or do not grant resources to arriving traffic.
Protocols between the data-store, PDPs and PEPs

Policy Data Store - Directory Services
Policy mechanisms rely on a set of data describing how resources in various parts of the network can be allocated to traffic that is associated with specific users and/or applications. Policy schemas define the format of this information. Two general types of schemas are required. One type describes the resources that should be allocated in a top-down provisioned manner. The other describes resources that can be configured via end-to-end signaling. This information tends to be relatively static and (at least in part) needs to be distributed across the network. Consequently, directories tend to be suitable data stores.

Policy Decision Points and Policy Enforcement Points
Policy decision points (PDPs) interpret data stored in the schemas and control policy enforcement points (PEPs) accordingly. Policy enforcement points are the switches and routers through which traffic passes. These devices have the ultimate control over which traffic is allocated resources and which is not. In the case of top-down provisioned QoS, the PDP 'pushes' policy information to PEPs in the form of classification information (IP addresses and ports) and the resources to which classified packets are entitled.

In the case of signaled QoS, RSVP messages transit through the network along the data path. When an RSVP message arrives at a PEP, the device extracts a policy element from the message, as well as a description of the service type required and the traffic profile. The policy element generally contains authenticated user and/or application identification. The router then passes the relevant information from the RSVP message to the PDP for comparison of the resources requested against those allowable for the user and/or application (per policy in the data-store). The PDP makes a decision regarding the admissibility of the resource request and returns an approval or denial to the PEP.

In certain cases, the PEP and the PDP can be co-located in the network device. In other cases, the PDP may be separated from the PEP in the form of a policy server. A single policy server may reside between the directory and multiple PEPs. Although many policy decisions can be made trivially by co-locating the PDP and the PEP, there are certain advantages that can be realized by the use of a policy server.

Use of Policy Protocols
When RSVP messages transit RSVP-aware network devices, they cause the configuration of traffic handling mechanisms in PEPs, including classifiers and queuing mechanisms, that provide intserv services. However, in many cases, RSVP cannot be used to configure these mechanisms. Instead, more traditional, top-down mechanisms must be used.

These protocols include Simple Network Management Protocol (SNMP), command line interface (CLI), Common Open Protocol Services (COPS) and others. SNMP has been in use for many years, primarily for the purpose of monitoring network device functionality from a central console. It can also be used to set or configure device functionality. CLI is a protocol used initially to configure and monitor Cisco network equipment. Due to its popularity, a number of other network vendors provide CLI-like configuration interfaces to their equipment. COPS is a protocol that has been developed in recent years in the context of QoS. It was initially targeted as an RSVP-related policy protocol but has recently been pressed into service as a general diffserv configuration protocol. All these protocols are considered top-down because, traditionally, a higher level management console uses them to push configuration information down to a set of network devices.

In the case of signaled QoS (as opposed to top-down QoS), detailed configuration information is generally carried to the PEP in the form of RSVP signaling messages. However, the PEP must outsource the decision whether or not to honor the configuration request to the PDP. COPS was initially developed to pass the relevant information contained in the RSVP message from the PEP to the PDP, and to pass a policy decision in response. Obviously, when PEP and PDP are co-located no such protocol is required.

A protocol is also required for communication between the PDP and the policy data-store. Since the data-store tends to take the form of a distributed directory, LDAP is commonly used for this purpose.

1	Since LAN resources tend to be less costly than WAN resources, 802.1p QoS mechanisms are often considered less important than their WAN related counterparts. However, with the increasing usage of multimedia applications on LANs, delays through LAN switches do become problematic. 802.1p tackles these delays.
2	802.1p is often defined together with 802.1q. The two define various VLAN (virtual LAN) fields, as well as a priority field. For the purpose of this discussion, we are interested only in the priority field.
3	a.k.a. Class of Service
4	The DSCP is a six-bit field, spanning the fields formerly known as the type-of-service (TOS) fields and the IP precedence fields.