Appendix A - Queuing and Scheduling Hardware/Software

Queuing and scheduling are the building blocks of the QoS traffic handling mechanisms. These are available both in standalone network devices as well as in host network components. The simplest network devices forward traffic from the source (ingress) interface to the destination (egress) interface in first-in-first-out (FIFO) order. More sophisticated devices are able to provide QoS by using intelligent queuing and scheduling schemes. We present an overview of these schemes in this section. Each of the queuing mechanisms described may be used to handle traffic on a per-conversation basis or on an aggregate basis.

Work-Conserving Queue Servicing

Traffic passing through a network device is classified to different queues within the device. A variety of queue-servicing schemes can then be used to remove traffic from the queues and forward it to egress interfaces. Most queue servicing schemes currently in use, are work-conserving. That is - they do not allow interface resources to go unused. So long as there is capacity to send traffic and there is traffic to be sent, work-conserving schemes will forward packets to the egress interface. If the interface is not congested then these schemes amount to first-in-first-out (FIFO) queuing. However, if the interface is congested, then packets will accumulate in queues in device memory, awaiting capacity on the interface. When capacity becomes available, the device must decide which of the queued packets should be sent next. In general, packets from certain queues will be given priority over packets from other queues. Thus, under congestion traffic is not serviced in a FIFO manner, but rather according to some alternate queue-servicing scheme.

Many work-conserving queue-servicing algorithms have been defined. Examples are weighted fair queuing (WFQ), deficit round robin (DRR), stochastic fair queuing (SFQ), round robin (RR), strict priority, etc. These all try to allocate some minimum share of the interface's capacity to each queue (during congestion), while allowing additional capacity to be allocated when there is no traffic queued on higher priority flows. These servicing schemes also try to minimize the latency experienced by packets on some or all flows.

Non-Work-Conserving Queue Servicing

A different type of queuing scheme is non-work-conserving. This type of scheme may allow interface capacity to go unused. These schemes are often referred to as packet-shaping schemes. A packet-shaping scheme limits the rate at which traffic on a certain flow can be forwarded through the outgoing interface. Packet-shaping is often used for multimedia traffic flows. In this case, there is no advantage to sending traffic sooner than necessary (voice data will generally not be played any faster than it was recorded) and downstream resources may be spared by limiting data transmission to the rate at which it can be consumed. Non-work-conserving schemes require a real-time clock to pace the transmission of traffic on the shaped queue.

It is possible to combine both work-conserving and non-work-conserving schemes on the same interface. In this case, work-conserving schemes may make use of capacity that is not used by non-work-conserving schemes.


A special queuing mechanism is optimized for slow network interfaces. This mechanism is referred to as ISSLOW (integrated services over slow links). The purpose of this scheme is to dramatically reduce the latency that would be experienced by certain packets (typically, small audio packets) when the capacity of the interface is very low. It is specifically targeted at interfaces that forward onto relatively slow modem links. A typical 1500-byte data packet, once forwarded onto a typical modem link, may occupy the link for almost half a second. Other packets that have the misfortune of being queued behind the data packet will experience a significant latency. This is unacceptable for latency-intolerant (such as telephony) traffic. To avert this problem, ISSLOW scheduling mechanisms break the data packet into smaller packets (link-layer fragmentation), such that they do not occupy the link for as long a period of time. Higher priority, latency-intolerant packets can then be interspersed between these smaller packets.


ATM interfaces fragment packets into very small cells. These cells are typically queued and scheduled for transmission by hardware on the ATM interface. Due to the small cell size, it is possible to schedule traffic very precisely and with low latency. ATM interfaces implement both work-conserving and non-work-conserving schemes. These do not require ISSLOW since the small cell size and typically high media rates do not present the latency problems observed on slow links.