Quicksort is a sorting algorithm whose worst-case running time is (*n*^{2}) on an input array of *n* numbers. In spite of this slow worst-case running time, quicksort is often the best practical choice for sorting because it is remarkably efficient on the average: its expected running time is (*n* lg *n*), and the constant factors hidden in the (*n* lg *n*) notation are quite small. It also has the advantage of sorting in place (see page 3), and it works well even in virtual memory environments.

Quicksort, like merge sort, is based on the divide-and-conquer paradigm introduced in Section 1.3.1. Here is the three-step divide-and-conquer process for sorting a typical subarray *A*[*p . . r*].

**Conquer: **The two subarrays *A*[*p . . q*] and *A*[*q* + 1 . . *r*] are sorted by recursive calls to quicksort.

The following procedure implements quicksort.

QUICKSORT(A,p,r)

1ifp<r

2thenqPARTITION(A,p,r)

3 QUICKSORT(A,p,q)

4 QUICKSORT(A,q+ 1,r)

To sort an entire array *A*, the initial call is QUICKSORT(*A*, 1, *length*[*A*]).

The key to the algorithm is the PARTITION procedure, which rearranges the subarray *A*[*p . . r*] in place.

PARTITION(A,p,r)

1xA[p]

2ip- 1

3jr+ 1

4whileTRUE

5do repeatjj- 1

6untilA[j]x

7repeatii+ 1

8untilA[i]x

9ifi<j

10thenexchangeA[i]A[j]

11else returnj

What value of *q* does PARTITION return when all elements in the array *A*[*p . . r*] have the same value?

Give a brief argument that the running time of PARTITION on a subarray of size *n* is (*n*).

How would you modify QUICKSORT to sort in nonincreasing order?

The running time of quicksort depends on whether the partitioning is balanced or unbalanced, and this in turn depends on which elements are used for partitioning. If the partitioning is balanced, the algorithm runs asymptotically as fast as merge sort. If the partitioning is unbalanced, however, it can run asymptotically as slow as insertion sort. In this section, we shall informally investigate how quicksort performs under the assumptions of balanced versus unbalanced partitioning.

T(n) =T(n- 1) + (n).

To evaluate this recurrence, we observe that *T*(1) = (1) and then iterate:

T(n) = 2T(n/2) + (n),

T(n) =T(9n/10) +T(n/10) +n

Suppose that the splits at every level of quicksort are in the proportion 1 - to , where 0 < 1/2 is a constant. Show that the minimum depth of a leaf in the recursion tree is approximately ep1g *n*/lg and the maximum depth is approximately - lg *n*/lg(1 - ). (Don't worry about integer round-off.)

Argue that for any constant 0 < 1/2, the probability is approximately 1 - 2 that on a random input array, PARTITION produces a split more balanced than 1 - to . For what value of are the odds even that the split is more balanced than less balanced?

In exploring the average-case behavior of quicksort, we have made an assumption that all permutations of the input numbers are equally likely. When this assumption on the distribution of the inputs is valid, many people regard quicksort as the algorithm of choice for large enough inputs. In an engineering situation, however, we cannot always expect it to hold. (See Exercise 8.2-3.) This section introduces the notion of a randomized algorithm and presents two randomized versions of quicksort that overcome the assumption that all permutations of the input numbers are equally likely.

We call an algorithm * randomized* if its behavior is determined not only by the input but also by values produced by a

By modifying the PARTITION procedure, we can design another randomized version of quicksort that uses this random-choice strategy. At each step of the quicksort algorithm, before the array is partitioned, we exchange element *A*[*p*] with an element chosen at random from *A*[*p* . . *r*]. This modification ensures that the pivot element *x* = *A*[*p*] is equally likely to be any of the *r* - *p* + 1 elements in the subarray. Thus, we expect the split of the input array to be reasonably well balanced on average. The randomized algorithm based on randomly permuting the input array also works well on average, but it is somewhat more difficult to analyze than this version.

The changes to PARTITION and QUICKSORT are small. In the new partition procedure, we simply implement the swap before actually partitioning:

RANDOMIZED-PARTITION(A,p,r)

1iRANDOM(p,r)

2 exchangeA[p]A[i]

3returnPARTITION(A,p,r)

We now make the new quicksort call RANDOMIZED-PARTITION in place of PARTITION:

RANDOMIZED-QUICKSORT(A,p,r)

1ifp<r

2thenqRANDOMIZED-PARTITION(A,p,r)

3 RANDOMIZED-QUICKSORT(A,p,q)

4 RANDOMIZED-QUICKSORT(A,q+ 1,r)

We analyze this algorithm in the next section.

Why do we analyze the average-case performance of a randomized algorithm and not its worst-case performance?

During the running of the procedure RANDOMIZED-QUICKSORT, how many calls are made to the random-number generator RANDOM in the worst case? How does the answer change in the best case?

Describe an implementation of the procedure RANDOM*(a, b)* that uses only fair coin flips. What is the expected running time of your procedure?

Give a (*n*)-time, randomized procedure that takes as input an array *A*[1 . . *n*] and performs a random permutation on the array elements.

Section 8.2 gave some intuition for the worst-case behavior of quicksort and for why we expect it to run quickly. In this section, we analyze the behavior of quicksort more rigorously. We begin with a worst-case analysis, which applies to either QUICKSORT or RANDOMIZED-QUICKSORT, and conclude with an average-case analysis of RANDOMIZED-QUICKSORT.

We saw in Section 8.2 that a worst-case split at every level of recursion in quicksort produces a (*n*^{2}) running time, which, intuitively, is the worst-case running time of the algorithm. We now prove this assertion.

Continuing with our bounding of *T*(*n*)*,* we obtain

T(n)cn^{2}-2c(n- 1) + (n)

cn^{2 },

We have already given an intuitive argument why the average-case running time of RANDOMIZED-QUICKSORT is (*n *1g *n*): if the split induced by RANDOMIZED-PARTITION puts any constant fraction of the elements on one side of the partition, then the recursion tree has depth (1g *n*) and (*n*) work is performed at (1g *n*) of these levels. We can analyze the expected running time of RANDOMIZED-QUICKSORT precisely by first understanding how the partitioning procedure operates. We can then develop a recurrence for the average time required to sort an *n*-element array and solve this recurrence to determine bounds on the expected running time. As part of the process of solving the recurrence, we shall develop tight bounds on an interesting summation.

We first make some observations about the operation of PARTITION. When PARTITION is called in line 3 of the procedure RANDOMIZED-PARTITION, the element *A*[*p*] has already been exchanged with a random element in *A*[*p . . r*]. To simplify the analysis, we assume that all input numbers are distinct. If all input numbers are not distinct, it is still true that quick-sort's average-case running time is *O*(*n* lg *n*), but a somewhat more intricate analysis than we present here is required.

Our first observation is that the value of *q* returned by PARTITION depends only on the rank of *x = A*[*p*] among the elements in *A*[*p . . r*]*.* (The * rank* of a number in a set is the number of elements less than or equal to it.) If we let

We show below that the summation in the last line can be bounded by

It remains to prove the bound (8.5) on the summation

Since each term is at most *n* lg *n*, we have the bound

if *n* 2. This is the bound (8.5).

Show that quicksort's best-case running time is (*n*1g*n*).

Show that RANDOMIZED-QUICKSORT's expected running time is (*n *1g *n*).

The running time of quicksort can be improved in practice by taking advantage of the fast running time of insertion sort when its input is "nearly" sorted. When quicksort is called on a subarray with fewer than *k* elements, let it simply return without sorting the subarray. After the top-level call to quicksort returns, run insertion sort on the entire array to finish the sorting process. Argue that this sorting algorithm runs in *O*(*nk + n* 1g(*n*/*k*)) expected time. How should *k* be picked, both in theory and in practice?

Consider modifying the PARTITION procedure by randomly picking three elements from array *A* and partitioning about their median. Approximate the probability of getting at worst an *-to-(1 - *) split, as a function of * in the range 0 < * < 1.

Give a careful argument that the procedure PARTITION in Section 8.1 is correct. Prove the following:

**a****. **The indices *i* and *j* never reference an element of *A* outside the interval [*p *. . *r*].

**b****.** The index *j* is not equal to *r* when PARTITION terminates (so that the split is always nontrivial).

8-2 Lomuto's partitioning algorithm

Consider the following variation of PARTITION, due to N. Lomuto. To partition *A*[*p *. . *r*], this version grows two regions, *A*[*p* . . *i*] and *A*[*i* + 1 . . *j*], such that every element in the first region is less than or equal to *x* = *A* [*r*] and every element in the second region is greater than *x*.

LOMUTO-PARTITION(A, p, r)

1xA[r]

2ip- 1

3forjptor

4do ifA[j]x

5thenii+ 1

6 exchangeA[i]A[j]

7ifi<r

8then returni

9else returni- 1

* a. *Argue that LOMUTO-PARTITION is correct.

**c****.** Argue that LOMUTO-PARTITION, like PARTITION, runs in (*n*) time on an *n*-element subarray.

Professors Howard, Fine, and Howard have proposed the following "elegan" sorting algorithm:

The QUICKSORT algorithm of Section 8.1 contains two recursive calls to itself. After the call to PARTITION, the left subarray is recursively sorted and then the right subarray is recursively sorted. The second recursive call in QUICKSORT is not really necessary; it can be avoided by using an iterative control structure. This technique, called* tail recursion*, is provided automatically by good compilers. Consider the following version of quicksort, which simulates tail recursion.

QUICKSORT'(A,p,r)

1whilep<r

2doPartition and sort left subarray

3qPARTITION(A,p,r)

4 QUICKSORT'(A,p,q)

5pq+ 1

**a****. **Argue that QUICKSORT'(*A*, 1, *length*[*A*]) correctly sorts the array *A*.

* b.* Describe a scenario in which the stack depth of QUICKSORT' is (

**c****. **Modify the code for QUICKSORT' so that the worst-case stack depth is (1g *n*).

One way to improve the RANDOMIZED-QUICKSORT procedure is to partition around an element *x* that is chosen more carefully than by picking a random element from the subarray. One common approach is the * median-of-3* method: choose