Querying for the Second-Highest or -Lowest Value in a Group
This content is no longer actively maintained. It is provided as is, for anyone who may still be using these technologies, with no warranties or claims of accuracy with regard to the most recent product version or service release.
Querying for the Second-Highest or -Lowest Value in a Group
by Susan Sales Harkins
In the September 2000 issue of Inside Microsoft Access, we showed you how to return the second-highest or -lowest value in a recordset (see the article "Querying for the second-highest or -lowest value"). That technique is useful enough, but it's limited to single-record data. Grouped data requires another solution. This month, we'll show you a different query technique submitted by Stephen Bond of New Zealand that returns the second-highest or -lowest value for each group.
For this approach, we'll be working with two queries again. The first query is a simple Totals query that identifies the highest value in each group. Then, we'll base a query on the original table of data and the Totals query. By relating fields and specifying criteria, we're able to eliminate the records returned by the first query from the results of the second. In addition, this second query is also a Totals query that returns the highest value in each group. By eliminating the highest value in each group via criteria, the query can return the second-highest value. In last month's article "Overcome query limitations by storing criteria in an Access table," we used a table to establish query criteria for calculating students' letter grades. The student score example also lends itself well to the technique we'll look at this month, but keep in mind that you can apply the technique for any instance in which you need to return the second-highest or -lowest value from a group.
First things first
Let's suppose you're maintaining a group of students' various test scores in a database. Your table might resemble the one shown in Figure A, named tblScores. We listed three scores for each of four students identified by the numbers 1 through 4. With such a small number of records we can clearly determine the second-highest score for each student as follows:
- Student 1 80
- Student 2 75
- Student 3 70
- Student 4 98
Figure A: Our data contains three scores for four students.
The first query
Creating the first query is a quick task as it's a simple Totals query. First, create the tblScores table and populate it with the data shown in Figure A. Then, select tblScores in the Database window and choose Query from the New Object button's dropdown list. Double-click on Design View in the New Query dialog box to open a new query design grid. Add both fields to the grid and then choose View | Totals from the menu bar. Select the Max aggregate function from the Score field's Total cell, as shown in Figure B. Running this query produces the results shown in Figure C. Save the query as qryStepOne and close it.
Figure B: This Totals query will return the highest score for each student.
Figure C: The Max function returns just the highest student scores.
The second query
Our second query is more complex, but it's not difficult to create. First, base a second query on tblScores. Then, add qryStepOne by clicking the Show Table button on the Query Design toolbar, clicking on the Queries tab in the Show Table dialog box, and then double-clicking on qryStepOne. Next, click Close and drag the following fields to the grid: tblScores.StudentCode, qryStepOne.StudentCode and tblScores.Score.
Building the relationships
Now we need to relate the two data sources. First, drag StudentCode from the tblScores list box to StudentCode in the qryStepOne list box. Similarly, drag Score in tblScores to MaxOfScore in qryStepOne.
By default, we just created two INNER joins, and we need LEFT joins. That's because we want our query to return all the records from tblScores, not just those records where the related data is equal. Right-click on a relationship line and choose Join Properties from the shortcut menu. If the resulting menu doesn't have that command, close it and try again. In the Join Properties dialog box, select the second option button, and click OK. Be sure to do this for both relationship lines and ensure that both lines have an arrow pointing to the appropriate fields in qryStepOne, as shown in Figure D.
Figure D: The second query requires LEFT joins to properly return the results we want.
Adding the criteria
Running the query now would return all the records in tblScores, as shown in Figure E. That's why we need to filter the query results. To do so, add the criteria expression
to the qryStepOne.StudentCode field's Criteria cell. As a result, the query omits the records with the highest scores for each student, as shown in Figure F. In Figure E we saw that the maximum value records contained data in the qryStepOne.StudentCode field, therefore the Is Null operator omits them from the query's results.
Figure E: Without a criteria expression, the query returns all the records from tblScores.
Figure F: At this point, the query returns all but the maximum value record for each group.
Finishing the job
At this point, our query still returns more than one record for each student--we've eliminated only the highest value for each student. Fortunately, the last step is simple. We'll make the query a Totals query and return the maximum value in each group, which is, of course, the second highest value for each student. With the query still in Design view, simply choose View | Totals from the menu bar. Now, select the Max aggregate function from the tblScores.Score field's Total cell. Finally, run the query to produce the results shown in Figure G. As you can see, each student has just one record, and the score is the second highest for that student. We saved our second query as qryStepTwo. There's no need to run both queries--simply run qryStepTwo.
Figure G: The completed query returns the second-highest score for each student.
Straightening a kink
As is, the query technique works well. However, it might pose a problem if you have just one record in a group. For instance, if you acquired a new student and that student has as yet only one score, the qryStepTwo doesn't return a score for that student. This may or may not be what you want. The nature of such a query won't include a single record group, because there can't be a second-highest or -lowest value when there's only one record. There can be only a highest or lowest value--the only value in the group (which of course is both the highest and the lowest value).
Realistically, you'll probably want to include these values anyway, as it isn't likely you'll want to just drop these records. Fortunately, our existing queries can accommodate this need with just a few additional fields. First, add a new record--the first record for Student 5--and enter a score of 85. Then, open qryStepOne in Design view and add a field that counts the records in each group, as shown in Figure H. To do so, simply drag the StudentCode field to the grid again and choose the Count aggregate function in the Totals cell. Then, save and close the query.
Figure H: The Count aggregate counts the number of records in each group.
Now, open qryStepTwo in Design view and add the qryStepOne.CountOfStudentCode field to the grid. Next, enter the value 1 in that field's or cell, as shown in Figure I. Be sure you use the or cell and not the Criteria cell, or the query won't return the correct results. The results of this query are shown in Figure J. As you can see, the query now includes the single record for our fifth student, even though that student has only one score so far.
Figure I: Now the query will return records with a count of 1 in the CountOfStudentCode field.
Figure J: Our query now returns a record for each student even when there's only one record for that student.
Returning the second-lowest value
We promised you a technique that returns both the second-highest or -lowest value and our example returns just the second highest. You can return the second lowest by using the Min aggregate function instead of the Max aggregate function in both queries. You can substitute the Min function for the Max function in the existing queries, but you'll have to replace a field and re-create the relationship, and doing so can get confusing; it's easier to just create two new queries. In addition, keep in mind that with our sample data the second-lowest value is also the second-highest value. If you do try the technique to return the second-lowest value, don't let the fact that the resulting records are the same make you think you've made a mistake--they should be the same.
Most of the time, we tend to think of queries as a one-step process, but as we've shown, sometimes two is better than one. In the case of both our second-highest and -lowest techniques, it would be impossible to return the same results with one fixed query. You might resort to VBA code, but unless you just want a development solution, code isn't necessary.
Copyright © 2001 Element K Content LLC. All rights reserved. Reproduction in whole or in part in any form or medium without express written permission of Element K Content LLC is prohibited. Element K is a service mark of Element K LLC.