INF: How to Dynamically Number Rows in a Select Statement

ID: Q186133

The information in this article applies to:

Microsoft SQL Server versions 4.2x, 6.0, 6.5

SUMMARY

Row numbering or ranking is a typical, procedural issue. The solutions are generally based on loops and temporary tables and, therefore, SQL Server loops and cursors. This article describes how to dynamically rank rows when performing a SELECT statement. It shows a flexible method, which may be the only possible solution. It is faster than the procedural solution.

MORE INFORMATION

This technique is based on an AUTO JOIN. The chosen relationship is generally "is greater than." The goal is to count how many times each element of a given set of data fulfills the relationship "is greater than" when the set is compared to itself. The following examples are based on database pubs.

Example 1

Set 1 is authors.
Set 2 is authors.
The relationship is "last and first names are greater than."
We avoid the duplicate problem by comparing the first + last names to the other first + last names.
We count the number of times the relationship is fulfilled by count(*).

Query:


   select rank=count(*), a1.au_lname, a1.au_fname
   from authors a1, authors a2
   where a1.au_lname + a1.au_fname >= a2.au_lname + a2.au_fname
   group by a1.au_lname, a1.au_fname
   order by 1

Result:


   Rank        Au_Lname              Au_Fname
   ----        --------------        -----------
     1         Bennet                Abraham
     2         Blotchet-Halls        Reginald
     3         Carson                Cheryl
     4         DeFrance              Michel
     5         del Castillo          Innes
     6         Dull                  Ann
     7         Greene                Morningstar
     8         Green                 Marjorie
     9         Gringlesby            Burt
    10         Hunter                Sheryl
    11         Karsen                Livia
    12         Locksley              Charlene
    13         MacFeather            Stearns
    14         McBadden              Heather
    15         O'Leary               Michael
    16         Panteley              Sylvia
    17         Ringer                Albert
    18         Ringer                Anne
    19         Smith                 Meander
    20         Straight              Dean
    21         Stringer              Dirk
    22         White                 Johnson
    23         Yokomoto              Akiko

   (23 row(s) affected)

Example 2

We want to rank stores by the number of books they have sold.

Set 1 is the number of books sold by store: select stor_id, qty=sum(qty) from sales group by stor_id.
Set 2 is the number of books sold by store: select stor_id, qty=sum(qty) from sales group by stor_id.
The relationship is "the number of books is greater than".
To avoid duplicates we could compare price*qty instead of qty for instance.

Query:


   select rank=count(*), s1.stor_id, qty=sum(s1.qty)
   from (select stor_id, qty=sum(qty) from sales group by stor_id) s1,
        (select stor_id, qty=sum(qty) from sales group by stor_id) s2
   where s1.qty >= s2.qty
   group by s1.stor_id
   order by 1

Result:


   Rank     Stor_Id    Qty
   ----     -------    ---
   1         6380        8
   2         7896      120
   3         8042      240
   4         7067      360
   5         7066      625
   6         7131      780

   (6 row(s) affected)

This example and some that follow only work in Microsoft SQL Server version 6.5, because they use derived tables in the FROM clause.

Example 3

We want to rank the publishers by their earnings.

Set 1 is the total sales by publisher:


        select t.pub_id, sales=sum(s.qty*t.price)
        from sales s, titles t
        where s.title_id=t.title_id
          and t.price is not null
        group by t.pub_id

Set 2 is the total sales by publisher:


        select t.pub_id, sales=sum(s.qty*t.price)
        from sales s, titles t
        where s.title_id=t.title_id
          and t.price is not null
        group by t.pub_id

The relationship is "earns more money than."

Query:


   select rank=count(*), s1.pub_id, sales=sum(s1.sales)
   from    (select t.pub_id, sales=sum(s.qty*t.price)
           from sales s, titles t
           where s.title_id=t.title_id
             and t.price is not null
           group by t.pub_id) s1,
           (select t.pub_id, sales=sum(s.qty*t.price)
           from sales s, titles t
           where s.title_id=t.title_id
             and t.price is not null
           group by t.pub_id) s2
   where s1.sales>= s2.sales
   group by s1.pub_id
   order by 1

Result:


   Rank     Pub_Id   Sales
   ----     ------   --------
   1         0736    1,961.85
   2         0877    4,256.20
   3         1389    7,760.85

   (3 row(s) affected)

Drawbacks

Because of the CROSS JOIN, this is not designed for working with a large number of rows. It works well for tens or hundreds of rows. On large tables, be sure to use an index to avoid large scans.

This does not work well with duplicate values. In other words, when comparing duplicate values, discontinuous row numbering occurs. If this is not the desired behavior, you can avoid it by hiding the rank column when you insert the result in a spreadsheet; use the spreadsheet numbering instead.

Example:


   select rank=count(*), s1.title_id, qty=sum(s1.qty)
   from (select title_id, qty=sum(qty) from sales group by title_id) s1,
        (select title_id, qty=sum(qty) from sales group by title_id) s2
   where s1.qty >= s2.qty
   group by s1.title_id
   order by 1

Result:


   Rank    Title_Id    Qty
   ----    --------    ----
   1       MC2222        10
   4       BU1032        60
   4       BU7832        60
   4       PS3333        60
   7       PS1372       140
   7       TC4203       140
   7       TC7777       140
   10      BU1111       250
   10      PS2106       250
   10      PS7777       250
   11      PC1035       330
   12      BU2075       420
   14      MC3021       560
   14      TC3218       560
   15      PC8888       750
   16      PS2091      1728

   (16 row(s) affected)

Benefits

These queries can be used in views and result formatting. The lower-ranked data can be shifted farther to the right.

Example 1:


   CREATE VIEW v_pub_rank
   AS
   select rank=count(*), s1.title_id, qty=sum(s1.qty)
   from (select title_id, qty=sum(qty) from sales group by title_id) s1,
        (select title_id, qty=sum(qty) from sales group by title_id) s2
   where s1.qty >= s2.qty
   group by s1.title_id

Query:


   select  publisher=convert(varchar(20),replicate (' ', power(2,rank)) +
           pub_id +
           replicate(' ', 15-power(2,rank))+': '),
           earnings=qty
   from v_pub_rank

Result:


   Publisher       Earnings
   -------------   --------
     0736          : 1,961.85
       0877        : 4,256.20
           1389    : 7,760.85

Example 2:


   CREATE VIEW v_title_rank
   AS
   select rank=count(*), s1.title_id, qty=sum(s1.qty)
   from (select title_id, qty=sum(qty) from sales group by title_id) s1,
        (select title_id, qty=sum(qty) from sales group by title_id) s2
   where s1.qty >= s2.qty
   group by s1.title_id

Query:


   select  Book=convert(varchar(45),replicate (' ', 2*rank) +
           title_id +
           replicate(' ', 35-2*rank)+': '),
           qty
   from v_title_rank
   order by rank

Result:


   Book                                          Qty
   -------------------------------------------   ----
     MC2222                                 :      10
           BU1032                           :      60
           BU7832                           :      60
           PS3333                           :      60
                 PS1372                     :     140
                 TC4203                     :     140
                 TC7777                     :     140
                       BU1111               :     250
                       PS2106               :     250
                       PS7777               :     250
                         PC1035             :     330
                           BU2075           :     420
                               MC3021       :     560
                               TC3218       :     560
                                 PC8888     :     750
                                   PS2091   :    1728

   (16 row(s) affected)

Additional query words: prodsql join rank formatting query

Keywords : SSrvTran_SQL
Version : WinNT:4.2x,6.0,6.5
Platform : winnt
Issue type : kbhowto