MDAC 2.5 SDK - OLE DB Programmer's Reference
Chapter 26: Mapping MDX to SQL Statements


 

EXCEPT Function

This section explains how the EXCEPT function handles duplicates.

Removing Duplicates

Consider EXCEPT(S1, S2), where:

S1 = {Kansas, Buffalo, Buffalo, Topeka, Topeka, Wichita, Canada, BC}

S2 = {Buffalo, Topeka, USA, Quebec}

As shown in the section "Literal Sets," create two tables—S1 and S2—that contain the fully qualified member names in the sets S1 and S2, respectively.

The following query does the EXCEPT operation:

SELECT Name, Newrank AS Rank
FROM 
         (((SELECT Name, Rank FROM S1
         WHERE S1.Name NOT IN (SELECT Name FROM S2))
      RANKTOP 1 LEVELS RANKORDER BY Rank RANKGROUP BY Name)
   RANK ROWS AS NewRank RANKORDER BY Rank)
ORDER BY Rank

In English, the query does the following:

  1. Selects the rows of S1 that are not in S2.

  2. The RANKTOP operation groups duplicates into separate groups by using the GROUP BY clause, orders each group by Rank, and then picks up the first one. This ensures that duplicates are eliminated from the tail.

  3. The RANK operation orders the filtered rows by Rank and assigns a new rank to each row so that input ordering is preserved.

Note   You cannot use the EXCEPT operation of SQL-92 because duplicate elimination is not guaranteed to remove duplicates from the tail.

Retaining Duplicates

Consider EXCEPT(S1, S2, [ALL]). As in the preceding section, create the tables S1 and S2. Then use the following query:

SELECT Name, Newrank AS Rank
FROM 
      ((SELECT Name, Rank FROM S1
      WHERE S1.Name NOT IN (SELECT Name FROM S2))
   RANK ROWS AS NewRank RANKORDER BY Rank)
ORDER BY Rank

This is simpler because there is no need to remove duplicates. After the query that finds the difference, the RANK operation generates a rank value for each of the rows of the resulting table. The outer SELECT produces this table, ordering it by the NewRank (renamed as Rank) column.

Note   You cannot use EXCEPT ALL from SQL-92 because its semantics are as follows: If a row r appears n times in S1 and m times in S2, then r appears exactly p times in the result, where p is the greater of m – n and 0. Moreover, the order of these p rows is not preserved.

Generalizing for Sets with Arbitrary Dimensionality

Each tuple has more than one element for these sets. Replace the NAME column in the select list of each SELECT statement above with a list of columns (equivalent to the number of dimensions in the input sets). Add as many NOT IN clauses in the WHERE as there are columns, one for each column.