Monday, January 10, 2011

Domain Function Example: "Difference Between" in Query


So far in this series on Domain Functions, I've discussed the general syntax (Domain Functions Demystified) and problems involved in building criteria expressions (Domain Functions Demystified: Criteria Expressions). Unfortunately, many of the examples I've given are relatively trivial. So for my next few blog posts, I thought I'd give what I consider truly useful applications of domain functions.

Other Examples:
  1. Simulate AutoNumber with DMax
  2. Running Sum with DSum
  3. Numbered Query with DCount
  4. Rolling Average with DAvg and DCount
  5. Begin Date and End Date from Effective Date
Earlier, I looked at the Running Sum query. A similar (but opposite) problem is the Difference Between query. Unlike a running sum, which adds the value of a field in a record to the value of the same field in the previous record, the "difference between" query subtracts the value of the previous record from the value of the current record to show the difference.

For instance, suppose I wanted to display the difference in days between orders in the table below:
 
Figure1: Need to calculate the difference between records.

The difference in days between records 1 and 2 is 4, between 2 and 3 is 7, between 3 and 4 is -22, and so forth.

The problem is that SQL does not have positional notation like Excel does. There's no way to simply point to the record above the one you're on. The only way to do it is to somehow identify the previous record in terms of a Where condition. Since this Where condition must be evaluated for each line, I can do this with a domain aggregate functions (DMax & DLookup).

For this method to work, I must have a unique record ID. The Autonumber field is ideal for this. It doesn't matter if there are gaps in the sequence, but I have to sort on this field, so there cannot be duplicates and they must be in the order I need displayed. In the above sample, OrderDetailsID fits the bill.

The technique is similar to creating a Numbered Query or a Running Sum in a query, but instead of just counting or summing all the records above the current record, I have to find just the previous record. This adds an additional complication, which requires an additional domain function.

Specifically, I need three steps:
  1. Use a domain function (DMax) to find the unique identifier the previous row.
  2. Then feed that value in to another domain function (DLookup) that identifies the previous value,
  3. And then subtract the previous value from the current value.
One way to show the difference between records is over the whole recordset. Later, I'll look at showing the difference between over groups of records.

Difference Between - Over All
Domain Aggregate functions are an Access-only method to return statistical information about a specific set of records, whether from a table or query. They have three arguments: 1) an expression that identifies a field, 2) a string expression that identifies a domain (that is, the table or query), and 3) a Criteria, which is essentially an SQL Where clause without the word WHERE.

Here are the steps:
  1. I need a DMax function to return the OrderDetailID of the previous record in the table:
    DMax("OrderDetailID","tblOrderDetails","OrderDetailID < " & [OrderDetailID])
  2. Next, I'll feed that to a DLookup function, which will return the date value from the previous row given number of records.
    DLookUp("OrderDate","tblOrderDetails","OrderDetailID = " & DMax("OrderDetailID","tblOrderDetails","OrderDetailID < " & [OrderDetailID]))
  3. Lastly, I'll subtract the this value from [OrderDate]of the current record and give the column an alias:[OrderDate]-DLookUp("OrderDate","tblOrderDetails","OrderDetailID = " & DMax("OrderDetailID","tblOrderDetails","OrderDetailID < " & [OrderDetailID])) AS DaysBetween
The full query, looks like this:

SELECT tblOrderDetails.OrderDetailID, tblOrderDetails.OrderID, tblOrderDetails.OrderDate, nz([OrderDate]-DLookUp("OrderDate","tblOrderDetails","OrderDetailID = " & nz(DMax("OrderDetailID","tblOrderDetails","OrderDetailID < " & [OrderDetailID]),0)),0) AS DaysBetween
FROM tblOrderDetails
ORDER BY tblOrderDetails.OrderDetailID;

The two NZ() functions are needed to display a zero on the first line. Otherwise, it would return an ERROR.

The Order By clause in the query is important. This will sort the query on the OrderDetailID field. I'll need to have that order to use the criteria argument in the DMax.

It is not necessary that the Order By field is an unbroken sequence. As long as that field has unique values and is sorted, it will work.

Figure 2: Shows the time difference in days between subsequent records.


Difference Between - Over Group
 SELECT tblOrderDetails.OrderDetailID, tblOrderDetails.OrderID, tblOrderDetails.OrderDate,
Nz([OrderDate]-DLookUp("OrderDate","tblOrderDetails","OrderDetailID = " & Nz(DMax("OrderDetailID","tblOrderDetails","OrderID = " & [OrderID] & " And OrderDetailID < " & [OrderDetailID]),0)),0) AS DaysBetween
FROM tblOrderDetails
ORDER BY tblOrderDetails.OrderDetailID;

Figure 3: As the OrderID changes, the DaysBetween resets to zero.

The only difference between this query and the OverAll query is the addition of

"OrderID = " & [OrderID]

to the "Where" condition of the DMax function. This sets the value of the first record of each group to zero.

Subquery Method

As I said, these can also be done with a correlated subquery, which I may discuss at a later date. However, you can find both methods on my website in this sample: DaysBetween.mdb

No comments: