Monday, April 4, 2011

Querying Repeated Columns: Multiple IIFs

In the first post in this series (The Problem of Repeated Columns), I defined repeated columns and talked about the data integrity problems associated with them. If you haven't read that yet, it would be worthwhile to read first. Similarly, if you are not familiar with the concept of Normalization, you should read my blog series What is Normalization?

What I'm going to concentrate in the next few post are specific problems with querying data stored in repeated columns. This time, it's Multiple IIF statements.

Others in this series:

Multiple IIF Statements

Another sort of repeated columns can be seen by the use of multiple Yes/No fields, each of which are just values of a specific category. For instance, suppose I had a patient table with a listing of symptoms as Yes/No fields. Like this:

Figure 1: Patient table with Yes/No fields representing symptoms.


If I wanted to create a simple list of patient's symptoms, in addition to the multiple Unions, I need multiple IIF statements to convert the field names into values. Something like this:

SELECT Patient, IIf([cough] = True,"Cough") AS Symptom
FROM Patient_RepeatedColumns WHERE [Cough] = True
UNION ALL
SELECT Patient, IIf([Sneeze] = True,"Sneeze") AS Symptom
FROM Patient_RepeatedColumns WHERE [Sneeze] = True
UNION ALL
SELECT Patient, IIf([Fever] = True,"Fever") AS Symptom
FROM Patient_RepeatedColumns WHERE [Fever] = True
UNION ALL
SELECT Patient, IIf([Body_Aches] = True,"Body aches") AS Symptom
FROM Patient_RepeatedColumns WHERE [Body_Aches] = True
UNION ALL
SELECT Patient, IIf([Nausea] = True,"Nausea") AS Symptom
FROM Patient_RepeatedColumns WHERE [Nausea] = True
ORDER BY Patient, Symptom;


By contrast, if I normalized the table to convert the repeated columns to repeated rows, it might look something like this:

Figure 2: Repeated columns converted to repeated rows in Symptoms table.


Creating the patient/symptom list is now simple:

SELECT Patient, Symptom
FROM Symptoms INNER JOIN Patient ON Symptoms.PatientID = Patient.PatientID
ORDER BY Patient, Symptom;


Or in the query builder:

Figure 3: Normalized query in the Query Builder.


The results of the queries are identical except the normalized query is updateable:

Figure 4: Results of both queries


Once again, querying repeated columns proves much more complicated and much less flexible than querying a normalized table structure.
















2 comments:

Tim Green said...

Roger,

I just stumbled across your blog looking for something else, but by coincidence I've dealt with a similar situation to the one you describe in the last week. I have an Excel table with products along row 1, dates in column 1 and populated with prices.

Many times the database designer has no control over data inputs - my spreadsheet comes from an industry website, for example, updated weekly.

Some discussion on efficient ways to normalise this kind of data might be helpful. My way was to write a vba function that imports the non-normal spreadsheet then iterates through the columns executing an append query into my normalised prices table.

I would rather saw off my right arm than write the kind of union query you described! I guess you would agree...

Thanks,

Tim

Roger Carlson said...

Tim,

That's a great idea for a follow-up series. Thanks.

--Roger