Tuesday, June 14, 2011

Normalizing Repeated Columns: Yes/No Fields (Part2)

Last time, in Normalizing Repeated Columns: Yes/No Fields (Part1) I talked about Normalizing data in repeated columns from a spreadsheet, like this:


Into a structure like this:





Using a One-Time process that creates the tables and relationships in addition to copying the data.  This time, I’ll talk about appending data to existing tables.
 

Appending to Existing Tables

The process for normalizing repeated column data into Existing Tables is similar to the One-Time process. It does, however, require you to create the tables and relationships correctly ahead of time. Because I've just created them in the previous section, I'll simply delete the data and re-append it from the linked spreadsheet.
The two processes are very similar, but I'll be using Append queries rather than Make-Table queries.
Step 1: Remove the common Patient elements.
Since I've already got a table, I'll use an Append query rather than the Make-Table query I used earlier:

INSERT INTO Patient ( Patient )
SELECT Patient
FROM Patient_RC
GROUP BY Patient;


Or in the Query Builder:


The resulting table looks like this:


Since I already have an autonumber primary key defined on the table, I don't need to do anything further.
Step 2: Create Symptom_Temp table
This step is exactly the same as Step 2 above. I'll use a Make-Table query to create a temporary table that will be used for creating the relationships later.

SELECT Patient, IIf([cough],"Cough") AS Symptom
FROM Patient_RC WHERE [Cough] = True
UNION ALL
SELECT Patient, IIf([Sneeze],"Sneeze") AS Symptom
FROM Patient_RC WHERE [Sneeze] = True
UNION ALL
SELECT Patient, IIf([Fever],"Fever") AS Symptom
FROM Patient_RC WHERE [Fever] = True
UNION ALL
SELECT Patient, IIf([Body_Aches],"Body aches") AS Symptom
FROM Patient_RC WHERE [Body_Aches] = True
UNION ALL SELECT Patient, IIf([Nausea],"Nausea") AS Symptom
FROM Patient_RC WHERE [Nausea] = True
ORDER BY Patient, Symptom;


As I showed in the previous post, there are two ways to make this into an Append query: 1) a Stacked Query, and 2) All In One.

Stacked Query

A stacked query is simply a query which uses another query in the FROM clause instead of a table. So if I save the above query as "qryTemp", I can use that query in the FROM clause of an Append query:

INSERT INTO Symptom_temp ( Patient, Symptom )
SELECT Patient, Symptom
FROM qryTemp;


Or in the Query Builder:



All-in-one Query

It's also possible to it in a single query. To do that, I surround the SQL of qryTemp in parentheses and use it in the From clause of the Append query. Like this:

INSERT INTO Symptom_temp ( Patient, Symptom )
SELECT qryTemp.Patient, qryTemp.Symptom
FROM
(SELECT Patient, IIf([cough],"Cough") AS Symptom
FROM Patient_RC WHERE [Cough] = True
UNION ALL
SELECT Patient, IIf([Sneeze],"Sneeze") AS Symptom
FROM Patient_RC WHERE [Sneeze] = True
UNION ALL
SELECT Patient, IIf([Fever],"Fever") AS Symptom
FROM Patient_RC WHERE [Fever] = True
UNION ALL
SELECT Patient, IIf([Body_Aches],"Body aches") AS Symptom
FROM Patient_RC WHERE [Body_Aches] = True
UNION ALL
SELECT Patient, IIf([Nausea],"Nausea") AS Symptom
FROM Patient_RC WHERE [Nausea] = True
ORDER BY Patient, Symptom) AS qryTemp;


The result will be a table that looks like this:


Step 3: Remove Duplicates from Symptom_Temp
To fulfill 3NF, I need to create a query that removes the duplicate Symptom values into the Symptom table. The GROUP BY clause works well for that. . As before, I'll use an Append write the data to my existing Software table:

INSERT INTO Symptoms ( Symptom )
SELECT Symptom
FROM Symptom_temp
GROUP BY Symptom;


Or in the Query Builder:



With the resulting table:



Again, this table already has an autonumber primary key, so I don't need to create one.
Step 4: Creating the Linking Table: Patient_Symptoms
So now, I've got my two "One-Side" tables: Patient and Symptoms. Now I just need to fill the linking table.

I can do that with an Append Query and a simple join of Patient, Symptoms and Symptom_Temp:

INSERT INTO Patient_Symptoms ( PatientID, SymptomID )
SELECT Patient.PatientID, Symptoms.SymptomID
FROM (Patient INNER JOIN Symptom_temp ON Patient.Patient = Symptom_temp.Patient)
INNER JOIN Symptoms ON Symptom_temp.Symptom = Symptoms.Symptom;


Or in the Query Builder:



The final result will look like this:



It is important to note that the order in which the data is moved is vital. Data must be written into the "one-side" tables (Patient and Symptoms) first, and then data can be moved into the "linking" tables (Patient_ Symptom).

This two-part post is available as a single download (with sample database) here:
http://www.rogersaccesslibrary.com/forum/topic564.html











































No comments: