5th Normal Form Applied

Banana

split with a cherry atop.
Local time
Today, 14:29
Joined
Sep 1, 2005
Messages
6,318
I've been reading up on 4th and 5th normal form, and two days ago, I had a Eureka! moment regarding how to design my database, which does need to be in fifth normal form. For those interested, those were my sources:

Simple Guide to Five Normal Forms
Wikipedia on Database Normalization (Note that 4th and 5th normal has their own article linked within that article)
University of Texas Data Modeling

Now, the trouble with all of those sources is they discuss about what we want to *see*, not the *actual* structure of fourth or fifth normal form. The articles talks about what the result would be like, presumably using a query, but doesn't describe the structure needed to achieve 5NF.

My assumption was that because we obviously need to join each pair to make "add one data once" possible, so we need three junction tables for each possible pair of the threesome. Let's consider a group:

Contract- All different contracts my company have and will have
Service - All service provided by the contract.
Detail - All requirements needed by the service to satisfy the contract.

Mind you, a same service can be used in two different contract, but has a entirely different set of detail because detail (aka requirement) depends on the contract's stipulation. So we have three pairwise junction:

Contract-Service
Service-Detail
Contract-Detail

The threesome junction table (Contract-Service-Detail) will have children, where I can actually input real data (not just metadata describing what kind of data I need to have, which the threesome junction table is for).

But I'm starting to confuse myself because I need the data to be in a table, not a query as it will have children. But if it's going to be in a table, do I still need to have pairwise junctions?

I made myself an "administrative form" to help with managing the three tables. The way I had it set up was to start with Contract, then have Service as a subform, which in turn would have the threesome junction table as recordsource for second subform but showing only the Detail. In AfterUpdate, I then add the Detail to both Contract-Detail and Service-Detail.

I did thought that I could have made it easier and just use that threesome junction table as a recordsource and add records directly to the table, but I realized that it would defeat the point of 5NF; you only need to add one data for a pair, and the other pair will take care of themselves. I am bit concerned that the present setup as this 1) tells me that I could be overcomplicating it as I have three simple one-many relationship in a hierarchical fashion (but that cannot be correct because my intention is to be able to re-use service for different contract, and allow different details for same service for different contracts), and 2) I'm thinking I'm supposed to set my form up so there's three datasheets for each pairwise junction tables, as this would be exactly the benefit of 5NF; add one data to one place and it will "propogate" to other junction tables; yet I need the threesome junction in a table....

I also did a test; I made a query pulling all records from the threesome junction table and another query pulling records based on the three tables and three pairwise junction tables. Both had different results. I know that one of source cautioned that set of tables not in 5NF will produce spurious results. Now, I'd probably expect that from the threesome junction table, but that's not the case. The query of three tables/three pairwise junction tables returns 23 records while the query of threesome table returns 22 results. Did that mean I did something wrong here?

I know this is quite a mouthful, but if you're thinking "WTF is he talking about?", please say so and let me know how I can clarify.
 
For me, you don't have to clarify the 5th normal form. I 've worked for an insurance comapny and they had a database designed in the 5th NF. It was extremely flexible, you can store everything, all possible links are easily set.
However...
It is also vey difficult to maintain, it slows your application down because you have to join all the tables to get the information you want. Having done that once it is very tedious to update or change.
I never get passed the Boyce-Codd NF (BCNF), which is the NF between the third and the fourth.

Enjoy!
 
Hm, I wasn't trying to teach anyone about 5NF. I was asking some questions because I was uncertain about some points. I threw in sources to give a idea of what I've been reading, in case the sources could be in error, or did not address a point that is tripping me up. Furthermore, I'm not talking about joining the tables. I need a *actual* table of the joins because this table will have children, but the articles doesn't discuss about the table structure. Here are my questions edited out from the OP.

But I'm starting to confuse myself because I need the data to be in a table, not a query as it will have children. But if it's going to be in a table, do I still need to have pairwise junctions?

I am bit concerned that the present setup as this 1) tells me that I could be overcomplicating it as I have three simple one-many relationship in a hierarchical fashion (but that cannot be correct because my intention is to be able to re-use service for different contract, and allow different details for same service for different contracts), and 2) I'm thinking I'm supposed to set my form up so there's three datasheets for each pairwise junction tables, as this would be exactly the benefit of 5NF; add one data to one place and it will "propogate" to other junction tables; yet I need the threesome junction in a table....

That paragraph didn't sound like a question; rephrasing...

If I could just make a form using the threeway junction table as the recordsource, doesn't that make point of having a pairwise junction table moot?

I also did a test; I made a query pulling all records from the threesome junction table and another query pulling records based on the three tables and three pairwise junction tables. Both had different results. I know that one of source cautioned that set of tables not in 5NF will produce spurious results. Now, I'd probably expect that from the threesome junction table, but that's not the case. The query of three tables/three pairwise junction tables returns 23 records while the query of threesome table returns 22 results. Did that mean I did something wrong here?
 
I also did a test; I made a query pulling all records from the threesome junction table and another query pulling records based on the three tables and three pairwise junction tables. Both had different results. I know that one of source cautioned that set of tables not in 5NF will produce spurious results. Now, I'd probably expect that from the threesome junction table, but that's not the case. The query of three tables/three pairwise junction tables returns 23 records while the query of threesome table returns 22 results. Did that mean I did something wrong here?

Examining things closer with a fresh mind, it became apparent that there was one extra record in the query of three tables/three pairwise junction because I had deleted one record from the three-way junction table, which in turns delete the record from one pairwise junction table because I had Cascade Delete enabled. But this does not delete the record from other pairwise junction table containing the record that I just deleted.

Reflecting on this fact, I now understand that my administrative form was set up incorrectly. This is because I have it set up as a form with two nested subform:

1st Level -- tblContract
2nd Level -- tblContractService
3rd Level -- tblContractServiceDetail (with VBA to add new records to tblContractDetail and tblServiceDetail)

While this was a great setup for viewing how everything is structured, this simply doesn't work wtih 5NF's proposition that you should add/delete from one pairwise junction table which will take care of the remaining joins. For adding/updating/deleting, I would need the form to be set up as three sibling subform (e.g. not nested but set side by side)

tblContractService -- tblServiceDetail -- tblContractDetail

But this creates a new problem: How do I keep the three way junction table up to date? As I said earlier, I can't use a query because I need to have children based on that the three way join.

Any insights?
 
If I could just make a form using the threeway junction table as the recordsource, doesn't that make point of having a pairwise junction table moot?
Apparently i have to do some reading because i don't know what a threeway junction table is and a pairwisw junction table. I'll check out your links. Because i am interested in the subject.
 
Ok, just so you know, the terms aren't in the sources I referred. As I said earlier, the sources talks about what we want to "see", as in a query, rather the actual table structure. Therefore I had to infer based on what they showed (as a query) that I would need three pairwise junction... Here's an example.

Let's say I want to track which salesman sells what manufacturer's car body type, even though manufacturer can make more than one kind of body type, and not all salesman may sell all of a manufacturer's models.

So that's three entities:

Salesman
Manufacturer
Car Body Type

To achieve 5NF, we need to have three pairwise junction tables:

Salesman-Manufacturer
Manufactuer-Car Body Type
Salesman-Car Body Type

But for final analysis, when we want to know who sells what kind of car from what manufacturer, we use a query that joins all three things together:

Salesman-Manufacturer-Car Body Type.

The three way joins ensure that only records for car body type that are true for *both* salesman (e.g. he does sell this kind), and manufacturer (e.g. they actually make it), and with the pairwise junction tables above, you only need to add single instance of record if anything changes. Say salesman now sells truck among with his car. You add it to the Salesman-Car Body Type. The query then will update correctly to show all available trucks he sells based on what manufacturer he sells for among with cars he already sells.

But that's where the sources stop. There's no talk about keeping the 5NF in a three way junction table. And I need my three way junction table to have children. Hence my confusion.

~~~~~~~~~​

Now, after considering the problem, it occurred to me that 5NF may not be suitable because 5NF, unlike 1NF-4NF is all about join dependency, not function dependency. It's talking about how much you can decompose your tables to a point where you can reconstruct everything using join. 4NF is about separating our multi-valued function dependency so there's no update/delete anomaly.

Now going back to my administrative form, I built it with two nested subforms because that was exactly what I wanted to see.

Code:
Contract
   Service
       Detail

It struck me that if the form is *correct* in representing what I want to see, I need my table to be structured thus:

tblContract M:M tblService = tblContractService

tblDetail M:M tblContractService = tblContractServiceDetail

Now we have something that we can use to base children off, and not suffer from data anomaly, and I don't need to do any VBA to update the two other pairwise junction table (e.g. ContractDetail and ServiceDetail), because the Detail's meaning compared with each is uninteresting. It's only with the *pair* of Contract and Service does it come to mean something.

A tangent: Does it makes sense to design the form first before structuring tables in future, to see what you want to structure tables like? When I did that, it became a bit more clearer that 5NF isn't what I need here. This is maybe just 3NF or 4NF (not sure quite).
 
Yet another thought provoking thread from Banana. It seems to me that normally we need a db to be at least 3NF and after that then it depends on the complexity of the data and application as to how much further you need to go. Reading some of the articles referenced by in the OP it seems that the difference between 4nF and 5NF is small and may be neglected depending on the application. My advice is to make sure that the structure is taylored to the application but then I am a pragmatist.
 
Yep. It also state that most of time, a form in 4NF is already in 5NF save only for exceptional cases.
 

Users who are viewing this thread

Back
Top Bottom