I've been reading up on 4th and 5th normal form, and two days ago, I had a Eureka! moment regarding how to design my database, which does need to be in fifth normal form. For those interested, those were my sources:
Simple Guide to Five Normal Forms
Wikipedia on Database Normalization (Note that 4th and 5th normal has their own article linked within that article)
University of Texas Data Modeling
Now, the trouble with all of those sources is they discuss about what we want to *see*, not the *actual* structure of fourth or fifth normal form. The articles talks about what the result would be like, presumably using a query, but doesn't describe the structure needed to achieve 5NF.
My assumption was that because we obviously need to join each pair to make "add one data once" possible, so we need three junction tables for each possible pair of the threesome. Let's consider a group:
Contract- All different contracts my company have and will have
Service - All service provided by the contract.
Detail - All requirements needed by the service to satisfy the contract.
Mind you, a same service can be used in two different contract, but has a entirely different set of detail because detail (aka requirement) depends on the contract's stipulation. So we have three pairwise junction:
Contract-Service
Service-Detail
Contract-Detail
The threesome junction table (Contract-Service-Detail) will have children, where I can actually input real data (not just metadata describing what kind of data I need to have, which the threesome junction table is for).
But I'm starting to confuse myself because I need the data to be in a table, not a query as it will have children. But if it's going to be in a table, do I still need to have pairwise junctions?
I made myself an "administrative form" to help with managing the three tables. The way I had it set up was to start with Contract, then have Service as a subform, which in turn would have the threesome junction table as recordsource for second subform but showing only the Detail. In AfterUpdate, I then add the Detail to both Contract-Detail and Service-Detail.
I did thought that I could have made it easier and just use that threesome junction table as a recordsource and add records directly to the table, but I realized that it would defeat the point of 5NF; you only need to add one data for a pair, and the other pair will take care of themselves. I am bit concerned that the present setup as this 1) tells me that I could be overcomplicating it as I have three simple one-many relationship in a hierarchical fashion (but that cannot be correct because my intention is to be able to re-use service for different contract, and allow different details for same service for different contracts), and 2) I'm thinking I'm supposed to set my form up so there's three datasheets for each pairwise junction tables, as this would be exactly the benefit of 5NF; add one data to one place and it will "propogate" to other junction tables; yet I need the threesome junction in a table....
I also did a test; I made a query pulling all records from the threesome junction table and another query pulling records based on the three tables and three pairwise junction tables. Both had different results. I know that one of source cautioned that set of tables not in 5NF will produce spurious results. Now, I'd probably expect that from the threesome junction table, but that's not the case. The query of three tables/three pairwise junction tables returns 23 records while the query of threesome table returns 22 results. Did that mean I did something wrong here?
I know this is quite a mouthful, but if you're thinking "WTF is he talking about?", please say so and let me know how I can clarify.
Simple Guide to Five Normal Forms
Wikipedia on Database Normalization (Note that 4th and 5th normal has their own article linked within that article)
University of Texas Data Modeling
Now, the trouble with all of those sources is they discuss about what we want to *see*, not the *actual* structure of fourth or fifth normal form. The articles talks about what the result would be like, presumably using a query, but doesn't describe the structure needed to achieve 5NF.
My assumption was that because we obviously need to join each pair to make "add one data once" possible, so we need three junction tables for each possible pair of the threesome. Let's consider a group:
Contract- All different contracts my company have and will have
Service - All service provided by the contract.
Detail - All requirements needed by the service to satisfy the contract.
Mind you, a same service can be used in two different contract, but has a entirely different set of detail because detail (aka requirement) depends on the contract's stipulation. So we have three pairwise junction:
Contract-Service
Service-Detail
Contract-Detail
The threesome junction table (Contract-Service-Detail) will have children, where I can actually input real data (not just metadata describing what kind of data I need to have, which the threesome junction table is for).
But I'm starting to confuse myself because I need the data to be in a table, not a query as it will have children. But if it's going to be in a table, do I still need to have pairwise junctions?
I made myself an "administrative form" to help with managing the three tables. The way I had it set up was to start with Contract, then have Service as a subform, which in turn would have the threesome junction table as recordsource for second subform but showing only the Detail. In AfterUpdate, I then add the Detail to both Contract-Detail and Service-Detail.
I did thought that I could have made it easier and just use that threesome junction table as a recordsource and add records directly to the table, but I realized that it would defeat the point of 5NF; you only need to add one data for a pair, and the other pair will take care of themselves. I am bit concerned that the present setup as this 1) tells me that I could be overcomplicating it as I have three simple one-many relationship in a hierarchical fashion (but that cannot be correct because my intention is to be able to re-use service for different contract, and allow different details for same service for different contracts), and 2) I'm thinking I'm supposed to set my form up so there's three datasheets for each pairwise junction tables, as this would be exactly the benefit of 5NF; add one data to one place and it will "propogate" to other junction tables; yet I need the threesome junction in a table....
I also did a test; I made a query pulling all records from the threesome junction table and another query pulling records based on the three tables and three pairwise junction tables. Both had different results. I know that one of source cautioned that set of tables not in 5NF will produce spurious results. Now, I'd probably expect that from the threesome junction table, but that's not the case. The query of three tables/three pairwise junction tables returns 23 records while the query of threesome table returns 22 results. Did that mean I did something wrong here?
I know this is quite a mouthful, but if you're thinking "WTF is he talking about?", please say so and let me know how I can clarify.