Given decomposition how to find it's in BCNF

darek_dade · Apr 9, 2009

Hello there

I am having a tough time figuring out how to do one of the questions on my assignment.

Given schema R with attributes {A,B,C,D,E,F,G,H} and functional dependencies:
{
G->B
AB->D
DE->C
GF->H
BD->E
AH->F
CE->A
AD->H
GF->D
}

I have to prove that the following decomposition:
R(A,B,C,D), S(C,D,E,F), and T(E,F,G,H) is in BCNF

First I need to find FD's for R, S and R and then check whether some FD violates BCNF

This is how I approached it
R(A,B,C,D; {AB->CD})
AB->CD because (AB)+ (closure) = ABDECHF (has C and D)

S(C,D,E,F; {DE->CF})
DE->CF because (DE)+ = DECAHF

T(E,F,G,H; {}) - here I have no idea how to find any FD

Does my solution for R and S seems plausable? How would you find a FD for T?

Any help is greatly appreciated!

gemma-the-husky · Apr 9, 2009

Can I be first?

I suspect most developers are are competent at normalisation empirically.

I may be wrong, suspect very few of us (myself included) are mathematically trained to a point where they could produce a formal proof of a generalised theorem in this way.

This sounds like a very advanced level course. What level is it?

darek_dade · Apr 9, 2009

gemma-the-husky said:
Can I be first?

Sure

gemma-the-husky said:
I suspect most developers are are competent at normalisation empirically.

I may be wrong, suspect very few of us (myself included) are mathematically trained to a point where they could produce a formal proof of a generalised theorem in this way.

This sounds like a very advanced level course. What level is it?

Well, the course is called Introduction to Databases so my guess is it's not supposed to be very advanced

.

Len Boorman · Apr 9, 2009

I am not sure but I would suggest that the normal exercise in functional dependancies would normally result in 3NF. Now a relation in BCNF is not necessarily in 3NF. You will remember that 3NF removes transient dependancies and that BCNF requires 2NF as a base and then that every attribute is a candidate key. Brain is reaching places it has not been for a while so I may not be 100% here.

So my approach would be to examine the result you have been given and
1) Establish it is in 2NF
2) Prove there is a transient dependancy and therefore not in 3NF
3) Make a bold statement that therefore it is in BCNF

I do believe it is a case of proving two situations which then purely leaves a single logical deduction

Cannot do the business for you cos the brain is a bit tired after the above explanation but you being fresh into the subject I am sure has the yiung brain cells to do the business

Len

Had a little look
Not sure about this bit its a maybe

AB>D

Therefore in R(ABCD) D is dependant on A and B. If A is the PK then there is a transient dependancy and of D on A and B. On the other hand I could be talking rubbish

stopher · Apr 10, 2009

This stuff was certainly part of my studies although I wasn't very good at it. Must be tens years since I saw questions like that. You'll be glad to know you don't need to draw on this in the real world (hence Dave's comment).

Like Len I don't know if I'm on the right track here but here's my take...

I agree with AB->CD but not with the argument.

1 G->B
2 AB->D
3 DE->C
4 GF->H
5 BD->E
6 AH->F
7 CE->A
8 AD->H
9 GF->D

From 2 & 5
10 AB->DE

From 10 & 3
11 AB->C

So from 10 & 11
12 AB->CD

I’d do the FD for T next:

From 1 & 9
13 GF->BD

From 13 & 5
14 GF->E

From 14 & 4
15 GF->EH

Does that help?

I haven't worked out the FD for S yet but maybe what I say here will help. I may be completely off track though!

Chris

dportas · Apr 18, 2009

Len Boorman said:
Now a relation in BCNF is not necessarily in 3NF.

That's incorrect.

A relation is in 3NF iff, for every non-trivial FD A->B, either A is a superkey OR B is a subset of some candidate key.

A relation is in BCNF iff, for every non-trivial FD A->B, A is a superkey.

Therefore BCNF is "stronger" than 3NF (and actually much more important in principle because it excludes all FDs not implied by superkeys). If you ensure your schema is in BCNF then you don't need to think about 2NF or 3NF because they are already satisfied.

Banana · Apr 18, 2009

Just to point one thing out-

Most of time, a 3NF is probably already in BCNF. Or put it other way, BCNF is a special case of 3NF.

Same applies to higher forms. One thing I take issue with some people saying that we need to normalize to 3NF and we're done because we don't need BCNF/4NF/5NF/6NF; it would be more correct to say that most time we can go for 3NF, and find that we're already BCNF/HNF as well.

dportas · Apr 18, 2009

Banana said:
BCNF is a special case of 3NF.

Surely 3NF is a special (degenerate) case of BCNF. That 3NF exists at all seems to be a historical accident and mistake. 3NF looked strong enough to deal with FD-related anomalies until it was proven otherwise and BCNF became the newly improved result.

Banana · Apr 18, 2009

I think we're looking at it differently.

Yes you are correct that when the 3NF was originally written, they missed out the cases that BCNF covers and BCNF should be the preferred form because we don't want any functional dependencies.

That said, my point was simply that for most business models, 3NF is *already* in BCNF as well. You would have to work hard to find a case where we have a table that's in 3NF but not in BCNF. Same applies to HNFs. If we have specific constraints on the business models, 4NF may be required for the job but change the constraint a little bit, and we may end up back in 3NF/BCNF. For that reason, people tend to say that we don't have to worry about HNF, but I disagree because that's the wrong reason. The correct reason is because most time, they already satisfy HNFs.

Let's consider the case that Wikipedia used for 4NF; a listing of pizza joints with different offerings and different parts of cities they can deliver to. Since the offerings and delivery area both are dependent on pizza joints, but not each other, the table isn't in 4NF, though it satisfy the 3NF/BCNF because all FD are against the key (in that case, the pizza joint name). But since offerings and delivery areas aren't dependent on each other, we break them out into a separate table.

BUT!

If we allowed for the case where a pizza franchise could have different menu in different areas, or maybe they just won't deliver their special pizzas as far as they would with their regular areas. Therefore, the three tables we used to satisfy 4NF with old requirements now over-constrains the new requirement and we need to put them back together into one table because offerings is now dependent on the joint and the delivery area.

This example should illustrate why it's usually practical to shoot for 3NF/BCNF for most parts. I'd wager that some time developers actually normalize to 4NF or 5NF without even knowing that they were doing so. But at the end of day, it's the business rules that dictates the model, and they usually come with loads of exceptions that would cause a HNF design to overconstrain the model.

Back to BCNF, the same situation applies here- though I would prefer to normalize up to BCNF, it's not that often that I'd run in case where I have a 3NF but not BCNF table.

Len Boorman · Apr 20, 2009

dportas said:
That's incorrect.

A relation is in 3NF iff, for every non-trivial FD A->B, either A is a superkey OR B is a subset of some candidate key.

A relation is in BCNF iff, for every non-trivial FD A->B, A is a superkey.

Therefore BCNF is "stronger" than 3NF (and actually much more important in principle because it excludes all FDs not implied by superkeys). If you ensure your schema is in BCNF then you don't need to think about 2NF or 3NF because they are already satisfied.

Yes I stand corrected. never too old to learn.

It does appear however that it is possible to be in 3NF and not be able to decompose to BCNF.

Len

darek_dade · May 1, 2009

Thanks guys for all your inputs. This is the correct solution:

To prove R1(A,B,C,D), R2(C,D,E,F), R3(E,F,G,H) are all in BCNF, compute the closure for all possible
subsets of Ri.
• For R1(A,B,C,D)
• Closures of attribute sets:
– (A)+
F = A
– (B)+
F = B
– (C)+
F = C
– (D)+
F = D
– (AB)+
F = ABCDEFH
– (AC)+
F = AC
– (AD)+
F = ADH
– (BC)+
F = BC
– (BD)+
F = ABCDEFH
– (CD)+
F = CD
– (ABC)+
F = ABCDEFH
– (ABD)+
F = ABCDEFH
– (ACD)+
F = ACDH
– (BCD)+
F = ABCDEFH
• nontrial FDs in F1:
– AB ! CD
– BD ! AC
The LHS of both FDs is a candidate key for R1(A,B,C,D); so R1(A,B,C,D) is in BCNF.
• For R2(C,D,E,F)
3
• Closures of attribute sets:
– (C)+
F = C
– (D)+
F = D
– (E)+
F = E
– (F)+
F = F
– (CD)+
F = CD
– (CE)+
F = ACE
– (CF)+
F = CF
– (DE)+
F = ACDEFH
– (DF)+
F = DF
– (EF)+
F = EF
– (CDE)+
F = ACDEFH
– (CDF)+
F = CDF
– (CEF)+
F = ACEF
– (DEF)+
F = ACDEFH
• nontrial FDs in F1:
– DE ! CF
The LHS of both FDs is a candidate key for R2(C,D,E,F); so R2(C,D,E,F) is in BCNF.
• For R3(E,F,G,H)
• Closures of attribute sets:
– (E)+
F = E
– (F)+
F = F
– (G)+
F = BG
– (H)+
F = H
– (EF)+
F = EF
– (EG)+
F = BEG
– (EH)+
F = EH
– (FG)+
F = ABCDEFGH
– (FH)+
F = FH
– (GH)+
F = BGH
– (EFG)+
F = ABCDEFGH
– (EFH)+
F = EFH
– (EGH)+
F = EGH
– (FGH)+
F = ABCDEFGH
• nontrial FDs in F1:
– FG ! EH
The LHS of both FDs is a candidate key for R3(E,F,G,H); so R3(E,F,G,H) is in BCNF.

I copied it from .pdf file so it looks kinda messy but I am sure it's readable

Given decomposition how to find it's in BCNF

darek_dade

New member

gemma-the-husky

Super Moderator

darek_dade

New member

Len Boorman

Back in gainfull employme

stopher

AWF VIP

dportas

Registered User.

Banana

split with a cherry atop.

dportas

Registered User.

Banana

split with a cherry atop.

Len Boorman

Back in gainfull employme

darek_dade

New member

Similar threads

Users who are viewing this thread