Academic vs practical design question

williamlove

Registered User.
Local time
Yesterday, 16:52
Joined
Feb 9, 2006
Messages
37
Consider two tables:

tblSongs
-SongID (primary key)
-SongTitle
-ArtistID (foreign key)

tblArtists
-ArtistID
-ArtistName

Is tblSongs normalized? In 3NF? It is possible to wind up with this anomoly:

SongID SongTitle ArtistID
131 Lifes Good 5387
221 lifes Good 5387
258 Lifes Good 5387

To cast more light on my question consider a customer table where the social security number is not allowed so the primary key is some arbitrary unique value

tlbCustomers
-CustID
-FullName
-City

The same question applies...is this normalized? If it is not, you'd have to have tables for the FullName and City, but that would be a lot of work and unwieldy. What would an academic do? A practical designer? I'm very curious.
 
interesting question

i dont think its a distinction between academic and practical designs - i think its more a matter of understanding a real world situation, and making appropriate design decisions

ie given three people named John Smith, its identifying whether these really are different people or not. A social security number is no guarantee incidentally - in the UK there are more numbers than people!

i think you might use a lookup for city, but i think its less likely you would use one for names. The only reason for having a name lookup would be to save space in the database, i think. Indeed, if you enter John Smith, and John and Smith are both lookups, you cant just change the name in the lookup table if you find the name is incorrect, (which is what you would normally do with a lookup) since that will affect other items using the same lookup.

--------
in your songs example all three songs are named the same, arent they (just capitalized differently), so it depends whether these ARE the same song or not - they may be different recordings of the same song. So its not yet a normalization issue per se, its more one of design decision. If they ARE the same song, then you may need extra information in the table, to distinguish the versions. Again, I dont think you would have a lookup table of song names, which could be misleading

ie is The Power of Love by Jennifer Rush, Huey Lewis, and Frankie all the same song (i think one was a different tune wasnt it)

does this help?
 
interesting question

i dont think its a distinction between academic and practical designs - i think its more a matter of understanding a real world situation, and making appropriate design decisions
I think it is really a division between Pragmatists and Pedants. Pedants will stick to a literal interpretation of the rules while pragmatists will do what works.
 
I think it is really a division between Pragmatists and Pedants. Pedants will stick to a literal interpretation of the rules while pragmatists will do what works.
Although, just doing "what works" doesn't guarantee that it is efficient or scalable, while doing what's right gives you the greater liklihood that it will be efficient and/or scalable.
 
Although, just doing "what works" doesn't guarantee that it is efficient or scalable, while doing what's right gives you the greater liklihood that it will be efficient and/or scalable.
Bob, as usual you are so right. There is no substitute for doing things properly but sometimes corners need to be cut because of other pressures.
 
Definitely I'd want to store a CityID instead of the City as a text string. Names of people I would almost always store as a text string. My reasons?

Cities are entities in the real world, and there're some obvious business reasons for wanting to be able to query data either aggregating or sorting using that entity.
Perhaps you might want to make a list of customers organised by city for your sales representatives. Perhaps for targetted mailers aimed at customers in certain cities. Given that there is a business reason for tracking these entities, you then have a need to ensure that minor differences in spelling, capitalization, abbreviations etc do not spoil the ability of your database to know that, for example, Los Angeles is the same city as L.A. Direct text entry allows potential differences in user data entry conventions to mess up your information. By storing cities in a table, you force the user to choose from a pre-existing list or add a new recrod to the city table. This reduces the likelihood of accidental double-entry of the same city using slightly different data entry conventions.

Further, there may be information about specific cities that you want to store in which case you'd need a table of cities to store that information in.

Is there the same business usefulness argument true for tracking names as entities also?

I don't think so....unless you're developing a database specifically relating to information about names (perhaps if you're tracking the origin and derivation of people's names maybe?). But for most databases, a customer's name is somewhat incidental information. (Not that it's unimportant to have the name stored, but the actual names themselves are relatively unimportant in how you will use the data)

You COULD still store each name as an id from a table of names. It would be the most efficient use of file space, certainly. But do you gain any business value by tracking customer names as entities? Will you ever need to obtain a list of customers with the first name 'Bob'? Is there any information about the first name 'Bob' that you need to store? If you answer No to those questions then I think there's no compelling business reason to create a table of names instead of just storing the names as attribute information in the customer table. If file size becomes a limiting factor then maybe you'd revisit that decision.

For what it's worth, that's my thinking anyways. :)
 
please let me modify my scenario and solicit more comments

I am teaching myself Access database programming using a fairly advanced book (Access 2002 Desktop Developer’s Handbook—Litwin, Getz, Gunderloy). They have a section early in the book on normalization. In order to understand it, I have to analyze my tables and think hard. Otherwise the subject just won’t “stick” for me. So if you will give me what you consider rigorous answers, it will serve a purpose for me at this stage in my development.

I would like to drop the secondary example of customers and cities and stick with tblSongs. I would also like to change my ArtistID field to Artist and not have a separate Artist table, and present a scenario for consideration.

If God assigned a unique number every time a music artist on earth recorded a song and revealed the list then we would have

tblSongs
-SongID (primary key)
-SongTitle
-Artist

If we found two records with the same Artist and SongTitle, we would know by definition this means the artist recorded a new version of the song. That new, unique version is mapped to the SongID. The SongTitle and Artist may be the same but it is a new recording with a new SongID.

I could word it without invoking God and just say that in the abstract, every song recording event can be mapped to a SongID, and I believe this abstraction is correct. If that is so, then I can proceed. If you feel it is not true, then we will have to discuss why SongID is not a necessary and sufficient primary key. I am eager to hear opinions on that.

If SongID is a necessary and sufficient primary key, I believe the value of SongTitle that would be entered in the row is only dependent on SongID because this relationship is established the moment the song is recorded. I believe the same is true of the value of Artist entering in the row. Since both are only dependent on the primary key, they are mutually independent, and therefore the table is by definition in 3NF.

My lingering conceptual problem is that if I decide to change the spelling of the Artist or some other aspect of Artist (maybe changing “Crosby, Stills and Nash” to “CSN”), I have to edit many rows of the table. That seems a bit of a violation of the spirit of 3NF. I still believe the table is in 3NF. But clearly the table can be broken out as follows:

tblSongs
-SongID (primary key)
-SongTitleID
-ArtistID

tblTitles
-SongTitleID
-Title

tblArtists
-ArtistID
-Name

I would like to know if my original table is, as I believe, in 3NF, and if so, what are the implications of the fact that I can expand the design with more tables?
 
its enterprise rules/business rules - there are no rigorous answers
---------

this is rhetorical, but the atomic structure (ie the level at which you no longer need to split attributes from entities) is what you need in order to model the thing you are trying to model. You therefore have to be God yourself, in your terms. And if your original design doesnt quite work, you can go back and change it.

-------

so if you need a songtitle table then have one - if you dont, dont have one. I think most people would not have a songtitle table, but pretty well all would have an artist table (it is definitely not normalized to have the multiple versions of the same artist name in the song table - as you identify with the Crosby Stills and Nash example) - music is especially fuzzy thought, as the same song (ie the samesongid) could be allocated to multiple artists (eg not only Crosby Stills and Nash as a group but also those artisits individually - perhaps you can get round this by linking artists making up a group to the group - but then you may also get issues with changes in group personnel)

--------
MS does something similar with its error messages - the messages definitely seem to be constructed from standard substrings,with parameters being infilled as appropriate. (which is why you sometimes get strange messages with vertical lines etc (place holders for something or other, I would think)

---------

I would think you are clearly understanding normalization very well to be asking these questions.

You are using a good book - the book I use most is the Access Cookbook (O'Reilly)- it isnt a primer at all, its a consideration of solutions for many practical real world problems. Its dog eared compared to my many reference books.

I would also try to use vba code, rather than macros, when you get to needing them. You do need code - you can only get so far in producing good databases without it, and most books dont cover vba well enough.
 
In the situation you are talking about here I think there's more to consider and Gemma started to talk about it with Cosby Stills and Nash being simultaneously individuals and band-members.

Or to take an example I'm more familiar with, Phil Collins, Mike Rutherford, and Peter Gabriel all have their own solo careers, but each have also been part of the band 'Genesis' and some being associated with the band at different periods of time. Mike Rutherford also is a member/leader of the band 'Mike and the Mechanics'.

So here, a single individual can be involved in many bands (over time, or at the same time).
A single band has many individuals.

The composition of a band can change over time.

A song can be performed by a 'band'...or by an individual, or by any combination of bands and individuals. (For example the band U2 and the individual BB King collaborating on a single song). Or a mish-mash of individuals and bands working together on the Live-Aid anthem 'We are the world' (way back in the 80's).

And don't forget that people can quit a band, and rejoin later....and have multiple roles within a band, or just one.

So, how can we handle all these relationships? How about this...

tblSongs
-SongID (pk, auto)
-SongTitle
-SongDate

tblPeople
-PeopleID (pk, auto)
-PeopleFirstname
-PeopleLastName

tblSongPeople
-SongPersonID (pk, auto)
-SongID (FK)
-PeopleID (FK)

tblBands
-BandID (pk, auto)
-BandName (text)

tblBandRoles
-BandRoleID (pk, auto)
-BandRole (text) eg, singer, drummer, lead guitar, hammer dulcimer, whatever

tblBandPeople
-BandPersonID (pk, auto)
-BandID (FK)
-PeopleID (FK)

tblBandPeopleRoles
-BandPeopleRoleID (pk auto)
-BandPeopleID (FK)
-BandRoleID (FK)

tblBandPeoplePeriods
-BandPeoplePeriodID (pk, auto)
-BandPeopleID (FK)
-StartPeriodDate
-EndPeriodDate

This structure allows you to associate a song with whoever the individual artist, or artists, are. It keeps track of which band or bands an artist belongs to, and when they were involved with a band. It tracks the role or roles each artist has within a band (although it assumes that this does not change over time).

You might have to use vba to simplify data entry so that you can choose from a list of bands and have the db automatically enter each of the individuals who were in that band at the time of the song, or allow you to enter individuals one at a time.

Personally, I would not treat SongTitles as entities (ie give them their own table) because I don't see any advantage to this from a business Point of View other than a minimal savings in file space. I would prefer to use a like operator search on a text field to find songs with titles that are similar, if not identical.
 
Thanks to both of you for taking the time to reply. I appreciate your input very much! The table design you suggested does seem to be rather rigorous and complete and it was interesting to consider the normalization aspects as I looked them over.

Regarding the suggestion to use VBA...I am doing that, the VBA handbook does not cover macros and in fact assumes VBA knowledge and jumps into objects...I am still trying to figure those out.
 
i think a lot of users here would rarely use macros - with a couple of minor exceptions anything you can do in a macro, you can also do with code (access actually provides a conversion utility to change a macro into code). I think code is generally more manageable and easier to analyse/debug

with regard to normalisation, it is interesting deciding how far to take it - Craig's last response was particularly rigirius - it all depends on how far you are trying to take it - if you want to trace the discography of say the beatles or the rolling stones, with all the attendant groups that each member was in, you probably need something like this

if its just a modest CD collection database, you perhaps dont.

Personally i find classification of music/or films really irritating - you know when you are on a interent DVD site, and you are asked to select bewteen genres ..... its the same issue really - normalisation!
 
I think code is generally more manageable and easier to analyse/debug

Gemma,

Depends. Macros are better if you rmake changes on a regular basis and email them. If some of us are using different forms, which we do, then sending the form is no good for obvious reasons. Macros are much easier to change with someone on the phone. You certainly can put the code in modules and send a mini db with the module. However, if a module is called from within a form I don't think you can see the module from the form in design view. The macro also has the plus of the drop down lists for form, table, query names and of course the various actions.

From the point of view of getting othe people to change code over the phone (could be form name etc) Access 95 was better than A2003. In A95 a click on the 3 dots only opened the code attached to that particular button or label as opposed to everything on the form, unless there is another way to do it in A2003. But macros are just plain easier to change when on the phone.

As a side note, coonversion of macro to code does not always work with big macros and especially where there are a lot of conditions. The conversion fouls up. Also, if a form as lots of macros, then doing a bulk conversion can foul up. Click on label or button after the conversion and everything just freezes.

One area where code is better is if you have a typo in macro conditions or other areas where you type in field or form names in the macro, it can be a ral pain on a big macro to find where it is. You get to use use StopMacro a lot:D

Of course code can peform functions that a macro can't even come close to. However, if a macro will do I would just about always use a macro.
 
Definition is the key

Surely the first problem is the definition of 'song'. Before you can normalise a database you need to have precise definitions. The arguments about 'are these two records referring to the same thing' does not occur if the record (as it should) contains the right data to uniquely define the entity. It seems to me that you want tblSongs records to each represent a recording of a song, not just a song, so there is not sufficient information. Even adding date of recording and record label wouldn't be enough, because there are releases with three different 'mixes' of the same song on one CD.

The important thing to remember about normalisation, and all other aspects of database design, is its purpose: to ensure that the data means what you intend it to, as efficiently as possible. 'Efficient' doesn't just mean economy of disc space. Saving a smidgeon of space at the expense of man-years of nasty programming isn't efficient.
 
>Surely the first problem is the definition of 'song'.

That in a nut shell is the problem with the discussion. The model is
not even closed to being well defined and therefore no where near
complete. You can't discuss normalizing an incomplete model.

The fact is that in the example from the original post both tables
consist of only fields that would make up the natural primary key
of the table and are therefore in every normal form there is.
 

Users who are viewing this thread

Back
Top Bottom