Database Planning

wallis1905 · Mar 13, 2007

Hi All,

I'm Looking for a bit of advise. I'm about to try and build a new database that will have about 40 tables. One of these will be the main table. This table will have a primary key TAGNUM and several other fields containing data associated with each tagnum. Most of the other tables will then contain more detailed info about each tagnum.
So the obvious aim is to have all these secondary tables connected somehow to the main table so that when a new tagnum is added/deleted, the corresponding row in any effected secondary tables is added/deleted also.

My attempts so far have involved setting up a relationship beween tagnum in the primary table and tagnum in each secondary table. This appears to work ok but I'm trying to be sure that my setup is good. I have a one to one relationship in each case, I have selected referential integrity, and the cascade changes etc. My join type is "All records from main table and only those... secondary table". Is this correct?

The further complication is that I need to have Lookup?? tables (not sure if it's the correct word) tied to each of the secondary tables and 3 tied to the main table. The user needs to be able to select items from the fields bound to these lookup tables via a dropdown box on the field in a query. When the dropdown box is selected a descriptor for each option available need be seen The non bound fields in the lookup tables only needs to be available to a report bar the showing of the Description field during selection. Any ideas?

Sorry about the size of this and annoying you with what is i'm sure not difficult stuff! Pretty new to this.

MicroE · Mar 14, 2007

Everything sounds good so far. However, without seeing the table structure, it is difficult to comment (40 tables sounds high).

Check Out - Suggested Readings

The_Doc_Man · Mar 14, 2007

wallis - having 40 tables and many of them one//one tables at that - is very high. Not that it never happens. There are good reasons. But generally those reasons don't apply.

Reasons you would split a single table into one-to-one tables:

1. If you don't, the record is too long or the number of fields is too many for one table. So you split into at least THREE tables.

2. The split has to be based on fields that are never used together, so they CAN be separated.

3. There can be differences in security between the elements of some of the tables. Like, one table contains privacy act data while another only holds things in public domain.

4. The split was done for issues of speed, doing the primary search on the shortened main table and then JOINing the result set to the other tables.

If none of these apply, splitting tables to produce one//one separate tables is not such a great idea.

In any case, your description of the relationship actually sounds like a one//many case. Which really doesn't matter that much if you never drive the operation from one of the secondary tables. Here's a little secret. Choosing a one//many relationship when you really only meant one//one is still going to work OK, and might actually let you get away with cases where there is an enry in the main table but the corresponding entry in one of the secondary tables hasn't been filled in yet. Because one//many really means one//(one, many, or NONE). So it ought to work OK anyway.

wallis1905 · Mar 14, 2007

Ok Thanks for that,
The reason these tables have been split is because the data to be contained in the secondary tables is to be entered by people in a different departments. Typically, a few tables per department, with each table representing a recordset of tagnums that all have the same specification number . People in one department dont need to see another departments data. Only the fields in the main table will be accessed by all. Most fields in secondary tables will be unique to that table. The secondary tables typically have about 20 fields. Therefore the master table would need hundreds of fields if it was not split out. Does this make sense?
Can you tell me why you say to split into at least three tables?

neileg · Mar 14, 2007

Users don't see data in tables. You use forms based on queries to enter/view the data. I would do this by giving each user a different front end that only gives them access to their forms.

Having said that, if you want to prevent users from seeing other department's data using Access security (which will be more secure than the above), this is a valid reason for using separate tables, as the Doc Man already said.

wallis1905 · Mar 14, 2007

Ok,
Yes the users will be entering data through queries, and security isn't really a concern. The problem is that there will be more than 255 fields required and therefore the only option is to break out into tables? And the best way to link these tables is to use relationships? By the way, nobody will be using forms. All data entry is done through a query.
Presuming that I must use many tables do you think my first post gave a proper set up? Also I've heard of Lookup tables being dangerous in some way. Is this a big deal for the average user?

Thanks again for the help

The_Doc_Man · Mar 15, 2007

NOW the alarm bells are going off BIG TIME. I find it very hard to believe that a properly normalized table stores more than 255 data attributes about a single entity such that one//one relationships make sense. (On the theory that the main table and its linkees describe a single entity of the class of things described by that table.) Whether we are talking people, cars, computers, or buildings.

When you have that many fields and none of them are duplicates, you are already way out of the league in which Access plays.

If you have not had someone tell you this before, it is time. You drastically need to step back to analyze your tables and design concept. If you have not done so yet, you need to immediately find references on normalization. A Google search will probably turn up 30-50 pages of hits. Pick some articles from various .EDU sites, which usually have pretty good explanations online.

I can understand that each department has different responsibility for adding data, so you split the tables to limit what can be changed by a given department member. But when you say you are using QUERIES to update the tables, that is SO dangerous and so hit-and-miss that there is no way you could really want to do that. It is just not a good approach.

You said there was no security. Well, trust me - with this design, your statement is ABSOLUTELY correct. Using this design, security left the building long ago, just after Elvis.

wallis1905 · Mar 15, 2007

Oh Oh

Oh Now you have me worried!
Just to note that some of those fields will not be unique to a given secondary table. They will appear in multiple tables, but the records that table relates to will not be in another table and therefore each instance in this column is only stored in one place?
Can you tell me what is so dangerous about using queries to update data? It's standard practise where I work. All data entry is done through a query which views The Tagnum and a few other fields from the main table as well as all the fields from whichever secondary table it is used to populate. When a new tagnum with a certain spec number is entered and you run the query, you will see a Tagnum record, but all the fields from the secondary table are blank. When you eneter data in any of these fields, the secondary table gets a new record and it's TAgnum column( Which is what is bound by the relationship) gets updated to be the same as in the main table. Is his really such a bad idea?

wallis1905 · Mar 15, 2007

Normalized???

Hi again,
I've been reading up on this normalization business. It seems a bit tricky, but It seems to me that maybe I am OK. It talks a lot about attributes being dependant on others. My database has one Primary/Candidate key. This is tagnum. All data is dependant on that field. The secondary tables have primary keys too, but they are all Tagnum also (which forms the relationship with tagnum in the main table), so I guess that this is effectivley the same as having one key only? When a new tagnum is entered then the rest of the fields are simply filled out manually. Am I correct in saying that if you only have one candidate key and then your table is normalized or am I being optimistic?
Whats the diffreence between a primary and candidate key?

neileg · Mar 15, 2007

Primary keys are uniqe to the table. If you hold a primary key in another table, it is held as a foreign key (which may be the same concept as a candidate key, but I'm not familair with the term).

Once the idea of normalisation jells with you, you'll wonder why you find it so hard. I'm not a trained programmer, so let me give you a layman's introduction.
1. Tables should be long and narrow. If yours are short and wide, you are probably going wrong.
2. One to one relationships in real life are rare.
3. If your design leads to lots of columns, only some of whch are ever populated, you should have lots of records instead of lots of columns.
4. Forget everything you ever knew about spreadsheets. Access is not a spreadsheet.
5. In general, tables are for storing data, queries for manipulating data, forms for entering/viewing data and reports for printing it out. So users never see tables or queries.

wallis1905 · Mar 15, 2007

Normalized?

Ok,
But I'm not sure how I make my tables long and skinny! My main table will be long and skinny because it has about 4000 records and 20 fields. Howevwer, some of the secondary tables will only have a dozen or so records, but up to 30 or 40 fields. How do you make such a table long and narrow and still have it related to the main table?
I think my relationship must be one to one. Each tagnum in the primary table relates to one tagnum in a secondary table and never more than one.

neileg · Mar 15, 2007

Take a screenshot of your relationships window and post it up here.

wallis1905 · Mar 15, 2007

Relationships

Ok Dude,
Here it is. It aint pretty but it seems to work ok. See what you think.

The_Doc_Man · Mar 15, 2007

OK, something is beginning to gel here in my mind.

If the tables strongly resemble each other (which is implied by "some of those fields will not be unique to a given secondary table"), you have a problem of major design implications. You are giving each department a separate table and you have a sparse overlap structure, if I read between the lines well enough to form the image. I'll bet you expect that TAGNUM will appear only once among all the non-main tables where it could appear. Not once per table, but once per set of all non-main tables. Did I read that right?

If there is duplication of data in the sub-tables, the potential duplicate data should not be there. It should be in the main table. (Otherwise, here's the big question: What do you do if somehow, whether intentional or not, two sub-tables have the same tagnum and their overlapping fields are different? Which one is right?)

In this design, if you intended that every one of your bunch of non-main tables has a TAGNUM but ONLY one non-main table ever actually has the matching TAGNUM, then your deparment separation is eating your socks for you. You suggest that SOME of the departmental data really IS unique. Move the non-unique stuff to the main table sooner than soonest. Then go back and analyze the department tables to see if there is a way to overlap their data somehow so that you could make a SINGLE table with a department code and your one//one relationship.

If you allow direct data entry from a query, you have NO WAY TO PREVENT THE ABOVE FROM HAPPENING. You have NO CONTROL WHATSOEVER over what data gets entered, when, and how.

Only through a form could you enforce the relationship for uniqueness among the non-main tables. ('cause there you could run VBA code underneath the form to do some extra validation.) Only in a form can you do advanced validation rules that look elsewhere.

And part of this problem is that I'll guarantee in writing that your main table might or might be normalized, but the non-main tables are definitely NOT normalized or optimized or in any other way structurally correct for the overall statement of the problem.

As an aside:

The term "candidate key" is a part of the analysis of a table to choose a primary key. Suppose (just for example) that you have an employee record with (in USA) Social Security Number, Employee Number, and several other fields. The odds are that both SSN and EmpNumb are unique, and therefore either is a candidate for Prime Key. So you need to decide a tie-breaker to determine which is dependent and which is independent. In the USA, by law, private companies may not use SSN, so EmpNumb wins - and candidate key EmpNumb becomes Primary Key. Then candidate key SSN holds unique but non-key data that truly depends on the chosen PK so there is no violation of normalization rules. It is at this point immaterial to the discussion as to whether an index appears on SSN because now it is just data.

wallis1905 · Mar 15, 2007

OK, something is beginning to gel here in my mind.

If the tables strongly resemble each other (which is implied by "some of those fields will not be unique to a given secondary table"), you have a problem of major design implications. You are giving each department a separate table and you have a sparse overlap structure, if I read between the lines well enough to form the image. I'll bet you expect that TAGNUM will appear only once among all the non-main tables where it could appear. Not once per table, but once per set of all non-main tables. Did I read that right?

If there is duplication of data in the sub-tables, the potential duplicate data should not be there. It should be in the main table. (Otherwise, here's the big question: What do you do if somehow, whether intentional or not, two sub-tables have the same tagnum and their overlapping fields are different? Which one is right?)

In this design, if you intended that every one of your bunch of non-main tables has a TAGNUM but ONLY one non-main table ever actually has the matching TAGNUM, then your deparment separation is eating your socks for you. You suggest that SOME of the departmental data really IS unique. Move the non-unique stuff to the main table sooner than soonest. Then go back and analyze the department tables to see if there is a way to overlap their data somehow so that you could make a SINGLE table with a department code and your one//one relationship.

If you allow direct data entry from a query, you have NO WAY TO PREVENT THE ABOVE FROM HAPPENING. You have NO CONTROL WHATSOEVER over what data gets entered, when, and how.

Only through a form could you enforce the relationship for uniqueness among the non-main tables. ('cause there you could run VBA code underneath the form to do some extra validation.) Only in a form can you do advanced validation rules that look elsewhere.

And part of this problem is that I'll guarantee in writing that your main table might or might be normalized, but the non-main tables are definitely NOT normalized or optimized or in any other way structurally correct for the overall statement of the problem.

As an aside:

The term "candidate key" is a part of the analysis of a table to choose a primary key. Suppose (just for example) that you have an employee record with (in USA) Social Security Number, Employee Number, and several other fields. The odds are that both SSN and EmpNumb are unique, and therefore either is a candidate for Prime Key. So you need to decide a tie-breaker to determine which is dependent and which is independent. In the USA, by law, private companies may not use SSN, so EmpNumb wins - and candidate key EmpNumb becomes Primary Key. Then candidate key SSN holds unique but non-key data that truly depends on the chosen PK so there is no violation of normalization rules. It is at this point immaterial to the discussion as to whether an index appears on SSN because now it is just data.

On your first point, no I expect there to be a tagnum in all secondary(ie not main and not lookup) tables and of course the main table too. And as I have enforced referential integrity I expect a tag change In either the main table, or any secondary table will be translated to it's counterpart(ie secondary to main or vise versa). And note that I also require the ability to map these tagnum changes to tables that contain tagnums, but where all other fields are unique to that table. ( I guess in some ways another primary table?)

In answer to your second point. All tables will only ever be filled through queries. Hopefully there will be a way to allow users see only those data entry queries and not the tables. These Queries will contain filters on the specnum column, ensuring that only instruments of a certain specnum end up in each table. If a spec number has been assigned incorrectly, and the tagnum ends up in the wrong table it will be a problem. If somebody spots the problem( Which will be obvious by the Tagnum, as its descriptive string and you could guess the spec num almost) they can change the spec number in the main table and delete the offending record in the secondary table. OOps Will this delete the record completely? What happens if it gets into the correct table before this? S**t!
Anyway I was going to use a delete query with a specnum filter to clean up any that were missed, but that can cause the same problems?

Right, Sorry I'm reading paragraphs and answering them! Ok, So this ain't good. What if I say that those non unique fields in the secondary tables do not occur in every other table. Most will occur in a handfull of tables only. We already have the most common non unique fields in the main table but perhaps on inspection some more will merit a place there. However, what do I do with fields that only occur in two tables out of 40?
So perhaps I can do as you say and basically make two very large tables or one smaller and one bigger. I'm sure I'll meet resistance if I try this. Everybody wants there tables.

About the data entry through a form, I don't think I'll be allowed pull that one off. Where I am everybody has been entering data through queries for years and wont budge.

Thanks for the help boss, much appreciated. I'm learing a lot at least!

neileg · Mar 16, 2007

wallis1905 said:
Ok Dude,
Here it is. It aint pretty but it seems to work ok. See what you think.

Mmm....

As usual, the Doc Man has some wise things to say.

Since the shot doesn't show the full picture, it only helps a bit. Your structure looks pretty much the way you would implement this in Excel.

I guess that at least part of the problem is that you seem to be dealing with a range of items that have very little in common other than a unique Tagnum! I don't think I can give you any more specific advise without understanding the business model.

Having said that, this phrase worries me Everybody wants there tables. Users don't see tables, so how would they know?

wallis1905 · Mar 16, 2007

Still in trouble?

Well what I mean by that is that people are used to running custom queries from these secondary tables. Prior to now, people have beeen copying and pasting records into these secondary tables, and updating them that way. But yes, I suppose now that I want them to only see the queries I set up instead of the tables they better get used to not seeing them.
And yes each tagnum record will have about 20 fields that all other records have also. Then each record will also have a varying amount of other unique and lets call them "Semi Unique" fields also (Appears in more than one secondary table but not all).
As for the screen shot, what I have given you is the heart of it. Ie it shows the main table "DesignIndex" connected to some of the secondary tables.
How bad a thing is it not to be normalised anyway? I just realised that if I untick the cascade deletions box in all my relationships my delete query method may work? ie the record will only be removed from the offending secondary table, and stay in the main table. It can be accordingly updated into the correct secondary table. Does this mean I can "fix" normalization errors or am I talking complete shite? Don't be shy tell me!

neileg · Mar 16, 2007

It's true that in an un-normalised database, enforcing referential integrity can make life impossible. On the whole, un-normalised applications can start off Ok but as complexity is added, they grind to a halt. Saving the data is not the issue, it's managing and reporting on it that causes the headaches.

If you users are used to seeing the data in tables, that suggests a fairly falt file structure with meanings embedded into the table names and the field names, ie just what you've got. If you move away from this, you gain in database design terms but you loose the 'look and see' aspect. Here's a for instance.

Let's say you have four different attributes that might relate to a Tagnum. For attribute A the diameter needs to be recorded, for B the length, for C the mass and for D the hardness. In your multi table set up, that means four tables, one field in each. However, if you have just one table, you would need one field to record the numeric value and a second to hold the unit you are recording. You need to look at both fields to derive the full meaning, but it's all there and you have reduced the complexity and reduce the number of empty fields.

The_Doc_Man · Mar 16, 2007

wallis, I am seriously concerned about two-way referential integrity, because it has a chance to cascade to Hell and gone real quick.

You said something earlier about being able to delete from EITHER the main table or the secondary table and have it work right. BUT... If you delete something from a secondary table and it deletes something from the main AND that something shares a tag number with other tables (which you say it will), you lose ALL RECORDS IN ALL RELATED TABLES with that tag number when all you did to start was you just deleted a secondary table's entry.

The question about normalization is moot if your structure needs to look more like Excel than Access. And as Neil has mentioned, that is exactly what it sounds like. The biggest issue that comes about with improper or incomplete normalization is that at some point you will be unable to implement a given Access feature due to issues with your keys - or lack of proper keys. If you ever want to do data mining, you are going to have trouble, too.

What bothers me more is the idea that in your one//one table relationship, there are multiple one//one tables that can be updated separately with respect to the main table. Structurally, this makes sense ONLY for security reasons (which you have said really don't apply) or because you have too much data for the item reflected in the main table. Again, the implications of the latter statement are staggering. In a normalized table, I would have thought that you couldn't reach this situation even in Access unless you stored many massive memo fields for a single item.

This sort of thing flies in the face of all the design methodologies normally taught for relational databases. If you wonder why we seem to be harping on these issues, it is because you are presenting us with one of two possible situations: (A) a truly anomalous object model, or (B) an abysmally badly designed database. Given the odds of finding true anomalies, some of us gravitate to (B) as a matter of course, which might be unfair to you. But I want you to know why you are getting this kind of flak.

To me, this problem calls for a step-back-and-regroup action. You strongly need to go back to paper, pencil, white-board-and-dry-marker, whatever else is involved, and do a full-blown analysis of your business model. In order to do that, you need to look for books on database design, business modeling, and the like. You need to read up on normalization in depth. For Access, we strongly urge that you shoot for tables to 3rd normal form. You could go deeper than that (4th, 5th, and some special-case normal forms exist), but if you get to 3rd-normal, Access will probably do most of what you want correctly most of the time. Which is good enough for most of us.

I won't say you should sh|t-can your design because it didn't come about by accident, whether it looks that way or not. But if I were in your shoes, I would definitely step back for a really hard look.

The_Doc_Man · Mar 16, 2007

Everybody wants there tables.

No they don't. They want their reports and they want data. If they give a crap about actual table structure, they need to be educated. And if you let them drive the process of correcting the design to that level, YOU need to be educated. (Sorry, not trying to be offensive, but the constraints of forum posting require me to get to the point.)

( I guess in some ways another primary table?)

Whether you wanted one or not, with multiple layers of enforcing referential integrity in this structure, I don't know WHAT you've got, but it is not the normal paradigm for a relational database.

However, what do I do with fields that only occur in two tables out of 40?

Neil touched on this. What you REALLY need seems to be a table for your main data and a table of descriptive data that links to the main tag numbers.

The descriptive data can be very simple - three fields per entry, then have one table for each CLASS of data. Or you could do it with one table and some imagination, plus an "attribute names" table.

tblMAIN
TagNum, PK, whatever it is...
data that belongs in main table.

tblOURGROUPS
Group code, PK, arbitrary number
Group name, text
other data about the business groups who participate in the operation.

tblATTRNAMES
AttrCode, PK, numeric (probably need either word or long for this one)
AttrFmt, a code number used to represent the nature of the attribute - and you could use some Access symbols for this. Things like, BYTE, WORD, LONG, SINGLE, DOUBLE, CURRENCY, Yes/No, DATE, TIME, DATE/TIME combined, etc etc etc.
AttrName, text, the name of the attribute being recorded

tblATTRIBUTES
TagNum, FK
AttrCode, FK
RespGroup, FK, group code
AttribValue, text, the presentation / display value of the attribute, size can be an issue but won't be a problem.

How do you use this? Not that easily, but... it is flexible to the point that you can do the whole schmeer with these three tables. You might wish to do a layered JOIN to bring all three of these together in a query, but what the heck, it should still work.

Join the main table to the attributes table on the tag number. Join the resulting query to the attribute name table on the attribute number. You can join THAT to the group table. Now you will have a query that presents a LINEAR recordsource that looks like this:

TagNumber, Attribute Code, Attribute Name, Attribute Format Code, Attribute Value, Group Code, Group Name

You can use THIS recordsource to build reports that are grouped by Tag Number and alphabetized or otherwise sorted by attributes, responsible group, etc. This implies that one and only one group is responsible for each attribute. You probably will be forced to build a form to handle data input, but the point is that by making this change, you are no longer bound by size issues. You can have as many attributes as you need to complete your description.

Database Planning

Registered User.

Registered User.

Immoderate Moderator

Registered User.

AWF VIP

Registered User.

Immoderate Moderator

Registered User.

Registered User.

AWF VIP

Registered User.

AWF VIP

Registered User.

Attachments

Immoderate Moderator

Registered User.

AWF VIP

Registered User.

AWF VIP

Immoderate Moderator

Immoderate Moderator

Similar threads

Users who are viewing this thread