How to normalize and optimize this database?

pheidlauf · Jul 24, 2014

I was just given the project of creating a database to store and access information that the various engineers involved in design and manufacturing can use to create updated instructions and load information into their modeling software efficiently.

I have *some* experience with database design, but have no actual education in the area of study apart from my personal internet research. I really want to start this project off on the right foot by making sure this database is normalized (and follows all of the important major concepts of a relational database) and organized to match most generic database standards (my primary key is a user-defined text field, is that acceptable?). I have attached a photo of the Database Relationships I've drafted thus far.

I would love any advice from the experienced members here on what I need to do to this database before I can actually start developing the content and functionality of it.

Note: The blacked out table names are processes that every manufactured part goes through. I plan on filling in those tables once the database is normalized.

Any and all help or advice is welcome!

plog · Jul 24, 2014

All those redacted tables shouldn't exist. I'm guessing you didn't name them after curse words and censored them for that reason. My guess is you instead trying to protect sensitive data. That's an error in table design--table nor field names should hold values, they should use generic terms. Add to that they all have the same structure and it means those 6 tables should be combined into one.

That new table will basically take all the current tables' names and make them values in one field. So the structure of that new table would look like this:

NewTable
PartID, Type

Where Type would hold what is now the name of each table.

Also, you should use autonumber primary keys instead of text values. Take Customers for example, you shouldn't use CustomerName as the primary key, you should instead add a primary key autonumber. Let's call it CustID. Then when you need to link to Customers you place that field in any outside table. So Engines would have CustID instead of CustomerName.

Those are the big ones I can discern from your screenshot. Make those changes then post back with some sample data and I can look further into your data.

Thales750 · Jul 24, 2014

Unless those new fields would have null values. But even with that I agree with you.

Just use zeros for null values. There are many query criteria conditions that will return incorrect values if nulls are present.

pheidlauf · Jul 24, 2014

plog said:
All those redacted tables shouldn't exist. I'm guessing you didn't name them after curse words and censored them for that reason. My guess is you instead trying to protect sensitive data. That's an error in table design--table nor field names should hold values, they should use generic terms. Add to that they all have the same structure and it means those 6 tables should be combined into one.

That new table will basically take all the current tables' names and make them values in one field. So the structure of that new table would look like this:

NewTable
PartID, Type

Where Type would hold what is now the name of each table.

Also, you should use autonumber primary keys instead of text values. Take Customers for example, you shouldn't use CustomerName as the primary key, you should instead add a primary key autonumber. Let's call it CustID. Then when you need to link to Customers you place that field in any outside table. So Engines would have CustID instead of CustomerName.

Those are the big ones I can discern from your screenshot. Make those changes then post back with some sample data and I can look further into your data.

Those "redacted" table names are simply the names of the processes the parts go through during manufacturing. In the picture I uploaded, they are empty only because I had not filled in the column names yet. The reason I had separated them was that they were each going to hold details about that specific process. For instance, the table named "Casting" was going to hold values like MeltTemperature and Speed, where the "PostProcessing" table would hold information about FlashCutOff, Finishing, and Coating or such.

Are you saying that the better database design would have one big table holding all of those values together?

Thanks for the help!

plog · Jul 24, 2014

Yes, 1 big table may be the way to go. 1-1 relationships are usually unnecessary, so most likely all that data would go in the Parts table. It really depends on the data you have for each process and if each part has each process.

Could you post sample data?

pheidlauf · Jul 24, 2014

Here is an updated image of what I have at the moment, hopefully it will be more clear. I've changed some names and values to clarify the relationships.

I also made all of the PK's into autonumbers. I have not yet combined the "redacted" tables (now named Materials, Shell, Casting, and PostProcessing) into one.

plog · Jul 24, 2014

Another sign of a bad structure is that has loops. There should be only 1 way to get from one table to another. From Users to Parts there are 2 paths, that's not right. Without sample data I don't know how to fix this, but I am certain this is wrong.

All those redacted tables should just be fields within Parts, unless some parts don't use them.

pheidlauf · Jul 24, 2014

plog said:
Another sign of a bad structure is that has loops. There should be only 1 way to get from one table to another. From Users to Parts there are 2 paths, that's not right. Without sample data I don't know how to fix this, but I am certain this is wrong.

All those redacted tables should just be fields within Parts, unless some parts don't use them.

I've attached the updated database with (dummy) sample data.
I agree that that loop in the relations shouldn't be there. However, I don't know how to get rid of it. I simply want to create a record of which user has made edits to the part files (kind of like the comments for a "Commit" button if you're using a version control system). Is this a bad idea altogether, is there a better way to record it?

Thanks!

plog · Jul 24, 2014

That' didn't really help, i wanted actual data as sample data.

I think I'd just need and explanation of the actual process. Without using any database jargon (tables, queries, forms), can you tell me what real life process this data is to model?

pheidlauf · Jul 24, 2014

plog said:
That' didn't really help, i wanted actual data as sample data.

I think I'd just need and explanation of the actual process. Without using any database jargon (tables, queries, forms), can you tell me what real life process this data is to model?

Sorry that that data didn't help. Basically, this is modeling a casting process. Each part has it's own unique process of die/mold construction, heating, and metal pouring, and 20 other numbers, file names, or variables that the casting equipment needs to run (all of that information can go right into the [Parts] table).

My company works for several [Customers], each of whom has several different [Engines]. Each engine has at least one, if not multiple [Parts] my company produces. Honestly, the [Customers] and [Engines] tables are simply there for organization and reference--there will probably be only 5-10 records in each table.

However, each of the [Plants] my company uses produces a given set of [Parts]. In some instances, the same part is made in multiple [Plants] (many to many relationship I think?).

The [Users] table refers to those people who will be using the database to input data or pull information from the [Parts] table to populate their modelling software. Whenever a user makes a change to the [Parts] table, the only place in which important data manipulation will occur, I want to automatically generate a time stamp showing which [User] from which [Plant] edited which [Part].

The purpose of this database as a whole is to automate and consolidate data entry to a single source to cut down on redundant data entry into different software.

plog · Jul 24, 2014

Your user section isn't related to the rest of your data. So while it may share some tables that the rest of your data uses, its not really part of that relationship. You essentially have 2 sets of relationships using your data--the Parts side and the Database Administration side.

Attached is an image of what I think your relationships and tables should be.

ButtonMoon · Jul 25, 2014

plog said:
Another sign of a bad structure is that has loops. There should be only 1 way to get from one table to another.

I wouldn't say that. There are plenty of situations which are accurately modelled by loops in an ER diagram. When you see a loop in an ER model (especially if it appears to be a circular dependency) it's certainly worth taking a second look, but the presence of loops doesn't automatically mean the design is "bad" or "wrong" or in need of modification.

Frothingslosh · Jul 25, 2014

Also keep in mind that Access' relationship diagram will combine duplicate references to the same table once you save it and re-open it. Even if the OP sets it up the way plog showed, once the relationships are saved and the window closed and re-opened, the loop will reappear.

pheidlauf · Jul 25, 2014

Is there an alternate way I could implement the timestamp feature within the database that would avoid this issue altogether?

plog · Jul 25, 2014

What is "this issue" that you want to avoid? Also, big picture: what's the point of capturing every change a user makes? What's the cost of implementing such a system compared to what it actually does for you? I've never seen a situation where it was justifyable for an Access developer to implement such a system. To be sure, every manager gets a chubby for the idea of holding everyone responsible for everything , but in practice its not worth it.

pheidlauf · Jul 25, 2014

plog said:
What is "this issue" that you want to avoid? Also, big picture: what's the point of capturing every change a user makes? What's the cost of implementing such a system compared to what it actually does for you? I've never seen a situation where it was justifyable for an Access developer to implement such a system. To be sure, every manager gets a chubby for the idea of holding everyone responsible for everything , but in practice its not worth it.

"This issue" is the circular relationship that seems to be inherently related to the timestamp feature. This feature isn't required for the database to function at the most basic level.

The main benefit of this will be the ability to track down which engineers have added changes to the processes. As the database will be used to store critical data needed for the modelling software we use, knowing why and when a particular change was made to any given part will be very useful. Engineer Y will be able to ask engineer X which value he needs to run in his model. All the while, manager Z gets a chubby.

jdraw · Jul 25, 2014

Since you are in design mode, and I am supposing you have some test data or at least some specifications/requirements from the Engineers, I recommend you work with the latest version of your evolving model and vet the model with your test data. We used to play what we called "stump the model". This involved people working with the design, users and support personnel. A picture of the model was on display; business rules were "actioned" against the model; "What-if scenarios" from users and support personnel were also 'tested' against the model. This was a test of the model, not the modeller. If any scenario could not be proven/answered with the model, things stopped and a reconciliation/rationalization occurred. This included identifying WHAT was "wrong" - was it the model, was it a business rule, was there a hidden entity/attribute/relationship, was the scenario itself not logical for the business; was it bad test data? This had to be resolved. This was repeated- often involving a few different representatives from the various participant groups. A revised model, or new data etc would be prepared and another "stump the model" session would take place. All anomalies were reconciled. At some point there would be agreement/acceptance of design. To make this work efficiently, you should have preliminary specifications and any modifications;definitions of tables and fields; keep samples of model versions (to show changes), a current list of business rules (with any modifications) and a summary of the "stump the model" sessions.
It is interesting to see what happens in such sessions. Users who know the process intimately are often surprised with what happens next or how X relates to Y-- you get the"I've been doing this for 5 years and I didn't know it did that"; or "I wasn't aware that they used that data I thought it was local to our group.."

You and the engineers etc know the situation better than any reader here. Readers will help as best they can if you post specific questions.

Good luck.

plog · Jul 25, 2014

The dual relationship image I provided would be the way to go. This eliminates the circular references.

However, like Frothing pointed out, if you set it up in the Relationship manger like I've described, Access will reconfigure it and put the circular reference back. So to avoid this, only put in the Products relationship and not the Change Tracking relationship into Access.

Setting up relationships in Access is helpful but not necessary, so leaving the Change Tracking one out won't hurt anything. If needed, you could always use the Relationship Tool to set up the Change Tracking relationship, save a screenshot of it and then delete it. That way you have it if needed, but its not being enforced in the database.

Lastly, you may consider a 'Notes' field in the Parts table. This will allow people to attach any necessary notes to each part so that its accessible by others.

pheidlauf · Jul 25, 2014

jdraw said:
You and the engineers etc know the situation better than any reader here. Readers will help as best they can if you post specific questions.

I really like your "Stump the model" idea. I usually debug my own applications that way on my own, but I'm sure the group dynamic will make that much more effective.

Specific Question:
Is there a common solution for tracking changes in a relational database? If so, where could I find information about it/them? There should be no malicious users of this database, all tracking will be for the purpose of communication.

jdraw · Jul 25, 2014

pheidlauf,

Where do you keep any documentation now? Is there a logical spot for these changes? Access doesn't have a data dictionary.

I managed a corporate data management group where we had data dictionaries/repositories. If a change was made to the database design, it was recorded with database/tables/fields and also on the logical/physical database model as a modification/comment.

If the change was made to a process, it was recorded with the program/procedure/function.

I've been retired for a few years.

Re: The stump the model approach

It takes the you vs me, or us and them away from the issue (hopefully). All participants have (or should have) an interest in seeing a well designed database/system/application from all aspects -- ease of use, ease of maintenance, and it must do the job required by business.

Sometimes the best approach is to make some statement that you know is false, then watch the users etc correct your "mistake". It gets the dialog flowing. That is the key --get those who have a concept of what this system should be, and those who will be involved in operations talking to each other and using the model as the talking point.

The modeller should never be in a position of defending the model -it's a learning opportunity/experience to make things better (for all). If the model doesn't represent some business rule properly, then make the point, record it and have the model updated.

Documentation is not solely for the designers, When changes are required, or if you merge departments, or expand your service or products, or for communications... you should be able to know what exists without having to re-invent such documentation after all the players have moved on to other jobs or companies.

Good luck.

How to normalize and optimize this database?

Registered User.

Attachments

Banishment Pending

Formerly Jsanders

Registered User.

Banishment Pending

Registered User.

Attachments

Banishment Pending

Registered User.

Attachments

Banishment Pending

Registered User.

Banishment Pending

Attachments

Registered User.

Premier Pale Stale Ale

Registered User.

Banishment Pending

Registered User.

Super Moderator

Banishment Pending

Registered User.

Super Moderator

Similar threads

Users who are viewing this thread