Normalising.....how far do you go?

  • Thread starter Thread starter Mike375
  • Start date Start date
M

Mike375

Guest
I am about to start helping someone make a data base and since it is a "start from scratch" I am going to avoid errors that are in my own data base. In both cases our occupations are insurance.

One obviously important part of the data is the recording/description of policy benefits. In my own data base I have all of this in one table which results in null fields in about every record

There are basically four types of policy and they all have some common fields or data which are

Policy Type.... Huge repeater in the record set
Insurance Co...Huge repeater in the record set
Policy Number
Application Date
Commence Date
Benefit Amount
Mode of Payment......Huge repeater in the record set
Payment Amount
Policy Status......Huge repeater in the record set

There would be no null fields in any of the above except for a perid of time following application. Commencement Date can be anywhere from a couple of weeks to several months after the application date but all the other fields have data from about day 1.

There are then fields that only apply to each policy type and also they repeat a great amount throughout the record set.

For example, Income Replacement insurance will have

Waiting Period....98% of the policies will be 2, 4 or 13 weeks

Benefit Period Sickness....virtually all will be Age 65 or Life or 2 years
Benefit Period Accident...Same range as above but period could be different. Could have Sickness for 2 years and Accident to Age 65

Approximately 2/3rds of the records would have null fields for the above.

Life Insurance, Total and Permanent Disability and Trauma Cover also have a couple of fields that only apply to each type of cover.

There are four fields which for the vast majority of records three of the fields will be null.

Issued Standard Y or N. Obviously a huge repeater.

If not a standard isssue (minority of policies) then these fields get entries

Premium Loading. Nearly every record will be null and those with an entry will only have a few different entries eg. 150%, 200% etc

Policy Conditions Altered Y or N

Altered Policy Condtions....Paragraph 27 changed to whatever etc

How far down would you break it. If it is all in one table there are 26 fields. (just counted mine:D)

There are only 5 fields where they apply to every policy type and where the data is not hugely repeated through the record set.

Policy Number
Application Date
Commence Date
Benefit Amount
Payment Amount
 
I forgot to include something about the table to hold the policy holder details.

Details of the person would include the following, which means plenty of null fields and repeated entries

Occupation........all records
Occupation Classification....hugely repeating entries
Employee/Self Employed
Qualifications....only some records
Year of Graduation....only some records
Height
Weight
Smoker/NonSmoker
SourceOfLead......hugely repeating entries
Male/Female

plus several other fields that relate to further defining the person for insurance purposes.

Would you keep prospects in the same table as clients. Initially prospets only have very brief data...name, address, phone. They either end up becoming clients or going into the Waste of Time department.
 
3NF should suffice in most cases, but then again that is only my opinion. As for your redundant data (repeating info) think of functional dependency. Does it depend on the entity? What I mean is, does all the stuff about the policy depend on the customer? Or does a lot of depend just on policy data? if not, then stick it in another table and stick a primary key in the table.

Basically your customer table should have a customerID as a primary key and all your obligatory customer info and if possible a policyID as a foreign key. It depends how you want your application to go, you could then have a policy table listing all the types of policy but the ID's here would be only for a type of policy. You would need a third intermediary table to take care of generating/holding the policy number ( as a number of policies may be identical, but belong to different customers), understand where I am coming from?

The intermedairy table (policyNumber?) would hold the customerID (as a foreign key linked to the customer table, from customer to intermediary table as a 1 to M relationaship, as a customer may have more than one policy), policyNumber (the policyNumber tables' own primary key) AND policyID (another foreign key linked to the policy table as a 1 to M relationaship) and all three fields in this table would have to concatenated as a three field primary key, to ensure that each record is unique. Oh the joys of database development. Got stuck? let me know.
 
Here's my 2 cents worth.

If you have fields that are specific to a certain type of policy (ie. income replacement), then I would seperate those fields into their own table. This could get complicated depending on how many different types of policies there are that will need unique fields.

Anything that is common between all types of policies keep in one table (start date, waiting period, etc.)

You can even do this for your prospects versus customer table. List all "people" in one table, I would probably make this a contac table that retains people who become customers, but have a seperate customer table for any specific data that only pertains to customers (possibly age, smoker), though I don't know if I would designate this as a "customer" table, or an "applicant" table because you may need to have all the pertinent "customer" information when someone fills out an application, rather than when they sign the contract and pay for the insurance.

What I would end up doing on the User Interface end is (obviously) having a sub-form specific to each type of policy you have designated above, and used a Tabbed control to switch between each type of policy. This would allow you to look at each type of policy seperate.

As far as structuring the tables, you could store the "person" ID in each table where a person has that type of policy and make the policy number the unique key field. You might not want to make person/policyID the primary key, because a person may or may not have more than 1 of an individual type of policy.

Have I totally confused you, or just myself :-) We could possibly exchange files, if you wanted to show what you have so far, and I can show you how I would do it based on what you have. Or possibly, if there is PII info in your database, maybe post a screenshot of your tables/relationship window.
 
Mike, here's my zero cents worth:

Just from looking at the title of this thread, I would say that normalizing a database is not really necessary.

Do you know how many people learn this stuff because they have to? Like, right away? In that case, most people don't have the time to learn about the "proper" process, nor do they really care (when starting out, that is). If I had to judge this subject, I would say that a potential sale doesn't have the time to "wait" on proper process and technique.

That's really too bad... :(
 
i feel its not quite a question of deciding how far to go in normalising.

its more a question of REALLY REALLY understanding the processes that your data will go through, and the information you need to store/retrieve.

you will then come up with a certain set of entities that will be related to one another, and attributes that will logically form part of a particular entitiy.

eg at a minimum you will have
Customers (ref no, name address dob status etc etc)

Insurance companies (ref no, name address phone)

policy types (ref no, , description)

Policies (insurance company, customer, policy type, date taken out, expiry date, etc)

As an aside note that in each table, there is a refno idenitifer. These are usually numeric (easier for Access to handle and often generated as autonumber fields - since the purpose is purely to provide a unique identifier to permit items to be realted to one another)

Where it becomes awkward is when you start thinking about your data at a more asdvanced (less superficial level). eg , You need to decide how you want to deal with, say, policy types. If you want to search for which insurance company offers motor policies, or employers liability, or puiblic liabiitly etc, then you need another table joining insurance company and policy.

If a policy has mutiple policy holders then again, you will see that (perhaps!) the policy table as it stands is not sufficient, You will either need to store multiple policy holders in the main policy table (this is not normalised but may be sufficient for your purposes) or have one table for policies, and a second table for policyholders, which links policies and customers.

doing it the second way is more work, but if you then need ot find all the policiies held by a given individual its a lot easier.

So, You need to do a careful analysis of your data first. Doing this first and getting it right in the first place is far easier than getting it wrong, and then trying to fix it. Offhand I would have thought an insurance policies database in a conmmercial environment was quite complex
 
I don't understand what adam is saying...

Since I'm not a trained programmer, I take a pragmatic approach - if it works it works. However, I have learned the hard way that a normalised database makes difficult taks much easer and renders possible what would otherwise be impossible. I have found that normalisation is now something I do instictively. I don't know whether I achieve 3NF or not since I don't know how that is defined. What I do know is that I have constructed a lot of databases that do some complex things with a minimum of code. If I can achieve what I want using SQL then I believe the design is pretty good.
 

Attachments

  • motor_policy_data_model.gif
    motor_policy_data_model.gif
    12.2 KB · Views: 126
One of the problems with normalisation is that too many people see normalisation as an IT "techie" issue, it isn't. Normalisation exists because businesses change. Imagine a business selling 10 products in each of its 12 shops, you could design a simple flat file DB, but you would be up shit creek without a paddle when it adds another 3 shops and 14 products.

I too have ignored normalisation on simple , in a hurry, one off dbs that exist merely to get out the results of a survey, I made a point of "Deleting" them after the report has run. The ""is because I actually hide them from the users just in case I want them.:D

Brian
 
What we basically did was as follows:

We changed the maintable I have into three tables. One with prospects, one for clients and one for personal/non insurance related people. The gain we got from doing that was to allow for more prospect records to be held while running over a wireless network. Too many records and mine bogs down over the wireless network and because of the very large number of fields. A prospect either goes into the reject table or goes onto the client table, so an Append/Delete combination.

For policy details he retained what I had and that was for a practical reason. We get policy updates each month from insurance companies (new premium with age change etc) and they have all types of policy benefit in the one table, actually Excel. Of couse we could have broken the table into smaller tables joined them back with a query but that would have been slower than the table. In 99% of cases we look at policy type based on commencement dates, individual client's policies, occupational groups etc. as opposed to looking at only certain policy types. This is probably the reason that the insurance companies also keep policy benefit details for policy types in the one table.

Mike
 
Adam

That is not really a problem as there is a gradual move from either a card system or some other data base or Excel.

As to practical sales situations then with a data base that is used in a very strong and fast moving sales environment, especially telemarketing, the screens and other forms that are opened need to be able to be changed and often more than once during the day. For that reason I prefer macros to code and because I won't be the one making the change. It is simply far easier and more fool proof for the person using the data base to change to another macro than to alter code.

Getting the tables right is the key because changing them down the track is one real big issue.

A couple of other tables we changed were the ones that hold appointment/made details and the table that has a record for each cold call made to the person. In my case I have the persons phone number, name and so on go into the record and so my tables are much bigger. In fact just about all my Many tables contain at the very least the persons first and last name and usually more. I could just never bring myself to have those tables without at least some identification. That is just a personal hangup I have and probably because of a mistrust of computers. Also, in my experience tables are the least trouble of any of the Access objects. In short, I don't like trusting a query to tell me that a policy benefit, an appointment, a file note etc belongs to John Andrews:D

Mike
 
Last edited:
Be aware that running Access over a wireless network is asking for trouble. Even a momentary loss of connection can lead to corruption.

I would certainly have normalised the database further than you have. Not trusting a query to retrieve data is a curious concept. Perhaps you should go back to oak gall ink and parchment?
 
Be aware that running Access over a wireless network is asking for trouble. Even a momentary loss of connection can lead to corruption.

I would certainly have normalised the database further than you have. Not trusting a query to retrieve data is a curious concept. Perhaps you should go back to oak gall ink and parchment?

We do get drop outs. So far the only problem we have had is failure to complete a record from a cold call. I was not aware that corruption could occur. Although we make a copy of the actual data base about every two to three hours.

I would not mind at all being able to go back to a feather and ink:D Actually, for telemarketing I would like to have a system where we printed each person's details on a sheet of paper and then after the calling was done the computer could be update so that the names could be managed and ready for the next print out. Forus cold calling from paper gives approximately 15% better results than from the screen but that is on the basis of results measured against calls. When results are measured against time then the computer wins and mainly because of management of call backs etc.

The pen/paper/calling falls down with one telemarketer making calls for five different salesman. However, when we call high dollar names and where the number of prospects is not limitless, such as private practice medical specialists then we use paper and pen.

Maybe a monitor that sat flat on the desk like a piece of paper and a mouse that was like a pen would be the ideal.

Mike
 

Users who are viewing this thread

Back
Top Bottom