How to normalize contacts in Access (1 Viewer)

twgonder

Member
Local time
Today, 00:40
Joined
Jul 27, 2022
Messages
177
Here’s my first post to this forum, and it’s going to be a doozy.
I’m coming from a mini/mainframe environment, where things weren’t normalized very well.

But as I go down the rabbit hole of Access, I wonder which of two approaches might be best for flexibility and usability.
I’ve created these tables (mostly in my head) for organizing people and their attributes and foreign attributes:
tblEntity A living or legally created being (corporation, LLC, etc.)
tblPoc A point of contact (hence POC), a real location (maybe virtual) like address, phone, fax, GPS coordinate, etc.

These two tables have a many to many relationship, requiring a third table tblEntityPoc.

For example, my son who is seven lives with me at my address. We share a land-line phone. We don’t share my cell number. We share a FAX machine, but not emails.
We are two different humans in tblEntity.
My cat José is in tblEntity too. He shares my address but nothing else in tblPoc

If I put all the types of fields for a POC in one record, but only for one fldType of POC then I have a lot of empty fields, but they don’t use much space in a variable length file system like Access. I’m not sure how SQL Server or other flavors fare, if they are fixed length or variable length fields.

For example, if I create a street address for myself (and my son and José) in tblPoc that record will have a fldType of “addr” and the record won’t have phone, cell, FAX, etc. Each of those would be in its own record.

The other choice is to create a half-dozen or so tables very specific to the fldType of POC. In other words, a tbl for land-lines won’t include cell phone numbers, because in many countries they don’t share the same formatting (and some different attributes, i.e., the land-line could link to an address in tblPoc, but a cell is mobile, so it can’t), true for some Faxes (formatting) too. Or maybe I have one tbl for all types of phones or fax, but then I have to have special formatting for each field (which I would have to do anyways if there was just one big tbl holding all POC types), and it might cause a problem with the next reality below.

Imagine I have one land-line and two cell phones, but only one of those can be a preferred contact for phone calls.

Has anyone been down this rabbit hole before that can offer some insights for dealing with this in Access?
 
Last edited:

Ranman256

Well-known member
Local time
Today, 01:40
Joined
Apr 9, 2015
Messages
4,339
People are the worst to normalize. I have :
a tPerson table
tAddress tbl -many persons can share the same address
tPhone tbl -many phones or emails

the person can have multiple phones, emails
the tPerson table holds the keys from the other tables.
 

The_Doc_Man

Immoderate Moderator
Staff member
Local time
Today, 00:40
Joined
Feb 28, 2001
Messages
27,001
Look up in this forum the "Entity-Attribute-Value" concept where you have variable numbers of attributes, not all of which apply to all possible entities. I use this in my genealogy database a lot. Ancestry.COM's GEDCOM files that convey my genealogy have that format to 4 levels of attributes and sub-attributes. BUT they are sparse... i.e. if they don't know, they DON'T send me an "I don't know" record; they just don't send anything with that attribute. Things like Mother, Father, Birthdate, Birthplace, Deathdate, Deathplace, Burialdate, BurialPlace, Address with subs of DateofAddresss, ActualAddress, ... At last count, over 100 possible genealogical attributes.

That's what I've been working with for a few years. In your case, you might have an attributes table with three fields: The entity (usually a foreign key to the entity table), the attribute (usually a code to a table of recognized attributes), and a value, which will usually be text that, for inherently numeric entities, is the CStr() of the value. Phone numbers and addresses have non-digit characters so ARE text.

So you have the attributes table that is a child of the entities table. You have a "known attribute" table. The values in the attributes table are literal, not encoded or looked up. That's three tables (one of which is merely a "translation" table - code to attribute name). The translation table would include some indicator of whether the value in question is numeric, date, string, boolean, or hyperlink or whatever else you've got in your bag of tricks to be recorded. That includes indicators for special formats like phone number, social security number, etc.

When you have multiple possible phones, your "translation" table might have "Cell phone" "Home land line" "Pager" "Burner phone for the bookie", etc. IF your attributes were such that they had to be unique, then you would have to choose which of the conflicting attributes to keep and drop any competitors. BUT if you allow for multiple values, you add a fourth field for the attribute, "Preferred" (as a Y/N field) and then have as many items of the same type as needed.

If you don't allow dups, then your key for the attributes table has the compound unique key Entity/Attribute. If you DO allow dups for attributes, then your key is merely the compound key Entity/Attribue (Dups allowed).
 

CJ_London

Super Moderator
Staff member
Local time
Today, 05:40
Joined
Feb 19, 2013
Messages
16,553
but then I have to have special formatting for each field
don't agree - formatting is a presentation property and does not belong in tables. Having said that, contact details are always text - phone numbers often contain a preceding zero and spaces so they are not 'numeric'. An extension number might be numeric but for convenience, store as text. Point is you can set the text 'format' at time of entry.

For addresses, in the UK there are potentially 41 values that go into making an address - not all will be used all the time. I don't know about other countries.

We are two different humans in tblEntity.
My cat José is in tbl_Entity too. He shares my address but nothing else in tbl_Poc
which implies the requirement of a many to many table or you treat an address as a type of entity which has its own associated contact methods (such as land line). Whether you include it in your entity table or a separate table really depends on your requirement. Values such as addresses can saved in one field, using a pipe separator or similar so when the address is viewed you can convert to an array or use the replace function to replace the pipe with vbcrlf.

But in a business you might have one address and multiple phone numbers for different departments

Imagine I have one land-line and two cell phones, but only one of those can be a preferred contact for phone calls.
I use two flags 'preferred' and 'do not use'. The second one came about for an app I developed for a telemarketing company. They would buy in peoples details. If you deleted the record not to be used, it could be repopulated the next time data was uploaded.

Basically I suggest 4 tables

Entities - can include a parentFK
methodtypes (address, landline, fax, cell, email ,etc) which contains the format rule required
Contacttypes (office, home, sales, private, etc) contains FK to methodtypes). The type description could be in a 5th table
ContactMethod FK to entitities, FK to Contact types, value and the two flags as above

Other considerations can be subject - e.g. this person is not to be contacted about house maintenance (you wouldn't want your son to receive a call about fixing a leaking roof)

You might want a business rule that says that landlines can only be associated with an address type entity
 
Last edited:

twgonder

Member
Local time
Today, 00:40
Joined
Jul 27, 2022
Messages
177
People are the worst to normalize. I have :
a tPerson table
tAddress tbl -many persons can share the same address
tPhone tbl -many phones or emails

the person can have multiple phones, emails
the tPerson table holds the keys from the other tables.
Why put email addresses with phone numbers? Following your design, shouldn't emails be in their own table?
 

twgonder

Member
Local time
Today, 00:40
Joined
Jul 27, 2022
Messages
177
don't agree - formatting is a presentation property and does not belong in tables. Having said that, contact details are always text - phone numbers often contain a preceding zero and spaces so they are not 'numeric'. An extension number might be numeric but for convenience, store as text. Point is you can set the text 'format' at time of entry.
I'm planning to use formatting in tables because I am writing the software for Spanish, Portuguese, French, Italian in addition to English. No one format will serve all countries, so formats become dependent on the country and type of POC it is. Like my foreign language translations, they will be loaded when the form/report opens.

Until recently here in Colombia, there were two completely different formats for land-line vs. cell. I once traveled to a country (don't remember which it was) that had different formats yet for Fax numbers! Aggghhh!

For addresses, in the UK there are potentially 41 values that go into making an address - not all will be used all the time. I don't know about other countries.

I would be very interested to see that, one government site here went to about 15 fields that is supposed to cover everything, but 41, that I have to see (and prepare for). As with prompt/heading translations, each of these different formats has to be loaded with a specific prompt for how the field is used in that country as well as being translated to another language. Aggghhh! I showed an example of two reports using language translation back in my "here is me" post. There you can see language, color and print formatting (press a print button, and report colors are converted to gray-scale) that came from tables.

Basically I suggest 4 tables

There will be more than four for all the types of tables that relate to tblEntity (but I'm trying to keep it reasonable, hence the original question, one table of many just for the addr/phone/email/GPS data. For this discussion, I tried to keep it simple. But I get what you're saying. Thanks for your ideas.
 

CJ_London

Super Moderator
Staff member
Local time
Today, 05:40
Joined
Feb 19, 2013
Messages
16,553
I'm planning to use formatting in tables
no one format will serve all countries
but tables can only have one format (per column) and some formats, particularly numbers are impacted by the windows settings. So guess you are talking about a format spec field

On revisiting a past project think my 41 was for the whole thing, we used 16 for the basic address, another 6 for postcode breakdown and the rest was to do with a persons name (can't remember the full list but would have included things like title, subtitle, first second third names, honours, qualifications etc)

see this link for the address and postcode part
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 01:40
Joined
Feb 19, 2002
Messages
42,981
I’m coming from a mini/mainframe environment, where things weren’t normalized very well.
Don't blame it on being mini/mainframe, blame it on people not understanding normalization, period. Even in the 80's when I was working with IMS, IBM's hierarchical database, our schema was normalized. Even my ISAM and VSAM files from the 70's were normalized as best they could be given that we didn't have queries to join tables we had to do multi-file matches to work with our parent and child tables which were of course not called that. The parent table was a "master". The worst database I've ever seen was designed by a web team. Those people really don't understand relational database design:(

The m-m relationship between entities and POC is a real conundrum. The decision on whether to implement this as 1-m or m-m revolves around how much space a m-m might save and how complicated the update rules have to be. A pure implementation would be the m-m. The practical implementation frequently ends up as 1-m. You need to go through the logic for managing the m-m. Are your data entry people going to be up to the task? If contact 1 calls to change his address, is someone going to call up his address and just change it or are they going to dig deeper to determine if only he is moving or are others assigned to this address all moving? Are some moving? Is only he moving? Strangely enough, for smaller applications such as personal contacts or schools, it is more likely that all will be moving rather than 1 or some so the m-m is more efficient but with other types of applications, where my contacts are sales people or project managers, it is the opposite. It is more likely that only one person will be moving.

I would separate addresses from other types of contacts like phone and email. Addresses have a lot of discrete fields as CJ mentioned. The UK has 41. The US has a lot also but I haven't worked in direct mail for a while but I thought we had closer to 15. I would put addresses in one table and all the other types in a different table and for those use the entity attribute method described by Doc. The other types of contact are single fields although they have different format requirements. You can tag each type with a format so you can properly validate these fields.

In the US, unless a company is in the direct mail business, the address standard is just a few fields - AddressLine1, AddressLine2, City, State, Zip Sometimes Suite. So 6 at max. The reason that direct mail companies worry about the details is because they can reduce their postage expense by doing presorts that the PO would otherwise have to do. And that requires very precise data entry. For example the compass designation used in cities with a grid layout might be before or after the street name and that matters for the presort. They also are much more concerned about duplicate addresses. It costs a lot of money to send out junk mail. They really don't want to send 3 pieces to the same address because they have three slightly different versions of an address. People in the UK and other countries can comment on whether the 6 fields I listed work for them. Of course State is sometimes called Province and Zip is called PostalCode. Don't let the different names confuse the issue. The only reason for two address lines is because sometimes the second line is used for Suite or other more detailed information.
 

CJ_London

Super Moderator
Staff member
Local time
Today, 05:40
Joined
Feb 19, 2013
Messages
16,553
I would say the standard in the UK is 5 fields, sometimes 6 - the 6th being 'house name or number' plus a 7th for country if required. but field lengths are long because sometimes you just can't fit the basic address on 2 lines for each part
 

twgonder

Member
Local time
Today, 00:40
Joined
Jul 27, 2022
Messages
177
Don't blame it on being mini/mainframe, blame it on people not understanding normalization, period. Even in the 80's when I was working with IMS, IBM's hierarchical database, our schema was normalized. ...
Yes, I studied Cobb in university, and I know some mainframes from IBM had started implementing the ideas. I ended up working with Reality/PICK/D3, which in another forum I stated was pretty basic. Here's a file structure, here's a query language, here's a BASIC that works with the query, good luck! As Dick Pick would say, "I gave you the rope, now try not to hang yourself!"

I've read that referential integrity breaks in Access when one splits a backend db. So, obviously, the jet or whatever it is these days isn't very sophisticated. I have the same issue with a lack of synchronization between various backend dbs at remote locations. My post asking about this in another forum was met with silence.

I have design rule I've used for decades. A simple flag field (I'll use a date because Access has problems with nulls in queries on yes/no fields as I understand it) shows when a record was first touched by a foreign table. If that flag is there, the record can't be deleted. A simple data entry error can be deleted as long as the record hasn't been used elsewhere.

Because certain departments of the government want data passed in little bits of fields, I can't use the Addr1 Addr2 model. I'm going to have to have some15-30 fields just for address excluding city, st, zip and person. The big problem, there doesn't seem to be any official standard on what those 15+ fields are and how they are to be referenced. That's in one country, now imagine what happens in the 250 countries I have in my tblCountry!

As to the person moving issue, it's the same, once an address is entered, it doesn't change. The location doesn't move with the person, it stays put. The same with phone numbers/email. In practical terms, this means that my data entry forms will have to have a very powerful search feature for lots of different textboxes when adding or changing an address/phone/etc. that allows for finding/using an existing address (not changing it) and adding something (like an address) if it doesn't exist.

For the first time in a long time I was impressed with a web order I made at Walgreens. The form first confirmed my country, then asked for a zip code and then with just a few keystrokes filled in the rest of my street address. Some design person at Walgreens is on the ball. My old system confirmed country then asked for a postal code. For the USA and Canada the user was after that presented with a default city and state/province. It seems logical to take that one step further with street addresses too, as Walgreens has done.
 

CJ_London

Super Moderator
Staff member
Local time
Today, 05:40
Joined
Feb 19, 2013
Messages
16,553
I've read that referential integrity breaks in Access when one splits a backend db.
provide a link to what you've read because that statement is absolute rubbish. RF is maintained in the BE, not the FE and you cannot apply RF to tables in different databases - and that applies to any db, not just ACE

My post asking about this in another forum was met with silence.
you are too impatient - here's the link https://www.accessforums.net/showthread.php?t=86355
 

twgonder

Member
Local time
Today, 00:40
Joined
Jul 27, 2022
Messages
177
provide a link to what you've read because that statement is absolute rubbish. RF is maintained in the BE, not the FE and you cannot apply RF to tables in different databases - and that applies to any db, not just ACE


you are too impatient - here's the link https://www.accessforums.net/showthread.php?t=86355
I think you missed the part where I said, "when one splits a backend db". But you did confirm what I said.
You're right, I hadn't checked for your great message, one hour or so, before I wrote my question here (then again, it wasn't my original question, it was more of an tangential observation).
 
Last edited:

Pat Hartman

Super Moderator
Staff member
Local time
Today, 01:40
Joined
Feb 19, 2002
Messages
42,981
I've read that referential integrity breaks in Access when one splits a backend db
When you establish RI in a monolithic app and then split the app, if you use the built in splitter tool, the schema is transferred entirely including RI. If you do it yourself, of course YOU are then responsible for everything. If you are talking about trying to establish RI between two BE's, can't do it. RI is enforced by the BE for only those tables within it. Same for SQL Server.
A simple flag field (I'll use a date because Access has problems with nulls in queries on yes/no fields as I understand it) shows when a record was first touched by a foreign table.
Keep in mind that you are actually talking about Jet or ACE, NOT Access. The Yes/No datatype is a bit data type and does not support null. It is a 2-value data type yes or no, null is not an option. If you need null, you need to use a small integer. Know thy tool.
Some design person at Walgreens is on the ball.
Apparently you've never used their prescription refill fiasco of a "system"

Managing street addresses that way in an application is a big bite to chew especially when you are dealing with multiple countries. I'm sure you could buy a database from some company that maintains them. Might be a good idea to search one out and find out what format they use for addresses. That would end up being too big for ACE. You'll need SQL Server or some other RDBMS
 
Last edited:

twgonder

Member
Local time
Today, 00:40
Joined
Jul 27, 2022
Messages
177
When you establish RI in a monolithic app and then split the app, if you use the built in splitter tool, the schema is transferred entirely including RI. If you do it yourself, of course YOU are then responsible for everything. If you are talking about trying to establish RI between two BE's, can't do it. RI is enforced by the BE for only those tables within it. Same for SQL Server.
I started my old application on 10 mb and in production had to go to 30mb. I can't imagine needing 2GB unless ACE handles variable length fields poorly. But some client may want the entire world of addresses, you gotta plan for the worst. That's why I'm thinking ahead for a possible split be.

Keep in mind that you are actually talking about Jet or ACE, NOT Access. The Yes/No datatype is a bit data type and does not support null. It is a 2-value data type yes or no, null is not an option. If you need null, you need to use a small integer. Know thy tool.
My old manual says yes/no uses integer, so it's two bytes if that's still true with ACE and Access. I used to do bits in Assembly, but I doubt VBA does.

Apparently you've never used their prescription refill fiasco of a "system"
Nope, it was just a regular product purchase. But it did make me go Whoa! Look, they took something I did 40 years ago to the next level!

Managing street addresses that way in an application is a big bite to chew especially when you are dealing with multiple countries. I'm sure you could buy a database from some company that maintains them. Might be a good idea to search one out and find out what format they use for addresses. That would end up being too big for ACE. You'll need SQL Server or some other RDBMS
I did it once before, 40 years ago for the USA and Canada, so I can probably do it again in Access with a little more knowledge and expansion. Once I know my tool. What I would like is a resource for this. I don't know of an ISO standard for addresses as of yet. I can't imagine a worse system than I've seen here in Colombia, but you never know. Wait, I take that back. Costa Rica took the cake years ago, but I hear they've modernized their address method. You used to address a package something like this: "Ebram, two blocks from the kid's basketball court, house with big tree and green door." Even my tico friends were lost three blocks from their own home.
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 01:40
Joined
Feb 19, 2002
Messages
42,981
The BE should ALWAYS be split, even for single user applications.

When I lived in Kuwait, most houses didn't have street addresses. My house was known by its name "Medina House". Probably because it was once occupied by a member of the royal family before it was turned into 3 apartments and a business office. Anyway, even though my street was named and had only four houses on it, every one I met knew it by its name.
 

twgonder

Member
Local time
Today, 00:40
Joined
Jul 27, 2022
Messages
177
The BE should ALWAYS be split, even for single user applications.

When I lived in Kuwait, most houses didn't have street addresses. My house was known by its name "Medina House". Probably because it was once occupied by a member of the royal family before it was turned into 3 apartments and a business office. Anyway, even though my street was named and had only four houses on it, every one I met knew it by its name.
Now I got to add a field for name and not number on the street. Here we go deeper in the rabbit hole.
 

Pat Hartman

Super Moderator
Staff member
Local time
Today, 01:40
Joined
Feb 19, 2002
Messages
42,981
I do that in my investment property analysis database. Having the house number with the street name makes it harder to find properties on the same street. But I define the house number as text so it can include letters. I don't need to break the "number" into multiple fields where the letter can come before the number or after it.

Common sense is a very good attribute to have when it comes to design work.
 

twgonder

Member
Local time
Today, 00:40
Joined
Jul 27, 2022
Messages
177
Common sense is a very good attribute to have when it comes to design work.
Oh, please come and tell that to the government agencies I have to work with.
And not just the government. My bank has me logon to my account here over the WEB with my four-digit ATM PIN. How hard is that to crack?
 

GaP42

Active member
Local time
Today, 16:40
Joined
Apr 27, 2020
Messages
310
Some other considerations :
1. We needed to retain address history - to support disease spread / tracking - ie epidemiology - datawarehouse
2. Commercial software was used/ integrated with the application to validate address data on entry, and to supply standardised output - including geocodes
3. Even so, an entity may have a physical and a postal address - do you want both?

What is key is being fully aware of the business needs and being able to support those through storing data in a properly normalised form. When you have a scope covering over 100 countries - don't expect much standardisation - the presentation format for addresses varies considerably, and terminology for these elements/attributes too. It will take considerable time to research and analyse to come up with a resilient db structure, without getting into the other items to manage contact data.
Given the scope of this work I presume you are going to be requesting data sets to load - which is where you will be setting your data standards/definitions and quality constraints, in the first instance - and finding out if the data suppliers can comply.
 

strive4peace

AWF VIP
Local time
Today, 00:40
Joined
Apr 3, 2020
Messages
1,003
hi @twgonder

I have 2 contact management systems with source code in Access that you may want to look at to get ideas.

http://msaccessgurus.com/tools.htm#Contacts

The Contact Management Template is older and more robust (and also has many extra tables so just import what you need). MyContacts is newer and has better naming but isn't split into FE/BE and is easier to learn.

ps, Welcome to AWF!
 
Last edited:

Users who are viewing this thread

Top Bottom