Correct schema for validating City-Country-Region? (1 Viewer)

Gopher__

Registered User.
Local time
Today, 06:12
Joined
Jul 23, 2012
Messages
13
Hi folks,

My first post here :)

I am currently in the process of building my first serious DB (generic customer orders). I am beginning by building the tables for storing customer addresses.

My professional background is as a data/database/quality management analyst in Finance and I am generally very picky when it comes to data validation and definitely subscribe to the "no data is better than bad data" ethic.

So all this leads to me trying to create an address system where data is fully validated at table level while I want forms to be dynamic so that, if you selected 'United Kingdom' as the country, you will get a filtered options list for cities currently stored in the DB belonging to the United Kingdom.

The actual form query through event handling is not a problem for me (I should at this point perhaps state that I have used PHP/SQL a fair amount, have built dynamic websites and I would consider myself pretty much expert in Excel, including VBA for Excel). However, I dont know if I am entirely sure of the table structure for this.

I have thought it through and have come to the conclusion of: -

1) A table for countries with an autonumber PK

2) A table for cities with three fields - Autonumber PK, City and CountryID (FK). The City and CountryID fields would be a compound index (no duplicates).

3) A table for cities with three fields - Autonumber PK, Region and CountryID (FK). The Region and CountryID fields would be a compound index (no duplicates).

This would allow for the form to requery on change events to return only the relevant data. Where no city or region exists for that country, the user would be given the option of opening a form that will allow them to edit the relevant table to add the missing data, then the parent will requery on close.

Hoping I have explained that properly, does that make any sense?
 

MarkK

bit cruncher
Local time
Yesterday, 22:12
Joined
Mar 17, 2004
Messages
8,180
You could make one table that self-references, so ...
tblGeo
GeoID (PK)
ParentID (FK linked to the the GeoID in this same table)
Name (Canada, British Columbia, Vancouver)
Abbrev (CA, BC)
Type (Country, Province, City, Neighborhood, State, Electoral District, Street, River, Mountain, etc...)
But you're not Google Maps. How much time and energy do you want to spend on this data, which is really only a side-show in terms of what your database is for?
Are users going to find this feature useful, or does this satisfy more the perfectionist in you? Will users be annoyed they can't just type or copy in the address?
 

Gopher__

Registered User.
Local time
Today, 06:12
Joined
Jul 23, 2012
Messages
13
Hi,

Thanks for the response.

I dont want to spend a terrific amount of time on the data. The database is actually for one of my businesses (an Ebay-type business) but I am generally just trying to work out what the best sort of schema to use for this type of system.

I am a bit of a perfectionist, yes. But I all want something that is user friendly and will be applicable to up-scaled data models for SMEs etc.

Personally, I dont see how users will be annoyed if they cant just type in the address. Certainly, it is more time consuming to have to add in a new Country-Region or Country-City but, generally speaking, there needs to be some kind of best-practice involved here or data integrity will go out the window.
 

MarkK

bit cruncher
Local time
Yesterday, 22:12
Joined
Mar 17, 2004
Messages
8,180
My 2c is that there is no "best schema" except for the one that serves your purposes most completely.

I have recorded addresses as a single text field for the whole thing, which is easiest by a long shot, and I've written code to return validated addresses from Google's API at the other extreme. One is cheap and easy, the other, expensive and hard, and there's everything in between.

The easiest, if you'll be collecting data from online users, is get them to enter their own addresses in text fields that you don't validate. It would be easy enough to swipe one of those drop-down lists of countries from an existing web site, which will be in the page source somewhere.

The other things is if you want your data to conform to standards you have just because you feel like it, that's cool too. I like a self-referencing table because I like treeviews and the programming doesn't scare me. Other people will prefer to have, as your original post suggests a table for countries, for cities, and so on.

Hope that helps, and welcome to the forum.
Mark
 

Gopher__

Registered User.
Local time
Today, 06:12
Joined
Jul 23, 2012
Messages
13
So would this schema be OK for the table layout I mentioned in the original post?

CountryID-Region and CountryID-City are both unique composite indexes.

i46. tinypic. com/2czx7b6.jp g
 

Gopher__

Registered User.
Local time
Today, 06:12
Joined
Jul 23, 2012
Messages
13
Sorry about not being able to post pics - my post count is not high enough yet. I hope I havent bothered the mods.
 

MarkK

bit cruncher
Local time
Yesterday, 22:12
Joined
Mar 17, 2004
Messages
8,180
I give every record in every table a unique key. Your unique index makes sense.
If it makes sense to you to store the city as a child of the country then do so. I've more commonly stored it as a child of the region, but again, this is my main point, that for address data there are many 'right' ways to approach the problem.
 

Gopher__

Registered User.
Local time
Today, 06:12
Joined
Jul 23, 2012
Messages
13
Thank you very much for your input. It's kinda laborious trying to understand best-practice for a new technology and a lot of the Access DBs I have come across often seem to me to be too simplified for my liking and dont allow proper data-validation.

I think the main problem I have with the examples I have seen so far as things such as derived-data for historical prices etc. In my mind, it makes much more sense and is far more professional to keep a relationship instead of derived data as simply overwriting the current price leads to massive problems regarding price histories of products etc, as it would only be possible to trace this via the derived data in order histories which, of course, is not a best practice for this.

Of course, all this is down to what you actually want to do with the DB - some people will want something minimal whereas anyone wanting any kind of business intelligence out of it will want something very thorough. I think what I am really aiming to do is produce solutions that are thorough yet user-friendly and one where I have known schemas that I can copy over to new projects.
 

Gopher__

Registered User.
Local time
Today, 06:12
Joined
Jul 23, 2012
Messages
13
I should probably correct that - the derived data is not what would cause the problem with the price history, the overwriting of the price would. There should be a table holding the prices (active and inactive) for a certain product and a form would be used for the user to change the price, the with VBA behind the form setting the old price to active, logging the date the old price ended and creating the new active price. That sounds right, I think?
 

MarkK

bit cruncher
Local time
Yesterday, 22:12
Joined
Mar 17, 2004
Messages
8,180
See, and in this case with prices, I think there is a best practice that I don't think exists for address data.

I would definitely recommend as a best practice that any data subject to change be stored with the object that requires a reference to that historical data. So for instance, in an invoicing system, the Product table contains a price, but that price is subject to change over time. As a result, the price must also be stored with each invoice detail record, which captures the price at the invoice date, and the price as it might be subject to discounts and so on. In this case, this is a discrete dimension of the invoice detail record as distinct from the base price that might appear in the Product table.

Address data doesn't have requirements like this. Address data you can legitimately store as one long string or break it up by street, city, region and so on. In both cases you are storing all the data. In the Price/InvoiceDetail issue above, if you fail to store the price with the invoice, you've made an omission and failed to store important data.

They're different.

Cheers,
Mark
 

Gopher__

Registered User.
Local time
Today, 06:12
Joined
Jul 23, 2012
Messages
13
Thank you. I think I follow. The data in a current invoice can be derived via a domain function (DLookup) or it can have a one-to-many relationship to the current price. And, if the second option is used, the price will retain as, when the price is updated, the field holding the price will still relate to the now historical price. That sound right?

Here is a question regarding best-practice for addresses I would like some help with...

Regarding customer/client addresses, the current schema I have is: -

One table (tbl_customers) holding just the basic, entirely static, information about the customer. For example, the customer name and possibly other information, such as registered number etc, if applicable.

A second table (tbl_customers_addresses) as a junction table to link the customer to their address(es).

A third table (tbl_addresses), holding all addresses whether they be contact addresses for that customer or billing/delivery address for the customer.

Now, here is the tricky bit (tricky for me at least!)... where do I put the field that states whether or not the address is the default for the contact or type of address (delivery, billing etc) for the customer? Lets say it's for the billing/delivery etc address. Do I have a boolean field (still cant get used to calling them "yes/no" fields ;)) in the junction table for each option (billing, delivery etc)? And simply have the VBA behind the form update the table to set those selections to true and make sure the others are set to false? The code itself wouldn't be hard I wouldn't imagine... I have done much VBA for Access but I know you can simply do it with an SQL update statements - one to set all possibilities to false then one to update the chosen fields to true. I would imagine this could also be done with DOMAIN functions. But, right now, I am more interested in the theory.
 

MarkK

bit cruncher
Local time
Yesterday, 22:12
Joined
Mar 17, 2004
Messages
8,180
A record in an invoice detail table looks like this
tblInvoiceDetail
InvoiceDetailID
InvoiceID
ProductID
Quantity
Price
Discount (maybe)
The price of that line item is not linked to a value that is subject to change, but notice that the product description is. The product description will be looked up from the product table, so if you reprint an invoice years later, the most recent product description will appear. But the price, in the invoice record, can't be subject to change. It must be stored with the invoice detail record, because it's mission-critical to the fundamental purpose of an invoice, to record the financial reality of the sales event.

I would not have a customer address table. Store all identically structured objects in the same table. If you need to distinguish those things from each other in that table, add a field, so maybe your address table needs an IsCustomer field. Then you write a qCustomerAddress query with a WHERE clause WHERE IsCustomer = true. Much simpler.

You can call yes/no fields boolean. Boolean is completely correct.

So a default, yeah, you need to make sure everything else is not the default. You can do a very efficient update of data in a table like...
Code:
CurrentDb.Execute _
  "UPDATE tAddress " & _
  "SET IsDefault = False " & _
  "WHERE CustomerID = 42", dbFailOnError
...and then set the specific address...
Code:
CurrentDb.Execute _
  "UPDATE tAddress " & _
  "SET IsDefault = True " & _
  "WHERE AddressID = 109", dbFailOnError
Cheers,
 

Gopher__

Registered User.
Local time
Today, 06:12
Joined
Jul 23, 2012
Messages
13
Forgive me if I am being a little slow here, still trying to get my head around things.

The idea behind my addressing schema would be so that the addresses are all held in a single table (as you suggest above) and that the addresses are related as a customer delivery/billing (lets call this business type for reference) etc address or as a customer contact address through a junction table.

With that example, it means that the addresses would be identified as customer contact or business type address through the junction table it uses. And the junction table for the business type can then have Boolean fields to identify it as delivery, billing etc.

Does that make sense?
 

Gopher__

Registered User.
Local time
Today, 06:12
Joined
Jul 23, 2012
Messages
13
I think my suggested schema is pretty much what is described in the schema in the link that jdraw posted above.
 

jdraw

Super Moderator
Staff member
Local time
Today, 01:12
Joined
Jan 23, 2006
Messages
15,378
Gopher,

You might wish to watch these free videos, that are available to anyone with access to the internet. There are other videos, these are just a sample.

http://www.youtube.com/watch?v=IiVq8M5DBkk Logical data modeling

http://www.youtube.com/watch?v=BGMwuOtRfqU Candidate key

http://www.youtube.com/watch?v=ZiB-BKCzS_I Normalization

http://www.youtube.com/watch?v=pJ47btpjAhA Normalization example

http://www.youtube.com/watch?v=q3Wg2fZENK0 1st Normal form

http://www.youtube.com/watch?v=vji0pfliHZI 2nd Normal form

http://www.youtube.com/watch?v=HH-QR7t-kMo 3rd Normal form

http://www.youtube.com/watch?v=q1GaaGHHAqM E_R Diagramming

You can go back and watch these as often as you want.
 

Gopher__

Registered User.
Local time
Today, 06:12
Joined
Jul 23, 2012
Messages
13
Thanks jdraw. I have actually seen those videos before and have watched a lot of the DB/Access stuff on youtube. I will probably watch them again though just to recap.
 

Gopher__

Registered User.
Local time
Today, 06:12
Joined
Jul 23, 2012
Messages
13
Now should I be as pendantic about storing phone, email addresses etc :rolleyes:
 

Gopher__

Registered User.
Local time
Today, 06:12
Joined
Jul 23, 2012
Messages
13
Regarding table and form-level validation. Given that all user interaction with the DB should be through forms, is there any need to have table or form-level validation through these objects properties when it can just simply be done in VBA on event? Is this just a case of preference? If so, I would much rather handle it all in VBA.
 

Users who are viewing this thread

Top Bottom