Atomic Data…?

ChrisO

Registered User.
Local time
Tomorrow, 07:10
Joined
Apr 30, 2003
Messages
3,202
A few general questions about the nature of atomic data.

What exactly does it mean?

If a field in a table defaults to Now() does it not store both Date and Time in the one field? Sometimes we want just the Date and sometimes we want just the Time.

If a telephone number requires Country Code, Area Code, Number and Extension is it not placing four (4) atomic chunks of data in the one field? If a database requires knowing each individual digit, perhaps for hardware routing purposes, should not all digits be stored in a different field and maybe also in related tables?

If a person has more than one Title, should the titles go into more than one field?
If a person has more than one Initial, should the initials go into more than one field?

If a Surname may be O’Brien, le Chartier or Du Pont, should not the surname field be broken down to more fields to allow for easier criteria for surname grouping?

Atoms were once thought to be atomic. (Perhaps a Quark of fate? :o )

In other words, if we cannot satisfy the requirements for atomic data (First Normal Form), then what claim do we have on the remainder?

Regards,
Chris.
 
1. Date/Time - You are thinking of these as separate entities when in fact a Date/Time field actually stores a point in time. The display format of that is variable depending on what you want to see but the storage value is a Double Precision number with the integer part representing the number of days before (negative) or after (positive) Dec, 30, 1899 and the decimal part representing the number of milliseconds since midnight. VBA and SQL include numerious functions to work with this very well defined datatype.
2. Telephone numbers - Again, a phone number is the address of a particular phone. It is pretty variable though and should be stored in at least four parts - international access code, area code, number, extension. Phone numbers and addresses fall into the same category of attributes. These are multi-field attributes where very few applications actually care about the constituient parts. The phone company cares about the details of a phone number and some demographics collectors would care about country and area codes. But the vast majority of systems would lump the first three fields together and have a separate field for extension. With addresses, most applications isolate, city, state, postal code, and country and then use 1, 2, or 3 text fields to hold the other parts of an address. Some applications need to have the address line details broken down into their atomic parts because they need to collect demographics but most just need the address to facilitate mail delivery and so don't break the fields down.
3. If you are talking about titles in the way the nobility uses them, they should be defined as any other repeating group - a many-side table so that each title can be stored uniquely.
4. An initial field is usually defined as a single character. Middle name would be more appropriate since it can store multiple initials if necessary. Middle name(s) is unimportant in the greater scheme of things. We need first and last names to be separated so we can properly personalize communications and so we can properly sort name lists but I personally have never seen a use for middle name. In cultures where people generally have more than three names, applications would need more than the three standard name parts used in the Western world.
5. A surname is a surname and shouldn't be separated. If you need to also store a familial relationship such as "ne San Bernado", then use a separate field.

Atoms can be further divided but not into pieces that stand alone and the same goes for any table column. A table column of more than one character in length could be divided into multiple single character fields. And even those could technically be divided into bits. But we have no way to store a single bit and who wants to concatenate multiple single character fields to get something that makes sense. Above all, relational database theory makes sense. Use common sense when breaking attributes into their atomic parts. Don't forget to plan ahead either. You may not have a requirement today to send letters to the people who's names are stored in your tables, but you may tomorrow. So, don't make non-atomic fields simply because it seems easier. Laziness will surely be punished.
 
Thanks for taking the time to put that together Pat.

Discussions on database theory sometimes seem to miss your point…
“Above all, relational database theory makes sense. Use common sense when breaking attributes into their atomic parts.”

Finding that “common sense” limit at times seems to vary from database to database.

With only slight reference to the atomic nature of the data: -
Names of people are my biggest problem at the moment because of the fact that I get data, from another system, in just three fields… FirstName Surname and PreferredName all in uppercase, sometimes hyphenated and sometimes without the PreferredName.

The biggest problem is the uppercase since all information about proper case has been lost and can’t easily be regained.

I guess that’s the point with the Date/Time field as well… once the information is thrown away it can’t be regained. It’s the two pieces of data together that specify a single point in time.

We could separate the data into Date and Time fields, which may better serve the atomic nature of data, but as you say…would it make sense to do so?

Thanks again and regards,
Chris.
 
We could separate the data into Date and Time fields, which may better serve the atomic nature of data, but as you say…would it make sense to do so?
Absolutely Not, it does not make any sense to separate date and time into individual fields. They are together in a single field because that is what you need when you want to compare them or find elapsed times. Write the code to compare two date/time values when the data is stored in separate fields and write the code to calculate the elapsed time when the data is stored in separate fields. THEN you'll understand why date and time BELONG in the same field.

Converting upper case names into proper case is fraught with problems. Search the archives for code that handles the odd cases better than the StrConv() function.
 
Hi Chris & Pat,

Nice discussion.

Time is a purely "atomic" data element. Its units are static and
doing operations like DateDiff, DateAdd and so on are very easy.
You can display it in a multitude of ways, but it's stored "atomically".

To me "atomic" is breaking down:

9 feet
3 yards
108 inches

You'll kill yourself trying to parse them.

Good thread!

Wayne
 
G’day and thanks Pat and Wayne

Now for a short explanition…

The original and subsequent posts of mine were somewhat rhetorical.
I knew I could not pull this off without the help of others.

My aim is to open a discussion on the practical application of normalization.
(Sometimes the no-bodies require the help of the some-bodies to do things.)

I did not ask Pat or Wayne to reply but that was my hope, and that you did superbly, so much so that I think it requires an entry in the FAQ.

Normalization and Common Sense
subheading Atomic Data

next subheading …Storing Calculated Values
(Believe me I will get around to it.)

I hope you forgive me, Pat and Wayne, for the apparent waste of your time, but the postings that you make are never a waste of time.

Regards,
Chris.
 
ChrisO said:
We could separate the data into Date and Time fields, which may better serve the atomic nature of data, but as you say…would it make sense to do so?

Let me reiterate...according to Pat's post if you created a seperate field for date and time, you would actually be storing the same value twice; it would just be formatted differently.

And to summarize Pat's original post...,"It depends on what you need to do." Part of good system development is understanding the requirements. If your users need a slice of bread, don't give 'em a farm.
 

Users who are viewing this thread

Back
Top Bottom