Data or information?

lmcc007

Registered User.
Local time
Today, 17:37
Joined
Nov 10, 2007
Messages
635
Hi everyone,

I’m trying to understand building databases correctly; therefore, I am going back over Running Microsoft Access 2000 book.

I’m having some trouble understanding a section in the book on Building a Database, page 86.

Data or information?

You need to know the difference between data and information before you proceed any further. This bit of knowledge makes it easier to determine what you need to store in your database.

The difference between data and information is that data is the set of static values you store in the tables of the database, while information is data that is retrieved and organized in a way that is meaningful to the person viewing it. You store data and you retrieve information. The distinction is important because of the way that you construct a database application. You first determine the tasks that are necessary (what information you need to be able to retrieve), and then you determine what must be stored in the database to support those tasks (what data you need in order to construct and supply the information).

Whenever you refer to or work with the structure of your database or the items stored in the tables, queries, macros, or code, you’re dealing with data. Likewise, whenever you refer to or work with query records, filters, forms, or reports, you’re dealing with information. The process of designing a database and its application becomes clearer once you understand this distinction....”

To me “data” is synonymous to information, facts, figures, numbers and so on. I guess this is why I am having a hard time understanding the above paragraphs.

Can anyone explain it in a way that a dummy can understand it?

Thanks!
 
Doc could (and hopefull will) explain this better than I, but in a sense you are correct. The book however is trying to describe the distinction between checks, deposits and bank charges to your bank account (data) and the current account balance (information).
 
If I may add, depending on the type of database you wish to build, but for example Sales Orders:

The data you would need to store is how much sold and at what price

The Information you would retrieve is Total Sales Value (which would be calculate from your data)

Doc (and others) can give far better explanations than this, but I think this is the gyst of it - you STORE data, but CALCULATE information
 
Thanks guys, but I'm still not getting it. Let me think about it some more and reread this section again and then maybe I'll get it.
 
What I'm driving at (in my inellegant manner) is in constructing your tables, ensure that your STORE the minimum amount of data required. Smaller runs faster.

For the Sales Order example, the Customer name would need to be stored, the sales order number would be autogenerated from PK, Order Item, Qty & Price would need to be stored.

Most other Information required (Sales reports, Profit Margins, Sales trends etc) can be calculated in queries from the basic stored data

Hope this helps
 
Here's my stab...

Data is raw. Irrespective of application to computers, data is the raw fact/figures/measurements/scores that you record. In the case of a sales order it could be:
Customer name
Required date
Products required
Quantities required
Price charged

At this stage nothing has happened to the details recorded on the order other than that they have been recorded on a piece of paper or on a screen. So we would consider it data.

Now consider the case where you have 100 such orders. Again all you have is a pile of data. The only meaning you can immediately see from the data is from the individual words/numbers.

What we want to see is information. Information is data that has been processed.
In our example we want to know:
- the total value of each order
- total number of orders
- the total sales for each month
- the total sales by customer.
To gain information you have to perform some task on the data (usually some calculations).

So when applying the Data/Information philosophy to databases, Data is the stuff you store (input) and Information is the stuff you get out (output) after having performed some task on the data.

In terms of database design you have to consider data to be stored and information required. Suppose you were designing a database to work out how much to pay people each week. The information you want is the total hours (which you expect your database to calculate). The data you input (store) could be the start/finish time each day or it could be the total hours each day depending on what data is available/collectable.

hth
Chris
 
Personally, I don't agree with the distinction. The author is trying to impose meaning on words that is more restrictive than common useage of the words. I believe that he is trying to point out that how facts are stored in a table are not neccesarilly the way they are presented or collected form the user. So why labour the point?
 
for my two penny's worth

Data - normally staticially information numbers etc
information is no analtical stuff

so
5 pencils
both 5 and pencils are data but you can only add up the number of pencils and you group pencils (Data) - however if we have an info field attached to
5 pencils - wooden with lead in them to write with .
and 5 pens - plastic with ink inth em to write with
5 erasers - rubber to eraser pencil writing

the first 2 fields are data the last field is not its information
5 and pencils (Pens and rubbers) are data
you cna group and add data - info you cannot
 
My 2 cents! Data is the facts. Information is the relevant part of the data.
 
Hi guys,

As for my views regarding "Data or Information" is this, "Data" is a RAW facts of Information, while Information is a collection of related data.

Been teaching that stuff for over 8 years back in my home country :)

Thanks.
 
First and foremost, this is an author's attempt at defining his terms. As such, using the concept of "literary license," s/he is permitted to make the distinction for purposes of discussion - as long as s/he is consistent in that usage. The unfortunate part is that s/he has chosen to use words that have "baggage." Hence the confusion. Nevertheless, there is merit in making this distinction.

Many of you have heard of my "Old Programmer's Rules" including this little gem: Access can never tell you anything you didn't tell it first, or explain to it how to compute it.

Working backwards, "Information" (as I read the passage) is the author's term for what you want OUT of the database, presumably organized in some way. But to provide that result, you must put some supporting "Data" (and/or formulas, procedures, and queries) to provide the framework that leads to the result you wanted.

Neileg, you are right. The words CAN be taken as nearly identical in meaning. Many people do so. It is a matter of usage.

Jepoysaipan, you are ALSO using terms that carry baggage just as is the author of the quoted article. Again, it is FAIR GAME TO DO SO - but be aware that, like the Red Queen from Alice in Wonderland you are making words have the meanings you want, not their "natural" meanings. To someone in the IT industry, the distinction is fairly clear. To a person of limited experience such as our original questioner, this SEEMINGLY artifical distinction is tough to understand.

I take a more pragmatic view. Anything is data and anything is information depending on the question you asked. It is quite possible for the same exact field in the same exact record in the same exact database to be BOTH data and information - or NEITHER - based on how you accessed or didn't access that datum.

It is ALL relative. With one exception... if you COMPUTED it from the contents of a database, it is not an ELEMENTARY entity, whatever it is, and DATUM (plural: DATA) is generally considered elementary. Whereas if it just sits there, unused and unwanted by anyone, nobody CARES whether it is elementary or not.

Which leads me to the philosophical question, "If a datum gets displayed on a screen when nobody is there to see it, is it information?" (This is not so off-subject as you might think, since "information," per the quoted article, implies intent and usage.)

Back to the point at hand. Data vs. information is an ARTIFICAL USAGE of two words that to the common man are absolutely synonymous. Only when you become uncommonly used to such artificial distinctions will you feel better about such usages. You will encounter such a thing again and again. Get used to it and realize that SOMETIMES we really DON'T have the words we need to express concepts in an ever-changing world. This is why dictionaries grow. You can bet that the ORIGINAL meanings of both words were nigh on to identical - but one of them has grown more that the other through variant USAGE. Just like data becomes information through usage.

Hope that clears up the matter.
 
Well, when I reread the section I'm thinking that "subjects" are tables and "data" are fields where the data is inputted and what you see on the screen is information. But, now I'm back confused. Let read it again and then sleep on it.
 
Who remembers the Data Processing Department?

Another way of looking at "data" is in the old days there were no Foreign Keys so if you could input the United Kingdom also as GB / Great Britain / UK etc. These were free formatted text fields and had little or no input control. Nowadays, we have a relational database and this is far more important than distinguishing between data and information. We would today, use Country Table. This maybe considered as static data.

Databases should be led by the useful business information that it products albeit from the data that has been collected!

Simon
 
In the context of Imcc007's book the distinction is clear. Data is what you put in - Information is what you get out. As Doc says this is an artificial constraint on the everyday meaning of these words but the author may/may not have a good reason for this.

Perhaps another way of looking at is that Data is the the raw facts that are input and Information is organised subset that you see when you query the DB.
 
I'm thinking that "subjects" are tables and "data" are fields where the data is inputted and what you see on the screen is information.

Yes, no, and maybe ... and even maybe not. I can display raw data on a screen, too. What I see on the screen is information if and only if I have in some way organized the underlying data. Just because it is on a screen somewhere doesn't mean it qualifies as information. There is no linkage between the two concepts ("viewed" = "information") at all.

Here is another, slightly farther-out viewpoint. If data elements are raw, isolated entities that come in with limited semblance of order or meaning, with no reasonable odds of predictability, then... you need to apply work to organize them, to provide that semblance of order. (Don't confuse ORDER in this discussion as being limited to an "ORDER BY" clause. SELECTs, JOINs, and UNIONs are also useful here. Other types of organization also apply. Such as table normalization, establishiment of formal relationships, imposition of validation edits, etc.)

This work that you do changes the net entropy of the data set as a whole. Now ORDER (in the sense of predictability) is considered a form of information. Organizing a data set changes its apparent randomness and thus creates information in the entropy-related sense of the term. Not all changes create information through work, however.

In the formal sense of entropy-related definitions of information, you have created information if and only if you can make better predictions of what you see after you apply the work than before you did so. If you make no change in the predictability of what you see, you have not changed the amount of information in the data set. In thermodynamics terms, work that doesn't change predictability is "lost or wasted energy." In the office it is called "lost or wasted effort."

SO according to the laws of thermodynamics as applied to stellar fusion dynamics and other randomizing influences, "information" occurs when you impose some form of order (or perhaps you prefer "organization") on raw data.

Don't forget, though, that just because you organized it AND changed its predictability STILL doesn't mean it is USEFUL information. For instance, USA Baseball statistics from before 1920 seem a bit... useless. Nobody is still alive to whom those statistics apply, so far as I know. But there are those who swear that such things are the most important bits of information since the invention of sliced bread.

Which is why in thermodynamics we have not less than six different ways of computing both energy and entropy in the same systems. There is the Gibbs-Helmholtz view, the classical Newtonian view, the "free energy" view (the latter being important in batteries), etc. etc. etc.
 
Hmmm... I see I'm getting a bit far-out in that last little soiret into the world of entropy as a measure of information. Time to increase my medication again.
 
We always enjoy you expanding our general knowledge Doc!
 
I didn't get it, The Doc Man. Too technical for me. I need basic stuff.

I'm thinking that what you type in is what you get--no data, no information--data = input, information = output.

The author wrote: "You need to know the difference between data and information before you proceed any further. This bit of knowledge makes it easier to determine what you need to sotre in your database." Did the author overstate this?
 
We use to be Data Processing Managers, we are now have the grand tilte of Information Technologists!

Simon
 

Users who are viewing this thread

Back
Top Bottom