Store in Rows or Columns

LordLoki

Registered User.
Local time
Today, 23:54
Joined
Jun 28, 2014
Messages
28
Hi together,

I hope the Title is not to shabby.
i need to store a lot of Values for a tool that generates statistics for outbound Campaings and i am not sure if it is better to store my different Values in a Column or Row based structure.

To give you a better understanding here is an Example:

Here what i call row based: (KPi would be 30 different entrys)

Year,Month,Week,ProjectiD,Partner,KPI,Value

And Column based would be the same but instead of KPI i would have 30 Columns and the value would be stored in that column.

I am not sure what are the advantages or disadvantages of each way. I assume the Row based storing would be slower cause you have a lot more rows in your table and if you want to get a specific KPI u would have to use a where clause and process all the rows. but i am not sure what is best practice

Thanks for your comments and greets Loki
 
Rows. That's the normalised way.
If you add a new KPI it's simple. If you don't need to record one, you are only storing it when you need to
Your forms will be simpler to maintain and not need redesigning if your KPI's Change.
 
hi Minty,

Thanks for the Quick reply.
Good to hear that cause my initial planning was based on Rows but i started to doubt it and wanted to have a second opinion.

And thanks for pointing out that its the normalised way that key word gave me something to google for and i got some pretty good articles about structuring Tables and Databases that helped me improve my Concept.

Now i can start building it knowing im not making it terribly wrong :D
 
I think to some extent it depends on how you intend to use the values.

It is much easier to access the values if they are in a single row with multiple columns

Logically, if they are independent of each other, but are all dependent on other items in the table, then they "normally" would go in columns. That's 3rd normal form I think.

The KPI's may be different data types - strings, dates, numbers etc. which is another indication that they belong in columns, not rows.

out of interest, you don't need the year, month, week. Date would be sufficient.

so you might have

projectID, date, partner, mailings sent, replies received, other kpi stats.
 
Would each row have just one of the KPIs filled in for each entry.

Ie KPI A done on 1/1/2000 for project 1
Ie KPI B done on 1/1/2001 for project 1


and different project have different KPIS, maybe KPI A can be done many times for project 1.

I'm definately leaning towards rows as more flexible.
 
if you go for rows then each row would only have one kpi available to store. That's the point.

either

project, kpi1, kpi2, kpi3, kpi4, kp5, ... , kpi30

OR

project kpinumber (between 1 and 30), kpivalue
 
if you go for rows then each row would only have one kpi available to store. That's the point.

either

project, kpi1, kpi2, kpi3, kpi4, kp5, ... , kpi30

OR

project kpinumber (between 1 and 30), kpivalue

Sorry I meant if all KPIs where in one row, would one column be filled in, ie the main row has a date field, it implies that a new row would be added for a new date when another KPI column would be filled in.

So you would end up with multiple columns and row for each project.
Adding new KPIs would be problematic.
The data type in the row option is a problem though.
 
if you have all the kpis in one row you would end up with this. wide and short

kpis for Week12 x, y, z, 1, 2, 3 etc.
kpis for Week13 b, c, d, 11, 12, 13 etc.
kpis for Week14 e, f, g, 12, 23.6, 14 etc.


I doubt if you would have random kpis appearing on different days, but you know what data you have.

if you have them vertically you get this - long and thin

kpis for week 12 - kpi 1, x
kpis for week 12 - kpi 2, y
kpis for week 12 - kpi 3, z
kpis for week 12 - kpi 4, 1
kpis for week 12 - kpi 5, 2
kpis for week 12 - kpi 6, 3
etc
kpis for week 13 - kpi 1, b
kpis for week 13 - kpi 2, c
kpis for week 13 - kpi 3, d
kpis for week 13 - kpi 4, 11
kpis for week 13 - kpi 5, 12
kpis for week 13 - kpi 6, 13
etc


the same data, just stored differently.

normally data is stored long and thin, but in your case, it isn't wrong to store short and wide. The reason is that with the first method, the data IS still normalised, since all values in columns are dependent on the key column. It wouldn't be normalised if some of the columns were dependent on other columns.

It's a matter of taste as to which method you prefer. I would probably go for the former, as the latter involves a lot of work to extract all the kpis for a given week, say.
 
Dave's (Gemma) wide short example is the relational approach and is normalised. The tall thin example is what is called EAV (entity-attribute value model) - google EAV database design for loads on this.

There's pros and cons for both of which many have already been mentioned. I think if your KPIs are all of the same thing then I would certainly recommend the EAV (tall,thin) approach e.g. say all the KPIs are about call answer times KPI1 = time to answer software calls, KPI2= time to answer hardware calls, KPI3=time to answer Network calls (these are all "time to answer KPIs").

But if they are different KPIs then the relational (wide/short) approach e.g. KPI1= time taken to answer call, KPI2=time taken to deal with call, KPI3= time taken to do call wrap-up (although these are all KPIs about time durations, they are for different types of KPIs)

It's been mentioned that the relational method is harder to add more KPIs. But consider the case of modelling a person and a persons attributes (hair colour, date of birth, town of birth etc). This can be dealt with by either of the above models but most of us would naturally choose the relational model (wide/short) - that would entail adding more fields if you wanted more attributes. Do your KPIs change often?

hth
Chris
 
Chris. thanks for identifying the technical terminology!
 

Users who are viewing this thread

Back
Top Bottom