Validity period as key attribute possible?

Desafinada · Aug 20, 2009

Hello everyone,

I have a table Salary which contains the following fields:
EmployeeID, DateValidFrom, DateValidTo and SalaryAmount.

Right now the table has uses Autovalue SalaryID as key, but only because I don't know how to define the range spanned by the DateValidFrom and DatevalidTo to be key attributes along with EmployeeID.

Is that possible and when yes, how?

namliam · Aug 20, 2009

Yes possible, in table design simply click and drag or use CTRL + click to select multiple columns before clicking the "key" button.

I would still use the ID auto number as a PK though, because
1) You are going to need code to ensure the validity of the periods anyway
2) having a single column key is much easier than multy column

Desafinada · Aug 20, 2009

What a badly formulated question from my side. Thank you namliam, it's not that I didn't know how to set a more than one attributes to be key. I only thought I wouldn't need to any write validation code if I could somehow define the time range & EmployeeID to be PK together.

namliam · Aug 20, 2009

well having these 3 as a PK, you would allow:
ID 1
Start 01-Jan-2009
End 31-Dec-2009

ID 1
Start 01-Jun-2009
End 15-Feb-2009

Generaly speeking in things like this you dont want periods to overlap, having it as PK would not prevent this.

The_Doc_Man · Aug 21, 2009

I would strongly avoid using date-fields like that as PK components. I would stay with the synthetic key (autonumber) and then select the correct salary with something like, pick a date and then find the records with the picked data BETWEEN the effective dates. And for the current salary, put a way-into-the-future date for the ending date.

Something about the idea of date-based fields as PK components bothers me with regard to "pure" normalization. If I figure out what about it bothers me, I'll maybe be able to post back and explain it. Something is nagging me about the relationship of the data to the key, which may make it a structural issue. Were I to do this, I would not store it this way. I would use:

tblPayHist
- EmplID, FK to employee table
- EffDate, date field
- NewAmount, dollars per hour, per paycheck, per whatever you are using
- Terminate-flag, Yes/No - TRUE when the employee is terminated and no longer draws a salary or wage. (And set the NewAmount to 0 just to be consistent.)

Then you pick the salary according to the DMax of the date that is less than or equal to the targeted date. You can make that into a VBA function, for example.

On the pragmatic side, I know for a fact what else bothers me - the bigger the key field, the more overhead you induce in its maintenance. Adding two dates (=16 bytes) to a LONG for Employee ID gives you a 20 byte key vs. a 4 byte key. Which quickly reduces the rate at which you can do searches.

Desafinada · Aug 23, 2009

Thank you, The_Doc_Man, for a very good response. Now that you've said it, naturally a three-column keys has a high overhead, especially for indexing I guess. I always thought artificial keys only create additional, useless data but now I know better.

gemma-the-husky · Aug 23, 2009

surely all you need is employee plus effective date as the key for the salary
(assuming you are storing salary changes in the table) Why do you need the salary end date at all - indeed you wont know the enddate until you apply a new salary.)

surely a salary is effective until its replaced by a later one.

then just have employeeid plus effectivedate as the PK for the table (or as a noduplicates secondary index)

------
as pointed out, it still might not be sufficient on its own to ensure table integrity - ie when you change a salary, you need to make sure that the sequence of dates is what you expect - which is likely to need some code perhaps

Desafinada · Aug 23, 2009

Hi Dave, I didn't mention it in the original question, bu I need to produce financial reports for the company's productivity, so all the relevant data for me are past data. Among other things, the amount of salary paid to each employee for any particular month is of interest. Therefore I need to keep the history.

gemma-the-husky · Aug 24, 2009

i dont mean that

i mean if a guy gets 15000 salary from 1/Jan/09
and then gets 16000 from 1/July/09

you dont need to store 30th June 09 as the final date for the 15000 salary.

he is always earning the salary based on the record equal to the max date <= the date checked

its a lot easier for you, if you dont have to be modfiying end dates all the while

Desafinada · Sep 2, 2009

Hello Dave, sorry for responding late, I was on leave. Thank you for the clarification and I believe you are right.

To everyone:
another related problem bothers me, and I hope The_Doc_Man could reply as well.
Remember my table Salary: Salary ID, Personal ID, Date Valid From, Amount
It is important that there are no two entries with the same Personal ID and Date Valid From, but aside from letting them acting as a key together, I don't know how to enforce this. Could you help?

namliam · Sep 2, 2009

Well you have to enforce it yourselve before/during data entry, you can also make a unique index on it to have the database do (some of) the work for you.

Having a "start" and "end" date on things like this is not that strange not even that bad... This allows you to enforce sequential changes
i.e. For the same employee
1/1 1.000
3/1 1.100
5/1 1.125
2/1 1.050

Usually you dont want this to happen...

Desafinada · Sep 2, 2009

Hello Namliam,

On using Startdate and Enddate: I've decided to go for the only startdate variety because it reduces the risk of error during a query. I shudder to think what would happen if there are missing periods inside the reporting period, on which my query depends. For me, the responsibility to enter the correct source data should be born by the user.

I have a form that allows the user to enter a new salary record or to edit an existing one. I can control what the user puts in there but what if the user enters the data directly in the table? Is there a way I can set up a validation rule that does not check only the content of that particular column but also checks that this date for this person doesn't exist in the table already?

You mentioned a unique index, what do you mean by this?

Thanks in advance

namliam · Sep 2, 2009

Desafinada said:
Hello Namliam,

On using Startdate and Enddate: I've decided to go for the only startdate variety because it reduces the risk of error during a query. I shudder to think what would happen if there are missing periods inside the reporting period, on which my query depends. For me, the responsibility to enter the correct source data should be born by the user.

It is offcourse the responsibilty of the user... but if the app can help...

Desafinada said:
I have a form that allows the user to enter a new salary record or to edit an existing one. I can control what the user puts in there but what if the user enters the data directly in the table? Is there a way I can set up a validation rule that does not check only the content of that particular column but also checks that this date for this person doesn't exist in the table already?

users are not to use the table directly... Always use form(s)

Desafinada said:
You mentioned a unique index, what do you mean by this?

Thanks in advance

You can create a index in the table design and you can make that index to be Unique... Thus using that to enforce your unique requirement.
Breaking that index will trhow an ugly access error though... One that your users will probably not understand

The_Doc_Man · Sep 6, 2009

It finally hit me when rereading this thread after a couple of days of doing something else. I said I would re-post if I ever figured out what was wrong with having two dates as a PK or as part of the PK

The dates by themselves are not candidate keys. They have nothing to do with the person. I.e. two people hired on the same day and then promoted on the first anniversary of their hires would both have the same dates, differing only by employee ID. The compound key of employee ID and start date would be a candidate. Adding a second date to the key doesn't enhance uniqueness because the problem description says that the salary started on that date for that person. You would never have two records with the same person and same starting date and different end date because a person will not have two different salaries starting at the same time but with different end dates. The second date is dependent, not independent, and therefore cannot be a candidate key. Nor does it add value to the putative compound key because the person it respresents is single-valued. (I.e. you are modeling an employee's work history and pay history, not multiple employees.)

Therefore, the problem I felt but couldn't articulate before is that the two-date key wouldn't represent reality. Less a matter of having two dates as part of a compound key and more that for this business model, there is no reason to further discriminate beyond employee ID and beginning salary date to obtain uniqueness. So in this case, if you HAD made the compound key of EmpID, SalaryStart, SalaryEnd, it would have been reducible with no loss of uniqueness.

Validity period as key attribute possible?

Desafinada

Registered User.

namliam

The Mailman - AWF VIP

Desafinada

Registered User.

namliam

The Mailman - AWF VIP

The_Doc_Man

Immoderate Moderator

Desafinada

Registered User.

gemma-the-husky

Super Moderator

Desafinada

Registered User.

gemma-the-husky

Super Moderator

Desafinada

Registered User.

namliam

The Mailman - AWF VIP

Desafinada

Registered User.

namliam

The Mailman - AWF VIP

The_Doc_Man

Immoderate Moderator

Similar threads

Users who are viewing this thread