How to use the Double datatype.

jal

Registered User.
Local time
Yesterday, 23:49
Joined
Mar 30, 2007
Messages
1,709
I have a table with a column called Frequency (numeric). The user wants to do a simple search, for example,

WHERE Frequency BETWEEN 500 AND 600

The table is very large (4 million records) and I have to do some joins. I found out, the hard way, that the search runs 10 times faster if the Frequency column is Double rather than Decimal.

I've been afraid to use Double because I got burned on it once, about a year ago. I was comparing the "same" number against a Double column, that is,

WHERE Amount = 10

but it proved false because the Double column was storing the Amount as 10.0000001 (or something like that ). I suppose one solution would be to do a rounding to 2 digits:


WHERE Round(Amount, 2) = 10

but the problem with rounding is, Don't I lose the speed of using the index? I can't afford to forfeit index speed, since the table is so large.


So my question is this. Is there a safe way to use Double (i.e. safe from errors such as 10 <> 10.0000001) without losing index speed?
 
Given that the BETWEEN ... AND ... is inclusive, you should be able to get all 500.000001s and 600.00000s (but not 600.0000001, possibly). If you want to include those, you can modify just in time so 500 and 600 is changed into 500.00000000 and 600.0009999 for instance and still enjoy the index optimization.

For a selection of a single row, you would probably use the same BETWEEN ... AND to provide you the tolerance so if an user want a frequency of 500, change the criteria of Frequency = 500 to Frequency BETWEEN 500.000000 AND 500.9999999 and see if this is acceptable way to get the needed data?
 
Ahh....That helps me in two ways. First it seems to confirm that the problem I mentioned is real (I was wondering, as I don't see this topic raised very often, if maybe it was all a brain fart on my part).

Second it gives me a practical solution. Thanks !!!
 
The reason for this is that Double are not accurate. There is no good way to store for instance, 1/10 in binary (this generates a infinitely repeating fraction, just like how 1/3 cannot be written without repeating infinitely in decimal base.) So for those certain numbers we cannot represent in binary, we use the closest approximation. When programmers deal with doubles, they have to use different approaches of which you can google on. Here's one starting point.

A while ago, I found an awesome page that enumerated different techniques of comparing doubles and the performance ramifications - that was for a different programming language (probably C++) but that should not us from studying & applying the techniques into SQL or C# or VBA. Too bad I didn't think to bookmark this page. :(


EDIT: After rewording a bit, I found the page I was thinking of. Link :)
 

Users who are viewing this thread

Back
Top Bottom