Cashing in on Cache

NauticalGent · Jan 5, 2020

Greetings AWF,

While trying to assist a member on performance tips, I came across this from the FMS wevsite:

Take Advantage of the Cache
Use cache memory wherever possible when using external SQL data sources. Microsoft Access forms and reports have automatic caching mechanisms. When using recordsets in your Visual Basic code, use the CacheStart, CacheEnd and FillCache methods to maximize cache effectiveness.

I did a quick Google search on them but couldn't really find anything useful and more importantly, Sample Code!

Is anyone here familiar with these methods?

Deleted member 147267 · Jan 5, 2020

I admit I've never used it. Did you see this and find it not useful? I realize there's no code example but it seems straight forward.
Look through the recordset properties list to see the various cache properties.

NauticalGent · Jan 5, 2020

Thanks Micron, I did in fact see that article but I failed to follow the links within it. It is the FillCache method that makes the difference.

When you go to that link, it has some sample code that shows you how to use all three methods together.

Thanks for making me read it!

Deleted member 147267 · Jan 5, 2020

Vunderbarf!

(tongue in cheek everyone, so don't correct my spelling!)

vba_php · Jan 5, 2020

Micron said:
Did you see this and find it not useful?

out of curiosity, I looked it up too, and Micron's link + google results led me to the informative MS pages which details the concepts.

isladogs · Jan 6, 2020

Just noticed this thread.

I ran these caching tests last year as one of my many speed comparison tests but never got around to posting them until now.

The effect of caching was unsurprisingly more noticeable with a large SQL table.
Less expected to me was that the cache size itself was only significant for smaller tables

vba_php · Jan 6, 2020

wow! if there was a ranking here of informative and useful posts, that one by you Colin would sure as he$$ be on the top of the list! nice one.

isladogs · Jan 6, 2020

Thanks. I've plenty more speed tests both here and on my website

I also did more test result to find out why the cache size limit is stated as 1200.
With my large table of 116262 records, I increased the cache size to 1500 as a test.

As you can see, it still worked but the cache time was significantly slower
1200 cache - almost 13 sec
1500 cache - nearly 19 sec

EDIT:
Just done a final one - this time with cache size =3000. I expected it to crash but not so.

So it seems the quoted cache size limit of 1200 from the MS article is out of date. However as it becomes so slow with a large cache, it seems counterproductive to use such large values

NauticalGent · Jan 6, 2020

I was going to suggest another Myth Buster candidate, but it seems you got there before I did! Well done , Col.

isladogs · Jan 6, 2020

You're welcome. Hope you find it useful.
As I said, I did the first set of tests some months ago.
Not sure what led me to that site in the first place ….

deletedT · Jan 6, 2020

isladogs said:
As I said, I did the first set of tests some months ago.

I'd be amazed if someday someone finds something that you don't know (regarding Access). If that day ever comes....

isladogs · Jan 6, 2020

Tera said:
I'd be amazed if someday someone finds something that you don't know (regarding Access). If that day ever comes....

Ha!
There's a lot of threads I don't answer...
There's a good reason for that. If I don't know, I keep quiet (usually...!)

Uncle Gizmo · Jan 6, 2020

isladogs said:
If I don't know, I keep quiet (usually...!)

Ditto...

Others here could benefit from using this strategy....

Sent from Newbury UK

isladogs · Jan 6, 2020

As you probably realised, I wrote that comment as a ‘seed’ to encourage exactly that response. Thanks!

Sent from my iPhone using Tapatalk

vba_php · Jan 6, 2020

isladogs said:
If I don't know, I keep quiet (usually...!)

Unlike me, with regard to my stupid google search results responses.

Uncle Gizmo said:
Ditto...

Others here could benefit from using this strategy....

I have a feeling this hints and me, yet again. If it doesn't, thank goodness. If it does, this is one exception where Tony's suggestion can be proven incorrect, even though the OP probably did not use my help directly:

https://www.access-programmers.co.uk/forums/showpost.php?p=1659235&postcount=3

Uncle Gizmo · Jan 6, 2020

vba_php said:
I have a feeling this hints and me

If the cap fits, wear it.

Sent from Newbury UK

isladogs · Jan 6, 2020

Adam
Of course it was aimed at you
The wise response would have been not to reply but instead to stop posting whenever you do not know the answer.
Sadly that is the case very frequently

Please do not reply to this post. There is no benefit to yourself or the forum as a whole in you doing so.
If it encourages further pointless replies I will regret posting this

Sent from my iPhone using Tapatalk

The_Doc_Man · Jan 6, 2020

With regard to caching, there was an old study in the IBM Journal of Research and Development back in the days when caching and virtual memory were becoming a "thing" because memory management hardware had crossed an ability threshold. "Back in the days" means late 1950s.

Caching of this type is a precursor to the more modern O/S ability called "virtual demand paging." However, it has similar properties and abilities, just maybe a little clunkier. The question of efficiency depends on some factors that I cannot recall in fine detail, but I can remember enough to have a comment of sorts. Forgive me if I get slightly pedantic on this one. {You may, if you wish, cough at my use of "slightly." I won't be offended.} You might also wish to break out your college-level calculus a little bit.

In essence, caching at any level removes the immediate need to pause what you are doing while the O/S does a disk fetch. If your information is in memory, no disk input is required, and any writeback can be deferred - for a while. Two major factors crop up when analyzing the speed effects.

1. The percent of benefit depends on how much randomness is associated with the pattern of data access. In essence, caching is NEARLY useless in an isolated linear scan of the cached memory. It becomes HIGHLY useful when the pattern of data reference is localized to a small number of records. It also has value when you have several scans of the same data to be performed, one after the other.

2. The amount of memory available for caching reaches a point where adding more memory has limited effect on speed. This is the "law of diminishing returns" as applied to computer memory.

It is all about probability. I'm going to "rough-draw" some relevant graphs to illustrate the point. The graphs represent size of memory cache vs. effect on speed. The experimental results were, basically, "sigmoid" curves where you bop along seeing no particular effect (or a small one at best) until suddenly there is a steep rise in the effect, after which it levels out again. (See chart labeled "1".) Obviously, you get the least effect with the least memory and the best effect with the most memory. In theory, the maximum effect is if you can hold ALL of your data in the cache at once.

Where there is a linear access pattern, imagine grabbing both ends of that chart and stretching it out a bit. The "s-curve" becomes more gradual and shallow. Where there is a highly localized data usage pattern, the opposite occurs and the s becomes steeper because the effect becomes more profound. There is another factor - just how localized we are talking. THAT factor moves the s-curve left or right. The narrower the focus, the farther to the left on that chart. The broader the focus, the more to the right.

The two factors are thus where on the memory chart you see that flex point, and how pronounced it is. "How pronounced" depends on degree of localization of your program's memory reference pattern. I.e. are you 10% more likely or 30% more likely or 50% more likely to touch one certain spot? "Where" depends on how big (or how heavily populated) is your most commonly referenced area. I.e. is there a single "hot spot" or do you have a few "hot spots" or do you have MANY hot spots?

You see this as a clear probability if you take the 1st derivative of chart 1 to get to chart 2 - the familiar Gaussian bell curve. It is hard to put this into words, but in essence chart 2 is a chart of the probability that adding more memory to the cache enabled it to bring in something useful for speed effects. Here, more common statistical terms apply - skew and kurtosis. When you have a small area of interest, skew is to the left. When the different in interest from one area to another is small, your kurtosis flattens the graph.

To find the point at which you have the maximum effect, i.e. when does the "law of diminishing returns" kick in, take the next derivative (see chart 3). The point at which you might as well give up adding more memory is where the arrow is pointing. Again, it is difficult to put that in words because it is now the 2nd derivative of the original graph. It is basically a point at which you are no longer getting enough benefit from adding memory to offset the cost of memory.

Now, the oddity shown in Colin's study in which for some cases, adding memory slowed things down. What happens here is that there is an enemy to caching. It is called "thrashing." (Yeah, they rhyme. No, I didn't make it up. And no, it wasn't my idea. I'm old, but not THAT old...)

Remember I said that with caching you can defer write-back. Windows does that for you. But what happens when you increase the amount of data you keep and start updating things? Eventually you need to write that stuff back because you need to flush the cache to make room for something else. This is most common for linear access cases but also happens - badly - when you DO have multiple hot spots and not quite enough cache to handle them all. In essence, you added another action and changed the access pattern when you get to the point that you have to thrash the cache.

EDIT: Per a later question, in this discussion "adding more memory" is the same in purpose as "increasing cache size."

Hope this helps put things in perspective.

In summary, use caching to greatest effect if you expect a few hot spots among your data set. Don't expect a big difference if everything you do is absolutely linear and flat in access patterns.

isladogs · Jan 6, 2020

Thanks for that info Doc.
Just to clarify, when you mentioned me adding more 'memory', you were I believe referring to me increasing the cache size? Yes?

The_Doc_Man · Jan 6, 2020

Correct, though in the original research it literally WAS adding memory. But that was in the late 1950s.

Cashing in on Cache

Ignore List Poster Boy

Deleted member 147267

Guest

Ignore List Poster Boy

Deleted member 147267

Guest

Forum Troll

MVP / VIP

Attachments

Forum Troll

MVP / VIP

Attachments

Ignore List Poster Boy

MVP / VIP

Guest

MVP / VIP

Nifty Access Guy

MVP / VIP

Forum Troll

Nifty Access Guy

MVP / VIP

Immoderate Moderator

Attachments

MVP / VIP

Immoderate Moderator

Similar threads

Users who are viewing this thread