Caching Redux

I got some interesting questions about how to build good middle-tier caches in my inbox last week.  I cleaned up the responses a little bit and I'm posting them here because they're actually pretty general.  I've written about this before but some things merit repeating :)

Here's what I wrote:

If I had a dime for every person who thought caching was the answer but then didn’t actually build a cache…

First, consider your cache policy carefully.  As I’ve often written, caching implies policy

And as I told Raymond – a cache with bad policy is another name for a memory leak

Raymond turns this into some excellent recommendations, including instrumentation and observation which result in cache design by a quantitative approach.

If I had a dime for everyone who built a cache because they thought it was a good idea but then did not measure the efficacy of what they had built…

Explore the space, try rough experiments at different layers and try different policies.  Often very aggressive policies (fast retirement of cache data) are effective but you must understand not only how data gets in the cache (that is obvious) but how does it get OUT?  Actively or passively?  Based on limits or hit rate or?

Whatever you do, be sure you do it on the basis of measurements.  Any kind of automatic “magic” caching layer that somehow knows about new business objects immediately sounds like a disaster to me.  It’s not a question of knowing the business objects it’s a question of knowing usage patterns and policy.  I don’t know how to do that automatically -- but maybe your particular problem has patterns you can leverage.  I also know that (e.g.) SQL server already has a good cache at the data level and if you do your job right that is often all you need.

Policy is everything.