Every CDN Ignores Your Cache Retention Preferences

Update: The new Cloudflare Reserve feature stores your files in cache indefinitely, thereby solving this problem. It’s a paid service but small to medium-sized sites will barely see any cost. This puts Cloudflare way above all other CDN providers.

After years of working with different CDN providers, I’ve reached a disturbing conclusion. All CDNs ignore your cache retention preferences. They flush your data long before you want them to. And I have the data to prove it. This drags down your cache hit ratio and slows down your site.

Ideally, you want your CDN to keep your files in their cache for as long as possible. This way, more of your customers get fast loading times, and there’s less load on your origin server, which can be used to serve HTML instead. All CDNs give you multiple ways to control the cache retention of your files. Usually in one of the following ways:

  1. Through the “cache-control” response header from your origin server
  2. Via explicit instructions on the CDN settings

Cloudflare even allows you to explicitly set the “Edge Cache TTL” for individual pages via its Page Rules section like this:

Cloudflare page rules specify cache retention policies
Cloudflare allows you to specify page rules – but doesn’t always follow them

“Wonderful!”, you think. “I’ll just tell the CDNs to store my data for 1-year and be done with it. That way, my static files will never be served from my origin server.”. Pretty cool right?

Not so fast.

All CDNs Flush Your Data After 2 Days if Not Accessed

My site gets infrequent traffic from certain countries. But that traffic is important. It pays the bills. So imagine my surprise when my cache hit ratio was very low for these countries. I became suspicious and tested out three CDNs for myself:

  1. Cloudflare
  2. KeyCDN
  3. BunnyCDN

I found that without exception, they all flush your files from cache if they’re not accessed for 2-3 days. It doesn’t matter what your cache retention policies are. It doesn’t matter if you explicitly set the TTL for certain page. None of it matters. They will flush your cache, no matter how badly you want them to retain it for infrequently accessed data.

CDN Cache Retention: Proof with Data

All three CDNs include a response header that tells us if they’re serving cached content. Here they are for each of them:

CDN Response Headers
Cloudflare cf-cache
KeyCDN X-Cache
BunnyCDN cdn-cache

Based on this, I created 3 dummy CSS files and primed each of the caches. Then I accessed them at different times to test whether or not they were still cached. And here are the results:

Start Date: 5:00 pm 13th May 2019

Time (Hours) Cloudflare KeyCDN BunnyCDN
5 hours HIT HIT HIT
17 hours HIT HIT HIT
29 hours HIT HIT HIT
49 hours HIT HIT HIT

My cache retention settings for all three CDNs was at least one week for my new CSS files. You can see that despite that, each provider flushed my files from their cache after 2 days of inactivity. Sometime before day 3 in any case. What’s the point of a CDN allowing you to specify cache retention policies, if they’re only going to ignore them?

Honestly, I was surprised that Cloudflare’s free plan was able to match the paid products of the other two. I expected KeyCDN and BunnyCDN to keep the files for longer, but I was wrong. After much consideration, this is why I feel that Cloudflare is still the best CDN even though it’s free.

Also, my anecdotal experience is that Cloudflare clears its HTML cache far more frequently even if there’s a page rule set to explicitly keep in for longer. I’ve heard that Incapsula has long cache retention policies, but they’re too expensive for me to just test out.

It’s Particularly Bad for Low-Traffic Sites

Sites whose content is infrequently accessed are the biggest losers here. There are good chances that certain content won’t be accessed for more than 2 days on certain POP servers, and that will cause a delay for the few visitors who do finally visit.

As far as I know, there’s no way to change this behavior. CDNs flush their cache frequently to avoid overburdening their servers with useless storage. One workaround would perhaps be for a CDN to charge its users a fee for pull storage POPs, so they could keep the content indefinitely. But such a provider doesn’t exist.

Push Zones are Not a Solution

Push zones are where you manually upload your data onto a CDN instead of waiting for it to be passively pulled from your origin server. However, this doesn’t mean that the edge POPs will store the data. That still has to be downloaded from the push zone the first time a request comes through.

There’s no easy solution here. CDNs are incredibly useful if you can trust them to store your files in cache and serve them to customers when needed. But if it’s cleared out regularly, then it defeats the purpose. If you have some better insight into this problem, let me know in the comments below!

About Bhagwad Park

I've been writing about web hosting and WordPress tutorials since 2008. I also create tutorials on Linux server administration, and have a ton of experience with web hosting products. Contact me via e-mail!

Speak Your Mind