Caching is the first thing most engineers reach for when an application is slow. Add Redis, throw a TTL on it, move on. Except three months later, customers are seeing stale prices, the support team is drowning in complaints about outdated inventory counts, and every time you deploy, the site goes down for two minutes while the cache warms up.
The problem is not caching itself. The problem is that most caching implementations are built without a strategy. No invalidation plan. No understanding of access patterns. No monitoring. Just a vague hope that putting data closer to the application will make things faster. Sometimes it does. Often, it creates an entirely new category of bugs that are harder to diagnose than the original performance problem.
Let us walk through why caching goes wrong, what the common mistakes are, and how to build a caching strategy that actually works in production.
Why Caching Goes Wrong
Caching fails for predictable reasons, and nearly all of them come down to a lack of planning.
No invalidation strategy. This is the root cause of most caching bugs. Data changes in the database, but the cache still serves the old version. In e-commerce, this means customers see prices from yesterday. In SaaS applications, it means users see permissions they no longer have. The classic computer science joke — "there are only two hard things: cache invalidation and naming things" — exists because invalidation is genuinely difficult. But difficulty is not an excuse for ignoring it entirely.
Caching everything instead of hot paths. Not all data benefits equally from caching. A product page viewed 10,000 times per hour benefits enormously. An admin settings page viewed twice a day does not. When you cache indiscriminately, you fill memory with data nobody is requesting, evict data that is actually hot, and increase the surface area for stale data bugs — all while gaining almost nothing in performance.
No TTL planning. Setting every cache key to expire in 3600 seconds is not a strategy. Different data has different volatility. Stock prices change every second. A company's "About Us" page changes once a year. Your TTLs should reflect how often the underlying data actually changes, not some arbitrary default.
Filesystem cache instead of in-memory. We still encounter WordPress installations using the default filesystem-based object cache. This is barely faster than hitting the database directly, because disk I/O is the bottleneck in both cases. If you are going to cache, cache in memory. That means Redis or Memcached — not /tmp/cache/.
Cache-aside vs. write-through confusion. In a cache-aside (lazy-loading) pattern, the application checks the cache first, and on a miss, fetches from the database and populates the cache. In write-through, the application writes to both the cache and the database simultaneously. Mixing these patterns — or not choosing one deliberately — leads to race conditions where the cache and database disagree about the current state of the data.
No monitoring of hit rates. If you do not know your cache hit rate, you do not know if your cache is working. A hit rate below 80% means the majority of your traffic is still hitting the origin, and the cache is just adding latency (the cache check) without reducing load. We have seen production systems where the cache hit rate was 12% — the cache was actively making things slower.
Common Mistakes
Beyond the strategic failures, there are specific tactical mistakes that we see repeatedly.
Using WordPress object cache without Redis. WordPress has an object cache API, but the default implementation stores objects in PHP memory — meaning it only lasts for a single request. Without a persistent backend like Redis, calling wp_cache_set() is essentially useless for cross-request performance. If your WordPress site is running slow under load, especially on WooCommerce, this is often the first thing to check.
Caching authenticated content. Full-page caching for logged-in users is dangerous. If User A's account page gets cached and served to User B, you have a data breach. Varnish, Nginx FastCGI cache, and CDNs must be configured to bypass caching when session cookies are present. We have seen this misconfigured on production sites serving financial data.
TTLs set too high. A WooCommerce store with dynamic pricing cached product pages for 24 hours. Prices changed at 9 AM, but customers saw yesterday's prices until 9 AM the next day. The fix was not to remove caching — it was to reduce the TTL to 300 seconds and implement event-based purging on price updates via woocommerce_product_set_price hooks.
No cache warming strategy. After a deployment or a cache flush, every single request is a cache miss. If your site handles 5,000 requests per minute, that means 5,000 simultaneous database queries in the first minute — a cache stampede. This can bring down the origin server entirely. Cache warming — pre-populating the cache with known hot data before traffic arrives — is not optional for high-traffic sites.
Caching database queries without understanding patterns. Caching the result of SELECT * FROM products WHERE category_id = 5 ORDER BY price ASC only helps if that exact query runs frequently. If users are filtering by different categories, sort orders, and pagination offsets, you end up with thousands of cache keys that are each hit once. Object-level caching (cache each product by ID, assemble the list in application code) is often more effective than query-level caching.
What Actually Works
An effective caching strategy operates on multiple layers, each with a specific purpose and invalidation approach.
Layer 1: Object Cache (Redis). Redis as the object cache backend with proper key namespacing. Every cache key should include a version prefix, the object type, and the identifier: v3:product:48291. This makes bulk invalidation possible — increment the version prefix to effectively invalidate all keys. Configure Redis with maxmemory-policy allkeys-lru so the least recently used keys are evicted automatically when memory is full. For WordPress/WooCommerce, use the Redis Object Cache plugin with WP_REDIS_PREFIX set to the site identifier.
# redis.conf - production settings
maxmemory 2gb
maxmemory-policy allkeys-lru
save 900 1
save 300 10
tcp-keepalive 300
timeout 0Layer 2: Full-Page Cache (Varnish or Nginx FastCGI). For anonymous users, cache the entire rendered HTML page. This eliminates PHP execution and database queries entirely. With Nginx FastCGI cache:
# nginx.conf
fastcgi_cache_path /var/cache/nginx levels=1:2 keys_zone=WORDPRESS:256m inactive=60m max_size=4g;
fastcgi_cache_key "$scheme$request_method$host$request_uri";
fastcgi_cache_valid 200 301 10m;
fastcgi_cache_valid 404 1m;
fastcgi_cache_bypass $skip_cache;
fastcgi_no_cache $skip_cache;The $skip_cache variable should be set to 1 when cookies indicate a logged-in user, when the request is a POST, or when query strings are present (for WooCommerce cart/checkout). Smart purging means using a plugin or hook system that sends PURGE requests to Nginx when content changes — not waiting for the TTL to expire.
Layer 3: CDN for Static Assets. CSS, JavaScript, images, and fonts should be served from a CDN with long TTLs (1 year) and versioned URLs. Instead of /style.css, use /style.css?v=a3f8e9c1 or /style.a3f8e9c1.css. This means the browser caches aggressively but gets the new version instantly on deploy. Cloudflare, Fastly, or AWS CloudFront all support this pattern. This is a fundamental part of improving your server's overall performance profile.
Cache warming for critical paths. After deployment, run a script that requests the top 500 URLs (from your access logs or analytics) to populate the cache before real users hit cache misses. A simple approach:
#!/bin/bash
# warm-cache.sh - run after deployment
URL_LIST="/sitemap.xml" # or extract from access logs
curl -s "https://yoursite.com/sitemap.xml" | \
grep -oP '<loc>\K[^<]+' | \
head -500 | \
xargs -P 10 -I {} curl -s -o /dev/null -w "%{http_code} %{time_total}s {}\n" {}Monitor hit rates and eviction rates. Redis provides INFO stats with keyspace_hits and keyspace_misses. Calculate hit rate as hits / (hits + misses) * 100. Set up alerts: hit rate below 90% means something is wrong. High eviction rate (evicted_keys) means Redis needs more memory or your caching strategy is too broad.
Real-World Scenario: WooCommerce Store With 50,000 Products
A client running WooCommerce with approximately 50,000 SKUs was experiencing 3.2-second average page load times and frequent 502 errors during peak traffic (around 2,000 concurrent sessions). The database server was at 95% CPU utilization, with MySQL running over 800 queries per page load.
The root cause: no object cache, no page cache, and the theme was making unoptimized database queries with no transient caching. Every single page load hit the database for menu items, widget content, product data, and sidebar queries.
What we implemented:
- Redis Object Cache — installed Redis 7.x with 2GB allocated memory and
allkeys-lrueviction policy. Connected via Unix socket for lower latency (WP_REDIS_PATH: /var/run/redis/redis.sock). This immediately reduced database queries per page from 800+ to approximately 120. - Nginx FastCGI Cache — configured with 4GB disk cache, 10-minute TTL for product pages, 60-minute TTL for category pages. Bypassed for logged-in users and cart/checkout URLs. This eliminated PHP execution for 85% of requests.
- Cloudflare CDN — static assets moved to Cloudflare with 1-year cache headers and content-hash filenames. Reduced origin bandwidth by 70%.
- Cache Warming — a post-deploy script that warms the top 1,000 product pages and all category pages. Total warm-up time: approximately 45 seconds with 20 concurrent requests.
- Monitoring — Redis hit rate tracked via Prometheus + Grafana. Alert fires if hit rate drops below 92%.
Results:
- Database CPU: 95% → 15%
- Average page load: 3.2s → 0.6s
- Database queries per page (cache miss): 800 → 120
- Database queries per page (cache hit, no page cache): 120 → 35
- Requests hitting origin server: 100% → 15%
- Redis hit rate: 94.7%
- 502 errors during peak: eliminated
The 5-Step Implementation Approach
If you are starting from scratch or fixing a broken caching setup, follow this sequence:
- Measure first. Before adding any cache layer, profile the current state. Enable MySQL slow query log (
slow_query_log = 1,long_query_time = 0.5). Use New Relic, Datadog, or evenSHOW PROCESSLISTto understand where time is spent. You cannot optimize what you have not measured. - Start with object caching. Install Redis, connect your application's object cache to it. For WordPress: install Redis Object Cache plugin, add
WP_REDIS_HOSTandWP_REDIS_PORTtowp-config.php. Monitor the immediate impact on database query count. - Add full-page caching. Configure Nginx FastCGI cache or Varnish for anonymous users. Define bypass rules carefully — every cookie, every POST request, every checkout URL must skip the cache. Test by logging in as a user and verifying you do not see another user's data.
- Implement cache invalidation. Hook into your application's write events. In WordPress:
save_post,woocommerce_update_product,wp_update_nav_menu. On each event, purge the relevant cache keys and the relevant page cache URLs. This is the hardest step and the most important one. - Monitor and iterate. Track hit rates, eviction rates, and cache-related latency. Adjust TTLs based on real data. Review monthly to ensure the strategy still matches your traffic patterns.
Caching is not something you set up once and forget. It is an ongoing operational concern that requires monitoring, tuning, and periodic review — exactly like database performance or security.
If your cache hit rate is below 90%, your caching strategy needs work. Let us analyze your infrastructure and fix it.