Avoiding Nginx restart when rsyncing cache across machines
quintinpar at gmail.com
Thu Sep 13 18:45:43 UTC 2018
Thank you for this. GEM all over. I didn’t know curl had –resolve.
This is a more a generic question: How does one ensure cache consistency on
all edges? Do people resort to a combination of expiry + background update
+ stale responding? What if one edge and the origin was updated to the
latest and I now want all the other 1000 edges updates within a minute but
the content expiry is 100 days.
On Wed, Sep 12, 2018 at 11:39 PM Lucas Rolff <lucas at lucasrolff.com> wrote:
> > The cache is pretty big and I want to limit unnecessary requests if I
> 30gb of cache and ~ 400k hits isn’t a lot.
> > Cloudflare is in front of my machines and I pay for load balancing,
> firewall, Argo among others. So there is a cost per request.
> Doesn’t matter if you pay for load balancing, firewall, argo etc –
> implementing a secondary caching layer won’t increase your costs on the
> CloudFlare side of things, because you’re not communicating via CloudFlare
> but rather between machines – you’d connect your X amount of locations to a
> smaller amount of locations, doing direct traffic between your DigitalOcean
> instances – so no CloudFlare costs involved.
> Communication between your CDN servers and your origin server also (IMO)
> shouldn’t go via any CloudFlare related products, so additional hits on the
> origin will be “free” in the expense of a bit higher load – however since
> it would be only a subset of locations that would request via the origin,
> and they then serve as the origin for your other servers – you’re
> effectively decreasing the origin traffic.
> You should easily be able to get a 97-99% offload of your origin (in my
> own setup, it’s at 99.95% at this point), even without using a secondary
> layer, and performance can get improved by using stuff such as:
> Nginx is smart enough to do a sub-request in the background to check if
> the origin request updated (using modified or etags e.g) – this way the
> origin communication would be little anyway.
> The only Load Balancer / Argo / Firewall costs you should have is the “CDN
> Server -> end user” traffic, and that won’t increase or decrease by doing a
> normal proxy_cache setup or a setup with a secondary cache layer.
> You also won’t increase costs by doing a warmup of your CDN servers – you
> could do something as simple as:
> curl -o /dev/null -k -I --resolve cdn.yourdomain.com:80:127.0.0.1
> You could do the same with python or another language if you’re feeling
> more comfortable there.
> However using a method like above, will result in your warmup being kept
> “local”, since you’re resolving the cdn.yourdomain.com to localhost,
> requests that are not yet cached will use whatever is configured in your
> proxy_pass in the nginx config.
> > Admittedly I have a not so complex cache architecture. i.e. all cache
> machines in front of the origin and it has worked so far
> I would say it’s complex if you have to sync your content – many pull
> based CDN’s simply do a normal proxy_cache + proxy_pass setup, not syncing
> content, and then using some of the nifty features (such as
> proxy_cache_background_update and proxy_cache_use_stale_updating) to
> decrease the origin traffic, or possibly implementing a secondary layer if
> they’re still doing a lot of origin traffic (e.g. because of having a lot
> of “edge servers”) – if you’re like 10 servers, I wouldn’t even consider a
> secondary layer unless your origin is under heavy load and can’t handle 10
> possible clients (CDN Servers).
> Best Regards,
> Lucas Rolff
> nginx mailing list
> nginx at nginx.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the nginx