Surviving Digg?
Igor Sysoev
is at rambler-co.ru
Wed Apr 30 11:08:52 MSD 2008
On Tue, Apr 29, 2008 at 01:38:13PM -0700, Neil Sheth wrote:
> We hit the front page of digg the other night, and our servers didn't
> handle it well at all. Here's a little of what happened, and perhaps
> someone has some suggestions on what to tweak!
>
> Basic setup, nginx 0.5.35, serving up static image content, and then
> passing php requests to 2 backend servers running apache, all running
> red hat el4.
>
> Looking at the nginx error log -
>
> First, we saw a lot of entries like the following:
> socket() failed (24: Too many open files) while connecting to upstream
> accept() failed (24: Too many open files) while accepting new connection
> open() "/var/www/html/images/imagefile.jpg" failed (24: Too many open files)
>
> Running ulimit -n showed 1024, so set that to 32768 on all 3 servers.
> Also raised limit in /etc/security/limits.conf.
You need to tune your OS: to increase number of files, sockets, etc.
I can not say about Linux, but here is my tunning for FreeBSD/amd64, 4G
for large number of sockets/etc:
http://lists.freebsd.org/pipermail/freebsd-net/2008-April/017737.html
> Now, we started seeing the following:
> upstream timed out (110: Connection timed out) while connecting to upstream
>
> So, perhaps the 2 backend servers couldn't handle the load? We were
> serving the page mostly out of memcache at this point. In any case,
> couldn't figure out why that wasn't sufficient, so we replaced the
> page with a static html one.
Yes, it seems that your backend can not handle load.
> This seemed to help, but we were now seeing a lot of these:
> connect() failed (113: No route to host) while connecting to upstream
> no live upstreams while connecting to upstream
>
> This wasn't on every request, but a significant percentage. This, we
> couldn't figure out. Why couldn't it connect to the backend servers?
> We ended up rebooting both of the backend servers, and these errors
> stopped.
>
> Any thoughts / comments anyone has? Thanks!
The "113: No route to host" is network error, it might be appeared while
backend rebooting.
--
Igor Sysoev
http://sysoev.ru/en/
More information about the nginx
mailing list