Surviving Digg?
Neil Sheth
nsheth at gmail.com
Wed Apr 30 03:01:02 MSD 2008
On Tue, Apr 29, 2008 at 2:07 PM, Aleksandar Lazic <al-nginx at none.at> wrote:
> Hi Neil,
>
>
> On Die 29.04.2008 13:38, Neil Sheth wrote:
>
> >
> > We hit the front page of digg the other night, and our servers didn't
> > handle it well at all. Here's a little of what happened, and perhaps
> > someone has some suggestions on what to tweak!
> >
> > Basic setup, nginx 0.5.35, serving up static image content, and then
> > passing php requests to 2 backend servers running apache, all running
> > red hat el4.
> >
>
> What was/is the network settings on the maschines?
What specific settings are you asking about?
>
>
> > Now, we started seeing the following:
> > upstream timed out (110: Connection timed out) while connecting to
> > upstream
> >
>
>
> What was the load on the backends?
> What are the settings of apache?
> Have you take a looke about
>
> netstat -nt
>
> how many FIN* things do you have?
Right now, shows about 60. Not sure what the count of FIN objects was
at the time of the digg. I did run the following (found in a forum
somewhere, to give connection counts by IP):
netstat -ntu | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -nr
This showed the number of connections to the backend servers to be
almost 1000 each.
>
>
>
> > So, perhaps the 2 backend servers couldn't handle the load? We were
> > serving the page mostly out of memcache at this point. In any case,
> > couldn't figure out why that wasn't sufficient, so we replaced the page
> > with a static html one.
> >
> > This seemed to help, but we were now seeing a lot of these:
> > connect() failed (113: No route to host) while connecting to upstream
> > no live upstreams while connecting to upstream
> >
>
> Have you put names or ip-addresses into the nginx config?
IP addresses
>
>
> > This wasn't on every request, but a significant percentage. This, we
> > couldn't figure out. Why couldn't it connect to the backend servers?
> > We ended up rebooting both of the backend servers, and these errors
> > stopped.
> >
>
> Again load and netstat?!
Load didn't actually look that bad, if I recall. Probably peaks
around 4 while this was occuring, but generally lower.
> Cheers
>
> Aleks
>
>
Thanks for the help!
More information about the nginx
mailing list