nginx, php-fpm and 502 errors

Igor Sysoev is at rambler-co.ru
Tue Nov 20 19:16:39 MSK 2007


On Tue, Nov 20, 2007 at 04:17:00PM +0100, Jure Pe??ar wrote:

> I'm trying to understand why some of our production nginx/php-fpm servers frequently return 502 errors. At that time "writev() failed (107: Transport endpoint is not connected) while sending request to upstream" is logged into error log.
> 
> Runnng strace -e connect,writev on nginx worker process frequently shows:
> 
> connect(75, {sa_family=AF_FILE, path="/tmp/php-fpm.sock"}, 110) = 0
> writev(75, [{"\1\1\0\1\0\10\0\0\0\1\0\0\0\0\0\0\1\4\0\1\6\260\0\0\17"..., 1752}], 1) = 1752
> connect(759, {sa_family=AF_FILE, path="/tmp/php-fpm.sock"}, 110) = 0
> writev(759, [{"\1\1\0\1\0\10\0\0\0\1\0\0\0\0\0\0\1\4\0\1\5\377\1\0\17"..., 1576}], 1) = 1576
> connect(940, {sa_family=AF_FILE, path="/tmp/php-fpm.sock"}, 110) = -1 EAGAIN (Resource temporarily unavailable)
> connect(996, {sa_family=AF_FILE, path="/tmp/php-fpm.sock"}, 110) = -1 EAGAIN (Resource temporarily unavailable)
> connect(391, {sa_family=AF_FILE, path="/tmp/php-fpm.sock"}, 110) = -1 EAGAIN (Resource temporarily unavailable)
> connect(1120, {sa_family=AF_FILE, path="/tmp/php-fpm.sock"}, 110) = -1 EAGAIN (Resource temporarily unavailable)
> writev(996, [{"\1\1\0\1\0\10\0\0\0\1\0\0\0\0\0\0\1\4\0\1\6O\1\0\0172S"..., 1656}], 1) = -1 ENOTCONN (Transport endpoint is not connected)
> writev(758, [{"HTTP/1.1 502 Bad Gateway\r\nServer"..., 157}], 1) = 157
> writev(940, [{"\1\1\0\1\0\10\0\0\0\1\0\0\0\0\0\0\1\4\0\1\6\320\0\0\17"..., 1784}], 1) = -1 ENOTCONN (Transport endpoint is not connected)
> writev(658, [{"HTTP/1.1 502 Bad Gateway\r\nServer"..., 157}], 1) = 157
> 
> Could these EAGAIN from connect be related to ENOTCONN from writev?

Partially. Usually connect() to an unix stream is established at once because
it's localhost. It's seem there is shortage of some resources.

By the way, it's starnge that Linux returns EAGAIN instead of EINPROGRESS.
It's also strange that Linux does not return ENOTCONN error via
getsockopt(SO_ERROR).

> The frequency of these increase if I decrease the number of php-cgi processes and vice-versa. If I have enough php-cgi processes, they do not occur at all. But this "enough" number is way too high for my taste (I ended with 64).
> 
> What exactly happens that writev looses connection in the middle of the writing?

It's not the middle. It's first FastCGI packet: "\1\1\0\1\0\10\0\0..."
The scenario is following:

   connect() returns EAGAIN,
   nginx adds socket to epoll
   epoll reports about some condition (may be an error) on the socket
   nginx writev()s FastCGI request and the writev() returns ENOTCONN.


-- 
Igor Sysoev
http://sysoev.ru/en/





More information about the nginx mailing list