upstream prematurely closed connection

BaHbka bahbka at gmail.com
Fri Jul 18 09:28:16 MSD 2008


DHCP
Да, кстати, в логах php-fpm, в логах на бэкендах и на фронтенде ничего подозрительного нет.
Еще раз повторюсь, что соедниения обрываются со всеми бэкендами _одновременно_.

Вот на графике http://img91.imageshack.us/img91/3523/prematurelyclosedes9.png кол-во таких строчек в логе nginx'а за 5 минут.

Valery Kholodkov пишет:
> На бездисковых бакэндах IP статический или DHCP?
> 
> BaHbka wrote:
>> Здравствуйте.
>>
>> Имеются 1 фронтэнд, 4 бэкенда и 2 сервера mysql
>> Никак не могу справиться с одной проблемой: периодически в логе 
>> ngnix'а появляются строки следующего вида
>>
>> 2008/07/17 14:34:40 [error] 14415#0: *3460073 upstream prematurely 
>> closed connection while reading response header from upstream, ... 
>> upstream: "fastcgi://192.168.0.1:9000"
>> 2008/07/17 14:34:40 [error] 14415#0: *3460070 upstream prematurely 
>> closed connection while reading response header from upstream, ... 
>> upstream: "fastcgi://192.168.0.3:9000"
>> 2008/07/17 14:34:40 [error] 14415#0: *3460069 upstream prematurely 
>> closed connection while reading response header from upstream, ... 
>> upstream: "fastcgi://192.168.0.4:9000"
>> 2008/07/17 14:34:40 [error] 14410#0: *3460247 upstream prematurely 
>> closed connection while reading response header from upstream, ... 
>> upstream: "fastcgi://192.168.0.4:9000"
>>
>> после чего зачастую проскакивает no live upstreams while connecting to 
>> upstream
>>
>> иногда бывают такие:
>>
>> 2008/07/17 14:54:21 [error] 14412#0: *3760898 kevent() reported about 
>> an closed connection (54: Connection reset by peer) while reading 
>> response header from upstream, ... upstream: "fastcgi://192.168.0.2:9000"
>> 2008/07/17 14:54:21 [error] 14412#0: *3760900 kevent() reported about 
>> an closed connection (54: Connection reset by peer) while reading 
>> response header from upstream, ... upstream: "fastcgi://192.168.0.3:9000"
>> 2008/07/17 14:54:21 [error] 14412#0: *3746025 kevent() reported about 
>> an closed connection (54: Connection reset by peer) while reading 
>> response header from upstream, ... upstream: "fastcgi://192.168.0.3:9000"
>>
>> Причем появляются эти строчки пачками в одну секунду штук по 5-50 с 
>> периодичностью раз в 5-10 минут (в зависимости от нагрузки), в 
>> основном со всех бэкендов.
>> Утром например, когда кол-во посетителей небольшое этого не происходит.
>> Причем в реквесте часто бывают скрипты, которые вообще не используют 
>> mysql. В одном из таких тестовых скриптов вообще пхпшного кода нет 
>> (скрипт из одной строчки вида <input id=asd value=1>) :) Но бэкенд все 
>> равно разрывает соединение.
>> Я думаю (но могу ошибаться) mysql тут ни при чем, а раз в одну секунду 
>> разрывается соединение со всеми бэками, то и они ни при чем.
>> Подозреваю, что клиенты в этот момент получают 502-ю, т.к. сам бывало 
>> на нее нарывался.
>>
>> В дебаг логе nginx'а следующее (не уверен, все ли что нужно захватил)
>>
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 http init upstream, 
>> client timer: 0
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 kevent set event: 534: 
>> ft:-2 fl:0025
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 http script copy: ""
>> ...
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 fastcgi param: 
>> "X_FORWARDED_FOR: 123.123.123.123"
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 http script copy: ""
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 fastcgi param: 
>> "SSL_CLIENT_SERIAL: 0"
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 http cleanup add: 
>> 00000000006EDF40
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 get rr peer, try: 4
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 get rr peer, current: 0 2
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 socket 237
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 connect to 
>> 192.168.0.2:9000, fd:237 #405294
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 kevent set event: 237: 
>> ft:-1 fl:0025
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 kevent set event: 237: 
>> ft:-2 fl:0025
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 http upstream connect: -2
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 event timer add: 237: 
>> 75000:1216210747795
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 http upstream check 
>> client, write event:1, "/blablabla.php"
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 http upstream process header
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 malloc: 
>> 000000000110E000:32768
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 recv: eof:1, avail:0, err:0
>> 2008/07/16 16:17:58 [error] 35120#0: *405143 upstream prematurely 
>> closed connection while reading response header from upstream, .....
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 http next upstream, 2
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 free rr peer 4 4
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 free rr peer failed: 0 -5
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 close http upstream 
>> connection: 237
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 event timer del: 237: 
>> 1216210747795
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 get rr peer, try: 3
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 socket 140
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 connect to 
>> 192.168.0.3:9000, fd:140 #406707
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 kevent set event: 140: 
>> ft:-1 fl:0025
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 kevent set event: 140: 
>> ft:-2 fl:0025
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 http upstream connect: -2
>>
>> или вот, например
>>
>> 2008/07/16 16:17:49 [debug] 35117#0: *404981 connect to 
>> 192.168.0.3:9000, fd:560 #405020
>> 2008/07/16 16:17:49 [debug] 35117#0: *404981 kevent set event: 560: 
>> ft:-1 fl:0025
>> 2008/07/16 16:17:49 [debug] 35117#0: *404981 kevent set event: 560: 
>> ft:-2 fl:0025
>> 2008/07/16 16:17:49 [debug] 35117#0: *404981 http upstream connect: -2
>> 2008/07/16 16:17:49 [debug] 35117#0: *404981 event timer add: 560: 
>> 75000:1216210744998
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 post event 0000000000DA58A0
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 post event 0000000000AFAA20
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 post event 0000000000DA6A20
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 delete posted event 
>> 0000000000DA6A20
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 http upstream send 
>> request handler
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 http upstream send request
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 chain writer buf fl:0 s:1320
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 chain writer in: 
>> 000000000106EF68
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 writev: 1320 of 1320
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 chain writer out: 
>> 0000000000000000
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 event timer del: 560: 
>> 1216210744998
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 event timer add: 560: 
>> 180000:1216210859644
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 http upstream process header
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 malloc: 
>> 00000000006E9000:32768
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 recv: eof:1, avail:0, err:0
>> 2008/07/16 16:17:59 [error] 35117#0: *404981 upstream prematurely 
>> closed connection while reading response header from upstream, .....
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 http next upstream, 2
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 free rr peer 4 4
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 free rr peer failed: 2 -11
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 close http upstream 
>> connection: 560
>>
>> ktrace'ом бэкендов ничего не смог выловить. Видимо просто не умею его 
>> готовить :)
>>
>> Схема следующая:
>> ---
>> один фронтэнд - FreeBSD 6.2-RELEASE-p7 amd64
>> nginx 0.6.32 (до этого был nginx 0.5.34 - ситуация такая же) ставился 
>> из портов
>>
>> netstat -Lan
>> tcp4  0/0/4096       123.123.123.123.80
>>
>> netstat -m
>> 6652/9503/16155 mbufs in use (current/cache/total)
>> 2249/7867/10116/32768 mbuf clusters in use (current/cache/total/max)
>> 2231/6351 mbuf+clusters out of packet secondary zone in use 
>> (current/cache)
>> 6161K/18109K/24270K bytes allocated to network (current/cache/total)
>>
>> ngnix.conf:
>>
>> user                    www www;
>> worker_processes        8;
>>
>> ...
>>
>> events {
>>         worker_connections  25000;
>>         use kqueue;
>> }
>>
>> http {
>> ...
>>         client_header_timeout           1m;
>>         client_body_timeout             1m;
>>         send_timeout                    1m;
>>
>>         client_header_buffer_size       32k;
>>         large_client_header_buffers     16 16k;
>>
>>         gzip                            on;
>>         gzip_min_length                 1100;
>>         gzip_buffers                    4 8k;
>>         gzip_types                      text/plain text/css 
>> application/x-javascript;
>>
>>         output_buffers                  1 32k;
>>         postpone_output                 1460;
>>
>>         sendfile                        on;
>>         sendfile_max_chunk              128k;
>>         tcp_nopush                      on;
>>         tcp_nodelay                     on;
>>         send_lowat                      12000;
>>
>>         keepalive_timeout               75 20;
>>
>>         fastcgi_hide_header             x-powered-by;
>>         fastcgi_ignore_client_abort     off;
>>         fastcgi_next_upstream           error;
>>
>>         upstream backend
>>         {
>>                 server  192.168.0.1:9000        weight=10;
>>                 server  192.168.0.2:9000        weight=10;
>>                 server  192.168.0.3:9000        weight=10;
>>                 server  192.168.0.4:9000        weight=10;
>>         }
>>
>>         server {
>>
>>                 listen                  123.123.123.123:80;
>>                 server_name             www.***.ru ***.ru;
>>                 error_log               
>> /var/log/nginx/nginx-error.log  error;
>>
>>                 client_body_buffer_size 1024k;
>>                 client_max_body_size    4m;
>>
>> ...
>>
>>                location ~ \.php$
>>                 {
>>                         access_log                              off;
>>                         error_log                               
>> /var/log/nginx/php-error.log error;
>>                         fastcgi_pass                            backend;
>>                         fastcgi_index                           
>> index.php;
>>                         fastcgi_connect_timeout                 75;
>>                         fastcgi_send_timeout                    180;
>>                         fastcgi_read_timeout                    180;
>>                         fastcgi_buffer_size                     32k;
>>                         fastcgi_buffers                         8 256k;
>>                         fastcgi_busy_buffers_size               256k;
>>                         fastcgi_temp_file_write_size            256k;
>>                         fastcgi_param   SCRIPT_FILENAME         
>> /var/www/www.***.ru/php/$fastcgi_script_name;
>> ...
>>
>> ---
>> четыре бездисковых бэкенда FreeBSD 7.0-RELEASE-p1 amd64 (раньше было 
>> FreeBSD 6.2, php 5.2.4 без php-fpm патча - такая же ситуация)
>> php 5.2.6 + php-fpm 0.5.8-rc1
>>
>> netstat -Lan
>> tcp4  0/0/4096       *.9000
>>
>> netstat -m
>> 262/3578/3840 mbufs in use (current/cache/total)
>> 260/3624/3884/25600 mbuf clusters in use (current/cache/total/max)
>> 260/2812 mbuf+clusters out of packet secondary zone in use 
>> (current/cache)
>> 585K/9030K/9616K bytes allocated to
>>  network (current/cache/total)
>>
>> php-fpm.conf:
>> ...
>>                 <section name="pool">
>>                         <value name="name">first</value>
>>                         <value name="listen_address">0.0.0.0:9000</value>
>> ...
>>                                 <value name="max_children">100</value>
>> ...
>>                         <value 
>> name="request_execution_timeout">601s</value>
>>                         <value name="rlimit_files">2048</value>
>>                         <value name="rlimit_core">0</value>
>> ...
>>                         <value name="max_requests">500</value>
>> ...
>>
>> Где копать? На фронтэнде или все таки бэкенды выеживаются?
>>
>> Заранее благодарю :)
>>
>>
>>
> 
> 





More information about the nginx-ru mailing list