upstream prematurely closed connection
BaHbka
bahbka at gmail.com
Fri Jul 18 09:28:16 MSD 2008
DHCP
Да, кстати, в логах php-fpm, в логах на бэкендах и на фронтенде ничего подозрительного нет.
Еще раз повторюсь, что соедниения обрываются со всеми бэкендами _одновременно_.
Вот на графике http://img91.imageshack.us/img91/3523/prematurelyclosedes9.png кол-во таких строчек в логе nginx'а за 5 минут.
Valery Kholodkov пишет:
> На бездисковых бакэндах IP статический или DHCP?
>
> BaHbka wrote:
>> Здравствуйте.
>>
>> Имеются 1 фронтэнд, 4 бэкенда и 2 сервера mysql
>> Никак не могу справиться с одной проблемой: периодически в логе
>> ngnix'а появляются строки следующего вида
>>
>> 2008/07/17 14:34:40 [error] 14415#0: *3460073 upstream prematurely
>> closed connection while reading response header from upstream, ...
>> upstream: "fastcgi://192.168.0.1:9000"
>> 2008/07/17 14:34:40 [error] 14415#0: *3460070 upstream prematurely
>> closed connection while reading response header from upstream, ...
>> upstream: "fastcgi://192.168.0.3:9000"
>> 2008/07/17 14:34:40 [error] 14415#0: *3460069 upstream prematurely
>> closed connection while reading response header from upstream, ...
>> upstream: "fastcgi://192.168.0.4:9000"
>> 2008/07/17 14:34:40 [error] 14410#0: *3460247 upstream prematurely
>> closed connection while reading response header from upstream, ...
>> upstream: "fastcgi://192.168.0.4:9000"
>>
>> после чего зачастую проскакивает no live upstreams while connecting to
>> upstream
>>
>> иногда бывают такие:
>>
>> 2008/07/17 14:54:21 [error] 14412#0: *3760898 kevent() reported about
>> an closed connection (54: Connection reset by peer) while reading
>> response header from upstream, ... upstream: "fastcgi://192.168.0.2:9000"
>> 2008/07/17 14:54:21 [error] 14412#0: *3760900 kevent() reported about
>> an closed connection (54: Connection reset by peer) while reading
>> response header from upstream, ... upstream: "fastcgi://192.168.0.3:9000"
>> 2008/07/17 14:54:21 [error] 14412#0: *3746025 kevent() reported about
>> an closed connection (54: Connection reset by peer) while reading
>> response header from upstream, ... upstream: "fastcgi://192.168.0.3:9000"
>>
>> Причем появляются эти строчки пачками в одну секунду штук по 5-50 с
>> периодичностью раз в 5-10 минут (в зависимости от нагрузки), в
>> основном со всех бэкендов.
>> Утром например, когда кол-во посетителей небольшое этого не происходит.
>> Причем в реквесте часто бывают скрипты, которые вообще не используют
>> mysql. В одном из таких тестовых скриптов вообще пхпшного кода нет
>> (скрипт из одной строчки вида <input id=asd value=1>) :) Но бэкенд все
>> равно разрывает соединение.
>> Я думаю (но могу ошибаться) mysql тут ни при чем, а раз в одну секунду
>> разрывается соединение со всеми бэками, то и они ни при чем.
>> Подозреваю, что клиенты в этот момент получают 502-ю, т.к. сам бывало
>> на нее нарывался.
>>
>> В дебаг логе nginx'а следующее (не уверен, все ли что нужно захватил)
>>
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 http init upstream,
>> client timer: 0
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 kevent set event: 534:
>> ft:-2 fl:0025
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 http script copy: ""
>> ...
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 fastcgi param:
>> "X_FORWARDED_FOR: 123.123.123.123"
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 http script copy: ""
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 fastcgi param:
>> "SSL_CLIENT_SERIAL: 0"
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 http cleanup add:
>> 00000000006EDF40
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 get rr peer, try: 4
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 get rr peer, current: 0 2
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 socket 237
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 connect to
>> 192.168.0.2:9000, fd:237 #405294
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 kevent set event: 237:
>> ft:-1 fl:0025
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 kevent set event: 237:
>> ft:-2 fl:0025
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 http upstream connect: -2
>> 2008/07/16 16:17:52 [debug] 35120#0: *405143 event timer add: 237:
>> 75000:1216210747795
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 http upstream check
>> client, write event:1, "/blablabla.php"
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 http upstream process header
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 malloc:
>> 000000000110E000:32768
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 recv: eof:1, avail:0, err:0
>> 2008/07/16 16:17:58 [error] 35120#0: *405143 upstream prematurely
>> closed connection while reading response header from upstream, .....
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 http next upstream, 2
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 free rr peer 4 4
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 free rr peer failed: 0 -5
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 close http upstream
>> connection: 237
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 event timer del: 237:
>> 1216210747795
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 get rr peer, try: 3
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 socket 140
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 connect to
>> 192.168.0.3:9000, fd:140 #406707
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 kevent set event: 140:
>> ft:-1 fl:0025
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 kevent set event: 140:
>> ft:-2 fl:0025
>> 2008/07/16 16:17:58 [debug] 35120#0: *405143 http upstream connect: -2
>>
>> или вот, например
>>
>> 2008/07/16 16:17:49 [debug] 35117#0: *404981 connect to
>> 192.168.0.3:9000, fd:560 #405020
>> 2008/07/16 16:17:49 [debug] 35117#0: *404981 kevent set event: 560:
>> ft:-1 fl:0025
>> 2008/07/16 16:17:49 [debug] 35117#0: *404981 kevent set event: 560:
>> ft:-2 fl:0025
>> 2008/07/16 16:17:49 [debug] 35117#0: *404981 http upstream connect: -2
>> 2008/07/16 16:17:49 [debug] 35117#0: *404981 event timer add: 560:
>> 75000:1216210744998
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 post event 0000000000DA58A0
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 post event 0000000000AFAA20
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 post event 0000000000DA6A20
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 delete posted event
>> 0000000000DA6A20
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 http upstream send
>> request handler
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 http upstream send request
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 chain writer buf fl:0 s:1320
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 chain writer in:
>> 000000000106EF68
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 writev: 1320 of 1320
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 chain writer out:
>> 0000000000000000
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 event timer del: 560:
>> 1216210744998
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 event timer add: 560:
>> 180000:1216210859644
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 http upstream process header
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 malloc:
>> 00000000006E9000:32768
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 recv: eof:1, avail:0, err:0
>> 2008/07/16 16:17:59 [error] 35117#0: *404981 upstream prematurely
>> closed connection while reading response header from upstream, .....
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 http next upstream, 2
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 free rr peer 4 4
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 free rr peer failed: 2 -11
>> 2008/07/16 16:17:59 [debug] 35117#0: *404981 close http upstream
>> connection: 560
>>
>> ktrace'ом бэкендов ничего не смог выловить. Видимо просто не умею его
>> готовить :)
>>
>> Схема следующая:
>> ---
>> один фронтэнд - FreeBSD 6.2-RELEASE-p7 amd64
>> nginx 0.6.32 (до этого был nginx 0.5.34 - ситуация такая же) ставился
>> из портов
>>
>> netstat -Lan
>> tcp4 0/0/4096 123.123.123.123.80
>>
>> netstat -m
>> 6652/9503/16155 mbufs in use (current/cache/total)
>> 2249/7867/10116/32768 mbuf clusters in use (current/cache/total/max)
>> 2231/6351 mbuf+clusters out of packet secondary zone in use
>> (current/cache)
>> 6161K/18109K/24270K bytes allocated to network (current/cache/total)
>>
>> ngnix.conf:
>>
>> user www www;
>> worker_processes 8;
>>
>> ...
>>
>> events {
>> worker_connections 25000;
>> use kqueue;
>> }
>>
>> http {
>> ...
>> client_header_timeout 1m;
>> client_body_timeout 1m;
>> send_timeout 1m;
>>
>> client_header_buffer_size 32k;
>> large_client_header_buffers 16 16k;
>>
>> gzip on;
>> gzip_min_length 1100;
>> gzip_buffers 4 8k;
>> gzip_types text/plain text/css
>> application/x-javascript;
>>
>> output_buffers 1 32k;
>> postpone_output 1460;
>>
>> sendfile on;
>> sendfile_max_chunk 128k;
>> tcp_nopush on;
>> tcp_nodelay on;
>> send_lowat 12000;
>>
>> keepalive_timeout 75 20;
>>
>> fastcgi_hide_header x-powered-by;
>> fastcgi_ignore_client_abort off;
>> fastcgi_next_upstream error;
>>
>> upstream backend
>> {
>> server 192.168.0.1:9000 weight=10;
>> server 192.168.0.2:9000 weight=10;
>> server 192.168.0.3:9000 weight=10;
>> server 192.168.0.4:9000 weight=10;
>> }
>>
>> server {
>>
>> listen 123.123.123.123:80;
>> server_name www.***.ru ***.ru;
>> error_log
>> /var/log/nginx/nginx-error.log error;
>>
>> client_body_buffer_size 1024k;
>> client_max_body_size 4m;
>>
>> ...
>>
>> location ~ \.php$
>> {
>> access_log off;
>> error_log
>> /var/log/nginx/php-error.log error;
>> fastcgi_pass backend;
>> fastcgi_index
>> index.php;
>> fastcgi_connect_timeout 75;
>> fastcgi_send_timeout 180;
>> fastcgi_read_timeout 180;
>> fastcgi_buffer_size 32k;
>> fastcgi_buffers 8 256k;
>> fastcgi_busy_buffers_size 256k;
>> fastcgi_temp_file_write_size 256k;
>> fastcgi_param SCRIPT_FILENAME
>> /var/www/www.***.ru/php/$fastcgi_script_name;
>> ...
>>
>> ---
>> четыре бездисковых бэкенда FreeBSD 7.0-RELEASE-p1 amd64 (раньше было
>> FreeBSD 6.2, php 5.2.4 без php-fpm патча - такая же ситуация)
>> php 5.2.6 + php-fpm 0.5.8-rc1
>>
>> netstat -Lan
>> tcp4 0/0/4096 *.9000
>>
>> netstat -m
>> 262/3578/3840 mbufs in use (current/cache/total)
>> 260/3624/3884/25600 mbuf clusters in use (current/cache/total/max)
>> 260/2812 mbuf+clusters out of packet secondary zone in use
>> (current/cache)
>> 585K/9030K/9616K bytes allocated to
>> network (current/cache/total)
>>
>> php-fpm.conf:
>> ...
>> <section name="pool">
>> <value name="name">first</value>
>> <value name="listen_address">0.0.0.0:9000</value>
>> ...
>> <value name="max_children">100</value>
>> ...
>> <value
>> name="request_execution_timeout">601s</value>
>> <value name="rlimit_files">2048</value>
>> <value name="rlimit_core">0</value>
>> ...
>> <value name="max_requests">500</value>
>> ...
>>
>> Где копать? На фронтэнде или все таки бэкенды выеживаются?
>>
>> Заранее благодарю :)
>>
>>
>>
>
>
More information about the nginx-ru
mailing list