upstream - behavior on pool exhaustion

Ruslan Ermilov ru at nginx.com
Tue Apr 25 14:44:37 UTC 2017


On Sat, Apr 22, 2017 at 01:52:48AM +0200, B.R. via nginx wrote:
> I do not know if your detailed explanation was aimed to me, or to the list
> in general, but I got all that already as far as I am concerned.
> 
> ​To me, when an attempt is made to an upstream group where no peer can be
> selected, a 502 should be returned for that request, and no upstream having
> been selected, no $upstream_* variable​ should contain anything. An error
> message should be generated in the error log.
> I fail to see the benefit of having the group name being considered as an
> upstream peer.... For cons, that name might get confused with a domain
> name, and the grammar of the $upstream_addr variable is complexified.
> Not that, as stated before, the docs merely say the $upstream_addr should
> contains IP addresses and ports of peers, no mention of the upstream group
> name there.
> 
> Well, it seems your design decisions are made, and even though I see things
> differently, I understand what I did not get before.
> Is my vision broke, ie some benefit I am failing to see about your
> implementation?

I agree with you that it's not logical, I was just trying to explain
in details why it happens the way it is.  I've prepared a patch that
addresses this and other inconsistencies with $upstream_* variables.
The only assertion I've preserved is that the number of elements
always stays equal to the number of attempts made to select a peer.
What this means for $upstream_addr is that instead of the upstream
group name it'll now output `-' as one of the possible values, much
like it's always the case for $upstream_status and $upstream_response_time
(see the answer to your other question below).

> Another linked question:
> I got some cases in which $upstream_response_time was '-' for some peers
> (and not a numeric value like 0.000).
> What is the meaning of that? Connection failed? I am not logging the
> $upstream_status variable, not $upstream_connect_time, thus have limited
> information.
> Could that '-' appear anywhere in the list?

This happens when the client aborts the connection before the upstream
server sends the status line with the status code.  The same holds
true for $upstream_status.  And while it seems logical for $upstream_status
not to output it when the status code is unknown, I see no reason not to
output the actual time spent on obtaining the response in this case.
The server may have responded with "HTTP/1.1 ", then $upstream_bytes_received
will be 9, but $upstream_response_time is `-'.  That's another inconsistency
that my patch fixes.

Also, $upstream_status no longer outputs 502 for the case when no live
upstream could be selected.  It'll similarly output `-' to mean "not
applicable" (for this particular attempt).

Just in case you want to give it a try, I've attached the patch against
the current nginx version.

diff --git a/src/http/ngx_http_upstream.c b/src/http/ngx_http_upstream.c
--- a/src/http/ngx_http_upstream.c
+++ b/src/http/ngx_http_upstream.c
@@ -1481,6 +1481,7 @@ ngx_http_upstream_connect(ngx_http_reque
     u->state->response_time = ngx_current_msec;
     u->state->connect_time = (ngx_msec_t) -1;
     u->state->header_time = (ngx_msec_t) -1;
+    u->state->selected = 1;
 
     rc = ngx_event_connect_peer(&u->peer);
 
@@ -1496,6 +1497,7 @@ ngx_http_upstream_connect(ngx_http_reque
     u->state->peer = u->peer.name;
 
     if (rc == NGX_BUSY) {
+        u->state->selected = 0;
         ngx_log_error(NGX_LOG_ERR, r->connection->log, 0, "no live upstreams");
         ngx_http_upstream_next(r, u, NGX_HTTP_UPSTREAM_FT_NOLIVE);
         return;
@@ -4146,7 +4148,9 @@ ngx_http_upstream_next(ngx_http_request_
         return;
     }
 
-    u->state->status = status;
+    if (ft_type != NGX_HTTP_UPSTREAM_FT_NOLIVE) {
+        u->state->status = status;
+    }
 
     timeout = u->conf->next_upstream_timeout;
 
@@ -5223,7 +5227,7 @@ ngx_http_upstream_addr_variable(ngx_http
 
     for (i = 0; i < r->upstream_states->nelts; i++) {
         if (state[i].peer) {
-            len += state[i].peer->len + 2;
+            len += (state[i].selected ? state[i].peer->len : 1) + 2;
 
         } else {
             len += 3;
@@ -5241,7 +5245,13 @@ ngx_http_upstream_addr_variable(ngx_http
 
     for ( ;; ) {
         if (state[i].peer) {
-            p = ngx_cpymem(p, state[i].peer->data, state[i].peer->len);
+
+            if (state[i].selected) {
+                p = ngx_cpymem(p, state[i].peer->data, state[i].peer->len);
+
+            } else {
+                *p++ = '-';
+            }
         }
 
         if (++i == r->upstream_states->nelts) {
@@ -5368,7 +5378,7 @@ ngx_http_upstream_response_time_variable
     state = r->upstream_states->elts;
 
     for ( ;; ) {
-        if (state[i].status) {
+        if (state[i].selected) {
 
             if (data == 1 && state[i].header_time != (ngx_msec_t) -1) {
                 ms = state[i].header_time;
@@ -5445,12 +5455,17 @@ ngx_http_upstream_response_length_variab
     state = r->upstream_states->elts;
 
     for ( ;; ) {
-
-        if (data == 1) {
-            p = ngx_sprintf(p, "%O", state[i].bytes_received);
+        if (state[i].selected) {
+
+            if (data == 1) {
+                p = ngx_sprintf(p, "%O", state[i].bytes_received);
+
+            } else {
+                p = ngx_sprintf(p, "%O", state[i].response_length);
+            }
 
         } else {
-            p = ngx_sprintf(p, "%O", state[i].response_length);
+            *p++ = '-';
         }
 
         if (++i == r->upstream_states->nelts) {
diff --git a/src/http/ngx_http_upstream.h b/src/http/ngx_http_upstream.h
--- a/src/http/ngx_http_upstream.h
+++ b/src/http/ngx_http_upstream.h
@@ -65,6 +65,7 @@ typedef struct {
     off_t                            bytes_received;
 
     ngx_str_t                       *peer;
+    ngx_uint_t                       selected;  /* unsigned selected:1 */
 } ngx_http_upstream_state_t;
 
 


More information about the nginx mailing list