status-line trailing SP is missing ?

Maxim Dounin mdounin at mdounin.ru
Thu Aug 31 10:28:39 UTC 2023


Hello!

On Wed, Aug 30, 2023 at 04:20:15PM +0400, Sergey Kandaurov wrote:

> > On 29 Aug 2023, at 08:33, Maxim Dounin <mdounin at mdounin.ru> wrote:
> > 
> > On Mon, Aug 28, 2023 at 08:59:28PM +0400, Sergey Kandaurov wrote:
> > 
> >>> On 26 Aug 2023, at 18:21, Jérémy Lal <kapouer at melix.org> wrote:
> >>> 
> >>> Hi,
> >>> 
> >>> https://bugs.debian.org/1050571
> >>> reports that the Status line doesn't always end with space, which seems contradictory to RFC9112 which states:
> >>> "A server MUST send the space that separates the status-code from the reason-phrase even when the reason-phrase is absent (i.e., the status-line would end with the space)."
> >>> 
> >>> Is it a documented nginx 1.24.0 behavior ?
> >>> 
> >> 
> >> Note that the response line with empty reason phrase
> >> is properly generated since nginx 1.5.6.
> >> The exception remains is proxying FastCGI responses
> >> as there is no distinguished response line in CGI syntax.
> >> 
> >> The reason is that Status is a CGI header field, and hence
> >> it is parsed by a generic routine that cuts trailing spaces.
> >> But it can have a trailing space per RFC 3875, section 6.3.3.
> >> So it needs a special treatment to preserve SP before empty reason
> >> phrase.  The below patch should help; although it doesn't look
> >> efficient and can be polished, I think this is quite enough for
> >> valid use cases.
> > 
> > I very much doubt that RFC 3875 properly defines whitespace 
> > handling, see my response to the original report.  In this 
> > particular case, it seems to define a header which cannot be 
> > correctly parsed if reason-phrase is empty.
> > 
> 
> Yes, it is quite dubious how this can be parsed correctly.
> Although it is valid to have a trailing space in Status,
> this contradicts to header field value syntax per RFC 3875:
>       field-content   = *( token | separator | quoted-string )

Note that field-value syntax does no apply to the "Status" header, 
its syntax is defined separately.

On the other hand, the "Status" header syntax does not allow any 
spaces after the colon, which rules out headers like "Status: 200 OK",
and makes the "Status" header syntax highly questionable.

As already suggested in my response to the original report, I tend 
to think that the best available option is to ignore RFC 3875 idea 
about headers syntax, and use HTTP instead.

> 
> >> 
> >> # HG changeset patch
> >> # User Sergey Kandaurov <pluknet at nginx.com>
> >> # Date 1693238094 -14400
> >> #      Mon Aug 28 19:54:54 2023 +0400
> >> # Node ID f621c60dfa277774ab5fadb3c8b805957ca3f281
> >> # Parent  2880f60a80c3e2706151dc7b48dc1267e39c47a9
> >> FastCGI: preserved SP for empty Status header reason phrase.
> >> 
> >> diff --git a/src/http/modules/ngx_http_fastcgi_module.c b/src/http/modules/ngx_http_fastcgi_module.c
> >> --- a/src/http/modules/ngx_http_fastcgi_module.c
> >> +++ b/src/http/modules/ngx_http_fastcgi_module.c
> >> @@ -2048,7 +2048,25 @@ ngx_http_fastcgi_process_header(ngx_http
> >>                     }
> >> 
> >>                     u->headers_in.status_n = status;
> >> -                    u->headers_in.status_line = *status_line;
> >> +
> >> +                    if (status_line->len == 3) {
> >> +                        /* preserve SP for empty reason phrase */
> >> +
> >> +                        u->headers_in.status_line.data = ngx_pnalloc(r->pool,
> >> +                                                                     5);
> >> +                        if (u->headers_in.status_line.data == NULL) {
> >> +                            return NGX_ERROR;
> >> +                        }
> >> +
> >> +                        ngx_memcpy(u->headers_in.status_line.data,
> >> +                                   status_line->data, 3);
> >> +                        u->headers_in.status_line.data[3] = ' ';
> >> +                        u->headers_in.status_line.data[4] = '\0';
> >> +                        u->headers_in.status_line.len = 4;
> >> +
> >> +                    } else {
> >> +                        u->headers_in.status_line = *status_line;
> >> +                    }
> >> 
> >>                 } else if (u->headers_in.location) {
> >>                     u->headers_in.status_n = 302;
> > 
> > Do you think we really need this complexity here?  I tend to think 
> > that
> 
> See above comment about efficiency.
> I doubt empty reason phrase is seen often in the wild,
> so this is a reasonable cost for kinda abusing protocol.
> But this complexity also reflects the ugliness, that's
> why I preferred the second version, also for its brevity.
> 
> > 
> >    if (status_line->len > 3) {
> >        u->headers_in.status_line = *status_line;
> >    }
> > 
> > would be enough, so nginx will generate status line by itself for 
> > such headers.
> 
> This is a nice trade-off between handling empty reason phrase
> with extra allocation and dropping it altogether, I like it.

Please see my response to the original report for the full patch.

> > 
> > The only downside I can see is that it will provide no way to 
> > actually generate a response with well-known status code, yet with 
> > an empty reason phrase.
> > 
> > (Also note that uwsgi and scgi modules needs to be patched 
> > similarly.)
> 
> These modules use a distinct routine to parse status line,
> that is ngx_http_parse_status_line, which respects spaces,
> and they expect the first line to look like a status line
> (similar to the proxy module).
> So I don't see immediate necessity to patch them as well.

Sure, but these are distinct parsing modes: parsing status line as 
in raw HTTP responses (which, in particular, are used by NPH scripts), 
or parsing the Status header, as in CGI responses.  Both should 
work correctly.

In particular, current SCGI specification 
(https://python.ca/scgi/protocol.txt) uses "Status: ..." in the 
example (whereas original version used "HTTP/1.0 ...", 
https://web.archive.org/web/20020403050958/http://python.ca/nas/scgi/protocol.txt).

I see no reasons why "Status: 404 " should work correctly in 
FastCGI, but shouldn't in SCGI.

> >> Alternatively, the reason-phrase value from FastCGI response 
> >> can be ignored, nginx will translate it with its own value:
> >> 
> >> # HG changeset patch
> >> # User Sergey Kandaurov <pluknet at nginx.com>
> >> # Date 1693241054 -14400
> >> #      Mon Aug 28 20:44:14 2023 +0400
> >> # Node ID f1a452cbd57ff4fba1caf3f36cb3624bd45411ea
> >> # Parent  2880f60a80c3e2706151dc7b48dc1267e39c47a9
> >> FastCGI: preserved SP for empty Status header reason phrase.
> >> 
> >> As per RFC 3875 sec 6.3.3, the status code is always followed by SP.
> >> The header is parsed by the common routine, and it can be challenging
> >> to properly translate empty reason phrase to HTTP/1.1 status line, as
> >> that would require an additional memory allocation.  For simplicity,
> >> the reason phrase is now ignored; the header filter will take care of
> >> it by inserting its own reason phrase for well known status codes.
> >> 
> >> diff --git a/src/http/modules/ngx_http_fastcgi_module.c b/src/http/modules/ngx_http_fastcgi_module.c
> >> --- a/src/http/modules/ngx_http_fastcgi_module.c
> >> +++ b/src/http/modules/ngx_http_fastcgi_module.c
> >> @@ -2048,7 +2048,6 @@ ngx_http_fastcgi_process_header(ngx_http
> >>                     }
> >> 
> >>                     u->headers_in.status_n = status;
> >> -                    u->headers_in.status_line = *status_line;
> >> 
> >>                 } else if (u->headers_in.location) {
> >>                     u->headers_in.status_n = 302;
> > 
> > I don't think it's a good idea, since this always drops the 
> > reason phrase provided by the upstream server.  It can contain 
> > some meaningful information which will be lost as a result, most 
> > notably for non-standard error codes.
> 
> It is little to no benefit to carry reason phrase,
> also this is consistent with HTTP/1.1:
> : A client SHOULD ignore the reason-phrase content because
> : it is not a reliable channel for information
> 
> Note also that HTTP/2 and then HTTP/3 do not define a way
> to carry the reason phrase that is included in HTTP/1.1.

In my practice, it is quite usable when debugging and analyzing 
the protocol flow.  HTTP/2 decision to drop it looks highly 
questionable to me, and generally reflects the fact that HTTP/2 is 
not designed for debugging.

> Personally I'd proceed with your approach for fastcgi module
> only, though I won't insist to patch other modules as well.
> Although this would introduce an extra condition in these
> other modules without a reason (as per my above analysis),
> this would keep sources in sync between them, which is fine,
> and will allow to stop using ngx_http_parse_status_line().
> 
> For the latter, uwsgi seems to use HTTP syntax in response,
> so it is ok to call ngx_http_parse_status_line() here, but
> scgi seems to follows CGI syntax where status line is
> carried in the Status header field, so the call would
> always return NGX_ERROR.  Luckily this is already handled
> by falling back to ngx_http_scgi_process_header().

SCGI does not define a response syntax at all, and at least 
"Status: ..." and "HTTP/1.0 ..." were seen in the different 
versions of the specification, see above.  As such, I tend to 
think that both variants needs to be (correctly) supported.

(On the other hand, for uwsgi I'm not sure if "Status: ..." is 
allowed, it seems to contradict the protocol (see 
https://uwsgi-docs.readthedocs.io/en/latest/Protocol.html, "Every 
uwsgi request generates a response in the uwsgi format").  We 
might consider removing Status header parsing in uwsgi, but this 
is probably out of the scope of this issue.)

-- 
Maxim Dounin
http://mdounin.ru/


More information about the nginx mailing list