status-line trailing SP is missing ?

Sergey Kandaurov pluknet at nginx.com
Thu Aug 31 11:45:06 UTC 2023


> On 31 Aug 2023, at 14:28, Maxim Dounin <mdounin at mdounin.ru> wrote:
> 
> Hello!
> 
> On Wed, Aug 30, 2023 at 04:20:15PM +0400, Sergey Kandaurov wrote:
> 
>>> On 29 Aug 2023, at 08:33, Maxim Dounin <mdounin at mdounin.ru> wrote:
>>> 
>>> On Mon, Aug 28, 2023 at 08:59:28PM +0400, Sergey Kandaurov wrote:
>>> 
>>>>> On 26 Aug 2023, at 18:21, Jérémy Lal <kapouer at melix.org> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> https://bugs.debian.org/1050571
>>>>> reports that the Status line doesn't always end with space, which seems contradictory to RFC9112 which states:
>>>>> "A server MUST send the space that separates the status-code from the reason-phrase even when the reason-phrase is absent (i.e., the status-line would end with the space)."
>>>>> 
>>>>> Is it a documented nginx 1.24.0 behavior ?
>>>>> 
>>>> 
>>>> Note that the response line with empty reason phrase
>>>> is properly generated since nginx 1.5.6.
>>>> The exception remains is proxying FastCGI responses
>>>> as there is no distinguished response line in CGI syntax.
>>>> 
>>>> The reason is that Status is a CGI header field, and hence
>>>> it is parsed by a generic routine that cuts trailing spaces.
>>>> But it can have a trailing space per RFC 3875, section 6.3.3.
>>>> So it needs a special treatment to preserve SP before empty reason
>>>> phrase.  The below patch should help; although it doesn't look
>>>> efficient and can be polished, I think this is quite enough for
>>>> valid use cases.
>>> 
>>> I very much doubt that RFC 3875 properly defines whitespace 
>>> handling, see my response to the original report.  In this 
>>> particular case, it seems to define a header which cannot be 
>>> correctly parsed if reason-phrase is empty.
>>> 
>> 
>> Yes, it is quite dubious how this can be parsed correctly.
>> Although it is valid to have a trailing space in Status,
>> this contradicts to header field value syntax per RFC 3875:
>>      field-content   = *( token | separator | quoted-string )
> 
> Note that field-value syntax does no apply to the "Status" header, 
> its syntax is defined separately.

Well, per RFC 3875 BNF, Status is CGI-field, which if generalized
to other-field, consists of field-content.

> 
> On the other hand, the "Status" header syntax does not allow any 
> spaces after the colon, which rules out headers like "Status: 200 OK",
> and makes the "Status" header syntax highly questionable.

Again, if generalize the "Status" header to header-field, then
there's a note about whitespace between the ":" and the field-value.

> 
> As already suggested in my response to the original report, I tend 
> to think that the best available option is to ignore RFC 3875 idea 
> about headers syntax, and use HTTP instead.

I don't mind.

[..]

>>> 
>>> (Also note that uwsgi and scgi modules needs to be patched 
>>> similarly.)
>> 
>> These modules use a distinct routine to parse status line,
>> that is ngx_http_parse_status_line, which respects spaces,
>> and they expect the first line to look like a status line
>> (similar to the proxy module).
>> So I don't see immediate necessity to patch them as well.
> 
> Sure, but these are distinct parsing modes: parsing status line as 
> in raw HTTP responses (which, in particular, are used by NPH scripts), 
> or parsing the Status header, as in CGI responses.  Both should 
> work correctly.
> 
> In particular, current SCGI specification 
> (https://python.ca/scgi/protocol.txt) uses "Status: ..." in the 
> example (whereas original version used "HTTP/1.0 ...", 
> https://web.archive.org/web/20020403050958/http://python.ca/nas/scgi/protocol.txt).
> 

Ahh ok, didn't know that.
That explains why all three modules should be patches.

> I see no reasons why "Status: 404 " should work correctly in 
> FastCGI, but shouldn't in SCGI.
> 
>>>> Alternatively, the reason-phrase value from FastCGI response 
>>>> can be ignored, nginx will translate it with its own value:
>>>> 
>>>> # HG changeset patch
>>>> # User Sergey Kandaurov <pluknet at nginx.com>
>>>> # Date 1693241054 -14400
>>>> #      Mon Aug 28 20:44:14 2023 +0400
>>>> # Node ID f1a452cbd57ff4fba1caf3f36cb3624bd45411ea
>>>> # Parent  2880f60a80c3e2706151dc7b48dc1267e39c47a9
>>>> FastCGI: preserved SP for empty Status header reason phrase.
>>>> 
>>>> As per RFC 3875 sec 6.3.3, the status code is always followed by SP.
>>>> The header is parsed by the common routine, and it can be challenging
>>>> to properly translate empty reason phrase to HTTP/1.1 status line, as
>>>> that would require an additional memory allocation.  For simplicity,
>>>> the reason phrase is now ignored; the header filter will take care of
>>>> it by inserting its own reason phrase for well known status codes.
>>>> 
>>>> diff --git a/src/http/modules/ngx_http_fastcgi_module.c b/src/http/modules/ngx_http_fastcgi_module.c
>>>> --- a/src/http/modules/ngx_http_fastcgi_module.c
>>>> +++ b/src/http/modules/ngx_http_fastcgi_module.c
>>>> @@ -2048,7 +2048,6 @@ ngx_http_fastcgi_process_header(ngx_http
>>>>                    }
>>>> 
>>>>                    u->headers_in.status_n = status;
>>>> -                    u->headers_in.status_line = *status_line;
>>>> 
>>>>                } else if (u->headers_in.location) {
>>>>                    u->headers_in.status_n = 302;
>>> 
>>> I don't think it's a good idea, since this always drops the 
>>> reason phrase provided by the upstream server.  It can contain 
>>> some meaningful information which will be lost as a result, most 
>>> notably for non-standard error codes.
>> 
>> It is little to no benefit to carry reason phrase,
>> also this is consistent with HTTP/1.1:
>> : A client SHOULD ignore the reason-phrase content because
>> : it is not a reliable channel for information
>> 
>> Note also that HTTP/2 and then HTTP/3 do not define a way
>> to carry the reason phrase that is included in HTTP/1.1.
> 
> In my practice, it is quite usable when debugging and analyzing 
> the protocol flow.  HTTP/2 decision to drop it looks highly 
> questionable to me, and generally reflects the fact that HTTP/2 is 
> not designed for debugging.
> 
>> Personally I'd proceed with your approach for fastcgi module
>> only, though I won't insist to patch other modules as well.
>> Although this would introduce an extra condition in these
>> other modules without a reason (as per my above analysis),
>> this would keep sources in sync between them, which is fine,
>> and will allow to stop using ngx_http_parse_status_line().
>> 
>> For the latter, uwsgi seems to use HTTP syntax in response,
>> so it is ok to call ngx_http_parse_status_line() here, but
>> scgi seems to follows CGI syntax where status line is
>> carried in the Status header field, so the call would
>> always return NGX_ERROR.  Luckily this is already handled
>> by falling back to ngx_http_scgi_process_header().
> 
> SCGI does not define a response syntax at all, and at least 
> "Status: ..." and "HTTP/1.0 ..." were seen in the different 
> versions of the specification, see above.  As such, I tend to 
> think that both variants needs to be (correctly) supported.
> 

Sure, it makes sense then to keep both variants.

> (On the other hand, for uwsgi I'm not sure if "Status: ..." is 
> allowed, it seems to contradict the protocol (see 
> https://uwsgi-docs.readthedocs.io/en/latest/Protocol.html, "Every 
> uwsgi request generates a response in the uwsgi format").  We 
> might consider removing Status header parsing in uwsgi, but this 
> is probably out of the scope of this issue.)

-- 
Sergey Kandaurov


More information about the nginx mailing list