status-line trailing SP is missing ?
Sergey Kandaurov
pluknet at nginx.com
Thu Aug 31 11:45:06 UTC 2023
> On 31 Aug 2023, at 14:28, Maxim Dounin <mdounin at mdounin.ru> wrote:
>
> Hello!
>
> On Wed, Aug 30, 2023 at 04:20:15PM +0400, Sergey Kandaurov wrote:
>
>>> On 29 Aug 2023, at 08:33, Maxim Dounin <mdounin at mdounin.ru> wrote:
>>>
>>> On Mon, Aug 28, 2023 at 08:59:28PM +0400, Sergey Kandaurov wrote:
>>>
>>>>> On 26 Aug 2023, at 18:21, Jérémy Lal <kapouer at melix.org> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> https://bugs.debian.org/1050571
>>>>> reports that the Status line doesn't always end with space, which seems contradictory to RFC9112 which states:
>>>>> "A server MUST send the space that separates the status-code from the reason-phrase even when the reason-phrase is absent (i.e., the status-line would end with the space)."
>>>>>
>>>>> Is it a documented nginx 1.24.0 behavior ?
>>>>>
>>>>
>>>> Note that the response line with empty reason phrase
>>>> is properly generated since nginx 1.5.6.
>>>> The exception remains is proxying FastCGI responses
>>>> as there is no distinguished response line in CGI syntax.
>>>>
>>>> The reason is that Status is a CGI header field, and hence
>>>> it is parsed by a generic routine that cuts trailing spaces.
>>>> But it can have a trailing space per RFC 3875, section 6.3.3.
>>>> So it needs a special treatment to preserve SP before empty reason
>>>> phrase. The below patch should help; although it doesn't look
>>>> efficient and can be polished, I think this is quite enough for
>>>> valid use cases.
>>>
>>> I very much doubt that RFC 3875 properly defines whitespace
>>> handling, see my response to the original report. In this
>>> particular case, it seems to define a header which cannot be
>>> correctly parsed if reason-phrase is empty.
>>>
>>
>> Yes, it is quite dubious how this can be parsed correctly.
>> Although it is valid to have a trailing space in Status,
>> this contradicts to header field value syntax per RFC 3875:
>> field-content = *( token | separator | quoted-string )
>
> Note that field-value syntax does no apply to the "Status" header,
> its syntax is defined separately.
Well, per RFC 3875 BNF, Status is CGI-field, which if generalized
to other-field, consists of field-content.
>
> On the other hand, the "Status" header syntax does not allow any
> spaces after the colon, which rules out headers like "Status: 200 OK",
> and makes the "Status" header syntax highly questionable.
Again, if generalize the "Status" header to header-field, then
there's a note about whitespace between the ":" and the field-value.
>
> As already suggested in my response to the original report, I tend
> to think that the best available option is to ignore RFC 3875 idea
> about headers syntax, and use HTTP instead.
I don't mind.
[..]
>>>
>>> (Also note that uwsgi and scgi modules needs to be patched
>>> similarly.)
>>
>> These modules use a distinct routine to parse status line,
>> that is ngx_http_parse_status_line, which respects spaces,
>> and they expect the first line to look like a status line
>> (similar to the proxy module).
>> So I don't see immediate necessity to patch them as well.
>
> Sure, but these are distinct parsing modes: parsing status line as
> in raw HTTP responses (which, in particular, are used by NPH scripts),
> or parsing the Status header, as in CGI responses. Both should
> work correctly.
>
> In particular, current SCGI specification
> (https://python.ca/scgi/protocol.txt) uses "Status: ..." in the
> example (whereas original version used "HTTP/1.0 ...",
> https://web.archive.org/web/20020403050958/http://python.ca/nas/scgi/protocol.txt).
>
Ahh ok, didn't know that.
That explains why all three modules should be patches.
> I see no reasons why "Status: 404 " should work correctly in
> FastCGI, but shouldn't in SCGI.
>
>>>> Alternatively, the reason-phrase value from FastCGI response
>>>> can be ignored, nginx will translate it with its own value:
>>>>
>>>> # HG changeset patch
>>>> # User Sergey Kandaurov <pluknet at nginx.com>
>>>> # Date 1693241054 -14400
>>>> # Mon Aug 28 20:44:14 2023 +0400
>>>> # Node ID f1a452cbd57ff4fba1caf3f36cb3624bd45411ea
>>>> # Parent 2880f60a80c3e2706151dc7b48dc1267e39c47a9
>>>> FastCGI: preserved SP for empty Status header reason phrase.
>>>>
>>>> As per RFC 3875 sec 6.3.3, the status code is always followed by SP.
>>>> The header is parsed by the common routine, and it can be challenging
>>>> to properly translate empty reason phrase to HTTP/1.1 status line, as
>>>> that would require an additional memory allocation. For simplicity,
>>>> the reason phrase is now ignored; the header filter will take care of
>>>> it by inserting its own reason phrase for well known status codes.
>>>>
>>>> diff --git a/src/http/modules/ngx_http_fastcgi_module.c b/src/http/modules/ngx_http_fastcgi_module.c
>>>> --- a/src/http/modules/ngx_http_fastcgi_module.c
>>>> +++ b/src/http/modules/ngx_http_fastcgi_module.c
>>>> @@ -2048,7 +2048,6 @@ ngx_http_fastcgi_process_header(ngx_http
>>>> }
>>>>
>>>> u->headers_in.status_n = status;
>>>> - u->headers_in.status_line = *status_line;
>>>>
>>>> } else if (u->headers_in.location) {
>>>> u->headers_in.status_n = 302;
>>>
>>> I don't think it's a good idea, since this always drops the
>>> reason phrase provided by the upstream server. It can contain
>>> some meaningful information which will be lost as a result, most
>>> notably for non-standard error codes.
>>
>> It is little to no benefit to carry reason phrase,
>> also this is consistent with HTTP/1.1:
>> : A client SHOULD ignore the reason-phrase content because
>> : it is not a reliable channel for information
>>
>> Note also that HTTP/2 and then HTTP/3 do not define a way
>> to carry the reason phrase that is included in HTTP/1.1.
>
> In my practice, it is quite usable when debugging and analyzing
> the protocol flow. HTTP/2 decision to drop it looks highly
> questionable to me, and generally reflects the fact that HTTP/2 is
> not designed for debugging.
>
>> Personally I'd proceed with your approach for fastcgi module
>> only, though I won't insist to patch other modules as well.
>> Although this would introduce an extra condition in these
>> other modules without a reason (as per my above analysis),
>> this would keep sources in sync between them, which is fine,
>> and will allow to stop using ngx_http_parse_status_line().
>>
>> For the latter, uwsgi seems to use HTTP syntax in response,
>> so it is ok to call ngx_http_parse_status_line() here, but
>> scgi seems to follows CGI syntax where status line is
>> carried in the Status header field, so the call would
>> always return NGX_ERROR. Luckily this is already handled
>> by falling back to ngx_http_scgi_process_header().
>
> SCGI does not define a response syntax at all, and at least
> "Status: ..." and "HTTP/1.0 ..." were seen in the different
> versions of the specification, see above. As such, I tend to
> think that both variants needs to be (correctly) supported.
>
Sure, it makes sense then to keep both variants.
> (On the other hand, for uwsgi I'm not sure if "Status: ..." is
> allowed, it seems to contradict the protocol (see
> https://uwsgi-docs.readthedocs.io/en/latest/Protocol.html, "Every
> uwsgi request generates a response in the uwsgi format"). We
> might consider removing Status header parsing in uwsgi, but this
> is probably out of the scope of this issue.)
--
Sergey Kandaurov
More information about the nginx
mailing list