[PATCH] HTTP: Add new uri_normalization_percent_decode option

Michael Kourlas Michael.Kourlas at solace.com
Mon Mar 27 16:19:27 UTC 2023


Hello,

Thanks for your comments! Sorry for the delay in responding -- I was on
vacation.

> As far as I understand, it will irreversibly corrupt URIs with
> double-encoded reserved characters.  For example, "%252F" will
> become "%2F" when proxying in the following configuration:
>
>     location /foo/ {
>         proxy_pass http://upstream/foo/;
>     }

I believe you are correct. When the "all-except-reserved" option is used, nginx
will not recognize, during the re-encoding step, that the "%" in "%2F" was
originally percent-encoded and not part of a percent-encoded "/".

However, I think this issue can be addressed by renaming "all-except-reserved"
to "all-except-percent-and-reserved" and adding "%" to the corresponding set of
characters that is not decoded or encoded automatically when that option is
used. "%252F" will then be left untouched throughout the entire process.

> Further, requests to static files with (properly escaped) reserved
> characters will simply fail, because nginx won't decode these
> characters.  For example, in the following trivial configuration a
> request to "/foo%3Fbar" won't be decoded to match "/foo?bar" file
> under the document root:
>
>     location / {
>         # static files
>     }

Yes, I agree it doesn't make sense to keep "%" and reserved characters
percent-encoded when performing a filesystem lookup, because at that point we
can be certain they're not being used as delimiters or anything other than
data.

However, I think this issue can be addressed by decoding any remaining
percent-encoded characters at the point of filesystem lookup, as well as any
other place where you think it is appropriate. (Of course, this should only be
done when using the "all-except-percent-and-reserved" option, and only with "%"
and reserved characters, to avoid double-decoding double-encoded characters.)

> Please also note that the configuration directive you've
> introduced in this patch applies to URI parsing from not-yet-final
> server block (see [1] for details), but the configuration from the
> final server block will be used for URI escaping.  These
> configuration can be different, and this might result in various
> additional issues.

I think this issue can be addressed by making the new directive available only
in http blocks. This ensures that all virtual servers share the same
configuration.

If you agree with these suggestions, I'd be happy to submit an updated patch.

Reserved characters are often used as delimiters in APIs, and I believe it is
important that nginx be able to distinguish between usages as delimiters and
usages as data.

Thanks,

Michael Kourlas
________________________________
 Confidentiality notice

This e-mail message and any attachment hereto contain confidential information which may be privileged and which is intended for the exclusive use of its addressee(s). If you receive this message in error, please inform sender immediately and destroy any copy thereof. Furthermore, any disclosure, distribution or copying of this message and/or any attachment hereto without the consent of the sender is strictly prohibited. Thank you.


More information about the nginx-devel mailing list