[PATCH] HTTP: Add new uri_normalization_percent_decode option

Michael Kourlas Michael.Kourlas at solace.com
Thu Mar 30 17:19:08 UTC 2023


Hello,

Thanks again for your comments.

> This implies, basically, that there are 3 forms of the request
> URI: 1) fully encoded, as in $request_uri, 2) fully decoded, as in
> $uri now, and 3) "all-except-percent-and-reserved". To implement this
> correctly, it needs clear definition when each form is used, and
> it is going to be a non-trivial task to do this safely.

I agree. A simple way to do this would be to make percent-decoding customizable
on a per-directive basis. The core use case I was hoping to support is
preserving encoded reserved characters in location matching (basically what was
proposed in [1]), so that is what I would like to focus on in a reworked
version of this patch.

I propose the following:

(1) The addition of a new variable called $uri_encoded_percent_and_reserved. As
discussed, this variable is a special version of the normalized URI ($uri)
that preserves any percent-encoded "%" or reserved characters.

(2) Every transformation applied to $uri (e.g. from the "rewrite" directive,
internal redirects, etc.) is automatically applied to
$uri_encoded_percent_and_reserved as well.

If this raises performance concerns, a new flag could be added to enable or
disable the availability of $uri_encoded_percent_and_reserved.

(3) The addition of a new optional parameter to the URI form of "location"
blocks called "match-source":

location [ = | ~ | ~* | ^~ ] uri [match-source=uri|uri-encoded-percent-and-reserved] {
    ...
}

For example:

location ~ ^/api/objects/[^/]+/subobjects(/.*)?$ match-source=uri-encoded-percent-and-reserved {
    ...
}

"match-source=uri" is the default and the current behaviour. When
"uri-encoded-percent-and-reserved" is used, the location matching for that
block uses $uri_encoded_percent_and_reserved rather than $uri. Nested location
blocks are not affected (unless they also use
"uri-encoded-percent-and-reserved").

In future it would be possible to use a similar pattern with other directives
that use $uri, such as "proxy_pass", but that can be done as part of a separate
patch.

If you think this is a sensible approach, I will submit a revised patch
implementing it.

Thanks,

Michael Kourlas

[1] https://trac.nginx.org/nginx/ticket/2225
________________________________
 Confidentiality notice

This e-mail message and any attachment hereto contain confidential information which may be privileged and which is intended for the exclusive use of its addressee(s). If you receive this message in error, please inform sender immediately and destroy any copy thereof. Furthermore, any disclosure, distribution or copying of this message and/or any attachment hereto without the consent of the sender is strictly prohibited. Thank you.


More information about the nginx-devel mailing list