Matching of special characters in location
Francis Daly
francis at daoine.org
Tue Nov 10 00:19:34 UTC 2020
On Tue, Nov 10, 2020 at 12:11:28AM +0100, Grzegorz Kulewski wrote:
> W dniu 09.11.2020 o 21:10, Sergey A. Osokin pisze:
> > On Mon, Nov 09, 2020 at 03:47:13PM +0100, Grzegorz Kulewski wrote:
Hi there,
> >> Is there any (sane) way to match things like: %e2%80%8b in URL in location?
> > here is the code snippet (not tested):
> >
> > location ~ ^/\xE2\x80\x8E {
> > return 200 "%e2%80%8b matched\n"/;
> > }
>
> Thank you. It works.
>
> They key seems to be using regexp match. Regular match doesn't seem to understand escapes. Not sure if (where) it is documented.
>
Regex match is straightforward here -- you use whatever your regex-engine
supports to match the octets, which probably includes a straight swap
of \x for % from the url.
Non-regex match does work too, though; the key there is that nginx does
that match against the non-url-encoded characters.
%e2%80%8b is the url-encoding of three octets; in a utf-8 world, they
represent the utf-8 encoding of the unicode code point U+200B (ZERO
WIDTH SPACE).
So if you want to prefix-match on a string including that character,
you'll need to include that character directly in your config file. Your
text editor should have some way of letting you do that -- for example,
in "vim" in insert mode, the six-character sequence control-V, u, 2, 0,
0, b will do the right thing.
In my case, it displays as <200b> and represents a single character.
So my config file can include (e.g.)
location ^~/<200b> { return 200 "match /zwsp ($uri, $request_uri)\n"; }
(except there is only one "character" between the / and the space);
and then any request that starts with /%e2%80%8b should be handled in
that location; and any request that does not, should not be.
Cheers,
f
--
Francis Daly francis at daoine.org
More information about the nginx
mailing list