Matching of special characters in location

Francis Daly francis at daoine.org
Tue Nov 10 00:19:34 UTC 2020


On Tue, Nov 10, 2020 at 12:11:28AM +0100, Grzegorz Kulewski wrote:
> W dniu 09.11.2020 o 21:10, Sergey A. Osokin pisze:
> > On Mon, Nov 09, 2020 at 03:47:13PM +0100, Grzegorz Kulewski wrote:

Hi there,

> >> Is there any (sane) way to match things like: %e2%80%8b in URL in location?

> > here is the code snippet (not tested):
> > 
> > location ~ ^/\xE2\x80\x8E {
> >     return 200 "%e2%80%8b matched\n"/;
> > }
> 
> Thank you. It works.
> 
> They key seems to be using regexp match. Regular match doesn't seem to understand escapes. Not sure if (where) it is documented.
> 

Regex match is straightforward here -- you use whatever your regex-engine
supports to match the octets, which probably includes a straight swap
of \x for % from the url.

Non-regex match does work too, though; the key there is that nginx does
that match against the non-url-encoded characters.

%e2%80%8b is the url-encoding of three octets; in a utf-8 world, they
represent the utf-8 encoding of the unicode code point U+200B (ZERO
WIDTH SPACE).

So if you want to prefix-match on a string including that character,
you'll need to include that character directly in your config file. Your
text editor should have some way of letting you do that -- for example,
in "vim" in insert mode, the six-character sequence control-V, u, 2, 0,
0, b will do the right thing.

In my case, it displays as <200b> and represents a single character.

So my config file can include (e.g.)

  location ^~/<200b> { return 200 "match /zwsp ($uri, $request_uri)\n"; }

(except there is only one "character" between the / and the space);
and then any request that starts with /%e2%80%8b should be handled in
that location; and any request that does not, should not be.

Cheers,

	f
-- 
Francis Daly        francis at daoine.org


More information about the nginx mailing list