Url rewriting faliure in case of UTF8 urls

Francis Daly francis at daoine.org
Thu Dec 5 19:09:35 UTC 2013

On Wed, Dec 04, 2013 at 07:42:17PM -0500, omidr wrote:

Hi there,

> As an example I want to redirect "/آرایشگر" to "/استخدام آرایشگر" and I am
> using a rewrite rulr like the one below but things go wrong and I get a
> 404.

Can I suggest you try to simplify the test case, so that you can see
what exactly is happening?

Use "rewrite A B permanent;", and if the url matches A you will see a
redirection to B in the http response headers; and if it does not match
A you will not see a redirection to B.

That way you will know whether the request that you made matched this
rewrite or not.

What matters to nginx are the actual bytes written inside the config file,
and the actual bytes received in the request.

> rewrite_rule: rewrite ^/آرایشگر/$ /استخدام آرایشگر;

So, make that be (say)

  rewrite ^/گر$ /match-گر permanent;

and then do something like

  grep permanent nginx.conf | xxd

so that you can see the bytes that are in the file on that line.

Then issue your single request for /گر, and watch the http response --
do you get the permanent redirect, or something else?

It is probably worth watching the network traffic using tcpdump, or some
other means, so that you can see what bytes are sent by the browser. If
the rewrite doesn't match when you think it should, the tcpdump output
might give an indication of why that was.

> And what do you mean by config? Do you mean nginx settings or configurations
> at compile time?

I mean "enough information so that it is easy for me (or anyone else)
to do what you are doing so as to be able to see the problem that you
are reporting".

I first tested using a line

  rewrite ^/omíd/$ /omídreza permanent;

where the character between "om" and "d" is the two bytes c3 ad, which
is the utf-8 representation of "LATIN SMALL LETTER I WITH ACUTE"

I tested using "curl -i" asking for /omíd/ and also /om%C3%ADd/. I saw
the expected redirection.

So nginx is correctly handling a rewrite of a UTF8 url.

Can you test with that same thing, and see if your nginx responds
differently? If it does respond differently, then there is something
significant different between my nginx and your nginx, so the "nginx -V"
output will probably matter.

Francis Daly        francis at daoine.org

More information about the nginx mailing list