problem with PCRE matching, utf-8, Greek, rewrite

tmanolat nginx-forum at nginx.us
Thu Jul 1 19:33:49 MSD 2010


Dear all,
I try to implement some rewrites using regular expressions and my URIs
will contain Greek characters.

Trials of the REs are going ok when tested with pcretest:

[code]
[root at localhost ~]# pcretest
PCRE version 8.10 2010-06-25

  re> #^[\x{0386}-\x{03FF}]+$#8
data> bv
No match
data> Τηλέ
 0: \x{3a4}\x{3b7}\x{3bb}\x{3ad}

[/code]
note the 8 modifier that actually tells PCRE to do a UTF-8 matching.


Having the RE in nginx.config complains about 
[code]
[emerg]: pcre_compile() failed: character value in \x{...} sequence is
too large in 
[/code]
which I guess means that somehow nginx calls PCRE without the PCRE_UTF8
option flag

Am I right? How can I implement these Greek character URL rewrites?

The system environment is:

* CentOS 5.4
* PCRE 8.10 with utf-8 and utf-properties enabled 
* nginx 0.8.42


Cheers
Tilemahos

Posted at Nginx Forum: http://forum.nginx.org/read.php?2,104357,104357#msg-104357




More information about the nginx mailing list