location regular expression not filtering some characters

CM Fields cmfileds at gmail.com
Tue Jun 19 17:07:17 UTC 2012

I believe this was a mistake on my side. While testing I noticed that
the (#) and (?)
were allowed through but the URL result was not what I was expecting.

When the pound (#) is used nginx converts the URI from

          and cuts off the pound sign and anything after it to this....

before my regular expression is ever used. The pound (#) is a location specific
tag so this expected and fine.

The question mark (?) is still passed to my regular expression and
allowed through.

       get passed through the regular expression unchanged

Not sure why the question mark is special yet.

On Tue, Jun 19, 2012 at 11:46 AM, CM Fields <cmfileds at gmail.com> wrote:
> I am looking to filter all characters other then those specified in the
> "location" regular expression. For example, [\w.]+$ should only allow one or
> more letters, numbers, underscore and period just like [a-zA-Z0-9_.]
> location ~* ^/data/[\w.]+$  {...}
> When I test the url with wget I find the pound (#) and question mark (?) are
> allowed through. For example...
> This URL is valid and is allowed through
>   wget "http://example.com/data/1234.txt"
> This URL with the additional "#" should not be allowed thorugh, but it is.
>   wget "http://example.com/data/12#34.txt"
> Adding a question mark also gets through when it is supposed to be blocked like
> the pound "#" above.
>   wget "http://example.com/data/12?34.txt"
> Are pound (#) and questions mark (?) matches being overridden in Nginx and thus
> getting past my regular expression?
> Does anyone know of a way to block the "#" or "?" that I am missing?
> Just for clarity, I have no need for the "#" or "?" in my script and I can do
> checks in the script to exclude these characters if necessary.

More information about the nginx mailing list