location regular expression not filtering some characters
CM Fields
cmfileds at gmail.com
Tue Jun 19 17:07:17 UTC 2012
I believe this was a mistake on my side. While testing I noticed that
the (#) and (?)
were allowed through but the URL result was not what I was expecting.
When the pound (#) is used nginx converts the URI from
http://example.com/data/12#34.txt
and cuts off the pound sign and anything after it to this....
http://example.com/data/12
before my regular expression is ever used. The pound (#) is a location specific
tag so this expected and fine.
The question mark (?) is still passed to my regular expression and
allowed through.
http://example.com/data/12?34.txt
get passed through the regular expression unchanged
http://example.com/data/12?34.txt
Not sure why the question mark is special yet.
On Tue, Jun 19, 2012 at 11:46 AM, CM Fields <cmfileds at gmail.com> wrote:
> I am looking to filter all characters other then those specified in the
> "location" regular expression. For example, [\w.]+$ should only allow one or
> more letters, numbers, underscore and period just like [a-zA-Z0-9_.]
>
> location ~* ^/data/[\w.]+$ {...}
>
> When I test the url with wget I find the pound (#) and question mark (?) are
> allowed through. For example...
>
> This URL is valid and is allowed through
> wget "http://example.com/data/1234.txt"
>
> This URL with the additional "#" should not be allowed thorugh, but it is.
> wget "http://example.com/data/12#34.txt"
>
> Adding a question mark also gets through when it is supposed to be blocked like
> the pound "#" above.
> wget "http://example.com/data/12?34.txt"
>
>
> Are pound (#) and questions mark (?) matches being overridden in Nginx and thus
> getting past my regular expression?
>
> Does anyone know of a way to block the "#" or "?" that I am missing?
>
>
> Just for clarity, I have no need for the "#" or "?" in my script and I can do
> checks in the script to exclude these characters if necessary.
More information about the nginx
mailing list