Rewrite regex with percent signs

redrobes nginx-forum at forum.nginx.org
Sun May 22 17:18:19 UTC 2016


Francis Daly Wrote:
-------------------------------------------------------
> On Sun, May 22, 2016 at 07:16:35AM -0400, redrobes wrote:
> 
> Hi there,
> 
> > For example, we have a url of the following:
> > /members/redrobes-albums-2d%20vs%203d%20?-picture12345-mt-pub01.jpg
> 
> The %20 pieces in there are url-encoded spaces. In a "location" or a
> "rewrite", you would have to match a single space character each.
> 
> However, there is also a ? in the url; that marks the start of the
> query
> string. A "location" or "rewrite" in nginx will *not* consider that
> part
> of the url.


Ah ! Thanks. I didn't spot that one amongst all the other odd chars there.
Yes nginx does indeed treat the ? as a different character to perl and the
hex codes convert to "2d vs 3d ?" I.e. the ? was in the title of the post
and is not the start of args. I think thats it and it makes sense now. I can
understand what is going on.

It could have been mighty hard to fix this case since we are not going to
know in advance whether the ? was part of the url or the start of args but I
think in our case we know were going to dump all the args anyway and
substitute our own in. So I think there may be the possibility of appending
the args to the rewrite before we do the match. Not sure at this point.

But thanks Francis - I think you have solved it.



> 
> > it needs to go to
> > 
> > /attachment.php?attachmentid=12345
> 
> It is not immediately clear to me which parts of the original url are
> important in deciding whether the request should be redirected or not.
> 
> > we have:
> > 
> > location /members/ {
> > rewrite ^/members/.+-albums-.+-picture(\d+)-.*
> > /attachment.php?attachmentid=$1? redirect;
> > }
> 
> That suggests that just those three words matter. You might be able
> to put something together involving "$args" matching "-picture(\d+)-"
> if the request matches "^/members/.*-albums-", perhaps?
> 
> Alternatively, perhaps the thing that created the url in the first
> place,
> incorrectly did not url-encode the ? to %3F.
> 
> > and this particular one is not working. It works with many others
> where the
> > original url did not have the %20's in them. So there is something
> about
> > those %20's that are causing these to fail.
> 
> I suspect that it is the ? rather than the %20, from the one example
> you have given.
> 
> > I can write a perl script and run that url through its regex and it
> does
> > change them.
> > 
> > So what does the nginx regex do different from perl regex with
> regard to %
> > signs.
> 
> With regard to % signs, nginx regex uses the %-unencoded version. With
> regard to ?, some nginx parts do not consider anything after the ?
> when
> matching.
> 
> Good luck with it,
> 
> 	f
> -- 
> Francis Daly        francis at daoine.org
> 
> _______________________________________________
> nginx mailing list
> nginx at nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx

Posted at Nginx Forum: https://forum.nginx.org/read.php?2,267039,267044#msg-267044



More information about the nginx mailing list