Regular Expression global redirect

Max nginxyz at mail.ru
Mon Feb 27 06:33:39 UTC 2012


27 февраля 2012, 04:41 от António P. P. Almeida <appa at perusio.net>:
> On 27 Fev 2012 00h39 CET, nginx-forum at nginx.us wrote:
> 
> > I still cant seem to get this working. I upgraded my PCRE libraries
> > and recompiled/reinstalled a fresh nginx 1.0.12
> >
> > # pcrecheck
> > PCRE version 8.21 2011-12-12
> >
> > Here is my server sections. Notice I have 2 server sections...the
> > 1st section catches the WWW site and redirects it to the 2nd,
> > non-www...right? I'm still getting: nginx: [emerg] unknown "domain"
> > variable
> >
> > server {
> > listen 80;
> > server_name ^~www\.(?<domain>.*)$;
> > return 301 http://$domain;
> > }
> >
> > server {
> > listen 80;
> > server_name ^~(?<domain_name>[^\.]*)\.(?<tld>[^\.]*)$;
> > location / {
> > proxy_pass http://websites;
> > }
> > }
> >
> > When I try it with the P, everything (www and nonwww) get a white
> > 301 nginx page: server { listen 80; server_name
> > ^~www\.(?P<domain>.*)$; return 301 $scheme://$domain$request_uri;; }
> >
> > server {
> > listen 80;
> > server_name _;
> > location / {
> > proxy_pass http://websites;
> > }
> > }
> >
> > I tried making server_name in the 2nd block:
> > server_name ^~(?P<domain_name>[^\.]*)\.(?<tld>[^\.]*)$;
> >
> > but I get this:
> > nginx: [emerg] invalid server name or wildcard
> > "^~(?p<domain_name>[^\.]*)\.(?<tld>[^\.]*)$" on 0.0.0.0:80
> > (fyi, the error has a lowercase p, server_name has it capitalized)
> >
> > Is there some other dependency I'm missing or am I just mangling the
> > syntax?
> 
> Oops. I erroneously switched the '^' and '~'. It's ~^ not ^~. Solly :(
> 
> Ok. It seems that your PCRE library has problems with the non P syntax
> for named captures. So you cannot mix both.
> 
> server {
> listen 80;
> server_name ~^www\.(?P<domain>.*)$;
> return 301 $scheme://$domain$request_uri;
> }
> 
> server {
> listen 80;
> server_name ~^(?P<domain_name>[^\.]*)\.(?P<tld>[^\.]*)$;
> location / {
> proxy_pass http://$domain_name.$tld;
> }
> }
> 
> This should work [1]. 

Your solution, while syntactically correct, is wrong by design.
What you created there is an open anonymizing proxy that will pass
any request from anyone to any host:port combination that contains
only the domain name and the TLD, if a functional resolver has been
set up using the resolver directive. Take a guess what this would do:

$ nc frontend 80
GET /a/clue HTTP/1.0
Host: fbi.gov:22

You should never pass unsanitized user input to pass_proxy, unless
you want people to abuse your open anonymizing proxy for illegal
activities that will get you in trouble. Good luck convincing
the FBI that your incompetence was the real culprit.

Moreover, the frontend server will pass all requests for
"http://$domain_name.$tld" that would have normally been passed
on to the backend server on to itself to create a nasty loop,
unless you happen to have split horizon DNS set up with the
resolver set to the internal DNS server that maps the value
of "$domain_name.$tld" to an internal IP. But if you had that kind
of setup, you'd use it to do the mapping in the first place instead
of doing what you've been trying to do.

This is what your solution does if a functional resolver has been
set up:

http://www.domain.tld -> status code 301 with "Location: http://domain.tld"

http://own-domain.tld -> proxy_pass LOOP to http://own-domain.tld

http://foreign-domain.tld:port -> OPEN ANONYMIZING PROXY to
                                  foreign-domain.tld:port


If no resolver has been set up, proxy_pass will fail due to being
unable to resolve the value of "$domain.$tld" for any request
that contains only the domain name and the TLD.


Here's one of the correct ways to do what the OP wants to do:

map $http_host $wwwless_http_host {
    hostnames;
    default                 $http_host;
    ~^www\.(?P<domain>.*)$  $domain;
}

server {
    listen 80 default_server;
    server_name _;

    location / {
        proxy_set_header Host $wwwless_http_host;
        proxy_pass http://backend;
    }
}


It would be a good idea to also allow only hosts and domains that you
actually host, which could be done like this:


map $http_host $own_http_host {
    hostnames;
    default    0;
    include    nginx.own-domains.map;
}

server {
    listen 80 default_server;
    server_name _;

    if ($own_http_host = 0) {
        # Not one of our hosts / domains, so terminate the connection
        return 444;
    }

    location / {
        proxy_set_header Host $own_http_host;
        proxy_pass http://backend;
    }
}

The nginx.own-domains.map file would contain entries such as:

.domain.org       domain.org;   # map *.domain.org to domain.org
www.another.net   another.net;  # map only www.another.net to another.net

This file could be generated automatically from DNS zone files,
so it would be easy to maintain.

Max


More information about the nginx mailing list