How to do substitions (like perl s/// operator) in rewrites?

bartschipper nginx-forum at nginx.us
Sun Jan 10 17:08:48 MSK 2010


I recently migrated a web site from Apache to Nginx (and from CMS-system x to CMS system y)
Almost all of the rewrites are OK in the new CMS, except for one older class of article URLs:
http://example.com/News/Articlepage-News/This-is-the-best-News-EVER.htm (example)

The direct mapping for these literal urls to the new urls is lost.
I do however have a file (200k+ entries) that maps the titles to article-id's like so:

# old-class-urls.txt: (first two commented lines are not actually in the file)
# all lowercase title without spaces and hyphens     article-id
thisisthebestnewsever                                123456;
...

In apache I used a rewritemap to rewrite these urls:

# httpd.conf:
...
RewriteMap articles prg:/etc/httpd/rewrites/old-class-article-urls.pl
...


# old-class-article-urls.pl:

#!/usr/bin/perl

$| = 1;

###############################################
# code to be executed at startup of webserver #
###############################################

open(TEXTFILE,";
close TEXTFILE;

foreach $line (@lines) {                                    # load the data in an associative array for fast lookup
   ($keyword,$article_id) = split(/\s+/,$line);
   $keys{$keyword} = $article_id;
}

##########################################
# code to be once every URL is requested #
##########################################

while () {
   $url = $_;
   chomp($url);
   if($url =~ /\/([^\/]+)\.htm$/) {                         # a match could be made
      $keyword = lc($1);
      $keyword =~ s/-//g;
      if($keys{$keyword}) {
         print  'old-class-url.php?articleid=' . $keys{$keyword}."\n";
         next;
      }
   }
   print "Not found\n";                                     # no match could be made
}

I was hoping something similar or even easier could be done with Nginx:


map $uri $old-class-url {
        include /etc/nginx/rewrites/old-class-urls.txt;
}
...
server {
...
    location ~* ^/News/Articlepage-News/.*htm { 
        rewrite ^/News/Articlepage-News/(.*)htm $1 ;
###
### change $uri to lowercase and remove the hyphens...
### I am looking for something equivalent like in perl:
### s/-//g;
### s/.*/\L{$1}/;
###
        if ($old-class-url) {
                rewrite  ^    /old-class-url.php?articleid=$old-class-url   permanent;
        }
    }
}


I have seen questions about similar functionality in the forums, but not with a solution:
http://forum.nginx.org/read.php?2,34788
http://forum.nginx.org/read.php?9,2511

Is it possible to solve this issue this way or do you recommend a different solution?

Thanks in advance,
Bart

Posted at Nginx Forum: http://forum.nginx.org/read.php?2,39425,39425#msg-39425




More information about the nginx mailing list