GeoIP Question - Speed & efficiency

Chris Savery chrissavery at gmail.com
Sat Aug 16 18:11:00 MSD 2008


I wrote a quick php script to amalgamate the ip ranges into larger 
regions than countries. It takes large groups of countries and breaks 
them into user defined groups (for me  NA, EU, AS). Doing this drops the 
line count from 104K to about 33K and after run through the perl script 
the conf file is 1.5MB instead of over 3MB. So that's not a bad savings. 
I checked a lot of the regions manually to be sure it was working so I 
think it's ok.

I'll post the code here just in case anyone else can use it. Sorry it's 
not perl - I learned it a decade ago but never use it so didn't want to 
brush up. This works. I just want to set the correct image server for a 
visitor so they get faster photos.

I guess the best thing would be to do a set of GETs from the client to 
each server on demand and then choose the image server with best times - 
then it adapts real time. Didn't think of that til now...

Chris :)

<?php  // Combine regions in GeoIP Database

$regions = array(
    'NA' => 'US CA MX PR VI BM BO BS DM AR BZ BR CL PN AD AI AG AW AT BB 
BA BG KY CO '.
        'CR CU DM EC SV GQ GP GT HT HN JM NR NI PY PE PL RU RO TT TC ',
    'EU' => 'EU GB DE FR IT ES SE IR NL BE IE IL CH AL AM BY HR CY CZ DK 
EE FI GE GI '.
        'GR GL GG HU IS LB LY LI LT LU MC ME MS NO PT RS SK SI TR UA VA '.
        'ZA GA EG NA NG ZW BJ GH CG MW UG SC TZ TM KE RW TZ SO SR SY TM 
AE UZ AF DZ AO AZ BI '.
        'CV CF TD IQ JE LV MR MQ MU MN SA SL CI NE LS SZ MG SL AO BF MU 
TG LY SN SD RE CV GQ '.
        'ZM BW CD TN BJ TG BT BW DJ ER ET JO KZ KW KG LB OM QA ',
    'AS' => 'JP IN AU NZ TH CN HK MY PK KR HK SG BD ID TW PH LK VN AP AS 
AQ TO KH '.
        'CK FJ GN LA MO MM NP NC PG PN WS ST'
    );
$other = 'NA';

$geo = fopen('GeoIPCountryWhois.csv', 'r');
$r = $w = 0;
$last = fgetcsv($geo);
$last_region = region($last);

while($line = fgetcsv($geo))
    {
    $r++;
    if(($region = region($line)) != $last_region)
        {
        print '"'.join('","', array($last[0], $last[1], $last[2], 
$last[3], $last_region, '-'))."\"\n";   
        $last = $line;
        $last_region = $region;
        $w++;
        }
    else
        {
        $last[1] = $line[1];
        $last[3] = $line[3];
        }
    }
fclose($geo);
//print "$r => $w\n";

function region($vars)
{
    global $regions, $other;
   
    $found = false;
    foreach($regions as $r => $codes)
        if(strpos($codes, $vars[4]) !== false)
            {
            $found = $r;
            break;
            }
    return $found ? $found : $other;
}

?>

Igor Sysoev wrote:
> On Sat, Aug 16, 2008 at 05:27:47PM +0700, Chris Savery wrote:
>
>   
>>> All nginx variables are evaluated on demand only, therefore geo variables
>>> are looked up only if they are really used in a request.
>>>       
>> Ok. Excellent, so if I only include the fastcgi param line for one 
>> location, say for index.php then it would only evaluate under that 
>> condition to pass thru to php, like this:
>>
>> fastcgi_param  COUNTRY      $geo;
>>
>> Which is easy then...
>>     
>
> Yes. Actually even if you set fastcgi_param on http level, it will eventually
> be inherited on all localtions level (unless overridden), but it will execute
> only when fastcgi_pass directive will start to work.
>
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://nginx.org/pipermail/nginx/attachments/20080816/3f1f174f/attachment.html>


More information about the nginx mailing list