GeoIP Question - Speed & efficiency

Igor Sysoev is at rambler-co.ru
Sat Aug 16 13:13:11 MSD 2008


On Sat, Aug 16, 2008 at 03:05:56PM +0700, Chris Savery wrote:

> Thanks Maxim. Sounds cool, fast but perhaps a bit of a memory hog. For 
> loading time I would think the way to improve that is to compile a 
> binary representation on disk that can be loaded as a "pre-made tree" 
> into memory so that no insertion scan need be done. Or pre-sort the data 
> to insert with minimum searches.
> 
> Anyway, I may write a small script to see if I can amalgamate countries 
> into big blocks as that would help both speed and memory.

We at Rambler use geo base with countries and Russian regions:

>wc geo.conf 
  141240  282480 2979471 geo.conf

Your base will probably be even lesser (as Russia will be one country).

> I gather that being at the http level config this means it is "always 
> on". I could see it being useful to be able to put it in a location 
> specifier so that only certain requests go through the lookup. For 
> example, I've no need for static images to get country codes but my 
> index page would be great as I would set a "best choice" value for 
> serving the user for all further requests in the session. It sounds like 
> it doesn't use much cpu time  but I expect to be serving vasts amounts 
> of small thumbnails so reducing cycles on that is always a good thing. 
> (25 thumbs/page/user ad nauseum photo app).

All nginx variables are evaluated on demand only, therefore geo variables
are looked up only if they are really used in a request.

> Cheers, for excellent info.
> Chris :)
> 
> Maxim Dounin wrote:
> >Hello!
> >
> >On Sat, Aug 16, 2008 at 07:43:45AM +0700, Chris Savery wrote:
> >
> >>I am thinking of using the GeoIP module with input from the maxmind  
> >>database converted with the perl script as described through the link 
> >>on  the nginx site.
> >>
> >>I'm curious if the country-ip pairs are managed efficiently so that 
> >>the  lookup/conversion is very fast or not? That is, does the module 
> >>do  something like sort the list and then use a binary tree  to 
> >>quickly  locate the country? Is the whole thing loaded in memory?
> >
> >Geo module builds in-memory radix tree when loading configs.  This is 
> >the same data structure as used in routing, and lookups are really fast.
> >
> >>This country  database is quite huge and if this process happens on 
> >>every hit or even  on only a selected entry page then it could be 
> >>very slow. Does anyone  here have experience with this?
> >
> >The only inconvinience of using really large geobases is config 
> >reading time.  My currently takes about 30 seconds to load - but 
> >that's for more than 30 Mb of data, and not only countries.
> >
> >>For my purposes I only really need to detect continents for deciding 
> >>if  visitors should pull from one of a few server locations. So 
> >>presumably  it may be possible to combine many countries into larger 
> >>blocks so that  there are fewer steps in the lookup. Any input on how 
> >>speedy or  efficient this has shown to be would be super helpful here.
> >
> >Aggregating blocks is good thinks to do if you don't need detailed 
> >information, but you'll hardly notice any difference.
> >
> >Maxim Dounin
> >
> >
> 

-- 
Igor Sysoev
http://sysoev.ru/en/





More information about the nginx mailing list