NGINX - traffic log / graphing / billing per destination ip

mike mike503 at gmail.com
Sun Mar 8 08:52:28 MSK 2009


i've had to resort to a mixture of basic logging and then processing
those logs later on. it -seems- to be reliable enough for now.

in my nginx.conf:

log_format traffic
'$http_host|$bytes_sent|$time_local|$remote_addr|$request_uri';
access_log /var/log/nginx/traffic traffic;

then a php script i run once per night at like 12:05am, it grabs the
stats from 3 webservers

it even parses the host and allows for URI processing to alter the
host to push the data from one host to another (like i do a couple
things to correlate bandwidth from one site based on the uri or the
host)

*** YOUR MILEAGE MAY VARY ***

#!/usr/local/bin/php -q
<?php

$servers = Array('web01', 'web02', 'web03');

$daysback = 1;
for($x=$daysback; $x>0; $x--) {
        $day = date('Ymd', time()-(60*60*24*$x));
        echo "$day\n";
        runstats($day);
}
system("rm -f /tmp/log??????");

function runstats($day) {
        global $servers;

        $tmpfile = tempnam('/tmp', 'log');
        $logfile = '/var/log/nginx/traffic.'.$day;

        $hosts = array();
        $bytes = array();
        $transfer = array();

        foreach($servers as $server) {
                echo "\t{$server}\n";
                system("/usr/bin/scp {$server}:{$logfile} {$tmpfile}
2>/dev/null");
                if(file_exists($tmpfile)) {
                        $fp = fopen($tmpfile, 'r');
                        while(!feof($fp)) {
                                list($host, $bytes, $date, $ip, $uri)
= fgetcsv($fp, 1024, '|');
                                $host = preg_replace('/^www\./', '', $host);
                                $host = preg_replace('/\.$/', '', $host);
                                if(strstr($host, ':')) {
                                        $host = rtrim($host, ':80');
                                        $host = rtrim($host, ':443');
                                }
                                $host = strtolower($host);
                                if(substr($uri, 0, 21) ==
'/rating/video/get.php') {
                                        if(strstr($host, 'foo3')) {
                                                $host = 'foo.com';
                                        } elseif(strstr($host, 'foo2')) {
                                                $host = 'foo.com';
                                        }
                                }
                                if(!isset($requests[$host])) {
                                        $requests[$host] = 1;
                                } else {
                                        $requests[$host]++;
                                }
                                if(!isset($transfer[$host])) {
                                        $transfer[$host] = $bytes;
                                } else {
                                        $transfer[$host] =
$transfer[$host] + $bytes;
                                }
                                $hosts[$host] = 1;

                        }
                        fclose($fp);
                        unlink($tmpfile);
                }
        }

        $day = date('Y-m-d', strtotime($day));
        db_query("DELETE FROM stats WHERE day='{$day}'");
        foreach(array_keys($hosts) as $host) {
                if(!empty($host)) {
                        $vhost = db_escape($host);
                        $t = isset($transfer[$host]) ? $transfer[$host] : 0;
                        $r = isset($requests[$host]) ? $requests[$host] : 0;
                        db_query("REPLACE INTO stats
(vhost,day,bytes,requests) VALUES('$vhost', '$day', $t, $r)");
                }
        }
}
?>




On Sat, Mar 7, 2009 at 8:55 PM, Jeffrey 'jf' Lim <jfs.world at gmail.com> wrote:
> On Sun, Mar 8, 2009 at 9:40 AM, Payam Chychi <pchychi at gmail.com> wrote:
>> Hi Guys,
>>
>> I've spend some time going through google trying to find an elegant
>> solution for traffic logging/ graphing(rrd) / billing(based on rrd?)
>> but have come up empty handed. Right now here is what I am doing:
>>
>> <snip>
>
> how about log parsing? would that help?
>
> -jf
>
> --
> In the meantime, here is your PSA:
> "It's so hard to write a graphics driver that open-sourcing it would not help."
>    -- Andrew Fear, Software Product Manager, NVIDIA Corporation
> http://kerneltrap.org/node/7228
>
>





More information about the nginx mailing list