Log Parsing - Near Real Time

Neil Mckee neil.mckee.ca at gmail.com
Fri Aug 5 00:29:54 UTC 2011


On Aug 4, 2011, at 12:17 PM, Dennis Jacobfeuerborn wrote:

> The problem is that sFlow currently lacks practical Documentation and a library that can be used to develop agents and collectors.

The http://sflow.org website has some overview documentation,  links to developer tools,  plus agent and collector source code.

> It took me a while to realize why I couldn't find a collector daemon that I could set up to use with the sflowtool or sFlowTrend. These tools *are* the collectors. I would have expected to find some kind of management daemon akin to the snmp world.

Open Source collectors include Ganglia, ntop and pmacct,  but there are so many possible applications for this data that no single collector-package is going to encompass them all.  I think there may be some collectors that load data into relational database tables but that may not always make sense for what you want to do.  Hence the starting point is usually a C, Perl or Java decoder that unpacks the data for you as a real-time feed,  then you are free to do whatever you want.

> sFlow looks really interesting but it is unnecessarily obscure and the developer resources could use a face lift and present things in a way that introduces concepts, term and methodology to new-comers.
> Put out a C library with an agent and a collector API and throw the code on github and you no doubt will see a pickup in interest from developers.

Each agent is very different because it gets embedded in the device or application that it is monitoring,  so the best thing is probably just to list some of the open-source projects:

http://nginx-sflow-module.googlecode.com
http://mod-sflow.googlecode.com
https://github.com/sflow/memcached
http://host-sflow.sourceforge.net
http://openvswitch.org

Perhaps there should be a page on sflow.org with this list?  Would that have been helpful for you?

If you have more questions about sFlow in general then it might make sense to post them to the sFlow mailing list instead:
http://groups.google.com/group/sflow

But getting back to the question of real-time monitoring of nginx servers,  the nginx-sflow-module is a complete sFlow agent that offers centralized, real-time monitoring of large clusters.  sflowtool can turn the feed into a piped stream of ASCII CLF data,  so it represents one way to avoid all that log-file tailing.

Neil


> 
> Regards,
>  Dennis
> 
> On 08/04/2011 06:59 PM, Neil Mckee wrote:
>> Not sure what you mean about sFlow needing to be open source? Here are
>> links to the relevant open-source projects:
>> 
>> http://nginx-sflow-module.googlecode.com
>> http://host-sflow.sourceforge.net
>> http://www.inmon.com/technology/sflowTools.php
>> 
>> With a more complete "developer resources" description here:
>> http://blog.sflow.com/2010/01/developer-resources.html
>> 
>> If you use sflowtool to turn sFlow-HTTP into common-log format at the
>> collector, that opens up a whole ecosystem of open-source
>> perl/python/bash/PHP tools for the analysis, such as AWStats.
>> http://awstats.sourceforge.net/
>> 
>> The sFlow-HTTP feed also sends performance counters every N seconds. I
>> don't yet know of an open-source adaptor to feed that into something like
>> Nagios, Ganglia or Graphite, but I know there are options to do that with
>> the sFlow-HOST performance counters so it shouldn't be hard to add. In
>> fact, Ganglia now has native support for the sFlow-HOST counters.
>> http://ganglia.info/?p=430
>> 
>> This sFlow-HOST (http://host-sflow.sourceforge.net) part is helpful because
>> it provides telemetry on the underlying CPU/mem/disk/network stats in a
>> light and scalable way, and supports zero-config (DNS-SD) to make sFlow
>> easier to roll out on a large cluster/farm.
>> 
>> Neil
>> 
>> 
>> On Aug 1, 2011, at 6:09 PM, SplitIce wrote:
>> 
>>> sflow would be great it it was open source and had an easily customizable
>>> server (perl/python/bash or PHP)
>>> 
>>> On Tue, Aug 2, 2011 at 5:08 AM, Harold Sinclair <haroldsinclair at gmail.com
>>> <mailto:haroldsinclair at gmail.com>> wrote:
>>> 
>>>    I cobbled something like this together with open source tools and
>>>    have been using it on hundreds of servers.. pls contact me offline if
>>>    you'd like a copy :)
>>> 
>>>    -Harold
>>> 
>>> 
>>>    On Mon, Aug 1, 2011 at 2:57 PM, Dennis Jacobfeuerborn
>>>    <dennisml at conversis.de <mailto:dennisml at conversis.de>> wrote:
>>> 
>>>        An alternative is to tail -F (aka. "--follow=name --retry") the
>>>        log file and pipe the output into a script. This allows you to
>>>        parse the entries as they come in and rotate the log file as
>>>        often as you want independently of the parsing script.
>>> 
>>>        Regards,
>>>        Dennis
>>> 
>>>        On 08/01/2011 04:57 PM, Randy Parker wrote:
>>> 
>>>            My app has a request that opens the log file, fseeks to the
>>>            end, backs up
>>>            as many bytes as it takes to get to the size the log file was
>>>            on the last
>>>            similar request by that user, and runs a regex over the novel
>>>            part to get
>>>            interesting metrics before closing the file. Since this
>>>            happens less than
>>>            once per minute, I have not done anything fancy to optimize.
>>> 
>>>            - Randy
>>> 
>>>            On Mon, Aug 1, 2011 at 10:39 AM, Reinis Rozitis <r at roze.lv
>>>            <mailto:r at roze.lv>
>>>            <mailto:r at roze.lv <mailto:r at roze.lv>>> wrote:
>>> 
>>>            I'm looking for a near real-time script to parse log files and
>>>            insert interesting data into a db.
>>>            Does anyone know of an existing script to do this?
>>> 
>>> 
>>>            You can check/try http://www.splunk.com <http://www.splunk.com/>
>>> 
>>>            rr
>>> 
>>> 
>>>            ___________________________________________________
>>>            nginx mailing list
>>>            nginx at nginx.org <mailto:nginx at nginx.org>
>>>            <mailto:nginx at nginx.org <mailto:nginx at nginx.org>>
>>>            http://mailman.nginx.org/____mailman/listinfo/nginx
>>>            <http://mailman.nginx.org/__mailman/listinfo/nginx>
>>>            <http://mailman.nginx.org/__mailman/listinfo/nginx
>>>            <http://mailman.nginx.org/mailman/listinfo/nginx>>
>>> 
>>> 
>>> 
>>> 
>>>            --
>>>            http://mobiledyne.com <http://mobiledyne.com/>
>>> 
>>> 
>>>            _________________________________________________
>>>            nginx mailing list
>>>            nginx at nginx.org <mailto:nginx at nginx.org>
>>>            http://mailman.nginx.org/__mailman/listinfo/nginx
>>>            <http://mailman.nginx.org/mailman/listinfo/nginx>
>>> 
>>> 
>>>        _________________________________________________
>>>        nginx mailing list
>>>        nginx at nginx.org <mailto:nginx at nginx.org>
>>>        http://mailman.nginx.org/__mailman/listinfo/nginx
>>>        <http://mailman.nginx.org/mailman/listinfo/nginx>
>>> 
>>> 
>>> 
>>>    _______________________________________________
>>>    nginx mailing list
>>>    nginx at nginx.org <mailto:nginx at nginx.org>
>>>    http://mailman.nginx.org/mailman/listinfo/nginx
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Warez Scene <http://thewarezscene.org/> Free Rapidshare Downloads
>>> <http://www.nexusddl.com/>
>>> 
>>> _______________________________________________
>>> nginx mailing list
>>> nginx at nginx.org <mailto:nginx at nginx.org>
>>> http://mailman.nginx.org/mailman/listinfo/nginx
>> 
>> 
>> 
>> _______________________________________________
>> nginx mailing list
>> nginx at nginx.org
>> http://mailman.nginx.org/mailman/listinfo/nginx
> 
> _______________________________________________
> nginx mailing list
> nginx at nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx



More information about the nginx mailing list