Options for selective logging

Peter Booth peter_booth at me.com
Tue Sep 19 21:23:00 UTC 2017


What is your ultimate goal? You say that you want to replay 0.05% of traffic into a test environment. 
Are you wanting to capture real world data on a one off or ongoing basis? 

You say that this particular proxy is very busy.  How busy? Is it hosted on a physical host or a virtual machine?
If physical, do you own the physical environment? If you do, then you can capture the (entire) content with a network tap or by adding a spanning port to your switch, without affecting your proxy.

Assuming you don't have control then  I see from the docs that ngx_http_log_module has an if parameter that will only log if a condition equals zero. This implies that if you define a variable that equals the request_id or time in milliseconds mod X then you can sample by time or by request number.

Sent from my iPhone

> On Sep 19, 2017, at 2:17 PM, mblancett <nginx-forum at forum.nginx.org> wrote:
> 
> I am looking for ways to target every Nth request into a very busy proxy
> within an nginx configuration. This particular proxy is extremely busy and
> receives POSTs to a single URI, and taking an approach like sharding by IP
> would not be the kind of traffic sample we’re after.
> 
> The long term goal here is to replay some small amount (like 0.05%) of
> requests into a separate test environment. Currently I’m logging the entire
> request to ramdisk and using an every minute logrotation script in python to
> get the small proportion of requests I need, then using python ‘requests’ to
> replay them against the separate environment. This works, but the proxy
> underperforms its neighbors in the dns pool noticeably, and the RAM
> requirement is just too high for this to be sustainable long-term.
> 
> I’d much prefer to find some way to have nginx only log the data that is
> necessary. I’ve seen that there is an http_mirror command that came out very
> recently which is nearly perfect for my needs, but that leaves the problem
> of only mirroring a percentage of the traffic.
> 
> Thanks for your suggestions.
> 
> Posted at Nginx Forum: https://forum.nginx.org/read.php?2,276452,276452#msg-276452
> 
> _______________________________________________
> nginx mailing list
> nginx at nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx


More information about the nginx mailing list