<div dir="ltr">Hello, Joshua. Thanks a lot for your explanation,<br><br>Agentzh says in the documentation about testing Nginx and OpenResty<br><br> <a href="https://openresty.gitbooks.io/programming-openresty/content/" target="_blank">https://openresty.gitbooks.<wbr>io/programming-openresty/<wbr>content/</a><div class="gmail_quote"><br>that there are two different Perl modules Test::Nginx. It seems that the one you were using isn't the same one I supposed you were using. I think it isn't a good idea having given the same name to two different Perl modules :-(<br><br></div><div class="gmail_quote">Oscar<br></div><div class="gmail_quote"><br><br></div><div class="gmail_extra"><div class="gmail_quote">On Tue, Jan 3, 2017 at 10:01 PM, Joshua Spence <span dir="ltr"><<a href="mailto:josh@joshuaspence.com" target="_blank">josh@joshuaspence.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">%%TEST_GLOBALS%% and %%TEST_GLOBALS_HTTP%% are macros inserted by Test::Nginx. I don't know if it is the same Test::Nginx from CPAN, it is vendored in the nginx-tests repository.</div><div class="gmail_extra"><span class="gmail-m_-4613019892692904241gmail-"><br clear="all"><div><div class="gmail-m_-4613019892692904241gmail-m_-4270344897646853370gmail_signature"><font size="4"><b>Joshua Spence</b></font><br><br><div><b>mobile:</b> <a href="tel:+61%20422%20990%20263" value="+61422990263" target="_blank">+61 422 990 263</a></div><b>email:</b> <a href="mailto:josh@joshuaspence.com" target="_blank">josh@joshuaspence.com</a><br><div style="padding:0px;margin-left:0px;margin-top:0px;overflow:hidden;color:black;font-size:10px;text-align:left;line-height:130%"></div><div><b>web:</b> <a href="http://www.joshuaspence.com" target="_blank">http://www.joshuaspence.c<wbr>om</a></div></div></div>
<br></span><div><div class="gmail-m_-4613019892692904241gmail-h5"><div class="gmail_quote">On 3 January 2017 at 23:57, oscaretu . <span dir="ltr"><<a href="mailto:oscaretu@gmail.com" target="_blank">oscaretu@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div><div><div>Hello, Joshua.<br><br></div>What values take %%TEST_GLOBALS%% and %%TEST_GLOBALS_HTTP%%?<br><br></div>I've searched them in Google and I didn't find anything related. Nor I was unable to find anything about them in the documentation for Test::Nginx (<a href="http://search.cpan.org/%7Eagent/Test-Nginx-0.25/lib/Test/Nginx.pm" target="_blank">http://search.cpan.org/~agent<wbr>/Test-Nginx-0.25/lib/Test/Ngin<wbr>x.pm</a>) neither in <a href="https://openresty.gitbooks.io/programming-openresty/content/" target="_blank">https://openresty.gitbooks.io/<wbr>programming-openresty/content/</a><br><br></div>Kind regards,<br></div>Oscar<br><div><div><br><br></div></div></div><div class="gmail_extra"><br><div class="gmail_quote"><div><div class="gmail-m_-4613019892692904241gmail-m_-4270344897646853370h5">On Tue, Jan 3, 2017 at 12:14 PM, Joshua Spence <span dir="ltr"><<a href="mailto:josh@joshuaspence.com" target="_blank">josh@joshuaspence.com</a>></span> wrote:<br></div></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div class="gmail-m_-4613019892692904241gmail-m_-4270344897646853370h5"><div dir="ltr">I tested this change with the following script:<div><br></div><div>```</div><div><div>#!/usr/bin/perl</div><div><br></div><div>##############################<wbr>##############################<wbr>###################</div><div><br></div><div>use warnings;</div><div>use strict;</div><div><br></div><div>use Test::More;</div><div><br></div><div>BEGIN { use FindBin; chdir($FindBin::Bin); }</div><div><br></div><div>use lib 'lib';</div><div>use Test::Nginx;</div><div><br></div><div>##############################<wbr>##############################<wbr>###################</div><div><br></div><div>select STDERR; $| = 1;</div><div>select STDOUT; $| = 1;</div><div><br></div><div>my $t = Test::Nginx->new()</div><div> ->has(qw/http limit_req/)</div><div> ->plan(2)</div><div> ->write_file_expand('nginx.con<wbr>f', <<'EOF');</div><div><br></div><div>%%TEST_GLOBALS%%</div><div><br></div><div>daemon off;</div><div><br></div><div>events {}</div><div><br></div><div>http {</div><div> %%TEST_GLOBALS_HTTP%%</div><div><br></div><div> limit_req_zone $binary_remote_addr zone=one:1m rate=1r/m;</div><div> limit_req zone=one;</div><div><br></div><div> server {</div><div> listen <a href="http://127.0.0.1:8080" target="_blank">127.0.0.1:8080</a>;</div><div> server_name localhost;</div><div><br></div><div> location / {</div><div> error_log stderr;</div><div> }</div><div> }</div><div>}</div><div><br></div><div>EOF</div><div><br></div><div>open OLDERR, '>&', \*STDERR;</div><div>open STDERR, '>', $t->testdir() . '/stderr' or die "Can't reopen STDERR: $!";</div><div>open my $stderr, '<', $t->testdir() . '/stderr' or die "Can't open stderr file: $!";</div><div><br></div><div>$t->run();</div><div><br></div><div>open STDERR, '>&', \*OLDERR;</div><div><br></div><div>##############################<wbr>##############################<wbr>###################</div><div><br></div><div># charge limit_req</div><div>http_get('/');</div><div><br></div><div>http_get('/"foo bar"');</div><div>is(lines($t, 'request: "GET /\"foo bar\" HTTP/1.0"'), 1);</div><div><br></div><div>http_get_with_referrer('/', '<a href="https://www.google.com.au/search?hl=en&q=" target="_blank">https://www.google.com.au/sea<wbr>rch?hl=en&q=</a>"BauGutachter" +41 2016 .ch&num=100&start=0');</div><div>is(lines($t, 'referrer: "<a href="https://www.google.com.au/search?hl=en&q=%5C" target="_blank">https://www.google.com.au/sea<wbr>rch?hl=en&q=\</a>"BauGutachter\" +41 2016 .ch&num=100&start=0"'), 1);</div><div><br></div><div>##############################<wbr>##############################<wbr>###################</div><div><br></div><div>sub http_get_with_referrer {</div><div> my ($url, $referer) = @_;</div><div> return http(<<EOF);</div><div>GET $url HTTP/1.0</div><div>Host: localhost</div><div>Referer: $referer</div><div><br></div><div>EOF</div><div>}</div><div><br></div><div>sub lines {</div><div> my ($t, $pattern) = @_;</div><div> return map { $_ =~ /\Q$pattern\E/ } (<$stderr>);</div><div>}</div><div><br></div><div>##############################<wbr>##############################<wbr>###################</div></div><div>```</div></div><div class="gmail_extra"><br clear="all"><div><div class="gmail-m_-4613019892692904241gmail-m_-4270344897646853370m_-5120982133995323042m_3273610685956422162gmail_signature"><font size="4"><b>Joshua Spence</b></font><br><br><div><b>mobile:</b> <a href="tel:+61%20422%20990%20263" value="+61422990263" target="_blank">+61 422 990 263</a></div><b>email:</b> <a href="mailto:josh@joshuaspence.com" target="_blank">josh@joshuaspence.com</a><br><div style="padding:0px;margin-left:0px;margin-top:0px;overflow:hidden;color:black;font-size:10px;text-align:left;line-height:130%"></div><div><b>web:</b> <a href="http://www.joshuaspence.com" target="_blank">http://www.joshuaspence.c<wbr>om</a></div></div></div>
<br><div class="gmail_quote">On 3 January 2017 at 17:50, Joshua Spence <span dir="ltr"><<a href="mailto:josh@joshuaspence.com" target="_blank">josh@joshuaspence.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span># HG changeset patch<br>
# User Joshua Spence <<a href="mailto:josh@joshuaspence.com" target="_blank">josh@joshuaspence.com</a>><br>
# Date 1483420178 -39600<br>
# Tue Jan 03 16:09:38 2017 +1100<br>
</span># Branch error_log_escape<br>
# Node ID 30a9ce42aadfcf135f3ad95a296c95<wbr>e7044d5f9d<br>
<span># Parent a42afc225e98afe1f7c3b428c81c8a<wbr>f7eba362c0<br>
Escape strings produced by the `%V` format specification<br>
<br>
Currently nginx error logs loosely adhere to the following pattern:<br>
<br>
```<br>
%{TIMESTAMP} [%{SEVERITY}] %{PROCESS_ID}#%{THREAD_ID}:( *%{CONNECTION_ID})?%{MESSAGE}(<wbr>, %{KEY}: "%{VALUE}")*<br>
```<br>
<br>
In our environment we are using Filebeat to ship nginx error logs into Logstash. Our Logstash filter for parsing nginx error logs is as follows:<br>
<br>
```<br>
filter {<br>
grok {<br>
match => {<br>
"message" => "(?<timestamp>%{YEAR}/%{MONTHN<wbr>UM2}/%{MONTHDAY} %{TIME}) \[%{LOGLEVEL:severity}\] %{POSINT:process_id:int}#%{NON<wbr>NEGINT:thread_id:int}:(?: \*%{NONNEGINT:connection_id:in<wbr>t})? (?<message>[^,]*)(?:, %{GREEDYDATA:nginx_keyvalue_da<wbr>ta})?"<br>
}<br>
overwrite => ["message"]<br>
}<br>
<br>
date {<br>
match => ["timestamp", "yyyy/MM/dd HH:mm:ss"]<br>
remove_field => ["timestamp"]<br>
}<br>
<br>
kv {<br>
field_split => ", "<br>
remove_field => "nginx_keyvalue_data"<br>
source => "nginx_keyvalue_data"<br>
value_split => ": "<br>
}<br>
}<br>
```<br>
<br>
The problem, however, is that it is difficult to unambigiously parse the key-value data from the nginx error log format because of a lack of data sanitization. Specifically, consider the following entry from our nginx error logs:<br>
<br>
```<br>
2016/11/15 03:45:51 [error] 23770#23770: *21535000 open() "/REDACTED/+"BauGutachter"++41<wbr>+2016+.ch&tbo=1" failed (2: No such file or directory), client: REDACTED, server: REDACTED, request: "GET /job/+%22BauGutachter%22++41+2<wbr>016+.ch&tbo=1 HTTP/1.1", host: "REDACTED", referrer: "<a href="https://www.google.com.au/search?hl=en&q=" rel="noreferrer" target="_blank">https://www.google.com.au/sea<wbr>rch?hl=en&q=</a>"BauGutachter" +41 2016 .ch&num=100&start=0"<br>
```<br>
<br>
Specifically, the problem in the above sample is the `referrer` field, the value of which contains double quotes. I think that it makes sense to escape `\` and `"` characters before they are written to the error log.<br>
<br>
</span>diff -r a42afc225e98 -r 30a9ce42aadf src/core/ngx_string.c<br>
<div class="gmail-m_-4613019892692904241gmail-m_-4270344897646853370m_-5120982133995323042m_3273610685956422162HOEnZb"><div class="gmail-m_-4613019892692904241gmail-m_-4270344897646853370m_-5120982133995323042m_3273610685956422162h5">--- a/src/core/ngx_string.c Tue Dec 27 17:23:08 2016 +0300<br>
+++ b/src/core/ngx_string.c Tue Jan 03 16:09:38 2017 +1100<br>
@@ -15,7 +15,7 @@<br>
const u_char *basis, ngx_uint_t padding);<br>
static ngx_int_t ngx_decode_base64_internal(ngx<wbr>_str_t *dst, ngx_str_t *src,<br>
const u_char *basis);<br>
-<br>
+static uintptr_t ngx_escape_string(u_char *dst, u_char *src, size_t size);<br>
<br>
void<br>
ngx_strlow(u_char *dst, u_char *src, size_t n)<br>
@@ -235,7 +235,7 @@<br>
v = va_arg(args, ngx_str_t *);<br>
<br>
len = ngx_min(((size_t) (last - buf)), v->len);<br>
- buf = ngx_cpymem(buf, v->data, len);<br>
+ buf = (u_char *) ngx_escape_string(buf, v->data, len);<br>
fmt++;<br>
<br>
continue;<br>
@@ -1844,6 +1844,43 @@<br>
}<br>
<br>
<br>
+static uintptr_t<br>
+ngx_escape_string(u_char *dst, u_char *src, size_t size)<br>
+{<br>
+ u_char ch;<br>
+ ngx_uint_t len;<br>
+<br>
+ if (dst == NULL) {<br>
+ len = 0;<br>
+<br>
+ while (size) {<br>
+ ch = *src++;<br>
+<br>
+ if (ch == '\\' || ch == '"') {<br>
+ len++;<br>
+ }<br>
+<br>
+ size--;<br>
+ }<br>
+<br>
+ return (uintptr_t) len;<br>
+ }<br>
+<br>
+ while (size) {<br>
+ ch = *src++;<br>
+<br>
+ if (ch == '\\' || ch == '"') {<br>
+ *dst++ = '\\';<br>
+ }<br>
+<br>
+ *dst++ = ch;<br>
+ size--;<br>
+ }<br>
+<br>
+ return (uintptr_t) dst;<br>
+}<br>
+<br>
+<br>
void<br>
ngx_str_rbtree_insert_value(n<wbr>gx_rbtree_node_t *temp,<br>
ngx_rbtree_node_t *node, ngx_rbtree_node_t *sentinel)<br>
</div></div></blockquote></div><br></div>
<br></div></div>______________________________<wbr>_________________<br>
nginx-devel mailing list<br>
<a href="mailto:nginx-devel@nginx.org" target="_blank">nginx-devel@nginx.org</a><br>
<a href="http://mailman.nginx.org/mailman/listinfo/nginx-devel" rel="noreferrer" target="_blank">http://mailman.nginx.org/mailm<wbr>an/listinfo/nginx-devel</a><span class="gmail-m_-4613019892692904241gmail-m_-4270344897646853370HOEnZb"><font color="#888888"><br></font></span></blockquote></div><span class="gmail-m_-4613019892692904241gmail-m_-4270344897646853370HOEnZb"><font color="#888888"><br><br clear="all"><br>-- <br><div class="gmail-m_-4613019892692904241gmail-m_-4270344897646853370m_-5120982133995323042gmail_signature">Oscar Fernandez Sierra<br><a href="mailto:oscaretu@gmail.com" target="_blank">oscaretu@gmail.com</a></div>
</font></span></div>
<br>______________________________<wbr>_________________<br>
nginx-devel mailing list<br>
<a href="mailto:nginx-devel@nginx.org" target="_blank">nginx-devel@nginx.org</a><br>
<a href="http://mailman.nginx.org/mailman/listinfo/nginx-devel" rel="noreferrer" target="_blank">http://mailman.nginx.org/mailm<wbr>an/listinfo/nginx-devel</a><br></blockquote></div><br></div></div></div>
<br>______________________________<wbr>_________________<br>
nginx-devel mailing list<br>
<a href="mailto:nginx-devel@nginx.org" target="_blank">nginx-devel@nginx.org</a><br>
<a href="http://mailman.nginx.org/mailman/listinfo/nginx-devel" rel="noreferrer" target="_blank">http://mailman.nginx.org/mailm<wbr>an/listinfo/nginx-devel</a><br></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail-m_-4613019892692904241gmail_signature">Oscar Fernandez Sierra<br><a href="mailto:oscaretu@gmail.com" target="_blank">oscaretu@gmail.com</a></div>
</div></div>