From morgan at doveria.com Wed Mar 2 13:55:05 2022 From: morgan at doveria.com (Morgan Kisienya) Date: Wed, 2 Mar 2022 16:55:05 +0300 Subject: Session Persistence Message-ID: Hi, We are running nginx opensource with modsecuity. Nginnx is a proxy server. We are also running an application, (which we proxy using nginx) that crates reports and downloads images. We are facing an issue with nginx session persistence. During report creation, not all images are downloaded to the report. When the page is refreshed, other images different from the initial ones are displayed. Nginx access.log shows the following GET /prod/reportImage?rnd=1661411659&image=img_0_0_5 HTTP/1.1" 500 1692 Modscurity log shows the following !doctype html>HTTP Status 500 \xe2\x80\x93 Internal Server Error

HTTP Status 500 \xe2\x80\x93 Internal Server Error


Type Exception Report

Message No JasperPrint documents found on the HTTP session.

Description The server encountered an unexpected condition that prevented it from fulfilling the request.

Exception

javax.servlet.ServletException: *No
JasperPrint documents found on the HTTP
session.*\x0a\x09net.sf.jasperreports.j2ee.servlets.ImageServlet.service(ImageServlet.java:95)\x0a\x09javax.servlet.http.HttpServlet.service(HttpServlet.java:742)\x0a\x09org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)\x0a\x09com.ltc.app.server.ClickjackFilter.doFilter(ClickjackFilter.java:117)\x0a\x09org.apache.logging.log4j.web.Log4jServletFilter.doFilter(Log4jServletFilter.java:71)\x0a

Note The full stack trace of the root cause is available in the server logs.


Apache Tomcat/8.5.41

Appreciate your help *Morgan Kisienya* *Managed Security Services* *PO Box 139 Wahroonga NSW 2076* *Mobile: +254 733 698 394* *Web : www.doveria.com Email : **morgan at doveria.com * The content of this email is confidential and intended for the recipient specified in message only. It is strictly forbidden to share any part of this message with any third party without a written consent of the sender. If you received this message by mistake, please reply to this message and follow with its deletion, so that we can ensure such a mistake does not occur in the future. Doveria puts the security of the client at a high priority. Therefore, we have put efforts into ensuring that the message is error and virus-free. Unfortunately, full security of the email cannot be ensured as, despite our efforts, the data included in emails could be infected, intercepted, or corrupted. Therefore, the recipient should check the email for threats with proper software, as the sender does not accept liability for any damage inflicted by viewing the content of this email. Please do not print this email unless it is necessary. Every un-printed email helps the environment. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdounin at mdounin.ru Wed Mar 2 15:37:55 2022 From: mdounin at mdounin.ru (Maxim Dounin) Date: Wed, 2 Mar 2022 18:37:55 +0300 Subject: Session Persistence In-Reply-To: References: Message-ID: Hello! On Wed, Mar 02, 2022 at 04:55:05PM +0300, Morgan Kisienya wrote: > We are running nginx opensource with modsecuity. Nginnx is a proxy server. > > We are also running an application, (which we proxy using nginx) that > crates reports and downloads images. > > We are facing an issue with nginx session persistence. > > During report creation, not all images are downloaded to the report. When > the page is refreshed, other images different from the initial ones are > displayed. The nginx-devel@ mailing list is about nginx development. Please refrain from posting user-level questions to it. Instead, please use the nginx@ mailing list, which is designated for user-level questions. Thank you for understanding. -- Maxim Dounin http://mdounin.ru/ From morgan at doveria.com Wed Mar 2 15:40:14 2022 From: morgan at doveria.com (Morgan Kisienya) Date: Wed, 2 Mar 2022 18:40:14 +0300 Subject: Session Persistence In-Reply-To: References: Message-ID: Thanks for the guidance. *Morgan Kisienya* *Managed Security Services* *PO Box 139 Wahroonga NSW 2076* *Mobile: +254 733 698 394* *Web : www.doveria.com Email : **morgan at doveria.com * The content of this email is confidential and intended for the recipient specified in message only. It is strictly forbidden to share any part of this message with any third party without a written consent of the sender. If you received this message by mistake, please reply to this message and follow with its deletion, so that we can ensure such a mistake does not occur in the future. Doveria puts the security of the client at a high priority. Therefore, we have put efforts into ensuring that the message is error and virus-free. Unfortunately, full security of the email cannot be ensured as, despite our efforts, the data included in emails could be infected, intercepted, or corrupted. Therefore, the recipient should check the email for threats with proper software, as the sender does not accept liability for any damage inflicted by viewing the content of this email. Please do not print this email unless it is necessary. Every un-printed email helps the environment. On Wed, Mar 2, 2022 at 6:38 PM Maxim Dounin wrote: > Hello! > > On Wed, Mar 02, 2022 at 04:55:05PM +0300, Morgan Kisienya wrote: > > > We are running nginx opensource with modsecuity. Nginnx is a proxy > server. > > > > We are also running an application, (which we proxy using nginx) that > > crates reports and downloads images. > > > > We are facing an issue with nginx session persistence. > > > > During report creation, not all images are downloaded to the report. When > > the page is refreshed, other images different from the initial ones are > > displayed. > > The nginx-devel@ mailing list is about nginx development. Please > refrain from posting user-level questions to it. Instead, please > use the nginx@ mailing list, which is designated for user-level > questions. > > Thank you for understanding. > > -- > Maxim Dounin > http://mdounin.ru/ > _______________________________________________ > nginx-devel mailing list -- nginx-devel at nginx.org > To unsubscribe send an email to nginx-devel-leave at nginx.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From luhliari at redhat.com Thu Mar 3 01:11:26 2022 From: luhliari at redhat.com (Lubos Uhliarik) Date: Thu, 3 Mar 2022 02:11:26 +0100 Subject: [PATCH] Fix resource leak - sockaddr is not properly freed Message-ID: # HG changeset patch # User Lubos Uhliarik # Date 1646269812 -3600 # Thu Mar 03 02:10:12 2022 +0100 # Node ID 317e1e4b0c7343c49e0e13fc59ac75a565521b67 # Parent a736a7a613ea6e182ff86fbadcb98bb0f8891c0b Fix resource leak - sockaddr is not properly freed sockaddr variable is allocated by ngx_resolver_calloc function but then it is going out of scope leaking the storage it points to. diff -r a736a7a613ea -r 317e1e4b0c73 src/core/ngx_resolver.c --- a/src/core/ngx_resolver.c Tue Feb 08 17:35:27 2022 +0300 +++ b/src/core/ngx_resolver.c Thu Mar 03 02:10:12 2022 +0100 @@ -4260,6 +4260,8 @@ } #endif + ngx_resolver_free(r, sockaddr); + return dst; } -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcus.ball at live.com Thu Mar 3 19:04:21 2022 From: marcus.ball at live.com (Marcus Ball) Date: Thu, 3 Mar 2022 14:04:21 -0500 Subject: [PATCH 0 of 1] Fix for Nginx hanging on systems without EPOLLRDHUP Message-ID: Hello, I recently encountered an issue where Nginx would hang for a very long time, if not indefinitely, on responses which exceeded the FastCGI buffer size (> ~4000 bytes) from an upstream source which, in this case, was PHP-FPM. This issue appeared to only be happening on DigitalOcean's App Platform service; I couldn't reproduce it locally. I did a lot of testing and digging around, I eventually tracked it back to DigitalOcean's system not supporting the `EPOLLRDHUP` event. After much debugging and reading through Nginx's source code, I believe I found the source to be two conditions which were missing a check for `ngx_use_epoll_rdhup`. I made the changes and rebuilt nginx and everything appears to be working fine now. If anyone needs to reproduce the issue, I've published a basic example at https://github.com/marcusball/nginx-epoll-bug. There are also corresponding Docker Hub images which should be able to demonstrate an example project with the bug and with the fix if they are deployed to App Platform: `marcusball/nginx-rdhup-bug:without-fix` and `marcusball/nginx-rdhup-bug:with-fix` respectively. This is my first time contributing to Nginx, as well as the first time trying to contribute via mailing list, so let me know if anything else is needed. I also can't get the Mercurial Patchbomb extension working, so I'm sending this manually and my apologies if anything gets formatted incorrectly. Marcus Ball From marcus.ball at live.com Thu Mar 3 19:07:27 2022 From: marcus.ball at live.com (Marcus Ball) Date: Thu, 3 Mar 2022 14:07:27 -0500 Subject: [PATCH 1 of 1] Fix for Nginx hanging on systems without EPOLLRDHUP In-Reply-To: References: Message-ID: # HG changeset patch # User Marcus Ball # Date 1646329482 18000 # Thu Mar 03 12:44:42 2022 -0500 # Node ID 395afc438f3ed064f78c6d8f1c3e5abe4d6294fc # Parent a736a7a613ea6e182ff86fbadcb98bb0f8891c0b Add missing `ngx_use_epoll_rdhup` condition. This fixes an issue where Nginx hangs when a response exceeding the page size is being sent from an upstream server with fastcgi_buffering enabled (such as from PHP-FPM). This issue occurs on systems which support epoll, but which do not support EPOLLRDHUP, like Digital Ocean's App Platform. diff -r a736a7a613ea -r 395afc438f3e src/os/unix/ngx_readv_chain.c --- a/src/os/unix/ngx_readv_chain.c Tue Feb 08 17:35:27 2022 +0300 +++ b/src/os/unix/ngx_readv_chain.c Thu Mar 03 12:44:42 2022 -0500 @@ -55,7 +55,9 @@ #if (NGX_HAVE_EPOLLRDHUP) - if (ngx_event_flags & NGX_USE_EPOLL_EVENT) { + if ((ngx_event_flags & NGX_USE_EPOLL_EVENT) + && ngx_use_epoll_rdhup) + { ngx_log_debug2(NGX_LOG_DEBUG_EVENT, c->log, 0, "readv: eof:%d, avail:%d", rev->pending_eof, rev->available); diff -r a736a7a613ea -r 395afc438f3e src/os/unix/ngx_recv.c --- a/src/os/unix/ngx_recv.c Tue Feb 08 17:35:27 2022 +0300 +++ b/src/os/unix/ngx_recv.c Thu Mar 03 12:44:42 2022 -0500 @@ -52,7 +52,9 @@ #if (NGX_HAVE_EPOLLRDHUP) - if (ngx_event_flags & NGX_USE_EPOLL_EVENT) { + if ((ngx_event_flags & NGX_USE_EPOLL_EVENT) + && ngx_use_epoll_rdhup) + { ngx_log_debug2(NGX_LOG_DEBUG_EVENT, c->log, 0, "recv: eof:%d, avail:%d", rev->pending_eof, rev->available); From mdounin at mdounin.ru Sun Mar 6 01:58:16 2022 From: mdounin at mdounin.ru (Maxim Dounin) Date: Sun, 6 Mar 2022 04:58:16 +0300 Subject: [PATCH] Fix resource leak - sockaddr is not properly freed In-Reply-To: References: Message-ID: Hello! On Thu, Mar 03, 2022 at 02:11:26AM +0100, Lubos Uhliarik wrote: > # HG changeset patch > # User Lubos Uhliarik > # Date 1646269812 -3600 > # Thu Mar 03 02:10:12 2022 +0100 > # Node ID 317e1e4b0c7343c49e0e13fc59ac75a565521b67 > # Parent a736a7a613ea6e182ff86fbadcb98bb0f8891c0b > Fix resource leak - sockaddr is not properly freed > > sockaddr variable is allocated by ngx_resolver_calloc function but then it > is > going out of scope leaking the storage it points to. > > diff -r a736a7a613ea -r 317e1e4b0c73 src/core/ngx_resolver.c > --- a/src/core/ngx_resolver.c Tue Feb 08 17:35:27 2022 +0300 > +++ b/src/core/ngx_resolver.c Thu Mar 03 02:10:12 2022 +0100 > @@ -4260,6 +4260,8 @@ > } > #endif > > + ngx_resolver_free(r, sockaddr); > + > return dst; > } Could you please clarify why do you think there is a leak? Note that sockaddr is referenced in the dst array, which is being returned. -- Maxim Dounin http://mdounin.ru/ From mdounin at mdounin.ru Sun Mar 6 04:05:46 2022 From: mdounin at mdounin.ru (Maxim Dounin) Date: Sun, 6 Mar 2022 07:05:46 +0300 Subject: [PATCH 0 of 1] Fix for Nginx hanging on systems without EPOLLRDHUP In-Reply-To: References: Message-ID: Hello! On Thu, Mar 03, 2022 at 02:04:21PM -0500, Marcus Ball wrote: > I recently encountered an issue where Nginx would hang for a very long > time, if not indefinitely, on responses which exceeded the FastCGI > buffer size (> ~4000 bytes) from an upstream source which, in this case, > was PHP-FPM. This issue appeared to only be happening on DigitalOcean's > App Platform service; I couldn't reproduce it locally. I did a lot of > testing and digging around, I eventually tracked it back to > DigitalOcean's system not supporting the `EPOLLRDHUP` event. After much > debugging and reading through Nginx's source code, I believe I found the > source to be two conditions which were missing a check for > `ngx_use_epoll_rdhup`. I made the changes and rebuilt nginx and > everything appears to be working fine now. > > If anyone needs to reproduce the issue, I've published a basic example > at https://github.com/marcusball/nginx-epoll-bug. There are also > corresponding Docker Hub images which should be able to demonstrate an > example project with the bug and with the fix if they are deployed to > App Platform: `marcusball/nginx-rdhup-bug:without-fix` and > `marcusball/nginx-rdhup-bug:with-fix` respectively. Thanks for the investigation. The rev->available shouldn't be 0 unless it was set to 0 due to reading more than (or equal to) the amount of data reported via ioctl(FIONREAD) during the particular event loop iteration. And it will be again set to -1 as long as an additional event is reported on the socket. That is, it shouldn't hang when epoll() is working properly and reports all data additionally received after all the data available at the time of the previous epoll_wait() return were read by nginx. I suspect this doesn't work due to issues with DigitalOcean's App Platform's / gVisor's epoll() emulation layer. Most likely, it fails to report additional events once nginx reads the amount of data reported by ioctl(FIONREAD). Or ioctl(FIONREAD) simply reports incorrect amount of data (or just 0). Debug log might be helpful to further investigate what goes on here. It would be great if you'll provide one for additional analysis. As far as I understand, proper workaround for this would be to compile nginx with --with-cc-opt="-DNGX_HAVE_FIONREAD=0", that is, with ioctl(FIONREAD) explicitly disabled. Please test if it works for you. -- Maxim Dounin http://mdounin.ru/ From marcus.ball at live.com Mon Mar 7 23:12:41 2022 From: marcus.ball at live.com (Marcus Ball) Date: Mon, 7 Mar 2022 18:12:41 -0500 Subject: [PATCH 0 of 1] Fix for Nginx hanging on systems without EPOLLRDHUP In-Reply-To: References: Message-ID: On 3/5/22 23:05, Maxim Dounin wrote: > Hello! > > On Thu, Mar 03, 2022 at 02:04:21PM -0500, Marcus Ball wrote: > >> I recently encountered an issue where Nginx would hang for a very long >> time, if not indefinitely, on responses which exceeded the FastCGI >> buffer size (> ~4000 bytes) from an upstream source which, in this case, >> was PHP-FPM. This issue appeared to only be happening on DigitalOcean's >> App Platform service; I couldn't reproduce it locally. I did a lot of >> testing and digging around, I eventually tracked it back to >> DigitalOcean's system not supporting the `EPOLLRDHUP` event. After much >> debugging and reading through Nginx's source code, I believe I found the >> source to be two conditions which were missing a check for >> `ngx_use_epoll_rdhup`. I made the changes and rebuilt nginx and >> everything appears to be working fine now. >> >> If anyone needs to reproduce the issue, I've published a basic example >> athttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmarcusball%2Fnginx-epoll-bug&data=04%7C01%7C%7Cfde67f80c99f4344ac4108d9ff26ba57%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637821364072127312%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=JknoAPGVXqPEaACMneRDZxH%2FkHjA4xBmONgvh%2BLpbqE%3D&reserved=0. There are also >> corresponding Docker Hub images which should be able to demonstrate an >> example project with the bug and with the fix if they are deployed to >> App Platform: `marcusball/nginx-rdhup-bug:without-fix` and >> `marcusball/nginx-rdhup-bug:with-fix` respectively. > Thanks for the investigation. > > The rev->available shouldn't be 0 unless it was set to 0 due to > reading more than (or equal to) the amount of data reported via > ioctl(FIONREAD) during the particular event loop iteration. And > it will be again set to -1 as long as an additional event is > reported on the socket. That is, it shouldn't hang when epoll() > is working properly and reports all data additionally received > after all the data available at the time of the previous > epoll_wait() return were read by nginx. > > I suspect this doesn't work due to issues with DigitalOcean's App > Platform's / gVisor's epoll() emulation layer. Most likely, it > fails to report additional events once nginx reads the amount of > data reported by ioctl(FIONREAD). Or ioctl(FIONREAD) simply > reports incorrect amount of data (or just 0). > > Debug log might be helpful to further investigate what goes on > here. It would be great if you'll provide one for additional > analysis. > > As far as I understand, proper workaround for this would be to > compile nginx with --with-cc-opt="-DNGX_HAVE_FIONREAD=0", that is, > with ioctl(FIONREAD) explicitly disabled. Please test if it works > for you. > Hi, Thank you for the suggestion. I tried using the --with-cc-opt="-DNGX_HAVE_FIONREAD=0" compilation option, and that also appeared to fix the issue with the response hanging. However, I'm not exactly sure that is the best solution. First, this would be less than ideal for this circumstance, as needing to compile a custom build would make it a lot more difficult to use Nginx within off-the-shelf components, such as Heroku buildpacks (heroku/php), which is how I initially encountered this problem. As a result t would greatly reduce the main selling point of a service like DigitalOcean's App Platform which is being able to simply deploy code without needing to worry about the configuration or maintenance of the server. More importantly, while disabling FIONREAD appears to address the issue of the request hanging, I *think* it may be more of a workaround. I certainly may be mistaken, but I still believe the patch I submitted address an actual bug which should be fixed. Basically, if I'm understanding everything properly, I think that `rev->available` is working correctly. I'm going to use `ngx_readv_chain.c` with FIONREAD enabled as an example: As Nginx reads the last of the response from upstream (PHP-FPM) `readv` receives the remaining body (line 121), `rev->available` is >= 0, so `n` is subtracted and `rev->available = 0` (L# 171). The problem is that once `ngx_readv_chain` is called again, the condition at line 63 is encountered: if (rev->available == 0 && !rev->pending_eof) { return NGX_AGAIN; } `rev->available` and `rev->pending_eof` are both `0`, so `NGX_AGAIN` is returned. This is essentially where the "hang" occurs: `rev->available` and `rev->pending_eof` always remain zero, and Nginx keeps attempting to read again. The reason I'm fairly confident this is a bug is that, with the exception of one line in `ngx_kqueue_module` which I think is not applicable here, `rev->pending_eof` is only set to `1` in exactly one place: ngx_epoll_module.c:887: #if (NGX_HAVE_EPOLLRDHUP) if (revents & EPOLLRDHUP) { rev->pending_eof = 1; } #endif Obviously, this line will never be hit if EPOLLRDHUP is not available. This is why `pending_eof` is always zero and Nginx will continue to try to read from upstream and never complete the response. This is why I think adding the check for `ngx_use_epoll_rdhup` to mirror the use of other occurrences of `rev->pending_eof` (like `ngx_readv_chain:205`) appears to make sense. Here are a few different logs, note these are the full logs from the containerized application so they also include some Nginx access logs, as well as PHP-FPM logs mixed in, but the vast majority is the Nginx debug log which can be differentiated via the full timestamp. WITH THE ISSUE:https://pastebin.com/SKerUPX8 Also note, this contains two requests: the first succeeds, then the second experiences the issue of the response never completing; it only "completes" when I killed `curl` and terminated the client connection. WITH -DNGX_HAVE_FIONREAD=0:https://pastebin.com/xGvxwbLz WITH MY PATCH:https://pastebin.com/GtmdLviz Marcus Ball -------------- next part -------------- An HTML attachment was scrubbed... URL: From luhliari at redhat.com Tue Mar 8 09:33:56 2022 From: luhliari at redhat.com (Lubos Uhliarik) Date: Tue, 8 Mar 2022 10:33:56 +0100 Subject: [PATCH] Fix resource leak - sockaddr is not properly freed In-Reply-To: References: Message-ID: Hi Maxim, I'm sorry, it's my mistake I missed that sin = &sockaddr[d].sockaddr_in; and dst[d].sockaddr = (struct sockaddr *) sin; Sorry for the confusion. Best, Lubos On Sun, Mar 6, 2022 at 2:59 AM Maxim Dounin wrote: > Hello! > > On Thu, Mar 03, 2022 at 02:11:26AM +0100, Lubos Uhliarik wrote: > > > # HG changeset patch > > # User Lubos Uhliarik > > # Date 1646269812 -3600 > > # Thu Mar 03 02:10:12 2022 +0100 > > # Node ID 317e1e4b0c7343c49e0e13fc59ac75a565521b67 > > # Parent a736a7a613ea6e182ff86fbadcb98bb0f8891c0b > > Fix resource leak - sockaddr is not properly freed > > > > sockaddr variable is allocated by ngx_resolver_calloc function but then > it > > is > > going out of scope leaking the storage it points to. > > > > diff -r a736a7a613ea -r 317e1e4b0c73 src/core/ngx_resolver.c > > --- a/src/core/ngx_resolver.c Tue Feb 08 17:35:27 2022 +0300 > > +++ b/src/core/ngx_resolver.c Thu Mar 03 02:10:12 2022 +0100 > > @@ -4260,6 +4260,8 @@ > > } > > #endif > > > > + ngx_resolver_free(r, sockaddr); > > + > > return dst; > > } > > Could you please clarify why do you think there is a leak? Note > that sockaddr is referenced in the dst array, which is being > returned. > > -- > Maxim Dounin > http://mdounin.ru/ > _______________________________________________ > nginx-devel mailing list -- nginx-devel at nginx.org > To unsubscribe send an email to nginx-devel-leave at nginx.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robm at fastmail.fm Wed Mar 9 04:19:40 2022 From: robm at fastmail.fm (Robert Mueller) Date: Wed, 09 Mar 2022 15:19:40 +1100 Subject: Support for OAUTHBEARER and XOAUTH2 in nginx mail proxy module Message-ID: <665c76f2-e656-4730-b058-8b7765cbc025@www.fastmail.com> Hi I've been working on adding support for the OAUTHBEARER (RFC7628) and also the legacy XOAUTH2 (pre-RFC version still usable at google, microsoft and others and still the most commonly supported protocol in client libraries unfortunately) to the nginx mail proxy module. Mostly this has been fairly straight forward, it's just adding ngx_*_auth_{oauthbearer,xoauth2} states, constants, handlers, etc. I'm passing the bearer token provided by the client to the backend auth process in the `Auth-Pass` header. One change however, there's an additional optional response header from the backend auth process that's supported, `Auth-Error-Sasl`. It's expected in the failure case that the backend auth server will generate a base64 encoded JSON object that conforms to the error reporting in https://datatracker.ietf.org/doc/html/rfc7628#section-3.2.2 in this header. If present, the value in this header is prefixed with a `+ ` and returned as the SASL response. We then wait for any line from the client (which we ignore) and then we exit the SASL mode and return back to standard protocol parsing. There's an example of this looks like in https://datatracker.ietf.org/doc/html/rfc7628#section-4.3 Anyway I'd appreciate if someone could look over these changes to see that they all look reasonable and like something that would be accepted back into the nginx upstream. There's a bit of a push within the mail community to try and bring more modern OAUTH2 style authentication to more services (not just the big players), but for that to be possible services need to be able to actually handle OAUTHBEARER/XOAUTH2 authentication, which as a first step means support on the server side in things like nginx, and then obviously adding support in the auth systems behind that. https://github.com/robmueller/nginx/commits/add-xaouth2-oauthbearer-auth Thanks in advance Rob Mueller robm at fastmail.fm From ygk.kmr at gmail.com Fri Mar 11 09:18:50 2022 From: ygk.kmr at gmail.com (Gk Gk) Date: Fri, 11 Mar 2022 14:48:50 +0530 Subject: Need information Message-ID: Hi, We work on cloud platforms and we have recently come across an nginx vulnerability described at https://mailman.nginx.org/pipermail/nginx-announce/2021/000300.html?_ga=2.60788846.2132221914.1646979909-1951211776.1640153145 We are using Ubuntu 20.04 OS versions which have nginx 1.18 version. We are trying to upgrade the nginx version to 1.20.1 where this vulnerability is remediated. But we need nginx-extras as well. But we can't find the nginx-extras package of version 1.20. Only 1.18 is available. Can you suggest what is the best way to install nginx 1.20.1 with nginx-extras ? Thanks Kumar -------------- next part -------------- An HTML attachment was scrubbed... URL: From osa at freebsd.org.ru Fri Mar 11 09:29:58 2022 From: osa at freebsd.org.ru (Sergey A. Osokin) Date: Fri, 11 Mar 2022 12:29:58 +0300 Subject: Need information In-Reply-To: References: Message-ID: Hi Kumar, hope you're doing well. On Fri, Mar 11, 2022 at 02:48:50PM +0530, Gk Gk wrote: > Hi, > > We work on cloud platforms and we have recently come across an nginx > vulnerability described at > https://mailman.nginx.org/pipermail/nginx-announce/2021/000300.html?_ga=2.60788846.2132221914.1646979909-1951211776.1640153145 > > We are using Ubuntu 20.04 OS versions which have nginx 1.18 version. We are > trying to upgrade > the nginx version to 1.20.1 where this vulnerability is remediated. But we > need nginx-extras as well. But we can't find the nginx-extras package of > version 1.20. Only 1.18 is available. Can you suggest what is the best way > to install nginx 1.20.1 with nginx-extras ? It seems like the the CVE-2021-23017 has been fixed with the recent package update, http://changelogs.ubuntu.com/changelogs/pool/main/n/nginx/nginx_1.18.0-0ubuntu1.2/changelog Also, I'd recommend to address your question to the maintainer of the corresponding packages for the Ubuntu Linux. Hope that helps. -- Sergey Osokin From ygk.kmr at gmail.com Fri Mar 11 12:21:21 2022 From: ygk.kmr at gmail.com (Gk Gk) Date: Fri, 11 Mar 2022 17:51:21 +0530 Subject: Need information In-Reply-To: References: Message-ID: Thanks Sergey. One question. Which package is exactly affected by this CVE ? is it base nginx package or nginx-extras or nginx-common package ? Also what is the location of this affected resolver.c file in an installed server of ubuntu ? On Fri, Mar 11, 2022 at 3:00 PM Sergey A. Osokin wrote: > Hi Kumar, > > hope you're doing well. > > On Fri, Mar 11, 2022 at 02:48:50PM +0530, Gk Gk wrote: > > Hi, > > > > We work on cloud platforms and we have recently come across an nginx > > vulnerability described at > > > https://mailman.nginx.org/pipermail/nginx-announce/2021/000300.html?_ga=2.60788846.2132221914.1646979909-1951211776.1640153145 > > > > We are using Ubuntu 20.04 OS versions which have nginx 1.18 version. We > are > > trying to upgrade > > the nginx version to 1.20.1 where this vulnerability is remediated. But > we > > need nginx-extras as well. But we can't find the nginx-extras package of > > version 1.20. Only 1.18 is available. Can you suggest what is the best > way > > to install nginx 1.20.1 with nginx-extras ? > > It seems like the the CVE-2021-23017 has been fixed with the recent > package update, > > http://changelogs.ubuntu.com/changelogs/pool/main/n/nginx/nginx_1.18.0-0ubuntu1.2/changelog > > Also, I'd recommend to address your question to the maintainer of > the corresponding packages for the Ubuntu Linux. > > Hope that helps. > > -- > Sergey Osokin > _______________________________________________ > nginx-devel mailing list -- nginx-devel at nginx.org > To unsubscribe send an email to nginx-devel-leave at nginx.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From osa at freebsd.org.ru Fri Mar 11 15:56:27 2022 From: osa at freebsd.org.ru (Sergey A. Osokin) Date: Fri, 11 Mar 2022 18:56:27 +0300 Subject: Need information In-Reply-To: References: Message-ID: That should be the nginx-common package. The resolver.c file isn't a part of the package, that file is a C source code file, usually not a part of a binary distribution. -- Sergey A. Osokin On Fri, Mar 11, 2022 at 05:51:21PM +0530, Gk Gk wrote: > Thanks Sergey. One question. Which package is exactly affected by this CVE > ? is it base nginx package or nginx-extras or nginx-common package ? > Also what is the location of this affected resolver.c file in an installed > server of ubuntu ? > > On Fri, Mar 11, 2022 at 3:00 PM Sergey A. Osokin wrote: > > > Hi Kumar, > > > > hope you're doing well. > > > > On Fri, Mar 11, 2022 at 02:48:50PM +0530, Gk Gk wrote: > > > Hi, > > > > > > We work on cloud platforms and we have recently come across an nginx > > > vulnerability described at > > > > > https://mailman.nginx.org/pipermail/nginx-announce/2021/000300.html?_ga=2.60788846.2132221914.1646979909-1951211776.1640153145 > > > > > > We are using Ubuntu 20.04 OS versions which have nginx 1.18 version. We > > are > > > trying to upgrade > > > the nginx version to 1.20.1 where this vulnerability is remediated. But > > we > > > need nginx-extras as well. But we can't find the nginx-extras package of > > > version 1.20. Only 1.18 is available. Can you suggest what is the best > > way > > > to install nginx 1.20.1 with nginx-extras ? > > > > It seems like the the CVE-2021-23017 has been fixed with the recent > > package update, > > > > http://changelogs.ubuntu.com/changelogs/pool/main/n/nginx/nginx_1.18.0-0ubuntu1.2/changelog > > > > Also, I'd recommend to address your question to the maintainer of > > the corresponding packages for the Ubuntu Linux. > > > > Hope that helps. > > > > -- > > Sergey Osokin > > _______________________________________________ > > nginx-devel mailing list -- nginx-devel at nginx.org > > To unsubscribe send an email to nginx-devel-leave at nginx.org > > > _______________________________________________ > nginx-devel mailing list -- nginx-devel at nginx.org > To unsubscribe send an email to nginx-devel-leave at nginx.org From mdounin at mdounin.ru Fri Mar 11 19:59:08 2022 From: mdounin at mdounin.ru (Maxim Dounin) Date: Fri, 11 Mar 2022 22:59:08 +0300 Subject: [PATCH 0 of 1] Fix for Nginx hanging on systems without EPOLLRDHUP In-Reply-To: References: Message-ID: Hello! On Mon, Mar 07, 2022 at 06:12:41PM -0500, Marcus Ball wrote: > On 3/5/22 23:05, Maxim Dounin wrote: > > Hello! > > > > On Thu, Mar 03, 2022 at 02:04:21PM -0500, Marcus Ball wrote: > > > >> I recently encountered an issue where Nginx would hang for a very long > >> time, if not indefinitely, on responses which exceeded the FastCGI > >> buffer size (> ~4000 bytes) from an upstream source which, in this case, > >> was PHP-FPM. This issue appeared to only be happening on DigitalOcean's > >> App Platform service; I couldn't reproduce it locally. I did a lot of > >> testing and digging around, I eventually tracked it back to > >> DigitalOcean's system not supporting the `EPOLLRDHUP` event. After much > >> debugging and reading through Nginx's source code, I believe I found the > >> source to be two conditions which were missing a check for > >> `ngx_use_epoll_rdhup`. I made the changes and rebuilt nginx and > >> everything appears to be working fine now. > >> > >> If anyone needs to reproduce the issue, I've published a basic example > >> athttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmarcusball%2Fnginx-epoll-bug&data=04%7C01%7C%7Cfde67f80c99f4344ac4108d9ff26ba57%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637821364072127312%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=JknoAPGVXqPEaACMneRDZxH%2FkHjA4xBmONgvh%2BLpbqE%3D&reserved=0. There are also > >> corresponding Docker Hub images which should be able to demonstrate an > >> example project with the bug and with the fix if they are deployed to > >> App Platform: `marcusball/nginx-rdhup-bug:without-fix` and > >> `marcusball/nginx-rdhup-bug:with-fix` respectively. > > Thanks for the investigation. > > > > The rev->available shouldn't be 0 unless it was set to 0 due to > > reading more than (or equal to) the amount of data reported via > > ioctl(FIONREAD) during the particular event loop iteration. And > > it will be again set to -1 as long as an additional event is > > reported on the socket. That is, it shouldn't hang when epoll() > > is working properly and reports all data additionally received > > after all the data available at the time of the previous > > epoll_wait() return were read by nginx. > > > > I suspect this doesn't work due to issues with DigitalOcean's App > > Platform's / gVisor's epoll() emulation layer. Most likely, it > > fails to report additional events once nginx reads the amount of > > data reported by ioctl(FIONREAD). Or ioctl(FIONREAD) simply > > reports incorrect amount of data (or just 0). > > > > Debug log might be helpful to further investigate what goes on > > here. It would be great if you'll provide one for additional > > analysis. > > > > As far as I understand, proper workaround for this would be to > > compile nginx with --with-cc-opt="-DNGX_HAVE_FIONREAD=0", that is, > > with ioctl(FIONREAD) explicitly disabled. Please test if it works > > for you. > > > > Hi, > > Thank you for the suggestion. I tried using the > --with-cc-opt="-DNGX_HAVE_FIONREAD=0" compilation option, and > that also appeared to fix the issue with the response hanging. > However, I'm not exactly sure that is the best solution. > > First, this would be less than ideal for this circumstance, > as needing to compile a custom build would make it a lot more > difficult to use Nginx within off-the-shelf components, > such as Heroku buildpacks (heroku/php), which is how I initially > encountered this problem. As a result t would greatly reduce > the main selling point of a service like DigitalOcean's App Platform > which is being able to simply deploy code without needing to worry > about the configuration or maintenance of the server. > > More importantly, while disabling FIONREAD appears to address the > issue of the request hanging, I *think* it may be more of a > workaround. I certainly may be mistaken, but I still believe > the patch I submitted address an actual bug which should be fixed. > > Basically, if I'm understanding everything properly, I think > that `rev->available` is working correctly. I'm going to use > `ngx_readv_chain.c` with FIONREAD enabled as an example: > > As Nginx reads the last of the response from upstream (PHP-FPM) > `readv` receives the remaining body (line 121), `rev->available` > is >= 0, so `n` is subtracted and `rev->available = 0` (L# 171). > The problem is that once `ngx_readv_chain` is called again, > the condition at line 63 is encountered: > > if (rev->available == 0 && !rev->pending_eof) { > return NGX_AGAIN; > } > > `rev->available` and `rev->pending_eof` are both `0`, so > `NGX_AGAIN` is returned. This is essentially where the "hang" > occurs: `rev->available` and `rev->pending_eof` always > remain zero, and Nginx keeps attempting to read again. > > The reason I'm fairly confident this is a bug is that, > with the exception of one line in `ngx_kqueue_module` which > I think is not applicable here, `rev->pending_eof` is > only set to `1` in exactly one place: > > ngx_epoll_module.c:887: > > > #if (NGX_HAVE_EPOLLRDHUP) > if (revents & EPOLLRDHUP) { > rev->pending_eof = 1; > } > #endif > > Obviously, this line will never be hit if EPOLLRDHUP is not > available. This is why `pending_eof` is always zero and Nginx > will continue to try to read from upstream and never complete > the response. This is why I think adding the check for > `ngx_use_epoll_rdhup` to mirror the use of other occurrences of > `rev->pending_eof` (like `ngx_readv_chain:205`) appears > to make sense. > > > Here are a few different logs, note these are the full logs from > the containerized application so they also include some > Nginx access logs, as well as PHP-FPM logs mixed in, but the > vast majority is the Nginx debug log which can be differentiated > via the full timestamp. > > WITH THE ISSUE:https://pastebin.com/SKerUPX8 > Also note, this contains two requests: the first succeeds, > then the second experiences the issue of the response never > completing; it only "completes" when I killed `curl` and > terminated the client connection. Thanks, after looking into debug logs I think you are right. Indeed, rev->available can reach 0, stopping further processing, and if EOF happens to be already reported along with the last event, it is not reported again by epoll(), leading to a timeout. I was able to reproduce this on a recent Linux with a simple test patch which removes EPOLLRDHUP from events, thus emulating older kernel versions without EPOLLRDHUP support (and the emulation layer in DigitalOcean's App Platform's / gVisor's). I see the following tests failures: t/access_log.t (Wstat: 256 Tests: 21 Failed: 1) Failed test: 19 Non-zero exit status: 1 t/fastcgi.t (Wstat: 256 Tests: 10 Failed: 1) Failed test: 5 Non-zero exit status: 1 t/limit_rate.t (Wstat: 512 Tests: 9 Failed: 2) Failed tests: 6-7 Non-zero exit status: 2 t/ssi_delayed.t (Wstat: 256 Tests: 3 Failed: 1) Failed test: 1 Non-zero exit status: 1 t/proxy_unfinished.t (Wstat: 512 Tests: 17 Failed: 2) Failed tests: 11, 13 Non-zero exit status: 2 t/stream_tcp_nodelay.t (Wstat: 256 Tests: 4 Failed: 1) Failed test: 3 Non-zero exit status: 1 This seems to be an issue introduced in 7583:efd71d49bde0 along with ioctl(FIONREAD) support (nginx 1.17.5). Before the change, rev->available was never set to 0 unless ngx_use_epoll_rdhup was also set (that is, runtime test for EPOLLRDHUP introduced in 6536:f7849bfb6d21 succeeded). With the patch, I still see the following test failures: t/access_log.t (Wstat: 256 Tests: 21 Failed: 1) Failed test: 19 Non-zero exit status: 1 t/limit_rate.t (Wstat: 1280 Tests: 9 Failed: 5) Failed tests: 3-7 Non-zero exit status: 5 These failures seems to be due to race conditions in the tests though, and to be fixed in tests. Below is the patch with the commit log updated to reflect the above details, please take a look. # HG changeset patch # User Marcus Ball # Date 1647028678 -10800 # Fri Mar 11 22:57:58 2022 +0300 # Node ID 16762702cd6e949b448f43f3e209102ef60c7c2e # Parent aec3b1f8ae0c46a032c1bfcbe2c1d89981064993 Fixed runtime handling of systems without EPOLLRDHUP support. In 7583:efd71d49bde0 (nginx 1.17.5) along with introduction of the ioctl(FIONREAD) support proper handling of systems without EPOLLRDHUP support in the kernel (but with EPOLLRDHUP in headers) was broken. Before the change, rev->available was never set to 0 unless ngx_use_epoll_rdhup was also set (that is, runtime test for EPOLLRDHUP introduced in 6536:f7849bfb6d21 succeeded). After the change, rev->available might reach 0 on systems without runtime EPOLLRDHUP support, stopping further reading in ngx_readv_chain() and ngx_unix_recv(). And, if EOF happened to be already reported along with the last event, it is not reported again by epoll_wait(), leading to connection hangs and timeouts on such systems. This affects Linux kernels before 2.6.17 if nginx was compiled with newer headers, and, more importantly, emulation layers, such as DigitalOcean's App Platform's / gVisor's epoll emulation layer. Fix is to explicitly check ngx_use_epoll_rdhup before the corresponding rev->pending_eof tests in ngx_readv_chain() and ngx_unix_recv(). diff --git a/src/os/unix/ngx_readv_chain.c b/src/os/unix/ngx_readv_chain.c --- a/src/os/unix/ngx_readv_chain.c +++ b/src/os/unix/ngx_readv_chain.c @@ -55,7 +55,9 @@ ngx_readv_chain(ngx_connection_t *c, ngx #if (NGX_HAVE_EPOLLRDHUP) - if (ngx_event_flags & NGX_USE_EPOLL_EVENT) { + if ((ngx_event_flags & NGX_USE_EPOLL_EVENT) + && ngx_use_epoll_rdhup) + { ngx_log_debug2(NGX_LOG_DEBUG_EVENT, c->log, 0, "readv: eof:%d, avail:%d", rev->pending_eof, rev->available); diff --git a/src/os/unix/ngx_recv.c b/src/os/unix/ngx_recv.c --- a/src/os/unix/ngx_recv.c +++ b/src/os/unix/ngx_recv.c @@ -52,7 +52,9 @@ ngx_unix_recv(ngx_connection_t *c, u_cha #if (NGX_HAVE_EPOLLRDHUP) - if (ngx_event_flags & NGX_USE_EPOLL_EVENT) { + if ((ngx_event_flags & NGX_USE_EPOLL_EVENT) + && ngx_use_epoll_rdhup) + { ngx_log_debug2(NGX_LOG_DEBUG_EVENT, c->log, 0, "recv: eof:%d, avail:%d", rev->pending_eof, rev->available); -- Maxim Dounin http://mdounin.ru/ From okaya at kernel.org Mon Mar 14 20:42:15 2022 From: okaya at kernel.org (Sinan Kaya) Date: Mon, 14 Mar 2022 16:42:15 -0400 Subject: [PATCH] fix -Wsign-conversion warning with gcc 8.2 In-Reply-To: <04bf5348-b669-e8ed-b2e2-74b73a626ffa@kernel.org> References: <04bf5348-b669-e8ed-b2e2-74b73a626ffa@kernel.org> Message-ID: <7b02184b-b8b9-73d9-3d5c-839aa3eb9094@kernel.org> # HG changeset patch # User Sinan Kaya # Date 1647289518 14400 # Mon Mar 14 16:25:18 2022 -0400 # Node ID f22520b612969dbfa17205129510927519370000 # Parent a736a7a613ea6e182ff86fbadcb98bb0f8891c0b fix -Wsign-conversion warning with gcc 8.2 Getting compiler warning with -Wsign-conversion. /usr/include/nginx/core/ngx_crc32.h:31:47: warning: conversion to 'uint32_t' {aka 'unsigned int'} from 'int' may change the sign of the result [-Wsign-conversion] 31 | crc = ngx_crc32_table_short[(crc ^ (c >> 4)) & 0xf] ^ (crc >> 4); diff -r a736a7a613ea -r f22520b61296 src/core/ngx_crc32.h --- a/src/core/ngx_crc32.h Tue Feb 08 17:35:27 2022 +0300 +++ b/src/core/ngx_crc32.h Mon Mar 14 16:25:18 2022 -0400 @@ -28,7 +28,8 @@ while (len--) { c = *p++; crc = ngx_crc32_table_short[(crc ^ (c & 0xf)) & 0xf] ^ (crc >> 4); - crc = ngx_crc32_table_short[(crc ^ (c >> 4)) & 0xf] ^ (crc >> 4); + crc = ngx_crc32_table_short[(crc ^ (u_char)(c >> 4)) & 0xf]; + crc = crc ^ (crc >> 4); } return crc ^ 0xffffffff; From vaclav at leadspicker.com Wed Mar 16 15:18:48 2022 From: vaclav at leadspicker.com (=?UTF-8?B?VsOhY2xhdiBOw6FkZW7DrcSNZWs=?=) Date: Wed, 16 Mar 2022 16:18:48 +0100 Subject: [QUIC] current ETA Message-ID: In https://www.nginx.com/blog/our-roadmap-quic-http-3-support-nginx/ (from July 12, 2021) it is stated: "Our current target for completing the code merge into the NGINX mainline branch is the end of 2021 [...]" It seems that it is not currently done yet. What is the current ETA on QUIC and HTTP/3 mainline support? -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdounin at mdounin.ru Wed Mar 16 15:40:42 2022 From: mdounin at mdounin.ru (Maxim Dounin) Date: Wed, 16 Mar 2022 18:40:42 +0300 Subject: [QUIC] current ETA In-Reply-To: References: Message-ID: Hello! On Wed, Mar 16, 2022 at 04:18:48PM +0100, Václav Nádeníček via nginx-devel wrote: > In https://www.nginx.com/blog/our-roadmap-quic-http-3-support-nginx/ (from > July 12, 2021) it is stated: > "Our current target for completing the code merge into the NGINX mainline > branch is the end of 2021 [...]" > > It seems that it is not currently done yet. What is the current ETA on QUIC > and HTTP/3 mainline support? No ETA. -- Maxim Dounin http://mdounin.ru/ From marcus.ball at live.com Wed Mar 16 19:08:06 2022 From: marcus.ball at live.com (Marcus Ball) Date: Wed, 16 Mar 2022 15:08:06 -0400 Subject: [PATCH 0 of 1] Fix for Nginx hanging on systems without EPOLLRDHUP In-Reply-To: References: Message-ID: Hi Maxim, Thank you for taking the time to look further into this and confirm the issue. The new patch details look great, definitely a better description than I had. Is there anything else needed from me, or any other help I can offer, to try to get this patch merged? On 3/11/22 14:59, Maxim Dounin wrote: > Hello! > > On Mon, Mar 07, 2022 at 06:12:41PM -0500, Marcus Ball wrote: > >> On 3/5/22 23:05, Maxim Dounin wrote: >>> Hello! >>> >>> On Thu, Mar 03, 2022 at 02:04:21PM -0500, Marcus Ball wrote: >>> >>>> I recently encountered an issue where Nginx would hang for a very long >>>> time, if not indefinitely, on responses which exceeded the FastCGI >>>> buffer size (> ~4000 bytes) from an upstream source which, in this case, >>>> was PHP-FPM. This issue appeared to only be happening on DigitalOcean's >>>> App Platform service; I couldn't reproduce it locally. I did a lot of >>>> testing and digging around, I eventually tracked it back to >>>> DigitalOcean's system not supporting the `EPOLLRDHUP` event. After much >>>> debugging and reading through Nginx's source code, I believe I found the >>>> source to be two conditions which were missing a check for >>>> `ngx_use_epoll_rdhup`. I made the changes and rebuilt nginx and >>>> everything appears to be working fine now. >>>> >>>> If anyone needs to reproduce the issue, I've published a basic example >>>> athttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmarcusball%2Fnginx-epoll-bug&data=04%7C01%7C%7Cdfed9f2f9bf2450186ff08da0399c1cf%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637826256184318799%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=s83YDr%2BcT82or4uvSxAbIo6u9NuLo2GSIA8mmaHnZK4%3D&reserved=0. There are also >>>> corresponding Docker Hub images which should be able to demonstrate an >>>> example project with the bug and with the fix if they are deployed to >>>> App Platform: `marcusball/nginx-rdhup-bug:without-fix` and >>>> `marcusball/nginx-rdhup-bug:with-fix` respectively. >>> Thanks for the investigation. >>> >>> The rev->available shouldn't be 0 unless it was set to 0 due to >>> reading more than (or equal to) the amount of data reported via >>> ioctl(FIONREAD) during the particular event loop iteration. And >>> it will be again set to -1 as long as an additional event is >>> reported on the socket. That is, it shouldn't hang when epoll() >>> is working properly and reports all data additionally received >>> after all the data available at the time of the previous >>> epoll_wait() return were read by nginx. >>> >>> I suspect this doesn't work due to issues with DigitalOcean's App >>> Platform's / gVisor's epoll() emulation layer. Most likely, it >>> fails to report additional events once nginx reads the amount of >>> data reported by ioctl(FIONREAD). Or ioctl(FIONREAD) simply >>> reports incorrect amount of data (or just 0). >>> >>> Debug log might be helpful to further investigate what goes on >>> here. It would be great if you'll provide one for additional >>> analysis. >>> >>> As far as I understand, proper workaround for this would be to >>> compile nginx with --with-cc-opt="-DNGX_HAVE_FIONREAD=0", that is, >>> with ioctl(FIONREAD) explicitly disabled. Please test if it works >>> for you. >>> >> Hi, >> >> Thank you for the suggestion. I tried using the >> --with-cc-opt="-DNGX_HAVE_FIONREAD=0" compilation option, and >> that also appeared to fix the issue with the response hanging. >> However, I'm not exactly sure that is the best solution. >> >> First, this would be less than ideal for this circumstance, >> as needing to compile a custom build would make it a lot more >> difficult to use Nginx within off-the-shelf components, >> such as Heroku buildpacks (heroku/php), which is how I initially >> encountered this problem. As a result t would greatly reduce >> the main selling point of a service like DigitalOcean's App Platform >> which is being able to simply deploy code without needing to worry >> about the configuration or maintenance of the server. >> >> More importantly, while disabling FIONREAD appears to address the >> issue of the request hanging, I *think* it may be more of a >> workaround. I certainly may be mistaken, but I still believe >> the patch I submitted address an actual bug which should be fixed. >> >> Basically, if I'm understanding everything properly, I think >> that `rev->available` is working correctly. I'm going to use >> `ngx_readv_chain.c` with FIONREAD enabled as an example: >> >> As Nginx reads the last of the response from upstream (PHP-FPM) >> `readv` receives the remaining body (line 121), `rev->available` >> is >= 0, so `n` is subtracted and `rev->available = 0` (L# 171). >> The problem is that once `ngx_readv_chain` is called again, >> the condition at line 63 is encountered: >> >> if (rev->available == 0 && !rev->pending_eof) { >> return NGX_AGAIN; >> } >> >> `rev->available` and `rev->pending_eof` are both `0`, so >> `NGX_AGAIN` is returned. This is essentially where the "hang" >> occurs: `rev->available` and `rev->pending_eof` always >> remain zero, and Nginx keeps attempting to read again. >> >> The reason I'm fairly confident this is a bug is that, >> with the exception of one line in `ngx_kqueue_module` which >> I think is not applicable here, `rev->pending_eof` is >> only set to `1` in exactly one place: >> >> ngx_epoll_module.c:887: >> >> >> #if (NGX_HAVE_EPOLLRDHUP) >> if (revents & EPOLLRDHUP) { >> rev->pending_eof = 1; >> } >> #endif >> >> Obviously, this line will never be hit if EPOLLRDHUP is not >> available. This is why `pending_eof` is always zero and Nginx >> will continue to try to read from upstream and never complete >> the response. This is why I think adding the check for >> `ngx_use_epoll_rdhup` to mirror the use of other occurrences of >> `rev->pending_eof` (like `ngx_readv_chain:205`) appears >> to make sense. >> >> >> Here are a few different logs, note these are the full logs from >> the containerized application so they also include some >> Nginx access logs, as well as PHP-FPM logs mixed in, but the >> vast majority is the Nginx debug log which can be differentiated >> via the full timestamp. >> >> WITH THE ISSUE:https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2FSKerUPX8&data=04%7C01%7C%7Cdfed9f2f9bf2450186ff08da0399c1cf%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637826256184318799%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=1D2r6YNz4pLmy8xMw%2BZtjQbKXf714r83OC08VEI%2F6ik%3D&reserved=0 >> Also note, this contains two requests: the first succeeds, >> then the second experiences the issue of the response never >> completing; it only "completes" when I killed `curl` and >> terminated the client connection. > Thanks, after looking into debug logs I think you are right. > Indeed, rev->available can reach 0, stopping further processing, > and if EOF happens to be already reported along with the last > event, it is not reported again by epoll(), leading to a timeout. > > I was able to reproduce this on a recent Linux with a simple test > patch which removes EPOLLRDHUP from events, thus emulating older > kernel versions without EPOLLRDHUP support (and the emulation > layer in DigitalOcean's App Platform's / gVisor's). I see the > following tests failures: > > t/access_log.t (Wstat: 256 Tests: 21 Failed: 1) > Failed test: 19 > Non-zero exit status: 1 > t/fastcgi.t (Wstat: 256 Tests: 10 Failed: 1) > Failed test: 5 > Non-zero exit status: 1 > t/limit_rate.t (Wstat: 512 Tests: 9 Failed: 2) > Failed tests: 6-7 > Non-zero exit status: 2 > t/ssi_delayed.t (Wstat: 256 Tests: 3 Failed: 1) > Failed test: 1 > Non-zero exit status: 1 > t/proxy_unfinished.t (Wstat: 512 Tests: 17 Failed: 2) > Failed tests: 11, 13 > Non-zero exit status: 2 > t/stream_tcp_nodelay.t (Wstat: 256 Tests: 4 Failed: 1) > Failed test: 3 > Non-zero exit status: 1 > > This seems to be an issue introduced in 7583:efd71d49bde0 along > with ioctl(FIONREAD) support (nginx 1.17.5). Before the change, > rev->available was never set to 0 unless ngx_use_epoll_rdhup was > also set (that is, runtime test for EPOLLRDHUP introduced in > 6536:f7849bfb6d21 succeeded). > > With the patch, I still see the following test failures: > > t/access_log.t (Wstat: 256 Tests: 21 Failed: 1) > Failed test: 19 > Non-zero exit status: 1 > t/limit_rate.t (Wstat: 1280 Tests: 9 Failed: 5) > Failed tests: 3-7 > Non-zero exit status: 5 > > These failures seems to be due to race conditions in the tests > though, and to be fixed in tests. > > Below is the patch with the commit log updated to reflect the > above details, please take a look. > > # HG changeset patch > # User Marcus Ball > # Date 1647028678 -10800 > # Fri Mar 11 22:57:58 2022 +0300 > # Node ID 16762702cd6e949b448f43f3e209102ef60c7c2e > # Parent aec3b1f8ae0c46a032c1bfcbe2c1d89981064993 > Fixed runtime handling of systems without EPOLLRDHUP support. > > In 7583:efd71d49bde0 (nginx 1.17.5) along with introduction of the > ioctl(FIONREAD) support proper handling of systems without EPOLLRDHUP > support in the kernel (but with EPOLLRDHUP in headers) was broken. > > Before the change, rev->available was never set to 0 unless > ngx_use_epoll_rdhup was also set (that is, runtime test for EPOLLRDHUP > introduced in 6536:f7849bfb6d21 succeeded). After the change, > rev->available might reach 0 on systems without runtime EPOLLRDHUP > support, stopping further reading in ngx_readv_chain() and ngx_unix_recv(). > And, if EOF happened to be already reported along with the last event, > it is not reported again by epoll_wait(), leading to connection hangs > and timeouts on such systems. > > This affects Linux kernels before 2.6.17 if nginx was compiled > with newer headers, and, more importantly, emulation layers, such as > DigitalOcean's App Platform's / gVisor's epoll emulation layer. > > Fix is to explicitly check ngx_use_epoll_rdhup before the corresponding > rev->pending_eof tests in ngx_readv_chain() and ngx_unix_recv(). > > diff --git a/src/os/unix/ngx_readv_chain.c b/src/os/unix/ngx_readv_chain.c > --- a/src/os/unix/ngx_readv_chain.c > +++ b/src/os/unix/ngx_readv_chain.c > @@ -55,7 +55,9 @@ ngx_readv_chain(ngx_connection_t *c, ngx > > #if (NGX_HAVE_EPOLLRDHUP) > > - if (ngx_event_flags & NGX_USE_EPOLL_EVENT) { > + if ((ngx_event_flags & NGX_USE_EPOLL_EVENT) > + && ngx_use_epoll_rdhup) > + { > ngx_log_debug2(NGX_LOG_DEBUG_EVENT, c->log, 0, > "readv: eof:%d, avail:%d", > rev->pending_eof, rev->available); > diff --git a/src/os/unix/ngx_recv.c b/src/os/unix/ngx_recv.c > --- a/src/os/unix/ngx_recv.c > +++ b/src/os/unix/ngx_recv.c > @@ -52,7 +52,9 @@ ngx_unix_recv(ngx_connection_t *c, u_cha > > #if (NGX_HAVE_EPOLLRDHUP) > > - if (ngx_event_flags & NGX_USE_EPOLL_EVENT) { > + if ((ngx_event_flags & NGX_USE_EPOLL_EVENT) > + && ngx_use_epoll_rdhup) > + { > ngx_log_debug2(NGX_LOG_DEBUG_EVENT, c->log, 0, > "recv: eof:%d, avail:%d", > rev->pending_eof, rev->available); > From mdounin at mdounin.ru Thu Mar 17 22:02:21 2022 From: mdounin at mdounin.ru (Maxim Dounin) Date: Fri, 18 Mar 2022 01:02:21 +0300 Subject: [PATCH 0 of 1] Fix for Nginx hanging on systems without EPOLLRDHUP In-Reply-To: References: Message-ID: Hello! On Wed, Mar 16, 2022 at 03:08:06PM -0400, Marcus Ball wrote: > Thank you for taking the time to look further into this and confirm the > issue. The new patch details look great, definitely a better description > than I had. Is there anything else needed from me, or any other help I > can offer, to try to get this patch merged? Thanks for looking, and thanks for finding this. No further actions are needed, I'll take care of this (might take some time due to unrelated reasons though). -- Maxim Dounin http://mdounin.ru/ From spacewanderlzx at gmail.com Fri Mar 25 07:24:06 2022 From: spacewanderlzx at gmail.com (Zexuan Luo) Date: Fri, 25 Mar 2022 15:24:06 +0800 Subject: [PATCH] Core: copy a NULL string in memcpy is undefined behavior Message-ID: # HG changeset patch # User Zexuan Luo # Date 1648192098 -28800 # Fri Mar 25 15:08:18 2022 +0800 # Node ID a94f838a469ed158e421cbc8187db6ae79153921 # Parent a736a7a613ea6e182ff86fbadcb98bb0f8891c0b Core: copy a NULL string in memcpy is undefined behavior diff -r a736a7a613ea -r a94f838a469e src/core/ngx_string.c --- a/src/core/ngx_string.c Tue Feb 08 17:35:27 2022 +0300 +++ b/src/core/ngx_string.c Fri Mar 25 15:08:18 2022 +0800 @@ -81,7 +81,9 @@ return NULL; } - ngx_memcpy(dst, src->data, src->len); + if (src->len > 0) { + ngx_memcpy(dst, src->data, src->len); + } return dst; } From thattommyhall at gmail.com Sat Mar 26 20:55:48 2022 From: thattommyhall at gmail.com (Tom Hall) Date: Sat, 26 Mar 2022 21:55:48 +0100 Subject: Periodically refreshing an allowlist in njs Message-ID: Hi I am looking to replace some lua scripting we have to do an allowlist, I found https://medium.com/geekculture/building-a-simple-bot-protection-with-nginx-javascript-module-njs-and-typescript-386b2207ba90 and it seems to give me a good idea how to port my lua const fs = require('fs'); const badReputationIPs = loadFile('/var/lib/njs/ips.txt'); function loadFile(file: string): string[] { let data: string[] = []; try { data = fs.readFileSync(file).toString().split('\n'); } catch (e) { // unable to read file } return data; } function verifyIP(r: NginxHTTPRequest): void { if (badReputationIPs.some((ip: string) => ip === r.remoteAddress)) { r.return(302, '/block.html'); return; } r.internalRedirect('@pages'); } export default { verifyIP }; except I have the lua reloading every 60s by calling ngx.timer.at(60, helpers.load_json) in a init_worker_by_lua Is there an equivalent in njs? It seems not if you are starting a new lightweight VM for every request. Is the answer to do a subrequest and cache it or something? Thanks, Tom From osa at freebsd.org.ru Sat Mar 26 21:45:55 2022 From: osa at freebsd.org.ru (Sergey A. Osokin) Date: Sun, 27 Mar 2022 00:45:55 +0300 Subject: Periodically refreshing an allowlist in njs In-Reply-To: References: Message-ID: Hi Tom, hope you're doing well. On Sat, Mar 26, 2022 at 09:55:48PM +0100, Tom Hall wrote: > > I am looking to replace some lua scripting we have to do an allowlist, > I found https://medium.com/geekculture/building-a-simple-bot-protection-with-nginx-javascript-module-njs-and-typescript-386b2207ba90 > and it seems to give me a good idea how to port my lua > > const fs = require('fs'); > const badReputationIPs = loadFile('/var/lib/njs/ips.txt'); > > Is there an equivalent in njs? It seems not if you are starting a new > lightweight VM for every request. > Is the answer to do a subrequest and cache it or something? I'd recommend to tweak the architecutre a bit and use a key-value store instead of an access to a file. -- Sergey A. Osokin From xeioex at nginx.com Sun Mar 27 17:07:09 2022 From: xeioex at nginx.com (Dmitry Volyntsev) Date: Sun, 27 Mar 2022 10:07:09 -0700 Subject: Periodically refreshing an allowlist in njs In-Reply-To: References: Message-ID: Hi Tom, On 26.03.2022 13:55, Tom Hall wrote: > Is the answer to do a subrequest and cache it or something? > Yes, one way is to make a subrequest to a location when caching is enabled.