[PATCH] Do not break downstream connection on proxy_cache write error

Maxim Dounin mdounin at mdounin.ru
Mon May 2 21:01:52 UTC 2022


Hello!

On Mon, May 02, 2022 at 04:36:09PM +0300, Kipras Mancevicius wrote:

> # HG changeset patch
> # User Kipras Mancevicius <kiprasm at gmail.com>
> # Date 1651495240 -10800
> #      Mon May 02 15:40:40 2022 +0300
> # Branch proxy_cache_write_disk_full_fix
> # Node ID e6e708fd7958987027f6de7748d948e5015ed32b
> # Parent  a736a7a613ea6e182ff86fbadcb98bb0f8891c0b
> Do not break downstream connection on proxy_cache write error
> 
> Continue sending response to downstream connection and delete temp cache file on
> upstream connection finalization.
> 
> diff -r a736a7a613ea -r e6e708fd7958 src/core/ngx_file.c
> --- a/src/core/ngx_file.c	Tue Feb 08 17:35:27 2022 +0300
> +++ b/src/core/ngx_file.c	Mon May 02 15:40:40 2022 +0300
> @@ -765,6 +765,13 @@
>                                ngx_delete_file_n " \"%s\" failed", name);
>  
>              }
> +        } else {
> +            /*
> +             * if the file failed to fully copy (e.g. target disk is full) - it
> +             * may remain (partially written) in the target disk. Attempt to
> +             * delete it.
> +             */
> +            ngx_delete_file(name);

Trying to delete a file "just in case" looks clearly wrong.  If at 
all, this is to be done in ngx_copy_file(), the function which 
created the file and knows what in fact happened.  Further, it is 
not clear if it's to be done at all: in many practical cases an 
incomplete file is better than no file at all, and this is a 
generic function.

Also, this looks irrelevant to the issue being addressed.

>          }
>  
>          ngx_free(name);
> diff -r a736a7a613ea -r e6e708fd7958 src/event/ngx_event_pipe.c
> --- a/src/event/ngx_event_pipe.c	Tue Feb 08 17:35:27 2022 +0300
> +++ b/src/event/ngx_event_pipe.c	Mon May 02 15:40:40 2022 +0300
> @@ -753,10 +753,21 @@
>          out = p->writing;
>          p->writing = NULL;
>  
> -        n = ngx_write_chain_to_temp_file(p->temp_file, NULL);
> +        /*
> +         * stop writing to cache on cache write fail, but keep sending response
> +         * downstream. Incomplete temp cache file will be deleted later.
> +         */
> +        if (p->cache_write_fail) {
> +            return NGX_BUSY;
> +        } else {
> +            n = ngx_write_chain_to_temp_file(p->temp_file, NULL);

This looks terribly wrong: if this code will happen actually 
return NGX_BUSY, a thread-based operation won't be completed, 
potentially resulting in a connection hang and/or a socket leak.

>  
> -        if (n == NGX_ERROR) {
> -            return NGX_ABORT;
> +            if (n == NGX_ERROR) {
> +                p->cache_write_fail = 1;
> +                p->cacheable = 0;
> +
> +                return NGX_BUSY;

This looks wrong: if an error happens as a result of a 
thread-based operation, it is incorrect to just ignore this.  Data 
being written needs to be recovered somehow.

> +            }
>          }
>  
>          goto done;
> @@ -777,6 +788,23 @@
>          out = p->in;
>      }
>  
> +    /*
> +     * stop writing to cache on cache write fail, but keep sending response
> +     * downstream. Incomplete temp cache file will be deleted later.
> +     */
> +    if (p->cache_write_fail) {
> +        return NGX_BUSY;
> +    } else {
> +        n = ngx_write_chain_to_temp_file(p->temp_file, out);
> +
> +        if (n == NGX_ERROR) {
> +            p->cache_write_fail = 1;
> +            p->cacheable = 0;
> +
> +            return NGX_BUSY;
> +        }
> +    }
> +

This also looks wrong:

- The code tries to write everything in out, ignoring things like 
  temp_file_write_size and max_temp_file_size.

- In case of success, the following code will write the data 
  again, resulting in file corruption and likely various other 
  issues.

Further, the following writes (which are the only writes 
meaningful for the current request, as you don't try to update 
p->out) are not really protected by any error checking, so the 
initial objective of the patch does not seem to be achieved.

Please also note that the ticket in question says "without 
introducing unreasonable complexity in the source code" for a 
reason.

Hope this helps.

-- 
Maxim Dounin
http://mdounin.ru/



More information about the nginx-devel mailing list