[BF] slab init + http file cache fixes
mdounin at mdounin.ru
Wed Jun 15 18:52:48 UTC 2016
On Wed, Jun 15, 2016 at 07:08:27PM +0200, Sergey Brester wrote:
> >Not sure it's a good idea to reuse min_uses return code here. I
> >would rather not.
> Otherwise the cacheable requests will fail, just while the zone don't get a
> free node (as long as scarce).
> Because in other respects I do not understand why in other scarce cases
> "ngx_http_file_cache_open" will return NGX_HTTP_CACHE_SCARCE and continue
> processing of the request.
The NGX_HTTP_CACHE_SCARCE is intended to signal to the caller that
min_uses was not yet reached, and that's why the request will not
use cache. And that's the only case where NGX_HTTP_CACHE_SCARCE
is used. Semantics of the error case your are trying to add here
is quite different. This will be an immediate problem if/when
we'll add appropriate $upstream_cache_status value for SCARCE.
(Moreover, I'm not really sure this needs fixing at all, as since
version 1.9.13 cache manger monitors number of elements in the
cache and tries to avoid cache zone overflows. That is, overflows
are not expected to happen often.)
> >The condition added looks wrong: fcn->deleting cannot be set
> >here, as it's only set by this function under lock, and
> >fcn->count++ here will prevent the function from being called
> >As far as I understand, you are trying to change processing of the
> >race which happens if the node in question is grabbed by another
> >request while the file is being deleted (fcn->count != 0 at
> >the function end). Please explain what you are trying to chanage
> >and how it's expected to affect various cases possible.
> Do you really understand what the race condition means? And you see a pair
> lines bellow an ngx_shmtx_unlock for pool, so the second worker can wins and
> goes this block again?
> I can only add, that without this fix I had observed:
> 1) negative value of "cache->sh->size", what evidence, that the count of
> increment is fewer as decrement (same response length, so always the same
> "fcn->fs_size" and no cache update);
> 2) sporadically errors by deleting of cache files;
Race condition here is intended: we are dropping the lock to
prevent lock contention during a potentially long syscall. It is
expected that in some cases another worker will be able to grab
the node, and this will prevent its complete removal. But there
is more than one case possible: e.g., another worker can grab the
node and put an updated file into it before the file will be
deleted, and as a result we'll delete different file here. The
question is: what is the case you are trying to improve, and what
it means for other possible cases.
> >+ goto done;
> >This change looks unrelated. If you think it's needed, please
> >explain the reasons.
> But sure. Otherwise in many cases (e.g. case of scarce) the value of
> "cache->files" will be incremented (with ++cache->files).
The "cache->files" is used to count number of files processed by
the cache loader. When it reaches loader_files, cache loader will
sleep for a while. There is nothing wrong that it is incremented -
as the file was processed, it should be incremented. With the
change in question cache loader won't be able to properly respect
loader_files and loader_threshold, making it use all available
resources in case of errors.
> >See above, the description doesn't match style and doesn't
> >describe that this is a win32 problem.
> I would be not so sure by win32. At least the code is very strange and
> error-prone (for example if tomorrow some child processes will be not forked
> and not executed stub initialization...
If you think this problem can be seen on other platforms - feel
free to point them out as well.
> >ngx_slab_core_init ...
> >I can't say I like the name, but I have no better suggestions.
> What thus you want to tell me?
If you can suggest a better name - please do so.
> >This test is useless with the new logic you've introduced.
> >This call is useless with the new logic you've introduced.
> Both are done if someone uses slab module as API (as module pure outside
> nginx process), so not required to call ngx_slab_core_init extra.
> But I can remove it if disturbing (I wanted to leave it just because this
> one will be called once per process).
We certainly don't want to try to preserve compatibility with
someone who in theory can use slab code outside of nginx. If
somebody really does so, it can add appropriate initialization.
> BTW. I've always the same questions if I look how some people take
> Why such people often assume, that:
> - in contrast to them, the contributor have all the time in the world
> (e.g. to revise and resend an update again and again)?
> - do us a favor, if they assume the commit?
> - think to be the power holder of the project, that is an open source
> that in the meantime the community of the whole world has
> - they are always smarter than those who have found the bug and fixed it?
> Merely as the signal to cogitate...
You may find this article interesting:
More information about the nginx-devel