fsync()-in webdav PUT

Nagy, Attila bra at fsn.hu
Fri Mar 2 09:00:26 UTC 2018


On 02/28/2018 11:33 PM, Peter Booth wrote:
> This discussion is interesting, educational, and thought provoking. 
>  Web architects
> only learn “the right way” by first doing things “the wrong way” and 
> seeing what happens.
> Attila and Valery asked questions that sound logical, and I think 
> there's value in exploring
> what would happen if their suggestions were implemented.
>
> First caveat - nginx is deployed in all manner different scenarios on 
> different hardware
> and operating systems. Physical servers and VMs behave very 
> differently, as do local
> and remote storage. When an application writes to NFS mounted storage 
> there's no guarantee
> that even and synch will correctly enforce a write barrier. Still, if 
> we consider  real numbers:
>
>   * On current model quad socket hosts, nginx can support well over 1
>     million requests per second (see TechEmpower benchmarks)
>   * On the same hardware, a web app that writes to a Postgresql DB can
>     do at least a few thousand writes per second.
>   * A SATA drive might support  300 write IOPS, whilst an SSD will
>     support 100x that.
>
> What this means that doing fully synchronous writes can reduce your 
> potential throughput
> by a factor of 100 or more. So it’s not a great way to ensure consistency.
>
> But there are cheaper ways to achieve the same consistency and 
> reliability characteristics:
>
>   * If you are using Linux then your reads and write swill occur
>     through the page cache - so the actual disk itself really doesn’t
>     matter (whilst your host is up).
>   * If you want to protect against loss of physical disk then use RAID.
>   * If you want to protect against a random power failure then use
>     drives with battery backed caches, so writes will get persisted
>     when a server restarts after a power failure
>
Sorry, but this point shows that you don't understand the problem. A 
BBWC won't save you from random power failure. Because the data is still 
in RAM!
BBWC will save you when you do an fsync and the end of the write (and 
that fsync will still write RAM, but it will be the controller's RAM 
which is protected by battery).
But nginx doesn't do this today. And that's what this discussion is all 
about...

>   * If you want to protect against a crazy person hitting your server
>     with an axe then write to two servers ...
>
And still you won't have it reliably on your disks.


> *But the bottom line is separation of concerns.* Nginx should not use 
> fsync because it isn’t nginx's business.
>
Please suggest at least working solution, which is compatible with 
nginx's asynchronous architecture and it ensures that a successful HTTP 
PUT will mean the data written to a reliable store.

There are several filesystems which can be turned "fsync by default", 
but that will fail miserably because nginx does the writes in the same 
process in the same thread. That's what could be solved by doing at 
least the fsyncs in different threads, so they wouldn't block the main 
thread.

BTW, I'm not proposing this to be the default. It should be an optional 
setting, so if somebody want to maintain the current situation, they 
could do that.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20180302/d98f0ad2/attachment-0001.html>


More information about the nginx mailing list