fsync()-in webdav PUT

Fri Mar 2 10:42:17 UTC 2018

Atilla,

man page quote is related to the Valery’s argument that fsync wont affect performance, forget it.

It’s nonsense because you’re trying to solve the reliability problem at the different level,
it has been multiple times suggested here already by maxim and Paul, that it’s better 
to invest to the good server/storage infrastructure, instead of fsyncing each PUT.

Regarding the DB server analogy, you’re still not save from the power outages as long as your
transaction isn’t in a transaction log. 

If you’re still consent with syncing and ready to sacrifice your time, try mounting a file system
with ‘sync’ option.

br,
Aziz.

> On 2 Mar 2018, at 12:12, Nagy, Attila <bra at fsn.hu> wrote:
> 
> On 02/28/2018 03:08 PM, Maxim Dounin wrote:
>> The question here is - why you want the file to be on disk, and
>> not just in a buffer?  Because you expect the server to die in a
>> few seconds without flushing the file to disk?  How probable it
>> is, compared to the probability of the disk to die?  A more
>> reliable server can make this probability negligible, hence the
>> suggestion.
> Because the files I upload to nginx servers are important to me. Please step back a little and forget that we are talking about nginx or an HTTP server.
> We have data which we want to write to somewhere.
> Check any of the database servers. Would you accept a DB server which can loose confirmed data or couldn't be configured that way that a write/insert/update/commit/whatever you use to modify or put data into it operation is reliably written by the time you receive acknowledgement?
> Now try to use this example. I would like to use nginx to store files. That's what HTTP PUT is for.
> Of course I'm not expecting that the server will die every day. But when that happens, I want to make sure that the confirmed data is there.
> Let's take a look at various object storage systems, like ceph. Would you accept a confirmed write to be lost there? They make a great deal of work to make that impossible.
> Now try to imagine that somebody doesn't need the complexity of -for example- ceph, but wants to store data with plain HTTP. And you got there. If you store data, then you want to make sure the data is there.
> If you don't, why do you store it anyways?
> 
>> (Also, another question is what "on the disk" meas from physical
>> point of view.  In many cases this in fact means "somewhere in the
>> disk buffers", and a power outage can easily result in the file
>> being not accessible even after fsync().)
> Not with good software/hardware. (and it doesn't really have to be super good, but average)
> 
>> 
>>> Why doing this in a thread is not a good idea? It would'nt block nginx
>>> that way.
>> Because even in threads, fsync() is likely to cause performance
>> degradation.  It might be a better idea to let the OS manage
>> buffers instead.
>> 
> Sure, it will cause some (not much BTW in a good configuration). But if my primary goal is to store files reliably, why should I care?
> I can solve that by using SSDs for logs, BBWCs and a lot more thing. But in the current way, I can't make sure that a HTTP PUT was really successful or it will be successful in some seconds or it will fail badly.
> 
> _______________________________________________
> nginx mailing list
> nginx at nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx