fsync()-in webdav PUT

Valery Kholodkov valery+nginxen at grid.net.ru
Thu Mar 1 12:27:43 UTC 2018


You can also apply online:

https://angel.co/qubiq-digital-b-v/jobs

That's more 2018-nish.

On 01-03-18 13:24, Valery Kholodkov wrote:
> I admire your wise approach to this discussion, as well your technical
> expertise! I see the value in people who know the right way, but I see
> the value in people who dare to explore and want learning the right way.
> Coincidently, at Qubiq Labs we're looking for that kind of kick-ass
> Systems and Performance Architect to run and scale our software and
> infrastructure.
>
> If you're challenged by the intricates of online marketing industry and
> tons of traffic, we'd love to get your application at info at qubiqlabs.com
>
> It's a funded and growing startup with a lots of interesting projects
> that you always dreamed of if you're into nginx.
>
> So, make sure to shoot us an email and don't forget to mention "I want
> that job" in the subject!
>
> On 28-02-18 23:33, Peter Booth wrote:
>> This discussion is interesting, educational, and thought provoking.  Web
>> architects
>> only learn “the right way” by first doing things “the wrong way” and
>> seeing what happens.
>> Attila and Valery asked questions that sound logical, and I think
>> there's value in exploring
>> what would happen if their suggestions were implemented.
>>
>> First caveat - nginx is deployed in all manner different scenarios on
>> different hardware
>> and operating systems. Physical servers and VMs behave very differently,
>> as do local
>> and remote storage. When an application writes to NFS mounted storage
>> there's no guarantee
>> that even and synch will correctly enforce a write barrier. Still, if we
>> consider  real numbers:
>>
>>   * On current model quad socket hosts, nginx can support well over 1
>>     million requests per second (see TechEmpower benchmarks)
>>   * On the same hardware, a web app that writes to a Postgresql DB can
>>     do at least a few thousand writes per second.
>>   * A SATA drive might support  300 write IOPS, whilst an SSD will
>>     support 100x that.
>>
>> What this means that doing fully synchronous writes can reduce your
>> potential throughput
>> by a factor of 100 or more. So it’s not a great way to ensure
>> consistency.
>>
>> But there are cheaper ways to achieve the same consistency and
>> reliability characteristics:
>>
>>   * If you are using Linux then your reads and write swill occur through
>>     the page cache - so the actual disk itself really doesn’t matter
>>     (whilst your host is up).
>>   * If you want to protect against loss of physical disk then use RAID.
>>   * If you want to protect against a random power failure then use
>>     drives with battery backed caches, so writes will get persisted when
>>     a server restarts after a power failure
>>   * If you want to protect against a crazy person hitting your server
>>     with an axe then write to two servers ...
>>
>> *But the bottom line is separation of concerns.* Nginx should not use
>> fsync because it isn’t nginx's business.
>>
>> My two cents,
>>
>> Peter
>>
>>
>>> On Feb 28, 2018, at 4:41 PM, Aziz Rozyev <arozyev at nginx.com
>>> <mailto:arozyev at nginx.com>> wrote:
>>>
>>> Hello!
>>>
>>> On Wed, Feb 28, 2018 at 10:30:08AM +0100, Nagy, Attila wrote:
>>>
>>>> On 02/27/2018 02:24 PM, Maxim Dounin wrote:
>>>>>
>>>>>> Now, that nginx supports running threads, are there plans to
>>>>>> convert at
>>>>>> least DAV PUTs into it's own thread(pool), so make it possible to do
>>>>>> non-blocking (from nginx's event loop PoV) fsync on the uploaded
>>>>>> file?
>>>>> No, there are no such plans.
>>>>>
>>>>> (Also, trying to do fsync() might not be the best idea even in
>>>>> threads.  A reliable server might be a better option.)
>>>>>
>>>> What do you mean by a reliable server?
>>>> I want to make sure when the HTTP operation returns, the file is on the
>>>> disk, not just in a buffer waiting for an indefinite amount of time to
>>>> be flushed.
>>>> This is what fsync is for.
>>>
>>> The question here is - why you want the file to be on disk, and
>>> not just in a buffer?  Because you expect the server to die in a
>>> few seconds without flushing the file to disk?  How probable it
>>> is, compared to the probability of the disk to die?  A more
>>> reliable server can make this probability negligible, hence the
>>> suggestion.
>>>
>>> (Also, another question is what "on the disk" meas from physical
>>> point of view.  In many cases this in fact means "somewhere in the
>>> disk buffers", and a power outage can easily result in the file
>>> being not accessible even after fsync().)
>>>
>>>> Why doing this in a thread is not a good idea? It would'nt block nginx
>>>> that way.
>>>
>>> Because even in threads, fsync() is likely to cause performance
>>> degradation.  It might be a better idea to let the OS manage
>>> buffers instead.
>>>
>>> --
>>> Maxim Dounin
>>> http://mdounin.ru/
>>> _______________________________________________
>>> nginx mailing list
>>> nginx at nginx.org <mailto:nginx at nginx.org>
>>> http://mailman.nginx.org/mailman/listinfo/nginx
>>
>>
>>
>> _______________________________________________
>> nginx mailing list
>> nginx at nginx.org
>> http://mailman.nginx.org/mailman/listinfo/nginx
>>
> _______________________________________________
> nginx mailing list
> nginx at nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx


More information about the nginx mailing list