[WIP/RFC v4 0/5] Support abstract Unix sockets

Alejandro Colomar alx.manpages at gmail.com
Wed Aug 24 23:14:48 UTC 2022


Hi Maxim,

On 8/25/22 00:14, Maxim Dounin wrote:
[...]

>>> in most production setups
>>> it makes no difference, yet used to introduce various issues.
>>
>> The only one I can think of is security, since you don't have file
>> permissions, but one could also argue that the security can't be lower
>> than that of TCP sockets.
> 
> Apart from being non-scalable to multiple hosts and various issues

Yeah, that one is obvious.  But it should be trivial to change the 
configuration to use a TCP one.

> with socket files, Unix sockets used to trigger various kernel
> bugs in different OSes.  In particular, there were multiple
> sendfile() issues, at least on FreeBSD and IIRC on Linux.

Never heard of those.  I would guess that they are old bugs not present 
in current kernels.  Of course, if there have been bugs, there might be 
other bugs in the future, but I wouldn't go so far as to not recommend 
UDS at all, and instead go always for TCP.

> 
>>>> Nginx already supports UDS.  But UDS has a problem: the
>>>> kernel creates a file in the fs, and it's not always trivial to clean up
>>>> those files.  Then, if the application is restarted, there's no
>>>> SO_REUSEADDR to allow reusing the socket file, so the application will
>>>> just fail.
>>>>
>>>> This happens in nginx Unit, which creates listener sockets from a
>>>> privileged thread, and then uses them from unprivileged threads.  When
>>>> the unprivileged thread stops using the socket, it can't remove the
>>>> file, and doing so would require huge complexity to implement.  It's
>>>> easier to just tell the kernel we want an abstract UDS (AUDS), so that
>>>> there's no file at all.  Then if the user restarts Unit, it'll be able
>>>> to recreate the AUDS.
>>>>
>>>> A user reported this problem with normal UDS and we concluded that the
>>>> easiest solution would be to add support for AUDS.  His set-up is a
>>>> kubernetes pod, where a container uses nginx and another container uses
>>>> Unit.  Communicating through an AUDS would be trivial and fast.
>>>
>>> So, you are trying to implement abstract namespace sockets as a
>>> bandaid for Unit bug, which is not able to properly remove Unix
>>> sockets in some cases and fails to restart, correct?
>>
>> Yeah, it helps avoid fixing that bug.  But that's why the Linux kernel
>> implemented that feature in the first place: so that programs don't need
>> to cleanup the filesystem in these cases, recognizing that it may not be
>> always trivial.
>>
>>> This does
>>> not look like a valid reason to me, especially given that a)
>>> abstract namespace sockets are Linux-only, and
>>
>> Since other systems can use localhost, I don't think it's so important
>> to support abstract sockets in them (it's not losing a big feature, but
>> rather a minor improvement).  What Unit (and some other web servers do)
>> is just report an error in the configuration.
> 
> The point is that you'll have to fix the bug anyway, since the bug
> is there and, even assuming the workaround is good enough, it's
> Linux-specific.

I would say we don't need to fix it, even if AUDS can't be used. 
localhost is as good as them in terms of usability.  AUDS only helps in 
performance, compared to localhost.

If a user really wants to use file sockets, they can clean up /run with 
a script in between runs of the program.  Unit should very rarely need 
to restart if the OS doesn't.  So for this very rare case, 
localhost/AUDS are the simple solution.  Since support for AUDS is not 
so hard to implement, I thought we could go for it.

In fact, we had some buggy implementation of AUDS in Unit, written a 
long time ago by Igor, and I just fixed the bugs to make it work, but I 
didn't write it from scratch in Unit.  (BTW, he used the curl(1) approach.)

> 
> If you are looking for a workaround, a readily available one is
> using TCP sockets.

Yep.

In fact, nginx really supports abstract sockets, or I should say socket 
(singular).  If you try to proxy_pass to any address starting with a NUL 
character, it will try to connect to an abstract socket with a name 
consisting of all 0s.  This patch set only makes it more useful, but for 
a single socket, it already works.

> 
>>> b) there are
>>> multiple approaches to filing the abstract socket address.
>>
>> I don't understand this last point.  Do you mean the NUL and @
>> alternatives?  NUL is the one supported by the kernel, and so the most
>> obvious one.  Since NUL is not usable by most tools, most programs use
>> the alternative syntax @.  Some programs only allow @; but since it
>> doesn't hurt to allow NUL, and it makes for a very easy implementation,
>> I'd do it that way.
> 
> Since null bytes in the socket address are not special, there is
> more than one way to fill the sockaddr structure.  The patch I've
> referenced fills the struct sockaddr_un's sun_path with null
> bytes, and uses socklen set to sizeof(struct sockaddr_un): this
> approach is compatible with HAProxy's abns@ and socat with
> unix-tightsocklen=0.  In contrast, curl uses different approach
> and instead uses socklen set to exact length of the name provided
> with leading null bytes and no trailing null bytes (and this is
> what you've tried in your patches).

Ahhh, I didn't know those used a different syntax.  I'd go for the 
curl(1) approach, since it is more flexible, and one can always specify 
a socket with the full length by specifying all of the characters in the 
name.  Not very friendly, but it's possible.

Anyway, if you're not convinced by the feature, let me come back to it 
after I do some performance testing in Unit to see how significant are 
the numbers.

Thanks for the review!

Cheers,
Alex


-- 
Alejandro Colomar
<http://www.alejandro-colomar.es/>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://mailman.nginx.org/pipermail/nginx-devel/attachments/20220825/06d7f8ad/attachment.bin>


More information about the nginx-devel mailing list