<div dir="ltr"><div>Hi Maxim,</div><div><br></div><div>we have roundabout 7k ips in use, 3k ipv6, 4k ipv4 and 52 workers.</div><div>that results in ~364000 ips which need to be bound - twice that in sockets if i count port 80 and 443.</div><div><br></div><div>we have indeed reuseport active - we already thought about using a wildcard-address on a socket, but didnt have time to investigate and test thoroughly.. <br></div><div>if its really only useful for balancing udp we might be able to get rid of it.</div><div><br></div><div>we are aware of the need to reduce the number of listening sockets and config-size per server, however this will be challenging and involve changes on a lot of levels..<br></div><div>i'll have to look into that again..</div><div><br></div><div>thank you for your suggestions in any case!<br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Dec 6, 2022 at 1:34 AM Maxim Dounin <<a href="mailto:mdounin@mdounin.ru">mdounin@mdounin.ru</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hello!<br>

<br>

On Mon, Dec 05, 2022 at 09:43:18PM +0100, Charlie Kilo wrote:<br>

<br>

> I know the problem also from an environment with many sites and thousands<br>

> of ips to bind to. for us the problem  is that nginx binds every worker to<br>

> every ip sequentially - leading to a restart time of 10-15 minutes. the<br>

> problem can easily be observed using strace on the master process during<br>

> startup.. we couldn't find an easy solution so far.<br>

<br>

Could you please share some numbers and details of the <br>

configuration?  Some strace output with timestamps might be also <br>

helpful (something like "strace -ttT" would be great).<br>

<br>

While binding listening sockets indeed happens sequentially, it is <br>

expected to take at most seconds even with thousands of listening <br>

sockets, and even under load, not minutes.  It would be <br>

interesting to dig into what causes 10-15 minutes restart time.<br>

<br>

In particular, in ticket #2188 <br>

(<a href="https://trac.nginx.org/nginx/ticket/2188" rel="noreferrer" target="_blank">https://trac.nginx.org/nginx/ticket/2188</a>), which was about <br>

speeding up "nginx -t" with lots of listening sockets under load, <br>

opening 20k listening sockets (expanded from about 1k sockets in <br>

the configuration with "listen ... reuseport" and multiple worker <br>

processes) was observed to take about 1 second without load (and <br>

up to 15 seconds under load, though this shouldn't affect restart).<br>

<br>

Also note that nginx provides a lot of ways to actually do not <br>

open that many sockets (including using a single socket on a <br>

wildcard address for a given port instead of a socket for each IP <br>

address, and not using reuseport, which is really needed only if <br>

you are balancing UDP).  If the issue you are observing is indeed <br>

due to slow bind() calls, one of the possible solutions might be <br>

to reduce the number of listening sockets being used.<br>

<br>

-- <br>

Maxim Dounin<br>

<a href="http://mdounin.ru/" rel="noreferrer" target="_blank">http://mdounin.ru/</a><br>

_______________________________________________<br>

nginx mailing list -- <a href="mailto:nginx@nginx.org" target="_blank">nginx@nginx.org</a><br>

To unsubscribe send an email to <a href="mailto:nginx-leave@nginx.org" target="_blank">nginx-leave@nginx.org</a><br>

</blockquote></div>