<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">On Tue, Sep 3, 2013 at 10:36 PM, Maxim Dounin <span dir="ltr"><<a href="mailto:mdounin@mdounin.ru" target="_blank">mdounin@mdounin.ru</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hello!<br></blockquote><div><br></div><div><br>Hi,<br></div><div><br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
> Well, the idea is to keep at least one listen socket opened. Maybe I could<br><div class="im">
> find other way in kernel to make it less tricky. However, that may add<br>
> extra syscall or socket option.<br>
<br>
</div>I think extra syscall/socket option will be ok as long as it'll<br>
save us from the hassle of opening sockets. Not sure what to do<br>
with Linux compatibility though.<br></blockquote><div><br><br></div><div>Yeah, this is also my concern.<br></div><div><br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Another aproach which may be slightly better than the code is your<br>
last patch is to reopen sockets before spawning each worker<br>
process: this way, master may keep listen sockets open (listen<br>
queue is shared with the same socket as inherited by a worker<br>
process then, right?) and worker processes are equal and don't<br>
need to open sockets themself. It needs careful handling on dead<br>
process respawn codepath though.<br></blockquote><div><br></div><div><br>This may be doable and could better than my approach. I will take a look at the code and try implementing it.<br></div><div><br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div class="im">> > (We've also discussed this here in office serveral times, and it<br>
> > seems that general consensus is that SO_REUSEPORT for TCP balancing<br>
> > isn't really good interface. It would be much easier for everyone<br>
> > if normal workflow with inherited listen socket descriptors just<br>
> > worked. Especially given the fact that in nginx case it's mostly<br>
> > about benchmarking, since in real life load distribution between<br>
> > worker processes is good enough.)<br>
><br>
><br>
> In DragonFly, SO_REUSEPORT is more than load balance: it makes the accepted<br>
> sockets network processing completely CPU localized (from user land to<br>
> kernel land on both RX and TX path). This level of network processing CPU<br>
> localization could not be achieved by the old listen socket inheritance<br>
> usage model (even if I could divide listen socket's completion queue to<br>
> each CPU base on RX hash, the level of CPU localization achieved by<br>
> SO_REUSEPORT still could not be achieved easily).<br>
<br>
</div>Could you please point out how it's achieved?<br>
<br></blockquote><div><br></div><div><br>I have just put something up, which may help understanding what I have described above. Here it is:<br><a href="http://leaf.dragonflybsd.org/~sephe/netisr_so_reuseport.txt">http://leaf.dragonflybsd.org/~sephe/netisr_so_reuseport.txt</a><br>
</div><div><br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
We here tend to think that proper interface from an application<br>
point of view would be to implement a socket option which<br>
basically creates separate listen queues for inherited sockets.<br>
But if this isn't going to work, it's probably better to focus on<br>
SO_REUSEPORT.<br></blockquote><div><br></div><div><br>Well, I think I am going to stick w/ SO_REUSEPORT, mainly because the implementation is simple, straightforward, less invasive and the result is good. Besides, user space applications only need small changes to the listen socket related code (most of the time, it is quite simple), which means easy adoption. And in addition to TCP listen socket, SO_REUSEPORT also helps UDP socket reception load distribution and processing CPU localization.<br>
</div><div><br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
BTW, are you going to be on the upcoming EuroBSDcon? I'm not, but<br>
Igor and Gleb Smirnoff (<a href="mailto:glebius@freebsd.org">glebius@freebsd.org</a>) will be there, and it<br>
will be cool if you'll meet and discuss the SO_REUSEPORT usage for<br>
balancing.<br>
<div class=""><div class="h5"><br></div></div></blockquote><div><br><br></div><div>Sorry, I am not going to attend EuroBSDcon. However, it will be cool if we could discuss (through email) about SO_REUSEPORT or something that you folks are planning.<br>
</div><div><br></div></div>Best Regards,<br>sephe<br clear="all"></div><div class="gmail_extra"><br>-- <br>Tomorrow Will Never Die
</div></div>