[patch] Set SO_REUSEADDR on outgoing TCP connections
majek04 at gmail.com
Wed Apr 9 15:53:04 UTC 2014
Usually, when establishing a connection the kernel allocates outgoing
TCP/IP port automatically from an ephemeral port range. Unfortunately
when selecting the outgoing source IP (using bind before connect) the
kernel needs a unique port number. As the result it can only establish
a single outgoing connection from a single source port. This can cause
problems with a large number of outgoing proxy connections - it's
possible for the kernel to run out free ports in the ephemeral range.
The situation can be improved - TCP/IP allows any number of
connections to share outgoing TCP/IP port and host pair assuming the
destination addresses differ.
This patch sets a SO_REUSEADDR flag on the connections that use bind
before connect to select ougoing source address. This will allow the
kernel to reuse source port numbers, given that the destination
addresses are different.
The patch will work perfectly well assuming there aren't too many
connections to one destination address and port. If that happens the
kernel may randomly allocate an outgoing port number that is already
used for a given destination and attempt to connect() will fail with
EADDRNOTAVAIL. This is fairly easy to detect, and we can just retry
connecting again, using another random source port allocated by the
Unfortunately it introduces some nondeterminism, in an extreme
situation a connection attempt may fail while we still have a
theoretical chance of success. This situation is not worse than what
we have right now: currently the number of outgoing ports is strongly
limited by a size of ephemeral port range. With this patch it's
possible to establish pretty much unlimited number of outgoing
connections, assuming there are many destinations.
To work around the situation of thousands connections to the same
destination address, we will retry connection a few times before
giving up. The patch hardcodes a retry count of 8, which I believe
strikes the right balance between the probability of success and the
cost of retrying socket allocation.
Assuming 1 connection already present to exactly the same destination,
the probability of collision is 1/ephemeral_port_range given no retry
Given 8 retries we get following numbers:
* If 1% of ephemeral_ports are busy with given destination address,
eight retry attempts will fail for a one connection in 9999999999999998.
* For 10%: one in 100000000
* For 50%: one in 256
Finally, during the last retry run we do *not* set the SO_REUSEADDR
flag, making sure the kernel really doesn't have any free port
left. Unfortunately there is a side effect to not setting this flag:
we limit the outgoing port range for further connections, as source
ports without SO_REUSEADDR can't be reused.
Copy of the patch: https://gist.github.com/anonymous/10285483
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 5826 bytes
Desc: not available
More information about the nginx-devel