[Patch] SO_REUSEPORT support from master process
yingqi.lu at intel.com
Fri Aug 22 16:59:39 UTC 2014
The "SO_REUSEPORT support for listen sockets support" patches submitted by Sepherosa Ziehau are posted and discussed in  and . Last update on the threads was 09/05/2013 and the patch is not included in the current Nginx code. Reading from the discussion, my understanding is that his patch makes a dedicated listen socket for each of the child process. In order to make sure at any given time there is always a listen socket available, the patch makes the first worker process different/special than the rest.
Here, I am proposing a simpler way to enable the SO_REUSEPORT support. It is just to create and configure certain number of listen sockets in the master process with SO_REUSEPORT enabled. All the children processes can inherit. In this case, we do not need to worry about ensuring 1 available listen socket at the run time. The number of the listen sockets to be created is calculated based on the number of active CPU threads. With big system that has more CPU threads (where we have the scalability issue), there are more duplicated listen sockets created to improve the throughput and scalability. With system that has only 8 or less CPU threads, there will be only 1 listen socket. This makes sure duplicated listen sockets only being created when necessary. In case that SO_REUSEPORT is not supported by the OS, it will fall back to the default/original behavior (this is tested on Linux kernel 3.8.8 where SO_REUSEPORT is not supported).
This prototype patch has been tested on an Intel modern dual socket platform with a three tier open source web server workload (PHP+Nginx/memcached/MySQL). The web server has 2 IP network interfaces configured for testing. The Linux kernel used for testing is 3.13.9. Data show:
Case 1: with single listen statement (Listen 80) specified in the configuration file, there is 46.3% throughout increase.
Case 2: with dual listen statements (for example, Listen 192.168.1.1:80 and Listen 192.168.1.2:80), there is 10% throughput increase.
Both testing cases keep everything the same except the patch itself to get above result.
The reason that Case1 has bigger performance gains is that Case1 by default only has 1 listen socket while Case2 by default already has 2.
Please review it and let me know your questions and comments. Thanks very much for your time reviewing the patch.
1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the nginx-devel