[PATCH 1 of 2] fix sched_setaffinity(2) call failure
Joshua Zhu
zhuzhaoyuan at gmail.com
Mon Jan 16 05:52:09 UTC 2012
Hi,
On Sat, Jan 14, 2012 at 2:16 AM, Maxim Dounin <mdounin at mdounin.ru> wrote:
> Hello!
>
> On Thu, Jan 12, 2012 at 11:37:00PM +0800, Joshua Zhu wrote:
>
> > Hi,
> >
> > 'cpu_set_t' should be used instead of 'long' when calling
> > sched_setaffinity(2). Otherwise it may fail on some Linux systems.
> > Please see the attachment for more detail.
>
> According to sched_setaffinity(2) at least in kernels >= 2.6.9 it
> should be ok to use any length as long as bits above kernel mask
> used aren't referenced.
>
>
I've tested it on Linux 2.6.9 (Red Hat Enterprise Linux AS release 4) and
unfortunately sched_setaffinity(2) failed.
> I believe the real problem is that code uses 32 as cpusetsize,
> while this is the size in bytes. Hence it references arbitrary
> memory and this causes EINVAL to be returned by glibc if there are
> bits referenced above kernel cpumask size, see
> sysdeps/unix/sysv/linux/sched_setaffinity.c in glibc's sources.
>
> The following patch should be enough:
>
> diff -r e1296af53cc0 src/os/unix/ngx_process_cycle.c
> --- a/src/os/unix/ngx_process_cycle.c Mon Dec 26 00:00:00 2011 +0400
> +++ b/src/os/unix/ngx_process_cycle.c Fri Jan 13 11:58:01 2012 -0500
> @@ -914,7 +914,10 @@
> ngx_log_error(NGX_LOG_NOTICE, cycle->log, 0,
> "sched_setaffinity(0x%08Xl)", cpu_affinity);
>
> - if (sched_setaffinity(0, 32, (cpu_set_t *) &cpu_affinity) == -1) {
> + if (sched_setaffinity(0, sizeof(cpu_affinity),
> + (cpu_set_t *) &cpu_affinity)
> + == -1)
> + {
> ngx_log_error(NGX_LOG_ALERT, cycle->log, ngx_errno,
> "sched_setaffinity(0x%08Xl) failed",
> cpu_affinity);
> }
>
> Could you please test if it works for you?
>
Yes, it works on one of my Linux box which has two cores. But I have not
tested it on more other systems yet.
>
> It passes my limited testing, but I wasn't really able to
> reproduce the problem without modifying the sched_setaffinity()
> call in question to use bigger cpusetsize.
>
> [...]
>
> > diff -uprN nginx-1.1.12/src/core/nginx.c
> nginx-1.1.12-bugfix/src/core/nginx.c
> > --- nginx-1.1.12/src/core/nginx.c 2011-11-16 04:35:42.000000000 +0800
> > +++ nginx-1.1.12-bugfix/src/core/nginx.c 2012-01-12 11:06:00.000000000
> +0800
> > @@ -1245,15 +1245,15 @@ ngx_set_cpu_affinity(ngx_conf_t *cf, ngx
> > ngx_core_conf_t *ccf = conf;
> >
> > u_char ch;
> > - u_long *mask;
> > + cpu_set_t *mask;
> > ngx_str_t *value;
> > - ngx_uint_t i, n;
> > + ngx_uint_t i, j, n;
> >
> > if (ccf->cpu_affinity) {
> > return "is duplicate";
> > }
> >
> > - mask = ngx_palloc(cf->pool, (cf->args->nelts - 1) * sizeof(long));
> > + mask = ngx_palloc(cf->pool, (cf->args->nelts - 1) *
> sizeof(cpu_set_t));
> > if (mask == NULL) {
> > return NGX_CONF_ERROR;
> > }
>
> [...]
>
> While I generally agree that using cpu_set_t and CPU_* macros
> explicitly would be better, I don't think we want to introduce
> linux-centric code here in platform-independent code.
>
> Instead we probably want to leave this code generic enough to be
> useable with other scheduler interfaces, and convert the data
> somewhere near platform-specific whatever_setaffinity() calls if
> needed. (Likely unneeded for the Linux with the above patch.)
>
>
We use cpu_set_t and CPU_* macros in the patch not only because it's more
generic and not depend on the implementation of the kernel (I guess the
code in Nginx now is inspired by some _early_ kernel implementation, i.e.
Linux 2.5.x), but also we will have no 32-core-limitation any more
('"worker_cpu_affinity" supports up to 32 CPU only').
I agree that the set_cpu_affinity related code in Nginx core should be
platform-independent. But the code now is _not_ (please look at those
NGX_HAVE_SCHED_SETAFFINITY macros). And I do recommend move the
implementation of ngx_set_cpu_affinity() to the os/ directory when
worker_cpu_affinity is supported on other platforms.
Thank you very much for the review :)
Regards,
--
Joshua Zhu
Senior Software Engineer
Server Platforms Team at Taobao
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20120116/d1b9afcc/attachment.html>
More information about the nginx
mailing list