[PATCH 1 of 2] fix sched_setaffinity(2) call failure

Joshua Zhu zhuzhaoyuan at gmail.com
Mon Jan 16 05:52:09 UTC 2012


Hi,

On Sat, Jan 14, 2012 at 2:16 AM, Maxim Dounin <mdounin at mdounin.ru> wrote:

> Hello!
>
> On Thu, Jan 12, 2012 at 11:37:00PM +0800, Joshua Zhu wrote:
>
> > Hi,
> >
> > 'cpu_set_t' should be used instead of 'long' when calling
> > sched_setaffinity(2). Otherwise it may fail on some Linux systems.
> > Please see the attachment for more detail.
>
> According to sched_setaffinity(2) at least in kernels >= 2.6.9 it
> should be ok to use any length as long as bits above kernel mask
> used aren't referenced.
>
>
I've tested it on Linux 2.6.9 (Red Hat Enterprise Linux AS release 4) and
unfortunately sched_setaffinity(2) failed.


> I believe the real problem is that code uses 32 as cpusetsize,
> while this is the size in bytes.  Hence it references arbitrary
> memory and this causes EINVAL to be returned by glibc if there are
> bits referenced above kernel cpumask size, see
> sysdeps/unix/sysv/linux/sched_setaffinity.c in glibc's sources.
>
> The following patch should be enough:
>
> diff -r e1296af53cc0 src/os/unix/ngx_process_cycle.c
> --- a/src/os/unix/ngx_process_cycle.c   Mon Dec 26 00:00:00 2011 +0400
> +++ b/src/os/unix/ngx_process_cycle.c   Fri Jan 13 11:58:01 2012 -0500
> @@ -914,7 +914,10 @@
>         ngx_log_error(NGX_LOG_NOTICE, cycle->log, 0,
>                       "sched_setaffinity(0x%08Xl)", cpu_affinity);
>
> -        if (sched_setaffinity(0, 32, (cpu_set_t *) &cpu_affinity) == -1) {
> +        if (sched_setaffinity(0, sizeof(cpu_affinity),
> +                              (cpu_set_t *) &cpu_affinity)
> +            == -1)
> +        {
>             ngx_log_error(NGX_LOG_ALERT, cycle->log, ngx_errno,
>                           "sched_setaffinity(0x%08Xl) failed",
> cpu_affinity);
>         }
>
> Could you please test if it works for you?
>

Yes, it works on one of my Linux box which has two cores. But I have not
tested it on more other systems yet.


>
> It passes my limited testing, but I wasn't really able to
> reproduce the problem without modifying the sched_setaffinity()
> call in question to use bigger cpusetsize.
>
> [...]
>
> > diff -uprN nginx-1.1.12/src/core/nginx.c
> nginx-1.1.12-bugfix/src/core/nginx.c
> > --- nginx-1.1.12/src/core/nginx.c     2011-11-16 04:35:42.000000000 +0800
> > +++ nginx-1.1.12-bugfix/src/core/nginx.c      2012-01-12 11:06:00.000000000
> +0800
> > @@ -1245,15 +1245,15 @@ ngx_set_cpu_affinity(ngx_conf_t *cf, ngx
> >      ngx_core_conf_t  *ccf = conf;
> >
> >      u_char            ch;
> > -    u_long           *mask;
> > +    cpu_set_t        *mask;
> >      ngx_str_t        *value;
> > -    ngx_uint_t        i, n;
> > +    ngx_uint_t        i, j, n;
> >
> >      if (ccf->cpu_affinity) {
> >          return "is duplicate";
> >      }
> >
> > -    mask = ngx_palloc(cf->pool, (cf->args->nelts - 1) * sizeof(long));
> > +    mask = ngx_palloc(cf->pool, (cf->args->nelts - 1) *
> sizeof(cpu_set_t));
> >      if (mask == NULL) {
> >          return NGX_CONF_ERROR;
> >      }
>
> [...]
>
> While I generally agree that using cpu_set_t and CPU_* macros
> explicitly would be better, I don't think we want to introduce
> linux-centric code here in platform-independent code.
>
> Instead we probably want to leave this code generic enough to be
> useable with other scheduler interfaces, and convert the data
> somewhere near platform-specific whatever_setaffinity() calls if
> needed.  (Likely unneeded for the Linux with the above patch.)
>
>
We use cpu_set_t and CPU_* macros in the patch not only because it's more
generic and not depend on the implementation of the kernel (I guess the
code in Nginx now is inspired by some _early_ kernel implementation, i.e.
Linux 2.5.x), but also we will have no 32-core-limitation any more
('"worker_cpu_affinity" supports up to 32 CPU only').

I agree that the set_cpu_affinity related code in Nginx core should be
platform-independent. But the code now is _not_ (please look at those
NGX_HAVE_SCHED_SETAFFINITY macros). And I do recommend move the
implementation of ngx_set_cpu_affinity() to the os/ directory when
worker_cpu_affinity is supported on other platforms.

Thank you very much for the review :)

Regards,

-- 
Joshua Zhu
Senior Software Engineer
Server Platforms Team at Taobao
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20120116/d1b9afcc/attachment.html>


More information about the nginx mailing list