[PATCH] Allow binary upgrades in Solaris zones
Maxim Dounin
mdounin at mdounin.ru
Thu Jan 6 02:53:36 MSK 2011
Hello!
On Wed, Jan 05, 2011 at 07:22:08PM +0000, doug at hcsw.org wrote:
> Hello nginx-devel,
>
> Thank you very much for nginx.
>
> When running nginx in a Solaris zone, I am unable to do a binary upgrade without
> fully stopping and starting nginx. When I send the master process a USR2 signal,
> it refuses to do the upgrade and writes the following log message:
>
> 2011/01/04 16:00:23 [crit] 3818#0: the changing binary signal is ignored: you should shutdown or terminate before either old or new binary's process
>
> After looking at the code, it seems that nginx assumes if the master process's parent
> does not have PID == 1, then nginx is not running in stand-alone daemon mode and the
> upgrade should not be attempted.
>
> My problem is that in Solaris zones the master process's parent is actually the
> zsched process and this never has PID == 1. The real init process is not visible
> inside the zone at all.
Yes, nginx checks if previous upgrade was finished by checking
parent pid to be 1. This behaviour is indeed not portable, as
POSIX[1] says:
[1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/_Exit.html
% The parent process ID of all of the existing child processes and
% zombie processes of the calling process shall be set to the
% process ID of an implementation-defined system process. That is,
% these processes shall be inherited by a special system process.
...
% Historically, the implementation-defined process that inherits
% children whose parents have terminated without waiting on them is
% called init and has a process ID of 1.
So basically nginx relies on historical behaviour.
> I am attaching a patch against 0.9.3 that (only if NGX_SOLARIS is defined) checks to
> see if a root process can send a signal to init and, if not, assumes we are running
> in a zone and goes ahead with the binary upgrade. With this patch I am able to do
> 0-downtime binary upgrades in Solaris zones with no problems. Any other solutions
> would also be appreciated.
I don't really like this aproach, it looks fragile and actually
adds another non-portable hack instead of fixing original
non-portability.
Probably passing real parent pid from old binary and checking if
getppid() [doesn't] match whould be better aproach (at least, it
should be portable).
Maxim Dounin
More information about the nginx-devel
mailing list