I'm having a problem with the Apache web server when running Poplog
processes. It's probably a Unix problem - I'd be grateful for
suggestions. Briefly stated, My CGI script runs a program which
creates a child process. I want that process to live on after the CGI
script exits so that it can be interacted with by later transactions.
Unfortunately, Apache refuses to terminate the first transaction until
the child dies.
More specifically, my CGI script is a shell script which calls a Pop-11
handler program, similar to Steve's. This it does just by the command
'pop11 handler.p' - I haven't yet bothered to build myself a
light-weight executable yet.
The handler then creates a child process which runs my Eden AI program.
I don't want the handler to wait until Eden has exited, so I call
pipeout, with a dummy repeater (the child never uses it), a wait=false
parameter, and command and argument parameters which invoke Pop-11 on
Eden. The handler grabs this child's first few results and writes them
to its output stream, and then exits. It leaves the child running so
that later transactions can resume it and get some later results in the
same series. Synchronisation between the handler and the child uses
files as semaphores - not very efficient, but OK for testing.
All this works very well if I try it in stand-alone mode, 'exec'ing the
CGI script directly or running the Pop-11 handler directly rather than
via Apache. However, when I start the script via a Web transaction, it
seems that Apache wants to wait until the child has finished. I can
follow this by doing successive calls of -ps -gx'. If I kill the child,
my browser abruptly returns all the results (which show, incidentally,
that my semaphore mechanism is working.)
So here are results from ps -gx just after I invoked the CGI script from
my browser:
13734 ? S 0:00 /bin/csh /usr/local/poplog/www/eden.cgi (CGI script)
13736 ? R 0:00 pop11 handler.p (handler)
13739 ? D 0:00 $popsys/pop11 eden/eden.p (child)
And here's what happens a few seconds later. The handler program has
terminated, as I expected. The child is still alive, but is waiting on a
semaphore, and is not holding up any other process. All the results have
been generated. However, Apache is still hanging on. For some reason,
the CGI process is shown as <defunct>, rather than not there.
13734 ? Z 0:00 <defunct> (CGI script)
13739 ? R 0:13 $popsys/pop11 eden/eden.p (child)
If I then kill the child, my browser suddenly spews out all the results,
and ps shows the CGI script process to have vanished.
Almost certainly, this is some property of Unix which I don't
understand. (Someone recently reported on the Unix servers group that
the same happened with the Netscape Web server.) Any ideas, anyone? The
same happens, incidentally, if I use sysfork to create the child.
Jocelyn Paine
|