TIP:            240
Title:          An Ensemble Command to Manage Processes
Version:        $Revision: 1.14 $
Author:         Steve Bold <stevebold@hotmail.com>
State:          Draft
Type:           Project
Vote:           Pending
Created:        22-Feb-2005
Post-History:   
Keywords:       Tcl
Obsoletes:      88
Tcl-Version:    8.7

~ Abstract

This TIP proposes some new commands through which Tcl scripts can create and
monitor child processes.

~ Rationale

This TIP is intended to overcome the following limitations of the existing
'''exec''' and '''open''' commands:

 1. While the stderr stream of a child process can be redirected to a file, it
    cannot be directed to a pipe and so cannot be captured progressively as
    the process runs. [202] has partially addressed this issue but only for
    the case where the child's stderr stream is directed to the same pipe as
    its stdout stream. Independent progressive capture of both stdout and
    stderr is still not possible.

 2. In the (admittedly rare) case that a program has a significant delay
    between closing its standard streams and the process itself terminating, a
    Tcl script running that program as a background process cannot determine
    the exit status without blocking until the process terminates.

 3. The existing '''exec''' and '''open''' commands impose a special
    interpretation on the characters ''<>|&''. This causes two kinds of
    problems:

 > * scripts wishing to invoke a command on a remote computer using an ''rsh''
     or similar command will sometimes wish to have characters such as
     ''<>|&'' interpreted on the remote machine

 > * scripts may pass a user entered string as an argument to exec. Such
     scripts may break unexpectedly if the comment string contains one of the
     special characters. Such problems could be considered a security weakness
     in Tcl.

 4. Multiple child processes can be launched together with pipes used to link
    the streams of adjacent processes. However, little flexibility is provided
    in such cases, for example you can only capture the exit status of the
    last process in the pipeline.

A more general problem is that each process related command is a separate
top-level command. This is inconsistent with much else in Tcl, makes it harder
to find the related commands in some forms of documentation and increases the
risk of name clashes as new process related commands are introduced.

The BLT toolkit contains the command '''bgexec''' which addresses items (1)
and (2) in the above list. However, the resulting implementation is complex
and does not appear easy to transfer to the Tcl core. In addition, it is not
clear to the author how '''bgexec''' could be extended to address items (3)
and (4).

A variety of other approaches to addressing these problems are listed on the
Wiki [http://wiki.tcl.tk/1353]. This suggests that it may be difficult to
achieve a consensus on what the ideal command(s) for launching processes
should look like. This TIP provides a basis through which many of these
approaches could be implemented in pure Tcl. The commands specified in this
TIP map easily onto the existing low level process related functions in the
Tcl core, so the implementation cost is low.

~ Specification

There shall be a new ensemble command, '''process''', with at least four
subcommands.

 1. The sub-command '''invoke''' takes 4 arguments and invokes a sub-process,
    returning the process id of the child process. The arguments are (in
    order):

 > * a list containing the program name invoke and its arguments

 > * a channel to be connected to the stdin stream of the child process (or an
     empty string if the channel is to be disconnected in the child process).

 > * a channel to be connected to the stdout stream of the child process (or
     an empty string if the channel is to be disconnected in the child
     process).

 > * a channel to be connected to the stderr stream of the child process (or
     an empty string if the channel is to be disconnected in the child
     process).

 2. The sub-command '''pipe''' takes no arguments and returns a two element
    list containing the input and output channels of the pipe in that order.

 3. The sub-command '''status''' takes a single argument which is a process id
    and returns a two element list. The first element is either ''running'' or
    ''completed'' The second element is the exit status of the process.

 > A process will report an arbitrary exit status of zero while it is running.

 4. The sub-command '''wait''' is similar to '''status''' but blocks until the
    child process has completed.

~ Examples

The following shows how the commands proposed here can be used to produce a
'''bgexec''' like command in pure Tcl. Not all the '''bgexec''' options are
included and the implementation lacks the error handling needed for a robust
implementation.

|proc bgExecCloseHandler {pid cmd} {
|   lassign [process status $pid] status exitCode
|   if {$status eq "running"} {
|      puts "... deferring close handling for $pid"
|      after 1000 [list bgExecCloseHandler $pid $cmd]
|   } else {
|      if {$cmd ne ""} {
|         {expand}$cmd $pid $exitCode
|      }
|   }
|}
|
|proc bgExecReadHandler {chan cmd} {
|   if {[gets $chan line] == -1} {
|      close $chan
|      
|      if {[info exists ::bgExecCloseInfo($chan)]} {
|         lassign $::bgExecCloseInfo($chan) pid cmd
|         after 0 bgExecCloseHandler $pid $cmd
|         unset ::bgExecCloseInfo($chan)
|      }
|   } else {
|      {expand}$cmd $line
|   }
|}
|
|proc bgExecLike {args} {
|   set outChan ""; set errChan ""
|   set i 0
|   set exitCmd ""; set parentOutChan ""
|   while {$i != [llength $args]} {
|      set arg [lindex $args $i]
|      switch -glob -- $arg {
|      	
|      	-onoutput {
|      	   incr i
|      	   set cmd [lindex $args $i]
|      	   lassign [process pipe] parentChan outChan
|      	   fileevent $parentChan readable [list \
|                  bgExecReadHandler $parentChan $cmd]
|      	   set outCmd $cmd
|      	   set parentOutChan $parentChan
|      	}
|      	
|      	-onerror {
|      	   incr i
|      	   set cmd [lindex $args $i]
|      	   lassign [process pipe] parentChan errChan
|      	   fileevent $parentChan readable [list \
|                  bgExecReadHandler $parentChan $cmd]
|      	}
|      	
|      	-onexit {
|      	   incr i
|      	   set exitCmd [lindex $args $i]
|      	}
|      	
|      	-* {
|      	   error "Unknown switch $arg"
|      	}
|      	
|      	* {
|      	   break
|      	}
|      }
|      incr i
|   }
|   
|   set cmdLine [lrange $args $i end]
|
|   # puts [list process invoke $cmdLine "" $outChan $errChan]
|   set pid [process invoke $cmdLine "" $outChan $errChan]
|
|   # Close the child end of the pipes - if we opened them.
|   foreach var {outChan errChan} {
|      if {[set $var] ne ""} {
|         close [set $var]
|      }
|   }
|   
|   if {$parentOutChan eq ""} {
|      # Poll for child process exit then notify client, or at least
|      # clean up the zombie.
|      after 0 bgExecCloseHandler $pid $exitCmd
|   } else {
|      # We copy BLT's trick of deferring polling till the stdout pipe
|      # closes. This is marginally more efficient, more importantly
|      # it stops clients being notified of their process until at stdout
|      # channel has closed.
|      set ::bgExecCloseInfo($parentOutChan) [list $pid $exitCmd]
|   }   
|}
|
|
|# now show bgExecLike in action ...
|
|proc showExit {pid code} {
|   puts "$pid terminated with code $code"
|}
|
|proc showLine {channel line} {
|   puts "$channel: $line"
|}
|
|proc runLs {args} {
|   puts "invoking ls on $args"
|   bgExecLike -onoutput "showLine stdout" -onerror "showLine stderr" \
|           -onexit showExit ls {expand}$args
|}
|
|# Sample invocations: note when running under tclsh, there is no event loop,
|# use 'update' to see the output to see what's happening.
|
|# successful listing
|runLs .
|
|# Unsuccessful listing
|runLs not-found
|
|# Listing of (non existent) files containing exec/open meta characters
|runLs < > | &

~ Limitations

 1. For convenient use, the functionality proposed here needs to be
    supplemented with additional commands providing a higher level interface,
    perhaps one of them being similar to the '''bgExecLike''' example given
    previously. The author has decided to omit this feature from the TIP
    because:

 > * such commands can be implemented in pure Tcl using the commands described
     here

 > * the exact nature of the high level commands may produce lengthy
     discussions

 > * it could even be argued that such commands are more appropriate in tcllib
     rather than the Tcl core

 2. As with the current implementation of '''exec''', each channel passed to
    '''process invoke''' must have a valid underlying OS file handle.
    Consequently when running on Windows:

 > * use of a wish standard channel will be immediately rejected

 > * use of a socket will be accepted but will trigger an error in the child
     process.

 3. Efficiency - The author has not yet attempted a detailed performance
    study, but this proposal does have some theoretical inefficiencies when
    compared to a pure C implementation, such as '''bgexec''':

 > * an intermediate Tcl procedure is used to capture output from a pipe

 > * each end of each pipe has to be wrapped in a ''CommandChannel'' before it
     can be passed back to the calling script, even if the pipe is just going
     to be used to link together two processes in a pipeline.

~ Related Possibilities for Future Enhancements

 1. For Windows, an important limitation that is not addressed by this TIP, is
    the lack of control over the console window settings when invoking a
    process. This will require changes to ''TclpCreateProcess''.

 2. A '''kill''' subcommand would be a useful addition to the '''process'''
    ensemble. On Windows, the ability to kill a child console process cleanly
    is related to the choice of console mode, so this issue would ideally be
    addressed in conjunction with item (1) above.

 3. A command to categorise an exit status obtained from '''process status'''
    or '''process wait''' along similar lines to the data placed in
    ''$errorCode'' by ''TclCleanupChildren()''.

 4. Some aspects of the existing '''exec''' command depend on use of temporary
    files. Since this TIP transfers the high level implementation of process
    launching into Tcl scripts, support for creation of uniquely named
    temporary files, as proposed in [210], would be useful.

 5. The wish console on Windows could be improved, using this mechanism, so
    that program names typed interactively will run in the background,
    allowing output to be seen before the process completes.

 6. The ability to define ''argv[[0]]'', independently from the program name,
    would occasionally be useful. For example, some UNIX shells run as login
    shells when ''argv[[0]]'' begins with a dash.

 7. Public C functions for invoking a process and creating a pipe wrapped in
    command channels.

 8. Support for detaching process and for reaping detached processes.

 9. The existing '''exit''' command could be duplicated in the '''process'''
    ensemble.

 10. The basic form of the existing '''pid''' command, which obtains the
     process id of the current process, could be added to the '''process'''
     ensemble.

 11. In some cases, it is more appropriate to run a child process with its
     streams connected to a null file rather than disconnected. Since the name
     of the null file is platform specific, it would be helpful to have a
     platform independent way of accessing the name.

 12. An option to obtain full status information. On Windows, process exit
     codes are 32 bit. On UNIX, higher bits of a waitpid() status value
     distinguish termination via exit() from termination via an uncaught
     signal.

~ Reference Implementation

Submitted as patch 1315115
[https://sourceforge.net/support/tracker.php?aid=1315115]

~ Copyright

This document has been placed in the public domain.
