TIP 86: Improved Debugger Support

Login
Author:         Peter MacDonald <[email protected]>
Author:         Peter MacDonald <[email protected]>
State:          Draft
Type:           Project
Vote:           Pending
Created:        08-Feb-2002
Post-History:   
Tcl-Version:    9.1
Implementation-URL: http://pdqi.com/download/tclline-8.4.9.diff.gz

Abstract

This TIP proposes the storage by Tcl of source code file-name and line-numbering information, making it available at script execution time. It also adds additional trace and info subcommands to make it easier for a debugger to control a Tcl script much as gdb can control a C program.

Rationale

Currently, although Tcl provides quite reasonable information to users in error traces, the line numbers within those traces are always relative to the evaluation context containing them (often the procedure, but not always) and not to the script file containing the procedure. This is substantially different to virtually every other computer language and makes correlating errors with the source line that caused them much more difficult. This also makes coupling a Tcl interpreter to an external debugging tool more difficult. This TIP proposes adding new interfaces to the Tcl core to make such debugging activity easier.

A new trace execution option enables Tcl to track line number and source file information associated with statements being executed and call a single callback. A new **info line_ option provides access to line number information. As a result, it becomes a simple matter to implement a debugger for Tcl, in Tcl. Furthermore, the implementation also serves as example usage of the C interface, enabling similar capabilities at the lower level.

A simple Tcl debugger, tgdb, written in Tcl and emulating gdb, is included with this TIP to demonstrate the use of this interface. tgdb runs and controls a Tcl application in a sub-interp using trace execution and interp alias. It supports breakpoints on lines/procs/vars/etc, single-stepping, up/down stack and evals. It is designed to work both as a commandline and a slave process (see Reference Implementation).

Finally, upon error within a procedure, the file path and absolute (as opposed to relative) line number are printed out when available, even in the case where called from an after or callback invocation. Aside from aiding the user in more easily locating and dealing with errors, the message is machine parseable For example: automatically bring the user into an editor at the offending line.

Specification

A new execution subcommand to the trace command.

trace execution target ?level?

This arranges for an execution trace to be setup for commands at nesting level or above, thereby providing a simple Tcl interface for tracing commands to say, implement a debugger. With no arguments, the current target is returned. If target is the empty string, the execution trace is removed. The target argument is assumed to be a command string to be executed. When level is not specified, it defaults to 0, meaning trace all commands. For each traced command, the following data will be produced:

The target is presumed to be a valid Tcl command onto which is appended the above arguments before evaluation. Any return from the command other than a normal return results in the command not being executed. As with all traces, execution tracing is disabled within a trace handler.

Second, a new line subcommand to info gives access to the file path and line number information. It takes subcommands of its own in turn:

These exhibit the following behavior:

Third, a new info subcommand return.

Forth, an additional flag option debug to trace add variable

Fifth, a new breakpoint subcommand to the trace command.

trace breakpoint ??line file ?level ...??

The trace breakpoint manages a list of breakpoints that cause an execution trace to trigger, even when the nestlevel is exceeded. With no arguments it returns a ternery list of all breakpoints in sets of the triples: line, file, and state. With two arguments, the current state for the breakpoint is returned. With three or more arguments, new breakpoints are created. If created with a state of zero, the breakpoint is considered inactive. Setting the state of a breakpoint to the empty string effectively deletes the breakpoint. A state set to an N greater than zero triggers every Nth time.

Changes

Sourced file paths are stored per interp in a hash table. File/line numbering information is also stored in the Interp, Proc, After, and CallFrame structures. Newline counting/shifting code was added to proc, while, for, foreach, and if. All but the non-trivial code is active only when the new TRACE_LINE_NUMBERS interp flag is active, which is the case when using trace execution.

Most new variables within Interp are in the struct subfield sourceInfo of type Tcl_SourceInfo, which can be retrieved via the new Tcl_GetSourceInfo(interp) stubbed/public call.

Overhead/Impact

The runtime impact to Tcl should be modest: a few 10's of kilobytes of memory, even for moderately large programs. Most of the space impact occurs in storing the file paths. A typical example from a large system:

  100 sourced files * 100 bytes = 10K.

The other space overhead adds up to several words (8 bytes on a 32-bit platform) per defined procedure, plus an additional words in the Interp structure.

Runtime processing overhead should be negligible.

However, there have been no benchmarks done to validate these assertions.

Reference Implementation

This patch is against Tcl 8.4.9 and represents a complete rework of the approach.

http://pdqi.com/download/tclline-8.4.9.diff.gz

There is a simple demonstration debugger script: tgdb.tcl.

http://pdqi.com/download/tgdb.tcl

Previous/Old Reference Implementation

http://pdqi.com/download/tclline-cvs.diff.gz - Patch against CVS head.

http://pdqi.com/download/tclline-8.4.6.diff.gz - Patch against Tcl 8.4.6

The CVS patch was against the CVS head is as of June 13/2004. These have been lightly tested against numerous small Tcl programs.

There is also an initial version of a debugger: tgdb.

http://pdqi.com/download/tgdb-2.0.tar.gz

tgdb emulates the basic commands of gdb (s, n, c, f, bt, break, info locals, etc). This newest version also supports watchpoints and display variables. With load and run commands added, tgdb should probably work even with emacs and ddd.

An additional package pdqi provides tdb, a GUI front-end to gdb, modified to also work with tgdb.

Possible Future Enhancements

Build and store a line number table internally during parse?

Line number lookup via the source string. A simple way to implement this might be to lookup string against the codePtr->source+bestSrcOffset as returned by GetSrcInfoForPc().

Add special handling for eval. Cases like eval $str should eventually be changed to report a line number of 0 (or more likely the line number of the original statement) for all statements with any argument involving a sub-eval.

Possibly implement character offsets within a line.

Notes

A test has been added to the tests/trace.test. A utility trcline.tcl is provided that the test uses to provide some measure of the accuracy of the line number tracing.

Comments and Feedback

Jeff Hobbs asked what about interp alias, etc.

Jeff Hobbs notes filename storage is inefficient and finalization

Neil Madden/Stephen Trier comment on info subcommand names line, file and proc and possible future uses for line

Donal Fellows writes: Is there a way to do an equivalent of #line directives in C

Donald Porter notes that changing Tcl_Parse breaks binary compatibility

Donald Porter notes that the hash table should be per Interp

Mo DeJong notes: file path should be used in place of file name

Mo DeJong suggests to maybe use TclpObjNormalizePath(fileName)

Donal Fellows objects to no support for procs in subevals and Andreas Kupries suggests defining a line number Tcl_Token type.

Donal Fellows asks if trace is disabled in the execution handler, how tracing to a sub-interp would work, and clarification on the purpose and use of trace variable {debug}.

Copyright

This document has been placed in the public domain.

tgdb and pdqi have a BSD copyright by Peter MacDonald and PDQ Interfaces Inc.