TIP 189: Tcl Modules

Login
Author:         Andreas Kupries <[email protected]>
Author:         Jean-Claude Wippler <[email protected]>
Author:         Jeff Hobbs <[email protected]>
Author:         Don Porter <[email protected]>
Author:         Larry W. Virden <[email protected]>
Author:         Daniel A. Steffen <[email protected]>
Author:         Don Porter <[email protected]>
State:          Final
Type:           Project
Vote:           Done
Created:        24-Mar-2004
Post-History:   
Tcl-Version:    8.5
Tcl-Ticket:     942881

Abstract

This document describes a new mechanism for the handling of packages by the Tcl Core which differs from the existing system in important details and makes different trade-offs with regard to flexibility of package declarations and to access to the filesystem. This mechanism is called "Tcl Modules".

Background and Motivation

The current mechanism for locating and loading packages employed by the Tcl core is very flexible, but suffers from a number of drawbacks as well. These are at least partially the result of the flexibility, and thus not easily solved without giving up something.

One problem with the current mechanism is that it extensively searches the filesystem for packages, and that it has to actually read a file (pkgIndex.tcl) to get the full information for a prospective package. All of these operations take time. The fact that "index scripts" are able to extend the list of paths searched tends to heighten this cost as it forces rescans of the filesystem. Installations where directories in the auto_path are large or mounted from remote hosts are hit especially hard by this (network delays). All of this together causes a slow startup of tclsh and Tcl-based applications.

"Tcl Modules" on the other hand is designed with less flexibility in mind and to allow implementations to glean as much information as possible without having to perform lots of accesses to the filesystem.

Additional benefits of the proposed design are a simplified deployment of packages, akin to the way starkits made application deployment simple, and from that an easier implementation and management of repositories.

It does not come without penalties however.

Specification

Introduction

Modules are regular Tcl Packages, in a different guise. To ease explanations, first a summary of the existing mechanism:

The above is very flexible, but comes at a price. The filesystem is not only searched, but files have to be read as well to build up the in-memory index of packages. And this is iterated if index files change/extend the list of paths to search.

Tcl Modules simplifies the above considerably, by cutting down on the number of indirections involved. It only searches for module files and records their location, but does not read them. The search is only performed when required, on a limited part of the filesystem. This makes locating and importing packages in module form easier and faster. The price is that packages in module form cannot prevent registration in an interpreter not of their choice, nor can they influence the package search itself before they are actually used.

The remainder of this document will cover the following topics

Module Definition

A Tcl Module is a Tcl Package contained in a single file, and no other files required by it. This file has to be sourceable. In other words, a Tcl Module is always imported via:

 source module_file

The "load" command is not directly used. This restriction is not an actual limitation, as we may believe. Ever since 8.4 the Tcl source command reads only until the first ^Z character. This allows us to combine an arbitrary Tcl script with arbitrary binary data into one file, where the script processes the attached data in any it chooses to fully import and activate the package. Please read [190] "Implementation Choices for Tcl Modules" for more explanations of the various choices which are possible.

The name of a module file has to match the regular expression

 ([[:alpha:]_][:[:alnum:]_]*)-([[:digit:]].*)\.tm

The first capturing parentheses provides the name of the package, the second clause its version. In addition to matching the pattern, the extracted version number must not raise an error when used in the command

 package vcompare $version 0

This additional check has several benefits. The regular expression pattern is a bit simpler, and the full version check is based on the official definition of version numbers used by the Tcl core itself.

Finding Modules

Remember the check for a valid module in last section, and notice that any filename matching this name pattern is going to be treated by the TM system as if it's a Tcl module, whether it really is or not. This means it's a bad idea for any non-Tcl module files that might match that pattern to end up in a directory where TM will be scanning. This suggests that the directory tree for storing Tcl modules ought to be something separate from other parts of the filesystem. This further implies that a new search path over just these separate storage areas would be better than Yet Another Use of $::auto_path.

Therefore: Modules are searched for in all directories listed in the result of the command "::tcl::tm::path list" (See also section 'API to "Tcl Modules"'). This is called the "Module path". Neither "auto_path" nor "tcl_pkgPath" are used.

All directories on the module path have to obey one restriction:

This is required to avoid ambiguities in package naming. If for example the two directories

 foo/
 foo/cool

were on the path a package named 'cool::ice' could be found via the names 'cool::ice' or 'ice', the latter potentially obscuring a package named 'ice', unqualified.

Before the search is started, the name of the requested package is translated into a partial path, using the following algorithm:

Example:

After this translation the package is looked for in all module paths, by combining them one-by-one, first to last with the partial path to form a complete search pattern. The exact pattern and mechanism is left unspecified, giving the implementation freedom of choice as to what glob searches to perform, how much of them, and when.

Independent of that, the implemented algorithm has to reject all files where the filename does not match the regular expression given in the previous section. For the remaining files "provide scripts" are generated and added to the package ifneeded database.

The algorithm has to fall back to the previous unknown handler when none of the found module files satisfy the request. If the request was satisfied no fall-back is required.

Provide and Index Scripts

Packages in module form have no control over the "index" and "provide script"s entered into the package database for them. For a module file MF the "index script" is

 package ifneeded PNAME PVERSION [list source MF]

and the "provide script" embedded in the above is

 source MF

Both package name PNAME and package version PVERSION are extracted from the filename MF according to the definition below:

 MF = /module_path/PNAME'-PVERSION.tm

Where PNAME' **is the partial path of the module as defined in section 'Finding Modules' before, and translated into **PNAME by changing all directory separators to '::', and module_path is the path (from the list of paths to search) that we found the module file under.

Note that we are here creating a connection between package names and paths. Tcl is case-sensitive when it comes to comparing package names, but there are filesystems which are not, like NTFS. Luckily these filesystems do store the case of the name, despite not using the information when comparing.

Given the above we allow the names for packages in Tcl modules to have mixed-case, but also require that there are no collisions when comparing names in a case-insensitive manner. In other words, if a package 'Foo' is deployed in the form of a Tcl Module, packages like 'foo', 'fOo', etc. are not allowed anymore.

Regular packages have no problem with the names of their files, as their entry point has a standard name ("pkgIndex.tcl") and its contents can be adjusted according to the filesystem they are stored in.

API to "Tcl Modules"

"Tcl Modules" is implemented in Tcl, as a new handler command for package unknown. This command calls the previously installed handler when its own search fails, thereby ensuring proper fall-back to the regular package search.

All code and data structures implementing "Tcl Modules" reside in the namespace "::tcl::tm".

A namespace variable holds the list of paths to search for modules, but is not officially exported. All access to this variable is done through the following public commands:

We do not provide APIs for rescanning directories, clearing internal state and such. The official interface to this functionality is "package forget" and special interfaces are neither required nor desirable.

Discussion

Restriction to "source"

This has already been discussed in the specification above.

For more discussion I again refer to [190] "Implementation Choices for Tcl Modules" which explains the various implementation choices in much more detail.

Preconditions

It has already been mentioned in section 'Background and Motivation' that preconditions in "index scripts" are lost, one of the penalties of the simplified scheme specified here.

Their existence was most important to installations with multiple versions of Tcl coexisting with each other as they could share the directory hierarchy containing packages between the various Tcl cores. This is not possible anymore, at least not in a simple manner.

For the majority of installations however, i.e. those without only one version of Tcl installed, or controlled environments like the inside of starkits and starpacks, this loss is irrelevant and of no consequence.

For more discussion please see [191] "Managing Tcl Package and Modules in a Multi-Version Environment" which explains the various choices a sysadmin has in much more detail.

Package Metadata

An area possibly made harder by Tcl Modules is the storage and query of package metadata. [59] was one way of handling such information, by storing them in the binary library of packages which have such. Another approach was to store them in the package index script, using a hypothetical package about command.

The latter approach has the definite advantage that it was possible to query the database of metadata for a particular package without having to actually load said package, as a load may fail if the Tcl shell used to query the database does not fulfil the preconditions for that package.

Both approaches listed above assume that it makes sense to query the database of metadata for all installed packages from a plain Tcl shell. In other words, to use the standard Tcl shell also as the tool to directly manage an installation.

It is possible to extend the proposal made in this document to handle metadata as well. We already reserved the namespace ::tcl::tm for use by us, so it is no problem to extend the public API with commands to locate all installed packages, their metadata, and to perform queries based on this. This will require an additional specification as to how metadata is stored in/by Tcl Modules, and it will have to be understood that these extended management operations can take considerably more time than a package require, as they will have to scan all defined search paths and all their sub directories for Tcl Modules, and have to extract the metadata itself as well.

Deployment

The fact that a Tcl Module consists only of a single file makes its deployment quite easy. We only have to ensure correct placement in one of the searched directories when installing it locally, but nothing more.

Regarding the usage of Tcl Modules in a wrapped application, please see [190] "Implementation Choices for Tcl Modules". This is highly dependent on the implementation chosen for a specific Tcl Module and thus not discussed here, but in the referred document.

Package Repositories

At a very basic level, the physical storage, any directory tree containing properly placed files for a number of modules can serve as a package repository for the modules in it. In other words, from that point of view an installation is virtually indistinguishable from a repository, and their creation and maintenance is very easy

Note however that the higher levels of a repository, like indexing package metadata in general, or dependence tracking in particular, licensing, documentation, etc. are not addressed here and by this.

This requires standards for package metadata, format and content, topics with which this document will not deal.

Defaults

The default list of paths on the module path is computed by a tclsh as follows, where X is the major version of the Tcl interpreter and y is less than or equal to the minor version of the Tcl interpreter.

All the default paths are added to the module path, even those paths which do not exist. Non-existent paths are filtered out during actual searches. This enables a user to create one of the paths searched when needed and all running applications will automatically pick up any modules placed in them.

The paths are added in the order as they are listed above, and for lists of paths defined by an environment variable in the order they are found in the variable.

Installation

The installation of a Tcl module for a particular interpreter is basically done like this:

 #! /path/to/chosen/tclsh
 # First argument is the name of the module.
 # Second argument is the base filename
 set mpaths [::tcl::tm::path list]
 ... remove all paths the user has no write permissions for.
 ... throw an error if there are no paths left.
 ... provide the user with some UI if more than one path is left
 ... so that she can select the path to use.
 set selmpath [ui_select $mpaths]
 file copy [lindex $argv 1] \
     [file join $selmpath \
     [file dirname [string map {:: /} \
     [lindex $argv 0]]]]

Glossary

The following terms and definitions are used throughout the document

Reference Implementation

A reference implementation is available in Patch 942881 http://sf.net/tracker/?func=detail&aid=942881&group_id=10894&atid=310894

Questions

Comments

[ Add comments on the document here ]

A feature asked for during discussion is to allow a directory as a Tcl Module. I am opposed to this, because behind Tcl Modules is the same idea/vision as for starkit and starpacks, namely that of deploying something in the simplest possible manner, without any overhead. Sometimes I call Tcl Modules package kits, short pakits (and then twist that then spoken into 'packet' :). http://groups.google.ca/groups?hl=en&lr=&ie=UTF-8&frame=right&th=78764d499cc4e4a&seekm=c6tshf030c6%40news4.newsguy.com#link19

Copyright

This document has been placed in the public domain.