TIP 430: Add basic ZIP archive support to Tcl

Login
Author:         Sean Woods <[email protected]>
Author:         Donal Fellows <[email protected]>
Author:         Poor Yorick <[email protected]>
Author:         Harald Oehlmann <[email protected]>
State:          Final
Type:           Project
Vote:           Done
Created:        03-Sep-2014
Post-History:
Keywords:       virtual filesystem,zip,tclkit,boot,bootstrap
Tcl-Version:    8.7
Votes-For:      DKF, KBK, SL, AK, JD, JN (partially)
Votes-Against:  JN (partially)
Present:        JN (partially)
Tcl-Branch:     core_zip_vfs

Abstract

This proposal will add basic support for mounting zip archive files as virtual filesystems to the Tcl core.

Target Tcl-Version

This TIP targets TCL Version 8.7

Rationale

Tcl/Tk relies on the presence of a file system containing Tcl scripts for bootstrapping the interpreter. When dealing with code packed in a self-contained executable, or a dynamic library, a chicken-and-egg problem arises when developers try to provide this bootstrap from their attached VFS with extensions like TclVfs. TclVfs runs in the Tcl interpreter. The interpreter needs init.tcl, which would mean that the filesystem containing init.tcl is not present until after TclVfs mounts it, yet that mount cannot happen until after init.tcl has been loaded. Bootstrap filesystem mounts require built-in support for the filesystem that they use.

With the inclusion of Zlib in the core (starting with 8.6, [234]), all that is required to implement a zip file system based VFS is to add a C-level VFS implementation to decode the zip archive format. Thus: this project.

Note that we are prioritizing the zip archive format also because it is practical to generate the files without a Tcl installation being present; it is a format with widespread OS support. This makes it much easier to bootstrap a build of Tcl that uses it without requiring a native build of tclsh to be present.

Specification

There shall be new ensemble zipfs added to tcl. That ensemble will contain several commands including:

Outside of a save interpreter, the following additional commands will be available:

VFS Mount Point

On virtually all platforms tcl supports (Unix, Windows) ZipFs will mount all archives under //zipfs:/. Some operating systems (past or future) may have a special meaning for this style path. To that end, it may be changed to address the needs of the specific environment. Which root is being used for the current platform can be accessed via a call to zipfs root. For the remainder of this document, references to //zipfs:/ are also intended to referred to whatever the prefix designated by ZIPFS_ROOT actually is.

Volumes may be mounted at any point under ZIPFS_ROOT, and if a mount point does not start with ZIPFS_ROOT the path will be considered relative to ZIPFS_ROOT. This conventions avoids some confusing interactions between file normalize and glob that differ between Windows and Unix and make building global paths either hop volumes or interact with the native file system.

Having a fixed mount point breaks from the tradition of mounting volumes under / or info nameofexecutable that other zipfs implementations use. However, if a kit builder wishes to retain that capability, all that is required is to load their own zipfs implementation using the conventional shims provided for kit building. The function names for the core implementation have been modified to not conflict with zipfs implementations that are out in the wild.

Generating Task Executables Tclsh/Wish

If tclsh/wish detect that the executable has a zip archive attached, the executable will be mounted as ZIPFS_ROOT/app. If ZIPFS_ROOT/app/main.tcl exists, that file is marked set the shell's startup script. If ZIPFS_ROOT/app/tcl_library/ exists, it will be searched for init.tcl.

The way to produce an executable will be as follows (assuming the source for the application is at ~/myapp/src):

From Tcl:

zipfs mkimg ~/bin/myapp.exe ~/myapp/src ~/myapp/src ~/bin/tclsh87.exe

From Unix:

cd ~/myapp/src
zip -r ~/myapp.zip .
cd ..
cp ~/bin/tclsh87.exe myapp.exe
cat myapp.zip >> myapp.exe

First argument handling for Tclsh/Wish

If the first argument to Tclsh or Wish is detected to be a zipfile, that file will be mounted as ZIPFS_ROOT/app. If ZIPFS_ROOT/app/main.tcl exists, that file is marked set the shell's startup script. If ZIPFS_ROOT/app/tcl_library exists, it will be searched for init.tcl.

New Tclsh features for TEA

To assist in packaging extensions, tclsh will take on a new command install. If install is the first argument, set subsequent arguments are passed to a new file in library install.tcl.

tclsh install with no arguments is designed to return immediately with a normal return code, thus making it easy to test if a tclsh is TIP #430 savvy but running in autoconf:

AS_IF([$TCLSH_PROG install],[
    ZIP_PROG=${TCLSH_PROG}
    ZIP_PROG_OPTIONS="install mkzip"
    ZIP_PROG_VFSSEARCH="."
    AC_MSG_RESULT([Can use Native Tclsh for Zip encoding])
])

This TIP only defines 2 function for install:

Package loading

Calls to tcl_findLibrary will now search through loaded packages to see if the dynamic library for the package in question has an attached zip file system. If that file system exists, it is mounted to ZIPFS_ROOT/lib/PGKNAME, and that mount point is added to the list of directories to search.

Implementation

This work is largely adapted Richard Hipp's work on Tcl As One Big Executable (TOBE). The concept has been modernized, somewhat, as well as heavily influenced by improvements made to it through the FreeWrap and Androwish projects. That implementation consists of one C file (tclZipvfs.c). I have also prepared a set of kit-like behaviors for the core to express when tclAppInit.c is not compiled with a TCL_LOCAL_MAIN_HOOK defined. Those behaviors reside in the TclZipfs_AppHook() function.

This work is checked in as the "core_zip_vfs" branch on both Tcl and Tk.

Modifications to auto.tcl

auto.tcl now has rules for scanning DLLs for zip file systems.

Modifications to minizip.c

minizip has been modified to be able to handle recursive directory arguments.

Modifications to tclAppInit.c

tclAppInit.c will now call TclZipfs_AppHook() if no TCL_LOCAL_MAIN_HOOK was defined.

Modifications to tclBasic.c

tclBasic.c will contain a call to *TclZipfs_Init() which will initialize the portions of C needed to implement zipfs as well as inject the zipfs command into the interpreter.

Modifications to tclFileName.c

tclFileName.c has a minor patch to exclude the prefix // from local file searches.

New C File tclZipFS.c

This file is a self-contained implementation in C of a zip based VFS. It includes all functions needed for implementing zipfs.

Modifications to tclIOUtil.c

tclIOUtil.c has a minor patch to exclude UNC style paths that contain a colon (:) in the server field from being resolved by the operating system. (Which by standard is not allowed anyway.) This allows VFS file systems to use //FSTYPE: namespace with impunity.

Modifications to the Tcl build system

Tcl will now attempt to find a zip encoder in the environment. If a TIP #430 savvy tclsh is discovered, that shell will be used. Failing that, the system will search for an executable named zip. Failing that, tcl will build it's own zip encoder.

When it cannot locate a zip encoded in the environment, Tcl will now build a copy of the minizip program, whose source is currently distributed in /compat/zlib/contrib/minizip. The tcl.m4 macro now detects if the compiler used can produce native native executables, and in cases where it cannot, will search for a C compiler that can, an substitute that value into the Makefile as HOST_CC. The C compiler will generate a native executable minizip which will be compiled in the same directory as tcl, and be used for all archive creation.

New build product libtcl.zip

A new build target libtcl_MAJOR_MINOR_PATCHLEVEL.zip is created from the /library directory in the tcl sources. For static library installs, this archive is copied to the tcl standard install location. For shared library builds this archive is appended to the dynamic library.

Modifications to the /library file system

To reduce the complexity of building archives, init.tcl has been modified to look for the presence of an adjacent file pkgIndex.tcl. That file contains all of the package ifneeded calls to direct the core to find the core distributed packages relative to location of tcl_library. Unlike other pkgIndex.tcl files, this file must be manually maintained and kept up to date as package names and versions change, are added, or removed.

Modifications to the tclConfig.sh and TEA

A new field TCL_ZIP_FILE will indicate the name of the zip archive generated by the build system. If this field is present and the value is non-blank, extensions (for instance Tk) can use this to infer the core was built with ZipFs support.

TEA extensions which detect a non-blank value for TCL_ZIP_FILE will generate a value TCL_ZIPFS_SUPPORT=1 when compiling as a shared library, and TCL_ZIPFS_SUPPORT=2 when compiling as a static library.

Modifications to Tk

Tk will scan tclConfig.sh, and if it detects a non-blank value for TCL_ZIP_FILE, it will make a call to TclZipfs_AppHook() if no TK_LOCAL_MAIN_HOOK was defined.

C API

int TclZipfs_AppHook(int *argc, char ***argv);

  1. If the current executable has an attached zip file system, mount that to ZIPFS_ROOT/app.
  2. If the file ZIPFS_ROOT/app/main.tcl exists, register that file as the process startup script.
  3. If the file ZIPFS_ROOT/app/tcl_library/init.tcl exists, register ZIPFS_ROOT/app/tcl_library/init.tcl as tcl_library
  4. If the file ZIPFS_ROOT/app/tk_library/init.tcl exists, register ZIPFS_ROOT_/app/tk_library/init.tcl* as tk_library
  5. If tcl_library was not set, the function will then scan the local environment for a zipfs file system attached to either the tcl dynamic library or an archive named libtcl\_MAJOR\_MINOR\_PATCHLEVEL.zip (where MAJOR, MINOR and PATCHLEVEL depend on the exact version of Tcl). That file can either be in the present working directory or in the standard system install location for Tcl.

int TclZipfs_Mount(Tcl_Interp *interp, const char *zipname, const char *mntpt, const char *passwd);

Mounts a zip file zipname to the mount point mntpt. If passwd is non-null, that string is used as the password to decrypt the contents. mntpnt will always be relative to zipfs:

int TclZipfs_Unmount(Tcl_Interp *interp, const char *zipname);

Unmount the file system created (from zipname) by a prior call to TclZipfs_Mount().

Creating a wrapped executable

With this TIP, producing a wrapped executable is now a matter of:

mkdir myvfs.vfs
cd myvfs.vfs
echo "puts {hello world}" > main.tcl
zip -r ../hello.zip .
cd ..
cp tclsh8.7 hello
cat hello.zip >> hello
./hello
> hello world

Copyright

This document has been placed in the public domain.