<?xml version="1.0" encoding="ISO-8859-1" ?>
<!DOCTYPE TIP SYSTEM "http://www.tcl.tk/cgi-bin/tct/tip/tipxml.dtd">
<!-- Converted at Thu May 23 06:06:57 GMT 2013 -->
<!-- TIP AutoGenerator - written by Donal K. Fellows -->

<TIP number='192'>
<header><title>Lazy Lists</title><author address="mailto:antirez@invece.org">Salvatore Sanfilippo</author><author address="mailto:theover@tiscali.nl">Theo Verelst</author><status type='project' state='draft' tclversion="9.0" vote='prior'>$Revision: 1.2 $</status><history></history><created day='27' month='mar' year='2004' /><keyword>Tcl</keyword></header>
<abstract>This TIP proposes to add a new command to generate lists of <emph style="italic">N</emph> elements, where the <emph style="italic">i</emph>-th element is computed as the result of an unary Tcl procedure with <emph style="italic">i</emph> as itsargument. Implementing special handling for this kind of lists inside the Tcl core will allow generation of lists in a <emph style="italic">lazy</emph> way. This TIP&apos;s goal is not to change the semantics of Tcl, but just to provide a different space complexity for an (often interesting) subset of Tcl lists.</abstract>
<body><section title="Rationale">
<para>A subset of Tcl lists can be generated mapping an unary function to an integer sequence in the range [0,n), where <emph style="italic">n</emph> is the length of the list. The following procedure implements this concept:</para>
<verbatim><vline encoding='base64'>cHJvYyBsZ2VuIHtsZW4gZnVuY30gew==</vline><vline encoding='base64'>ICAgIHNldCBsIHt9</vline><vline encoding='base64'>ICAgIGZvciB7c2V0IGkgMH0geyRpIDwgJGxlbn0ge2luY3IgaX0gew==</vline><vline encoding='base64'>ICAgICAgICBsYXBwZW5kIGwgW3VwbGV2ZWwgMSBbbGlzdCAkZnVuYyAkaV1d</vline><vline encoding='base64'>ICAgIH0=</vline><vline encoding='base64'>ICAgIHJldHVybiAkbA==</vline><vline encoding='base64'>fQ==</vline></verbatim>
<para>and the following (using the <emph style="bold">lambda</emph> from <tipref type="text" tip="187"/>) is an example of usage:</para>
<verbatim><vline encoding='base64'>cHJvYyBsYW1iZGEge2FyZ2wgYm9keX0gew==</vline><vline encoding='base64'>ICAgIHNldCBuYW1lIFtpbmZvIGxldmVsIDBd</vline><vline encoding='base64'>ICAgIHByb2MgJG5hbWUgJGFyZ2wgJGJvZHk=</vline><vline encoding='base64'>ICAgIHNldCBuYW1l</vline><vline encoding='base64'>fQ==</vline><vline encoding='base64'></vline><vline encoding='base64'>c2V0IG15bGlzdCBbbGdlbiAxMCBbbGFtYmRhIHgge2luY3IgeDsgZXhwciAkeCokeH1dXQ==</vline></verbatim>
<para>The above code will evaluate to the same as [list 1 4 9 16 25 36 49 64 81 100]. <emph style="bold">lgen</emph> can be used in order to build a particularly useful Tcl procedure named <emph style="bold">range</emph>, returning a sequence of integers with given <emph style="italic">start</emph> <emph style="italic">end</emph> and (optionally) <emph style="italic">step</emph> paramenters. The <emph style="bold">range</emph> function is convenient for rewriting many <emph style="bold">for</emph> loops in terms of <emph style="bold">foreach</emph> iterating over an integer range. So instead of writing:</para>
<verbatim><vline encoding='base64'>Zm9yIHtzZXQgaSAwfSB7JGkgPCAyMH0ge2luY3IgaSAyfSB7</vline><vline encoding='base64'>ICAgIHB1dHMgJGk=</vline><vline encoding='base64'>fQ==</vline></verbatim>
<para>It is possible to write:</para>
<verbatim><vline encoding='base64'>Zm9yZWFjaCBpIFtyYW5nZSAwIDIwIDJdIHs=</vline><vline encoding='base64'>ICAgIHB1dHMgJGk=</vline><vline encoding='base64'>fQ==</vline></verbatim>
<para>That is more convenient to write and to read for the programmer. Of course it&apos;s possible to use <emph style="bold">foreach</emph> to iterate any sequence that <emph style="bold">lgen</emph> is able to generate, but <emph style="bold">range</emph> is probably one of the more common of the possible usages.</para>
<para>The TIP proposes to implement the ability to handle this kind of lists in a special way directly inside the <emph style="italic">List</emph> object implementation. This allows these common usage patterns of the list object to not need to hold the real sequence but just the length and the unary element generator function.</para>
<para>The interface to the Tcl programmer is a single command similar in semantics to <emph style="bold">lgen</emph>, but possibly with a more suitable name.</para>
</section>
<section title="Proposed Change">
<para>The <emph style="italic">List</emph> object should be modified in order to have the lazy-list as subtype or alternate internal implementation. All the core should use the proper <emph style="italic">List</emph> object API instead to access to the <emph style="italic">List</emph> object via the internal representation.</para>
<para>The two calls that should be guaranteed to not alter the <emph style="italic">lazyness</emph> of the list are <emph style="bold">Tcl_ListObjLength()</emph> and <emph style="bold">Tcl_ListObjIndex()</emph>. Other calls like <emph style="bold">Tcl_ListObjReplace()</emph> may be optimized for the lazy-list case when possible, and the <emph style="bold">lrange</emph> command may be optimized (particularly when the start index is zero).</para>
<para>The <emph style="italic">List</emph> will be converted into a non-lazy version if the user tries to modify it, for example using the <emph style="bold">Tcl_ListObjAppendElement()</emph> function. The <emph style="italic">List</emph> will also be converted into a non-lazy version on <emph style="bold">Tcl_ListObjGetElements()</emph> calls.</para>
<para>It&apos;s possible to handle the <emph style="bold">range</emph> command as a particular case of lazy-list in order to provide a very fast implementation of foreach iterating over an integer range (probably much faster than the today <emph style="bold">for</emph>, being <emph style="bold">foreach</emph> already faster iterating over a literal list of numbers).</para>
<subsection title="Consequences: Reference Management">
<para>The author of this TIP tried to implement the proposed changes in the HEAD, discovering that the <emph style="bold">Tcl_ListObjIndex()</emph> interface creates a serious problem due to the assumption that the <emph style="italic">List</emph> object holds at least one reference to the returned element. The implementation of <emph style="bold">Tcl_ListObjIndex()</emph> in the lazy case can&apos;t just create the element object and return it with refcount of zero because it will leak if the caller does not increment the reference count itself.</para>
<para>It&apos;s also not safe to store a reference to the last few elements created in the lazy way inside the <emph style="italic">List</emph> object, and release this references in order to create more elements, because in theory the caller may require a large number of elements storing pointers into an array, and finally incrementing the reference counts in a single pass.</para>
<para>In order to avoid this problem, the semantics of <emph style="bold">Tcl_ListObjIndex()</emph> should be changed in order to always return the element with an already incremented reference count. It will be up to the caller to decrement the reference count if the object will be discarded. (This is why this change is proposed for Tcl 9.0 and not 8.5, as this has a significant impact on both the core and on extensions.)</para>
<para>An alternative change to <emph style="bold">Tcl_ListObjIndex()</emph> (in order to make it &quot;compatible&quot; with the semantics of lazy lists) is to disallow successive calls against the same list if a previous call returned an object that the caller plans to reference the object.</para>
<para>So:</para>
<verbatim><vline encoding='base64'>VGNsX0xpc3RPYmpJbmRleChpbnRlcnAsIG15TGlzdFB0ciwgMCwgJmEpOw==</vline><vline encoding='base64'>VGNsX0xpc3RPYmpJbmRleChpbnRlcnAsIG15TGlzdFB0ciwgMCwgJmIpOw==</vline><vline encoding='base64'>bXlzdHJ1Y3QtPmEgPSBhOw==</vline><vline encoding='base64'>bXlzdHJ1Y3QtPmIgPSBiOw==</vline><vline encoding='base64'>VGNsX0luY3JSZWZDb3VudChhKTs=</vline><vline encoding='base64'>VGNsX0luY3JSZWZDb3VudChiKTs=</vline></verbatim>
<para>will be invalid, while:</para>
<verbatim><vline encoding='base64'>VGNsX0xpc3RPYmpJbmRleChpbnRlcnAsIG15TGlzdFB0ciwgMCwgJmEpOw==</vline><vline encoding='base64'>VGNsX0luY3JSZWZDb3VudChhKTs=</vline><vline encoding='base64'>bXlzdHJ1Y3QtPmEgPSBhOw==</vline><vline encoding='base64'>VGNsX0xpc3RPYmpJbmRleChpbnRlcnAsIG15TGlzdFB0ciwgMCwgJmIpOw==</vline><vline encoding='base64'>VGNsX0luY3JSZWZDb3VudChiKTs=</vline><vline encoding='base64'>bXlzdHJ1Y3QtPmIgPSBiOw==</vline></verbatim>
<para>is valid. This fixes any problem because the <emph style="italic">List</emph> object can just take a reference to the last generated object and avoid any leak. A study of existing code in the core and extensions is required to see whether this will allow the majority of code to operate unchanged.</para>
</subsection>
<subsection title="Consequences: Non-constant Lists">
<para>The TIP also poses another problem in the case the unary function has side effects. In this case the behaviour can be described without to violate the Tcl semantic in terms of variable traces most of the time, but actually it&apos;s possible to write code that shows that the list value is non-stable even without using variables (because Tcl has no &quot;object trace&quot; concept actually).</para>
<para>If this is considered a problem, the TIP may be reduced in order to only allow lazy generated lists composed of integer ranges, (but that is one of the most interesting advantages of this TIP anyway.) With integer ranges there are no side effects, so the semantical problem is not an issue, but the <emph style="bold">Tcl_ListObjIndex()</emph> problem is exactly the same.</para>
<para>Of course, this is arguably just an unavoidable consequence and shows that not all possible unary element generator functions, but just those with a functional denotation, are necessarily reasonable choices.</para>
</subsection>
</section>
<section title="Copyright">
<para>This document has been placed in the public domain.</para>
</section>
</body></TIP>
