<?xml version="1.0" encoding="ISO-8859-1" ?>
<!DOCTYPE TIP SYSTEM "http://www.tcl.tk/cgi-bin/tct/tip/tipxml.dtd">
<!-- Converted at Thu May 23 23:38:39 GMT 2013 -->
<!-- TIP AutoGenerator - written by Donal K. Fellows -->

<TIP number='115'>
<header><title>Making Tcl Truly 64-Bit Ready</title><author address="mailto:donal.k.fellows@man.ac.uk">Donal K. Fellows</author><status type='project' state='draft' tclversion="9.0" vote='prior'>$Revision: 1.3 $</status><history></history><created day='23' month='oct' year='2002' /></header>
<abstract>This TIP proposes changes to Tcl to make it operate more effectively on 64-bit systems.</abstract>
<body><section title="Rationale">
<para>It is a fact of life that 64-bit platforms are becoming more common. While once the assumption that virtually everything was a 32-bit machine (where not smaller) was valid, this is no longer the case. Particularly on modern supercomputers (though increasingly in workstations and high-end desktop systems too), the amount of memory that the machine contains is exceeding 2GB, and the need to address very large amounts of memory is certainly there in scientific and engineering applications. And where they lead, consumer systems will probably follow too.</para>
<para>At the moment, Tcl is ill-prepared for this. In particular, the type used for expressing sizes of entities in Tcl (whether strings, lists or undifferentiated blocks of memory) is <emph style="italic">int</emph> (and cannot be made into an <emph style="italic">unsigned int</emph> in most of those places where it is not already an unsigned value) but on the majority of 64-bit platforms this is still a 32-bit type, which is a major restriction. However, on the vast majority of those platforms <emph style="italic">long</emph> is a 64-bit type, and so a suitable replacement. (The exceptions to this are the Alpha - but that is unusual in that both <emph style="italic">int</emph> and <emph style="italic">long</emph> are 64-bit types there, meaning that the platform will be unaffected by such an alteration - and Win64, which has a 32-bit <emph style="italic">long</emph> but 64-bit pointers.)</para>
<para>Luckily, standards like POSIX have already been dealing with this problem before us, and the types <emph style="italic">size_t</emph> (which is unsigned) and <emph style="italic">ssize_t</emph> (which is signed) exist for the sorts of uses we&apos;re interested in (i.e. they are both the same size as each other, and <emph style="italic">size_t</emph> is large enough to describe the size of any allocatable memory chunk.)</para>
</section>
<section title="Details of Changes">
<para>The key changes will be to change the lengths of the following types from <emph style="italic">int</emph> to <emph style="italic">ssize_t</emph> in all appropriate places, and <emph style="italic">unsigned int</emph> to <emph style="italic">size_t</emph> likewise (mainly in memory allocation routines.)</para>
<itemize><item.i><para><emph style="italic">Tcl_Obj</emph> - the <emph style="italic">length</emph> member. (Potentially the <emph style="italic">refCount</emph> member needs updating as well, but that&apos;s less critical.)</para></item.i><item.i><para><emph style="italic">Tcl_SavedResult</emph> - the <emph style="italic">appendAvl</emph> and <emph style="italic">appendUsed</emph> members.</para></item.i><item.i><para><emph style="italic">Tcl_DString</emph> - the <emph style="italic">length</emph> and <emph style="italic">spaceAvl</emph> members.</para></item.i><item.i><para><emph style="italic">Tcl_Token</emph> - the <emph style="italic">size</emph> and <emph style="italic">numComponents</emph> members.</para></item.i><item.i><para><emph style="italic">Tcl_Parse</emph> - the <emph style="italic">commentSize</emph>, <emph style="italic">commandSize</emph>, numWords<emph style="italic">, </emph>numTokens<emph style="italic"> and </emph>tokensAvailable&apos;&apos; members.</para></item.i><item.i><para><emph style="italic">CompiledLocal</emph> - the <emph style="italic">nameLength</emph> member.</para></item.i><item.i><para><emph style="italic">Interp</emph> - the <emph style="italic">appendAvl</emph>, <emph style="italic">appendUsed</emph> and <emph style="italic">termOffset</emph> members.</para></item.i><item.i><para><emph style="italic">List</emph> - the <emph style="italic">maxElemCount</emph> and <emph style="italic">elemCount</emph> members.</para></item.i><item.i><para><emph style="italic">ByteArray</emph> - the <emph style="italic">used</emph> and <emph style="italic">allocated</emph> members.</para></item.i><item.i><para><emph style="italic">SortElement</emph> - the <emph style="italic">count</emph> member.</para></item.i><item.i><para><emph style="italic">SortInfo</emph> - the <emph style="italic">index</emph> member.</para></item.i><item.i><para><emph style="italic">CopyState</emph> - the <emph style="italic">toRead</emph> and <emph style="italic">total</emph> members.</para></item.i><item.i><para><emph style="italic">GetsState</emph> - the <emph style="italic">rawRead</emph>, <emph style="italic">bytesWrote</emph>, <emph style="italic">charsWrote</emph> and <emph style="italic">totalChars</emph> members.</para></item.i><item.i><para><emph style="italic">ParseInfo</emph> - the <emph style="italic">size</emph> member.</para></item.i><item.i><para><emph style="italic">String</emph> - the <emph style="italic">numChars</emph> member (see also the <emph style="italic">TestString</emph> structure.)</para></item.i></itemize>
<para>Changes to the bytecode-related structures might be worthwhile doing too, though there are more backward-compatibility issues there.</para>
<para>These changes will force many of the types used in the public API to change as well. Notable highlights:</para>
<itemize><item.i><para><emph style="italic">Tcl_Alloc</emph> will now take an <emph style="italic">size_t</emph>.</para></item.i><item.i><para><emph style="italic">Tcl_GetByteArrayFromObj</emph> will now take a pointer to a <emph style="italic">ssize_t</emph>.</para></item.i><item.i><para><emph style="italic">Tcl_GetStringFromObj</emph> will now take a pointer to a <emph style="italic">ssize_t</emph>.</para></item.i><item.i><para><emph style="italic">Tcl_ListObjLength</emph> will now take a pointer to a <emph style="italic">ssize_t</emph>.</para></item.i><item.i><para><emph style="italic">Tcl_GetUnicodeFromObj</emph> will now take a pointer to a <emph style="italic">ssize_t</emph>.</para></item.i></itemize>
<para>In the internal API, the following notable change will happen:</para>
<itemize><item.i><para><emph style="italic">TclGetIntForIndex</emph> will now take a pointer to a <emph style="italic">ssize_t</emph>.</para></item.i></itemize>
<para>There are probably other similar API changes required.</para>
</section>
<section title="What This TIP Does Not Do">
<para>This TIP does not rearrange structure orderings. Although this would be very useful for some common structures (notably <emph style="italic">Tcl_Obj</emph>) if the common arithmetic types were smaller than the word size, it turns out that the changes in types required to deal with larger entities will make these rearrangements largely unnecessary and/or pointless. (Inefficiency in statically-allocated structures won&apos;t matter as the number of instances will remain comparatively small, even in very large programs.) Once the changes are applied, there is typically at most a single <emph style="italic">int</emph> field per structure, usually holding either a reference count, a set of flags, or a Tcl result code.</para>
<para>It should also be noted that all structures are always going to be correctly aligned internally as we never use C&apos;s bitfield support, so structure alignment is purely an issue of efficiency, and not of correct access to the fields.</para>
</section>
<section title="Copyright">
<para>This document has been placed in the public domain.</para>
</section>
</body></TIP>
