TIP:            132
Title:          Revised Floating-Point Conversions in Tcl
Version:        $Revision: 1.14 $
Author:         Kevin Kenny <kennykb@acm.org>
Author:         Donal K. Fellows <donal.k.fellows@man.ac.uk>
State:          Final
Type:           Project
Vote:           Done
Created:        31-Mar-2003
Post-History:   
Keywords:       floating point,IEEE,precision
Tcl-Version:    8.5

~ Abstract

This TIP proposes several changes to the conversion between floating
point numbers and character strings. The changes are made to restore
the "everything is a string" contract that Tcl implicitly makes;
without them, there are observable differences in the behavior of
floating point numbers, depending on the state of the internal
representation.

~ Rationale

In today's (8.4) Tcl, there are several gaffes that make
floating-point arithmetic less useful than it might be, and cause
confusion for the users.  Chief among these is that string equality
does ''not'' imply value equality for floating point numbers:

|    % set tcl_precision 12
|    12
|    % set a [expr 1.00000000000123]
|    1.0
|    % set b [expr 1.0]
|    1.0
|    % expr { $a == $b }
|    0
|    % expr { $a eq $b }
|    1

This behavior flies in the face of Tcl's "everything is a string"
mantra.  In the interaction above, it is visible that ''a'' and ''b''
are ''not'' entirely strings; they have identical string
representations, but still test to be unequal with "==".  The
underlying cause of the behavior is the fact that the default setting
for ''tcl_precision'' loses precision when converting the
floating-point number to a string.  Behaviors like this have caused
Tcl's ''cognoscenti'' to recommend that all programs set
''tcl_precision'' to 17; once this setting is made, double-to-string
conversions are invertible, and everything is once again a string.

(Why 17? With IEEE-754 arithmetic, 17 decimal digits suffice to
distinguish between any two floating point numbers, no matter how
small their difference. No smaller number of digits suffices.)

Why is ''tcl_precision'' not 17 by default? The reason appears to be
that when the precision is set that high, the default IEEE-754
semantics for floating point conversions cause a certain degree of
trouble.  They require that the decimal representation be the nearest
decimal representation to the value of the floating-point number that
has the given number of significant digits.  This conversion gives
rise to peculiarities:

|    % set tcl_precision 17; expr 0.2
|    0.20000000000000001

The peculiar behavior is, for the most part, suppressed by a lower
value for ''tcl_precision'':

|    % set tcl_precision 16; expr 0.2
|    0.2

The lower value, nevertheless, introduces the trouble above.  This TIP
proposes a solution to both problems.

~ Specification

This TIP proposes the following changes to the floating point
conversions in the Tcl library:

 1. The default value of ''::tcl_precision'' shall be changed to 0.
    A value of 0 (currently an error) shall be interpreted to mean
    that a number shall be formatted using
    the smallest number of decimal digits required to distinguish
    it from the next floating point number above it and the next
    floating point number below.  Other values of ''tcl_precision''
    shall continue to work as they do in Tcl 8.4.
    The documentation shall formally deprecate changing
    ''::tcl_precision'' to any other value, warning that doing so
    risks both loss of precision and inconsistency between string
    equality and "==".

 1. The default input conversion of floating-point numbers,
    ''SetDoubleFromAny'' shall be documented to guarantee precise
    rounding (generally to within one-half a unit of the least
    significant place [[1/2 ULP]]).  IEEE-754 rounding semantics are
    correct for this input.  The ''strtod'' procedure from the
    standard C library shall ''not'' be used for this conversion,
    since so many implementations are buggy; instead, a Tcl
    implementation shall be developed from scratch based on
    the algorithms developed by Burger and Dybvig
    [http://citeseer.ist.psu.edu/28233.html].
    (It is worthy of note that several
    platforms already eschew the native ''strtod'' in favour of
    one provided in the ''compat/'' library, because of known bugs.)

 1. When ''tcl_precision'' is zero, the output conversion 
    of floating-point numbers, ''UpdateStringOfDouble,'' shall convert a
    floating-point number to the unique string that satisfies the
    following constraints:

 > * if reconverted to a binary floating point number, it yields a
     number that is the closest among all strings having the given
     number of significant digits.

 > * if there is more than one string that is equally close but
     neither string yields the given number exactly, then the string
     with the even digit in the least significant place is
     chosen. 

 > * if there is more than one string with at most the given number of
     significant digits that yields the given floating-point number,
     but one has fewer significant digits than the other, then the
     shorter one is chosen. For example,

|              % expr 1e23

 > > returns ''1e+23'', not ''9.9999999999999992e+22'', even though
     the latter is a nearer representation of the exact floating-point
     value.

 > * if there is more than one string with at most the given number of
     significant digits that reconverts exactly to the same floating
     point number, and all such strings are equally long, then the one
     closest to the given floating point number is chosen.

 > * if a floating point number lies exactly at the midpoint of two
     strings with the same number of significant digits, the one
     with an even digit in the least significant place is chosen.

 1. The test suite for Tcl shall be upgraded to include suitable test
    vectors for assessing correct rounding behavior.  The paper by
    Verdonk, Cuyt and Verschaeren in the References, and the
    associated software, present a suitable data set for inclusion.

 1. The input and output conversions shall allow for the IEEE special
    values +Inf, -Inf, and NaN (and for denormalized numbers).  The
    [[expr]] command shall be changed to allow +Inf and -Inf as
    the result of an expression; NaN shall still cause an error.
    Tcl_GetDoubleFromObj shall treat +Inf and -Inf as it does any
    ordinary floating point number, and return an error for NaN.

 1. The [[binary scan]] and [[binary format]] commands shall explicitly
    permit infinities and NaN as values.

''Donal K. Fellows notes that...'' the '''expr''' command will handle Inf and NaN quite well if asked nicely; it just won't return them.

| % expr {1 / Inf}
| 0.0

~ References

The basic principles of correctly-rounded floating point conversions
have been known for a good many years.  Perhaps the two most seminal
papers on modern floating point conversion are:

 > William D. Clinger, ''How to Read Floating Point Numbers
   Accurately'', Proceedings of the ACM Conference on Programming
   Language Design and Implementation, June 20-22 1990, pp. 92-101.
   [http://citeseer.nj.nec.com/clinger90how.html]

 > Guy L. Steele Jr. and Jon L. White, ''How to print floating-point
   numbers accurately''. In Proceedings of the ACM Conference on
   Programming Language Design and Implementation, June 20-22 1990,
   pp. 112-126.  [http://doi.acm.org/10.1145/93542.93559]

Both of these papers inspired David Gay to implement a library to do
correct rounding in floating point input and output conversions:

 > David M. Gay, ''Correctly rounded binary-decimal and decimal-binary
   conversions'', Numerical Analysis Manuscript 90-10, AT&T Bell
   Laboratories, Murray Hill, New Jersey, November 1990.
   [http://citeseer.nj.nec.com/gay90correctly.html]

The algorithms for output conversion that appear in the reference
implmentation are a hybrid of Gay's and the ones presented in:

 > Robert G. Burger and R. Kent Dybvig, ''Printing Floating-Point Numbers
   Quickly and Accurately'', SIGPLAN Conf. on Programming Language Design
   and Implementation, 1996, pp. 108-116.
   [http://citeseer.ist.psu.edu/28233.html]

A reasonably comprehensive set of test vectors detailing problem cases
for rounding conversions is documented in:

 > Brigitte Verdonk, Annie Cuyt, Dennis Verschaeren, ''A precision and
   range independent tool for testing floating-point arithmetic II:
   Conversions'', ACM Transactions on Mathematical Software 27:2
   (March 2001), pp. 119-140.
   [http://citeseer.nj.nec.com/verdonk99precision.html]

~ Reference Implementation

A partial reference implementation of this TIP is available in CVS on the
''kennykb-tcl-numerics'' branch.  Since it requires additional third-party
code that is not yet in the core, a different module name is required
to check it out:

| export CVSROOT=:ext:yourname@cvs.sourceforge.net:/cvsroot/tcl
| cvs -q checkout -rkennykb-tip-132 tcl_numerics

The implementation is essentially complete; it includes changes to
the documentation and test suite needed for this TIP.

~ Acknowledgments

The discussion of just why 17 is magical was prompted by a suggestion
from Lars Hellstroem.

~ Copyright

Copyright (c) 2005 by Kevin B. Kenny.  All rights reserved.

This document may be distributed subject to the terms and conditions
set forth in the Open Publication License, version 1.0
[http://www.opencontent.org/openpub/].

