TIP:            170
Title:          Better Support for Nested Lists
Version:        $Revision: 1.6 $
Author:         Sergey Babkin <babkin@freebsd.org>
Author:         Don Porter <dgp@users.sf.net>
Author:		Donal K. Fellows <dkf@users.sf.net>
State:          Draft
Type:           Project
Vote:           Pending
Created:        30-Jan-2004
Post-History:   
Tcl-Version:    8.7

~ Abstract

Nested lists are easy to create with Tcl but then manipulating them is not
easy. For example, think about how to change a value nested in a list 2 levels
deep? How about 4 levels deep? The proposed new commands make such
manipulation easy, and make the nested lists a great replacement for arrays
and structures in C-like languages.

~Rationale and Specification

The new proposed commands start with the prefix "ldeep". They are desgined to
resemble the classic list commands with "l" names. In the cases when the
meaning of "ldeep" command differs substantially from the "l" command, the
name has been selected to be different (such as "ldeeprep", not
"ldeepreplace").

The commands have been extensively used in the Not A Commander project
[http://nac.sf.net/] and have been adjusted to the observed needs of practical
use. All these commands use the concept of "path" (see [22]) and handle the
concatenation of paths for reasons of convenience (e.g. a "base" path followed
by a "local" path).

~~ Commands

The new proposed commands are:

 > '''ldeepset''' ''listvar path'' ?''path''?... ''value''

Set a ''value'' in a nested list variable. If the variable did not exist
previously, it is created. If the intermediate lists specified in the path did
not exist, they are created as empty lists. This includes the "filler"
elements: for example, if ''listvar'' contained a list of one element and the
path starts with "5 ...", the elements with indexes 1, 2, 3, 4 and 5 will all
be created (and will contain empty strings) and the the further creation
within the element at index 5 will proceed.

A special meaning is assigned by this command to the negative indexes in the
path: they mean "add a new element at the end of the list". So this command
also doubles as a nested version of lappend. For example,

|   ldeepset listvar -1 value

means the same thing as

|   lappend listvar value

This merging has happened because it's often neccessary to add elements to the
lists in the middle of the path. The particular value used to indicate the
addition of an element can be changed to something more symbolic, for example
to "append" instead of a negative value.

The ''ldeepset'' command returns nothing. Since the version without value as
in the common set can not be used, returning the value did not seem to make
sense.  Also when experimenting with large lists from the command line,
returning the value that is a large list itself would cause a long and
unpleasant printout of it.

''(This is only partially superceded by [33] and [331].)''

 > '''ldeepincr''' ''lstvar path'' ?''path''?... ''int-value''

Increase a value within a nested list by ''int-value''. Note that since the
amount of increase has to be differentiated from the ''path'', it's mandatory
even for the value of 1. This is a convenient and often used shortcut for the
'''ldeepindex'''-'''expr'''-'''ldeepset''' sequence.  It returns the value of
the element after increase.

 > '''ldeeprep''' ''lstvar path'' ?''path''?... ''first last element-list''

Replace a range of elements in a sublist identified by the ''path'' with the
elements from the ''element-list''. It returns nothing.

This command is different from '''lreplace''' in two ways, hence the name
change. First, it acts on data in a variable, not on a list as an argument.
Second, the elements for replacement are contained in a list, not as separate
elements on command line. Both differences were created for convenience of
practical use, plus to allow the path to pick up the variable number of
arguments. I have found that I always need to replace elements in a variable,
not as a pass-through operator, and that I almost always need to insert
elements from another list, not just some fixed set of elements.

 > '''ldeeppop''' ''lstvar path'' ?''path''?... ''count''

Remove ''count'' elements from the end of the sublist identified by ''path''
in the variable and return them in a list.

This command was inspired by the pop operator in Perl. Somehow I've never has
a very string need for the other similar commands (which would be
''ldeeppush'' to add elements, and ''ldeepshift'' and ''ldeepunshift'' for
operations on the start of the list) but they can be easily added as well for
completeness.

The command returns the list of the popped elements in the original order. For
example, if ''lstvar'' contained {0 1 2 3 4 5}, "ldeeppop lstvar {} 2" would
return {4 5}, NOT {5 4}.

~Other Extensions for List Support

In my practice I have found that a few other commands make working with lists
much more convenient. They are not directly related to the nested lists but to
the lists in general.

 > '''lconcat''' ''sublist'' ?''sublist''?...

Concatenate the argument lists and return the resulting list. This command is
similar to '''concat''' but avoids converting the values to strings,
concatenating the strings and then re-parsing the strings. When the lists
involved grow to a few megabytes, '''concat''' can become very inefficient
both in the sense of time and memory usage; '''lconcat''' resolved this
inefficiency. Note that it does ''not'' replace '''concat''', which can be
used to assemble lists from pieces in different argument strings. The command
returns the concatenated list.

''(Note that this is largely possible through using'' '''list''' ''and
[157]/[293], and that'' '''concat''' ''is more likely to handle the lists
where this matters efficiently anyway right now due to efficiency tweaks to
its implementation. It remains to be seen whether there is still a need for''
'''lconcat''' ''in other areas...)''

~Reference Implementation

The reference implementation is available as part of the Not A Commander
project [http://nac.sf.net/], the source file ''cutil.c''. To include the new
commands into Tcl, the error messages will have to be adjusted to match the
style used in Tcl, and the man pages will have to be written. The current
implementation has been tested in fair amounts for both correctness and
efficiency by usage in the Not A Commander project, the formal test suite
would have to be written. Further progress in this direction depends on
acceptance of this proposal.

~Comments

Messages on the TCLCORE mailing list have pointed out that nearly everything
proposed here is either equivalent to, or trivially created by composition of
the existing commands '''lindex''', '''lrange''', '''lset''', '''lassign''',
and '''lrepeat'''.  The only novel thing proposed is the ability of '''lset'''
to create new list elements.  If that's still desired, a separate new TIP
proposing that alone would be the best way to deal with that.

I call on the author to withdraw this TIP.

~Copyright

This document has been placed in the public domain.

----

~ Appendix: Removed Commands

''These commands were originally part of this TIP, but have been removed to
this appendix on the grounds that the functionality they describe is now part
of Tcl via other TIPs.''

 > '''ldeepindex''' ''list path'' ?''path''?...

Extract an element from a nested list. The element is identified by the
logical concatenation of the paths. It returns the extracted element.
''(Obsoleted by [22].)''

 > '''ldeeplen''' ''list path'' ?''path''?...

Find the length of a nested sublist. The sublist is identified by the logical
concatenation of the paths. It returns the found length. A non-existing
sublist is assumed to have the length of 0.
''(Obsoleted by [22] and normal llength.)''

 > '''ldeeprange''' ''list path'' ?''path''?... ''first last''

Extract a sublist from a nested list. This command is a convenient equivalent
of

|   lrange [ldeepindex list path ?path?] first last]

"end" is supported as the first or last index. It returns the extracted
sublist.
''(Obsoleted by [22] and normal lrange.)''

 > '''mset''' ''list-of-variable-names list-of-values''

Set the values from the value list to the variable in the variable list at the
same index. The command name stands for "multiple set". This command is
inspired by assignments in Perl.

If there were more variables than the values, the rest of variables are set to
an empty value. If there were more values than variables, then the last
variable is assigned the whole end of a list (as a list).

A special variable name "'''-'''" can be used to throw away a value, or the
whole end of the value list if it's specified as the last one in the variable
list.

The command returns nothing. It can be argued that it would make sense to
return the length of the original list, or the difference between the length
of the values list and the length of the variables list. I don't know which
one is better.

This command is particularly convenient for returning multiple values from a
procedure. For example:

|   proc xxx {a b} {
|      return [list [expr $b+$a] [expr $b*$a]]
|   }
|   mset {sum product} [xxx 1 2]

''(Obsoleted by [57].)''

 > '''ldup''' ''count element'' ?''element''?...

This returns a list produced by duplicating the sequence of elements ''count''
times. This command is inspired by the operator "x" in Perl, and it is a
logical equivalent of:

|   proc ldup {count args} {
|      set res {}
|      for {set i 0} {$i < $count} {incr i} {
|         set res [lconcat $res $args]
|      }
|      return $res
|   }

''(Obsoleted by [136].)''

 > '''lvdup''' ''count element-list''

This returns a list produced by duplicating the ''element-list count'' times.
This command is inspired by the operator "x" in Perl, and it is a logical
equivalent of:

|   proc lvdup {count list} {
|      set res {}
|      for {set i 0} {$i < $count} {incr i} {
|         set res [lconcat $res $list]
|      }
|      return $res
|   }

''(Obsoleted by [136] and [157]/[293].)''

