TIP:		341
Title:		Multiple 'dict filter' Patterns
Version:	$Revision: 1.5 $
Author:		Lars Hellström <Lars.Hellstrom@residenset.net>
State:		Final
Type:		Project
Vote:		Done
Tcl-Version:	8.6
Created:	27-Nov-2008
Keywords:	Tcl, set intersection
Post-History:	

~ Abstract

The '''key''' and '''value''' forms of '''dict filter''' are generalised to
allow an arbitrary number of patterns.

~ Specification

The two '''dict filter''' command forms

 > '''dict filter''' ''dictionary'' '''key''' ''pattern''

 > '''dict filter''' ''dictionary'' '''value''' ''pattern''

are generalised to

 > '''dict filter''' ''dictionary'' '''key''' ?''pattern'' ...?

 > '''dict filter''' ''dictionary'' '''value''' ?''pattern'' ...?

and the results are the sub-dictionaries of those keys and values respectively
which match at least one of the patterns.

~ Rationale

Although there are '''dict''' subcommands which allow deleting some keys from
a dictionary ('''dict remove''') and inserting some keys into a dictionary
('''dict replace'''), there is no direct way of requesting the sub-dictionary
which only has keys from a given list; if we think of only the set of keys in
the dictionary, then we have subcommands for set minus and set union, but none
for set intersection. A situation where this would be useful is that the
option dictionary for a high-level procedure can contain options meant to be
passed on to lower level commands, and it is necessary to extract the
subdictionary of options that the lower level command would accept (since
passing one which is not supported would cause it to throw an error).

There is of course already the '''dict filter''' command, which indeed returns
a subdictionary of an existing dictionary, but its '''key''' form only accepts
one '''string match''' pattern and therefore cannot be used to e.g. select all
three of -foo, -bar, and -baz (it could select both -bar and -baz through the
pattern -ba[rz], but that's neither common nor particularly readable).
However, in many instances where this kind of pattern is used (notably
'''glob''', '''namespace export''', and '''switch'''), it is possible to give
several such patterns and have it interpreted as the union of the patterns.
Were that the case with '''dict filter''', the "-foo, -bar, and -baz" problem
could be solved as easily as

|  dict filter $opts key -foo -bar -baz

which is comparable to

|  dict remove $opts -foo -bar -baz
|  dict replace $opts -foo 1 -bar off -baz 42

and much nicer than the '''script''' counterpart

|  dict filter $opts script {key val} {
|     ::tcl::mathop::in $key {-foo -bar -baz}
|  }

If the '''key''' form is generalised like this, then it seems appropriate to
also generalise the '''value''' form in the same way to keep the symmetry,
even though I have no immediate use-case for that feature.

Since it is generally good to Do Nothing Gracefully, the command syntax is
also generalised to allow the case of no patterns at all.

~ Rejected Alternatives

A more direct way of meeting the motivating need would be a command '''dict
select''' with the same syntax as '''dict remove''' (no pattern matching) but
logic reversed. This would however be so close to '''dict filter''' ...
'''key''' that extending the syntax of the latter seemed more appropriate.

An alternative to allowing multiple patterns with '''dict filter''' could be
to allow a regular expression pattern, since the union of two regular
languages is again a regular language. Any syntax that could be picked for
that would however on one hand already be rather close to

|  dict filter $opts script {key val} {regexp $RE $key}

and on the other it would be rather difficult to read, as the regular
expression corresponding to "-foo or -bar or -baz" is

|  ^(-foo|-bar|-baz)$

which it is tempting but incorrect to simplify to "-foo|-bar|-baz".

~ Implementation Notes

An implementation exists (it's a very trivial to modify '''dict filter'''
... '''value''' to work this way: just add an inner loop over the list of
patterns); see SF path #2370575.
[https://sourceforge.net/support/tracker.php?aid=2370575]

What might be tricky is the case of '''dict filter''' ... '''key''', since
this currently has an optimisation for the case of a pattern without glob
metacharacters that would be very desirable to keep for the motivating
use-case of selecting specific keys from a dictionary. The natural way to do
that would be to make the loop over patterns the outer loop and the loop over
dictionary entries the inner loop, which is only entered if the current
pattern contains metacharacters. Such an optimisation would however have the
script-level-visible consequence of having the keys show up in the order of
the patterns rather than the order of the original dictionary, so it may be a
good idea to also explicitly specify that '''dict filter''' does not guarantee
keys in the result to be in the same order as in the input dictionary.

Indeed, a '''dict filter''' ... '''key''' that reorders keys according to its
pattern arguments could sometimes be useful in interactive situations, as a
way of getting selected keys up from in a dictionary:

|  set D {-baz 0 -bar 1 -foo 2}
|  dict filter $D key -foo -bar *

On the other hand, this effect can mostly be obtained through use of '''dict
merge''' already:

|  dict merge {-foo x -bar x} $D

~ Copyright

This document has been placed in the public domain. 
