TIP 341: Multiple 'dict filter' Patterns

Login
Author:		Lars Hellström <[email protected]>
State:		Final
Type:		Project
Vote:		Done
Tcl-Version:	8.6
Created:	27-Nov-2008
Keywords:	Tcl, set intersection
Post-History:	
Tcl-Ticket:	2370575

Abstract

The key and value forms of dict filter are generalised to allow an arbitrary number of patterns.

Specification

The two dict filter command forms

dict filter dictionary key pattern

dict filter dictionary value pattern

are generalised to

dict filter dictionary key ?pattern ...?

dict filter dictionary value ?pattern ...?

and the results are the sub-dictionaries of those keys and values respectively which match at least one of the patterns.

Rationale

Although there are dict subcommands which allow deleting some keys from a dictionary (dict remove) and inserting some keys into a dictionary (dict replace), there is no direct way of requesting the sub-dictionary which only has keys from a given list; if we think of only the set of keys in the dictionary, then we have subcommands for set minus and set union, but none for set intersection. A situation where this would be useful is that the option dictionary for a high-level procedure can contain options meant to be passed on to lower level commands, and it is necessary to extract the subdictionary of options that the lower level command would accept (since passing one which is not supported would cause it to throw an error).

There is of course already the dict filter command, which indeed returns a subdictionary of an existing dictionary, but its key form only accepts one string match pattern and therefore cannot be used to e.g. select all three of -foo, -bar, and -baz (it could select both -bar and -baz through the pattern -ba[rz], but that's neither common nor particularly readable). However, in many instances where this kind of pattern is used (notably glob, namespace export, and switch), it is possible to give several such patterns and have it interpreted as the union of the patterns. Were that the case with dict filter, the "-foo, -bar, and -baz" problem could be solved as easily as

  dict filter $opts key -foo -bar -baz

which is comparable to

  dict remove $opts -foo -bar -baz
  dict replace $opts -foo 1 -bar off -baz 42

and much nicer than the script counterpart

  dict filter $opts script {key val} {
     ::tcl::mathop::in $key {-foo -bar -baz}
  }

If the key form is generalised like this, then it seems appropriate to also generalise the value form in the same way to keep the symmetry, even though I have no immediate use-case for that feature.

Since it is generally good to Do Nothing Gracefully, the command syntax is also generalised to allow the case of no patterns at all.

Rejected Alternatives

A more direct way of meeting the motivating need would be a command dict select with the same syntax as dict remove (no pattern matching) but logic reversed. This would however be so close to dict filter ... key that extending the syntax of the latter seemed more appropriate.

An alternative to allowing multiple patterns with dict filter could be to allow a regular expression pattern, since the union of two regular languages is again a regular language. Any syntax that could be picked for that would however on one hand already be rather close to

  dict filter $opts script {key val} {regexp $RE $key}

and on the other it would be rather difficult to read, as the regular expression corresponding to "-foo or -bar or -baz" is

  ^(-foo|-bar|-baz)$

which it is tempting but incorrect to simplify to "-foo|-bar|-baz".

Implementation Notes

An implementation exists (it's a very trivial to modify dict filter ... value to work this way: just add an inner loop over the list of patterns); see SF path #2370575. https://sourceforge.net/support/tracker.php?aid=2370575

What might be tricky is the case of dict filter ... key, since this currently has an optimisation for the case of a pattern without glob metacharacters that would be very desirable to keep for the motivating use-case of selecting specific keys from a dictionary. The natural way to do that would be to make the loop over patterns the outer loop and the loop over dictionary entries the inner loop, which is only entered if the current pattern contains metacharacters. Such an optimisation would however have the script-level-visible consequence of having the keys show up in the order of the patterns rather than the order of the original dictionary, so it may be a good idea to also explicitly specify that dict filter does not guarantee keys in the result to be in the same order as in the input dictionary.

Indeed, a dict filter ... key that reorders keys according to its pattern arguments could sometimes be useful in interactive situations, as a way of getting selected keys up from in a dictionary:

  set D {-baz 0 -bar 1 -foo 2}
  dict filter $D key -foo -bar *

On the other hand, this effect can mostly be obtained through use of dict merge already:

  dict merge {-foo x -bar x} $D

Copyright

This document has been placed in the public domain.