Tuesday, June 12, 2012

TCL - Dictionaries as alternative to arrays


Tcl arrays are collections of variables, rather than values. This has advantages in some situations (e.g., you can use variable traces on them), but also has a number of drawbacks:
  • They cannot be passed directly to a procedure as a value. Instead you have to use the array get and array setcommands to convert them to a value and back again, or else use the upvar command to create an alias of the array.
  • Multidimensional arrays (that is, arrays whose index consists of two or more parts) have to be emulated with constructions like:
    set array(foo,2) 10
    set array(bar,3) 11
    The comma used here is not a special piece of syntax, but instead just part of the string key. In other words, we are using a one-dimensional array, with keys like "foo,2" and "bar,3". This is quite possible, but it can become very clumsy (there can be no intervening spaces for instance).
  • Arrays cannot be included in other data structures, such as lists, or sent over a communications channel, without first packing and unpacking them into a string value.
In Tcl 8.5 the dict command has been introduced. This provides efficient access to key-value pairs, just like arrays, but dictionaries are pure values. This means that you can pass them to a procedure just as a list or a string, without the need for dict. Tcl dictionaries are therefore much more like Tcl lists, except that they represent a mapping from keys to values, rather than an ordered sequence.
Unlike arrays, you can nest dictionaries, so that the value for a particular key consists of another dictionary. That way you can elegantly build complicated data structures, such as hierarchical databases. You can also combine dictionaries with other Tcl data structures. For instance, you can build a list of dictionaries that themselves contain lists.
Here is an example (adapted from the man page):
#
# Create a dictionary:
# Two clients, known by their client number,
# with forenames, surname #
Joe dict set clients 1 surname
dict set clients 1 forename s Schmoe dict set clients 2 forenames Anne
t a table # puts "Number of client
dict set clients 2 surname Other # # Pri ns: [dict size $clients]" dict for {id info} $clients { puts "Client $id:"
dict with info { puts " Name: $forenames $surname" }
}
What happens in this example is:
  • We fill a dictionary, called clients, with the information we have on two clients. The dictionary has two keys, "1" and "2" and the value for each of these keys is itself a (nested) dictionary — again with two keys: "forenames" and "surname". The dict set command accepts a list of key names which act as a path through the dictionary. The last argument to the command is the value that we want to set. You can supply as many key arguments to the dict set command as you want, leading to arbitrarily complicated nested data structures. Be careful though! Flat data structure designs are usually better than nested for most problems.
  • The dict for command then loops through each key and value pair in the dictionary (at the outer-most level only). dict for is essentially a version of foreach that is specialised for dictionaries. We could also have written this line as:
        foreach {id info} $clients { ... }
    This takes advantage of the fact that, in Tcl, every dictionary is also a valid Tcl list, consisting of a sequence of name and value pairs representing the contents of the dictionary. The dict for command is preferred when working with dictionaries, however, as it is both more efficient, and makes it clear to readers of the code that we are dealing with a dictionary and not just a list.
  • To get at the actual values in the dictionary that is stored with the client IDs we use the dict with command. This command takes the dictionary and unpacks it into a set of local variables in the current scope. For instance, in our example, the "info" variable on each iteration of the outer loop will contain a dictionary with two keys: "forenames" and "surname". The dict with command unpacks these keys into local variables with the same name as the key and with the associated value from the dictionary. This allows us to use a more convenient syntax when accessing the values, instead of having to use dict get everywhere. A related command is the dict update command, that allows you to specify exactly which keys you want to convert into variables. Be aware that any changes you make to these variables will be copied back into the dictionary when the dict with command finishes.
The order in which elements of a dictionary are returned during a dict for loop is defined to be the chronological order in which keys were added to the dictionary. If you need to access the keys in some other order, then it is advisable to explicitly sort the keys first. For example, to retrieve all elements of a dictionary in alphabetical order, based on the key, we can use the lsort command:
foreach name [lsort [dict keys $mydata]] {
puts "Data on \"$name\": [dict get $mydata $name]"
}

Example

In this example, we convert the simple database of the previous lessons to work with dictionaries instead of arrays.
#
# The example of the previous lesson revisited - using dicts.
# proc addname {dbVar first last} { upvar 1 $dbVar db
access) dict incr db ID set id [dict get $db ID] # Cre
# Create a new ID (stored in the name array too for eas yate the new record dict set db $id first $first dict set db $id last $last } proc report {db} {
te a temporary dictionary mapping from # last name to I
# Loop over the last names: make a map from last name to ID dict for {id name} $db { # Cre aD, for reverse lookup if {$id eq "ID"} { continue } set last [dict get $name last] dict set tmp $last $id } #
puts " [dict get $db $id first] $last" } } # #
# Now we can easily print the names in the order we want! # foreach last [lsort [dict keys $tmp]] { set id [dict get $tmp $last] Initialise the array and add a few names # dict set fictional_name ID 0 dict set historical_name ID 0 addname fictional_name Mary Poppins addname fictional_name Uriah Heep addname fictional_name Frodo Baggins
ome simple reporting # puts "Fictional
addname historical_name Rene Descartes addname historical_name Richard Lionheart addname historical_name Leonardo "da Vinci" addname historical_name Charles Baudelaire addname historical_name Julius Caesar # # S characters:" report $fictional_name puts "Historical characters:"
report $historical_name
Note that in this example we use dictionaries in two different ways. In the addname procedure, we pass the dictionary variable by name and use upvar to make a link to it, as we did previously for arrays. We do this so that changes to the database are reflected in the calling scope, without having to return a new dictionary value. (Try changing the code to avoid using upvar). In the report procedure, however, we pass the dictionary as a value and use it directly. Compare the dictionary and array versions of this example (from the previous lesson) to see the differences between the two data structures and how they are used.

No comments:

Post a Comment

Popular Posts