[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

user hash tables



I have needed these badly.  I have two suggestions:

I frequently want to store objects that have their retrieval keys in them
already; for example, a structure representing a symbol in a table often
has the string for the symbol sitting in the structure already for various
reasons.  It is pointlessly inefficient to represent this in the hash
table as a pair (key . thing) when the thing contains the key already.

What you need here is a set implemented as a hash table, where the
definition of equality does comparison on some piece of the structure
of the object stored in the table, while hash tables as currently
designed are obviously intended to be like association lists.  If a hash
set could be created with user-supplied :hash and :equality routines,
it would be very easy to do this.  I believe that a hashed set
abstraction is also useful in its own right.

Now, in the example of a symbol table it is often the case that you
have a string in hand, and you want to find out if it is already in
the symbol table.  If it is, you do nothing; if not, you want to create
a symbol structure with various auxilliary fields and store it in the
table.  Hash sets as proposed above can be extended to do this very nicely
by adding a :insert routine, which is called when something is added
("adjoined", I guess) to the table.  In this example, the insert routine
would take the string and table as an argument and create the symbol
table structure just before stuffing into the appropriate hash bucket.

Given hash sets in this form, it would be very easy to implement hash
tables by having the insert routine cons up a (key . object) pair.
(Actually, a lookup would return the pair and I don't remember whether
CL hash tables return this or the object.)

I have used a hash set implementation like this at CMU on the PQCC
project and at Tartan Labs (the :insert routine idea was John Nestor's).

Tables of this form don't quite do everything: sometimes I have wanted the
key to be a combination of several fields of the structure being stored
(say "name" field and "type" or "package").  I suppose the routines could
be extended to take &REST arguments or whatever.

One of the properties I usually want from a hash table is speed.  Pulling
hash and equality routines out of a hash table object and calling them
is obviously going to add a fair amount of overhead to the whole process.
It would be nice if users could define a new type of hash table by
some method similar to DEFSTRUCT, and have these routines defined as
macros that could be expanded at compile time (but please don't suggest
hairing up DEFSTRUCT even more to do this!).