[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

EQUALP hash tables

To: Fahlman@CMU-CS-C.ARPA
Subject: EQUALP hash tables
From: Jon White <JLW@SU-AI.ARPA>
Date: Fri, 17 May 1985 12:07:00 -0000
Cc: common-lisp@SU-AI.ARPA

In-reply-to: your message of 11-May-85  10:59 EDT

As for names for the "equivalence" and "numericalizing" functions for
hash tables, well, the :test argument is already the "equivalence" argument;
how about :sxhash for the numericalizing argument?  This has the mnemonic
value that the default for it could simply be #'SXHASH.

I see no reason to restrict the possible values that :sxhash may have --
even when the :test argment is #'EQ; as my note last week was intending
to point out, EQ-type hash tables needn't be constrained to use pointer
bits for the :sxhash -- it's only a vague general feeling that we don't
want to have to re-hash whenever some object in th table is updated that
prevents use of #'SXHASH here.  But for many (maybe even most?) applications
there is really no need to worry about the updating problem;  after all,
who really worrys about it with EQUAL type tables?  and it's just as
serious a problm there too, as Hoey pointed out.

How  would you feel about this?  The default :sxhash value will be an internal
version of MAKNUM *** when the :test arg is #'EQ o #'EQL *** rather than
SXHASH (appropriate conversions from 'EQ and 'EQL).  iven this, is there any
need to have the user supply the information that the table will have to be
re-hash'd after a relocating GC?  This condition should only occur when the
:sxhash function is somehow a function of the entry's pointer address -- and
by fiat you are ruling out any user-level function that returns that address.
Can anyone think of a case where this isn't true?

Finally, I'd say your suggestion for a REHASH function is quite timely.
Interlisp  has such a function:
  (REHASH hash-array new-array)
where both argments are hash-arrays, and the first one is merely re-hashed
into the second; a NIL for the second arg means something like (I think)
"cons up a fresh hash array just big enough to hold all the elements of
the first argument, and it's :rehash-threshold size"; and a T for the
second arg means something like (I think) "cons up a fresh hash array big
enough to be the growth by the :rehash-size of the first argument".
[well, if it doesn't operate exactly like that, it should].  

A useful scenario is to make a huge hash table, make many entries over a
period of time, and then finally decide that not may more entries will
be made; thus one will want to "shrink" the table to the minimum feasible
size for holding the current entries.  REHASH, as defined above, with a
second arg of NIL is just the right thing.

-- JonL --

Prev by Date: SXHASH on circular structures?
Next by Date: Re: Poor excuse for packages
Previous by thread: SXHASH on circular structures?
Next by thread: SXHASH and EQ hash tables
Index(es):
- Date
- Thread