[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Re: Hashables and GC]



   From:	IN%"goldman@vaxa.isi.EDU"  4-AUG-1988 17:57
   To:	ELIOT@cs.umass.EDU
   Subj:	[Re: Hash Tables and GC]
             Not only does the GC have to prove that the reference
       count for BOTH key and VALUE is one, but the GC also has to prove
       that there are no other objects, x, such that:
       	(<Equality-Predicate> x KEY)
   
   There is no particular concern about whether other pointers to the VALUE
   exist; only pointers to members of the KEY's equivalence class.  

Agreed, but this has nothin gto do with my main point.

   Look at
   the problem not as one of GC, but as one of "when is it safe to remove
   an entry from a hash table?"  [once it has been removed that may make
   the value garbage (reduce its reference count to 0)].  It is certainly
   SAFE to remove a set of entries Ei= [Ki Vi] if, once the Ei are removed,
   there is no way to construct a reference to an object equivalent to any Ki.

Everything you say is precisely correct.  It is a consequence of what you
say that requires more emphasis.  Consider computationally implementing your
phrase "there is no way to construct a reference to an object equivalent to
any Ki."  For EQ hash tables this is somewhat straightforward because of:
	for-all x, y if (EQ x y) then (x = y)
i.e. there is only one object EQ to any given object.  Hence the size of the
set of objects that can be constructed which are equivalent (EQ) to Ki is one.

Now consider any hash table other than an EQ hash table.  In this case the
size (cardinality) of the set of objects that can be constructed which are
equivalent (#'TEST) to Ki is (potentially) infinite, except when the Ki are
some type of object for which EQUAL and EQ are the same (most atomic types)
in which case it is still one.

Now whatever unspecified process determines that it is safe to remove the
keys and values must prove that there is no reference to any element
of this potentially infinite set.  This is clearly impractical except in
some erratic and implementation specific special cases.  

What I am saying is that this feature is basically meaningless whenever
the test is not (effectively) EQ.  I don't think Common Lisp should have
features which are never really going to work in any implementation.

On the other hand, when restricted to EQ and EQL hash tables (excluding
numbers/characters) I think this is a reasonable feature.  It must
be specified what happens when a Key is an interned but GCTWA symbol.

In fact I think this feature could reasonably be driven down a level, into
MAKE-ARRAY.  Consider this proposal, upon which the kind of hash-tables
you want could be implemented.

PROPOSAL:  Add to MAKE-ARRAY the keywords :GC-PROTECTING (default T) and :KEY
(default #'IDENTITY).  If GC-PROTECTING is NIL and (the KEY of) any element
in the array cannot otherwise be referenced then the GC is allowed to
replace the entire array element with NIL.
=====

The :key argument makes it easy to construct hash tables of the type you
want.  The lack of a :test argument avoids the problems I am bothered by.
Arrays are a more basic data structure than hash tables, so this might 
have more uses.  (The :GC-PROTECTING keyword can be inherited by 
make-hash-table too.)

Something like this was proposed while I was implementing the editor for NIL.
It would allow the editor to cache data structures after buffers are killed
or large regions deleted, without entirely preventing the GC from reclaiming
the memory.  This cacheing was particularly important because of a
negative temporal displacement in the implementation of the garbage collector.

Christopher Eliot
University of Massachusetts at Amherst


   If there are references to KEYS from inside VALUES, some schemes may fail
   to remove entries that are safe to remove, but this is just like reference
   count GCers that fail to reclaim certain circular structures even though
   they are garbage.  You lose space, but don't get incorrect results.
   
   
       Even EQ hash tables have problems, in any implementation with
       immediate representations for objects.  For example the key 57236
       (a fixnum) may not be interned and therefor there is no way to tell
       how many copies of it exist.
   
   Considering the paragraph of CLTL beginning on pg 77 and continuing to the top
   of pg 78, entries in EQ hash tables having integers or characters as the
   KEY can serve NO PORTABLE purpose except for usage through
   MAPHASH and/or HASH-TABLE-COUNT.
   
   Neil