[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: constant folding/smashing



Since we're in the process of a CL overhaul, I'm not so concerned with
what's allowed as what should be.

Whether the (FOO BAR BAZ) is interpretable as read-only or not by a
compiler in
 (DEFUN FOO-BAR-OR-BAZ-P (X) (MEMBER X '(FOO BAR BAZ))
is not very interesting. Indeed, even in
 (DEFUN FOO-BAR-BAZ () '(FOO BAR BAZ))
I'm happy to have this return a read-only list or a list shared with
other programs because even when sharing doesn't occur, I could get
screwed by modifying the list. Consider:
 (DEFUN FOO-BAR-OR-BAZ-P (X) (MEMBER X (FOO-BAR-BAZ)))
 (DEFUN QUUXIFY (THING) (NCONC THING (LIST 'QUUX)))
 (QUUXIFY (FOO-BAR-BAZ)) => (FOO BAR BAZ QUUX))
 (QUUXIFY (FOO-BAR-BAZ)) => (FOO BAR BAZ QUUX QUUX))

HOWEVER, I am -very- concerned about the following case:
 (DEFVAR *FOO-LIST* (LIST (MAKE-ARRAY 5) (MAKE-ARRAY 5)))
 (DEFUN FOO-0 () '#,(NTH 0 *FOO-LIST*))
 (DEFUN FOO-1 () '#,(NTH 1 *FOO-LIST*))
I want to be programming in a language that guarantees that
if I modify (NTH 0 *FOO-LIST*), then the array returned by
FOO-0 will have been modified. Moreover, I want my language
of choice to guarantee that FOO-0 and FOO-1 will not return
EQ values. In programming terms, I want the following to be
true immediately after loading the preceding three forms:
 (EQ (FOO-0) (NTH 0 *FOO-LIST*)) => T
 (EQ (FOO-1) (NTH 1 *FOO-LIST*)) => T
 (EQ (FOO-0) (FOO-1)) 		 => NIL
I'm concerned that under CLTL it seems to me just as valid
to get:
 (EQ (FOO-0) (NTH 0 *FOO-LIST*)) => NIL
 (EQ (FOO-1) (NTH 1 *FOO-LIST*)) => NIL
 (EQ (FOO-0) (FOO-1)) 		 => T
In particular, if *FOO-LIST* is a registry of arrays or structures
which is expensive to select from but which might need to be
dynamically altered, and my purpose in using #, is to allow me to
do that access once and for all at load time, but I may not wish
to commit to the unchangingness of those arrays. If QUOTE is going to
foist constancy on me, I'm screwed.

There are solutions like using closures, but they are not guaranteed
to be fast, whereas QUOTE is fast in every dialect.

There are clearly two issues involved:
 - is the object evaluated or not
 - is the object constant or not

Long ago, I made a rule of thumb which I've not yet found to be violated:
 If you have two boolean quantities to represent,
 don't try to do it with one bit.
What happens when you violate this rule is that you let a user write
just 0 or 1 and then you waste endless time arguing over whether they
really meant 00, 01, 10, or 11 when instead you should have just let
them write 00, 01, 10, or 11 and been done with it. [Friends who read
drafts of this message have jokingly dubbed this "KMP's two-bit rule".
Whatever you call it, you'll quickly see that it is quite applicable to
this problem.]

People have legitimate reasons for having quoted constants and quoted
non-constants, so should have two primitives to control the orthogonal
features of evaluation and object constancy. Then programmers could
just say what they want to say and overzealous implementors wouldn't
keep trying to second-guess them and keep them from getting work done.

I don't know what these primitives should be called, exactly, but here
are some straw-men for people to discuss:

 #1: (QUOTE x) might mean X is not evaluated and is also read-only
     and (IMMEDIATE x) might mean X is not evaluated but is NOT read-only.

 #2: (QUOTE x) might mean X is not evaluated and NOT read-only
     (CONSTANT x) might mean X is not evaluated and is also read-only.

[The observant reader will note that this is only 1.5 bits (not 2 bits) of
information. You can write either x, (QUOTE x), or (CONSTANT x) but there
is nothing that says this is a constant, evaluated expression. I didn't
propose that just because no one is asking for it. If there were a strong
need, I might have proposed that [...] means like (...) except that it
allocates the list in read-only space. Then '[A B C] would be EQUAL to
'(A B C) but '[A B C] would designate an unchanging expression and '(A B C)
would not. There have been Lisp dialects which took this approach and got
useful mileage out of it, but I'm stopping short of proposing we go that
route.]

Having a facility like this, I could presumably then do either
(CONSTANT #,X) or (IMMEDIATE #,X) as appropriate ... or I could even
still use (QUOTE #,X) -- if only I knew which of CONSTANT and IMMEDIATE
QUOTE meant!