[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

DEFSETF proposal, revised



This replaces the version I mailed out a few days ago.  Some obscurities
and Lisp-machine-specific things have been removed, and the English has
been improved and clarified.  Please let me know if there are any problems
with this.

The DEFSETF documented in the Laser edition is deficient in several
respects.  The UNCONS example (about as obscure as one can get!) fails to
agree with the English.  The manual is inconsistent about who is
responsible for ensuring proper order of evaluation.  It doesn't seem to be
possible to write the DEFSETF for LDB; ignoring order of evaluation and
returned-value issues, (SETF (LDB x y) z) turns into (SETF y (DPB z x y));
note how the "y" argument is used as both an "Lvalue" and an "Rvalue".
This DEFSETF similarly gives no help in implementing such macros as INCF,
SHIFTF, and PUSH, which take a place that is both read and written.  The
function generated by DEFSETF doesn't have a portable name, so the user may
not know how to trace and otherwise debug it.  (This is best attributed to
the omission of function specs from Common Lisp, so I won't mention it
again.)

I would like to propose a modified form of DEFSETF:

There are three forms of DEFSETF, rather than the two in the Laser
manual.  This helps a lot.

The body of DEFSETF is in charge of making sure that the right value is
returned.  In most cases, this works simply by calling a storing
function that is guaranteed to return the right value.  The SETF system
is in charge of evaluating the evaluatable subforms of the SETF (or
other) special form in left-to-right order, and once each (no more
and no less), or generating code whose effect is equivalent to that.
It is aided in this by some information supplied by the body of the
DEFSETF.

A DEFSETF defines the "SETF method" for generalized-variable forms with
a certain symbol in the car.  (The use of the word "method" here has
nothing to do with message-passing or flavors.)  The underlying theory is
that the SETF method for a given form can be expressed as five values:

	- a list of temporary variables
	- a list of values to bind them to (subforms of the given form)
	- a second list of temporary variables ("store variables")
	- a storing form
	- an accessing form

The store variables are to be bound to the values of the form to be stored.
In almost all cases we store single values and there is only one store
variable.

The storing form and the accessing form can be evaluated any number of
times, inside the bindings of the temporary variables (and of the store
variables, in the case of the storing form).  The accessing form returns
the value of the generalized variable.  The storing form modifies the
value of the generalized variable, and guarantees to return the store
variable(s) as its value(s); these are the correct values for SETF,
INCF, etc. to return.  The value returned by the accessing form is
(of course) affected by execution of the storing form.

The temporary variables and the store variables are gensyms in all cases,
so that there is never any issue of name clashes with them.  This is
necessary to make the special forms that do multiple setfs in parallel work
properly; these are PSETF, SHIFTF, and ROTATEF.  Computation of the SETF
method must always cons new gensyms.

To give some examples:

	- for a variable X
		() () (G0001) (SETQ X G0001) X

	- for (CAR exp)
		(G0001) (exp) (G0002) (RPLACA2 G0001 G0002) (CAR G0001)
	  where RPLACA2 is assumed to be a version of RPLACA that returns
	  its second argument.

I guess &OPTIONAL and &REST should be allowed in the store variables, if
we want to keep the features involving them that are in the Laser edition
DEFSETF.  I don't currently support this.

SETF methods are defined via DEFSETF.  There are three forms of DEFSETF.

  - the trivial form (DEFSETF access-function storing-function [documentation])
    The storing-function takes the same arguments as the access-function,
    plus one additional argument, which is the value to store, and returns
    that as its value.

    Example: (DEFSETF SYMBOL-VALUE SET)

  - the functional form (DEFSETF access-function (args...) (store-variables...)
			  [documentation]
			  body...)
    The access-function is truly a function and evaluates its arguments.
    Furthermore a SETF reference evaluates all the arguments (i.e. this is not
    a "destructuring" case).  Args are the lambda-list for the
    access-function; &OPTIONAL and &REST are permitted.  Optional args may
    have defaults, although many cases that need them may prefer to use the
    general form (below).
    A DEFSETF of this form expands into code to return the five SETF-method
    values, including creating gensym temporary variables and gensym store
    variables.  The list of values to bind the temporary variables to is
    simply the cdr of the reference, except when there are optional arguments
    in which case code is generated to plug in the default values as required.
    The accessing form is access-function applied to the temporary variables.
    The result of evaluating the body is the storing form.  During the
    evaluation of the body, the variables in args and in store-variables will
    be bound to the corresponding gensyms.  The gensyms will in turn be bound
    to the arguments to the access function, and the values being stored, at
    run time.

    Example: (DEFSETF SUBSEQ (SEQUENCE START &OPTIONAL END) (NEW-SEQUENCE)
		`(PROGN
		   (REPLACE ,SEQUENCE ,NEW-SEQUENCE :START1 ,START :END1 ,END)
		   ,NEW-SEQUENCE))

    Note how in the functional form most of the "hair" required to evaluate
    things in the correct order is taken care of by DEFSETF and SETF, and the
    user need merely write a simple backquote specifying what is to be done.

  - the general form (DEFINE-SETF-METHOD access-function (subforms...)
			[documentation]
			body...)
    The result of evaluating the body must be five values, as documented
    above.  The subforms variables are bound to a destructuring of the cdr of
    the reference, just as in DEFMACRO.  Note that the general case differs
    from the functional case in that while the body is being executed the
    general case's subforms pattern is bound to parts of the reference form,
    whereas the functional case's args pattern is bound to gensyms.

    I gave this a separate name rather than kludging up DEFSETF's syntax
    somehow with an extra level of parentheses or something.  It's reasonable
    for this to have a longer name since it is more low-level.

    Example: (Sorry, there are by definition no small examples of this.)

    (DEFINE-SETF-METHOD LDB (BYTE WORD)
      (MULTIPLE-VALUE-BIND (WORD-TEMPS WORD-VALS WORD-STORES WORD-STORE-FORM WORD-ACCESS-FORM)
	  (GET-SETF-METHOD-1 WORD)		;Find out how to access WORD
	(LET ((BTEMP (GENSYM))			;Temporary variable for byte specifier
	      (STORE (GENSYM))			;Temporary variable for byte to store
	      (WORD (FIRST WORD-STORES)))	;Temporary variable for word to store
	  (VALUES (CONS BTEMP WORD-TEMPS)
		  (CONS BYTE WORD-VALS)
		  (LIST STORE)
		  `(LDB ,BTEMP			;To preserve value, assume optimizable out
			(LET ((,WORD (DPB ,STORE ,BTEMP ,WORD-ACCESS-FORM)))
			  ,WORD-STORE-FORM))
		  `(LDB ,BTEMP ,WORD-ACCESS-FORM)))))

    GET-SETF-METHOD-1 gets the SETF-method values for a form, dealing
    with all macro-expansion and error-checking issues.  In addition it
    complains if there isn't exactly one store variable.  Like most
    SETF'able things, LDB deals with single values only, not multiple
    values.

Another thing that is useful is

    DEFINE-MODIFY-MACRO name (args) function [documentation]

    where args may contain &OPTIONAL and &REST, and the expansion
    is like the following except that it has correct semantics:

	(DEFMACRO name (location args...)
	  `(SETF ,location (function ,location ,args...)))

    Example: (DEFINE-MODIFY-MACRO INCF (&OPTIONAL (DELTA 1)) +)

    I anticipate that users will want to define many more of these
    in addition to the mere two that Common Lisp has built in.
    It would of course be a loss if it was too hard for users to
    define modify macros that follow proper order of evaluation
    semantics, so that they end up defining inferior ones.
    I have needed the following, for example:

	(DEFINE-MODIFY-MACRO UNIONF (OTHER-SET) UNION)
		
A good test case for whether your SETF system works properly is to try
macro-expanding (INCF (LDB FOO BAR)); if it doesn't generate a SETQ of
BAR, it isn't right.  This exercises most of the complexity, except for
multiple values.


Thanks to Alan Bawden for explaining the underlying theory of the five
values to me.

By the way, this is all implemented and works.  The code is optimized,
removing any temporary-variable bindings that are not actually necessary,
using code-analysis tools that I intend to offer as a yellow-pages
package once I have rewritten them to be portable.  Some implementations
may have smart enough compilers that they don't need this, possibly.
The new LOOP will depend on some of these tools also.