[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

&Rest Lists

    Date: Tue, 29 Mar 88 12:12 EDT
    From: ELIOT@cs.umass.edu

       From: Barry Margolin <barmar@think.COM>
       To: ELIOT@cs.umass.EDU
	  Date: Thu, 24 Mar 88 17:12 EDT
	  From: ELIOT@cs.umass.edu
	     From: Barry Margolin <barmar@think.COM>
	     To: ELIOT@cs.umass.EDU
	  What I mean by saying that "Using APPLY is probably inefficient
	  anyhow" is that you will incur the overhead of a full extra function
	  call that probably cannot be optimized away. ...
       What extra function call?  I would hope that the compiler will inline
       code calls to APPLY.  (apply #'foo '(1 2 3)) should generate code at
       least as good as (foo 1 2 3), and the only difference between (apply
       #'foo '(1 2 3)) and (apply foo '(1 2 3)) should be whether it extracts
       the function from FOO's function or value cell.  In the implementation
       I use (a Lispm), APPLY is a single instruction.

    Is it an instruction, or a microcoded subroutine?  :-)

I don't have access to the microcode source, so I can't tell.  All I
know is that it is one macroinstruction.  It is probably a microcode
routine, because it still has to do a significant amount of work, such
as checking the number of arguments supplied against the allowed number
of arguments for the function.

       The only extra
       inefficiency that APPLY might incur is the one YOU are advocating --
       copying its last argument -- which makes your reasoning circular.

    I looked at both VAXLISP and NIL to see what they do.  Both of these
    implementations spread the arguments out on the stack in the calling
    function and collect them back into a list when entering the function
    with the &Rest argument.  Consequently they might use stack lists,
    but they cannot share list stucture as you are proposing.

Because these implementors chose not to share the apply argument and the
&rest argument, they must do the spread and collect.  This extra
function call you mentioned is directly caused by the non-sharing.  You
implied that there was some inefficiency already inherent in APPLY, so
that adding the spread/collect wouldn't make it much slower.

    A normal function call [i.e. (foo 1 2 3)] gets compiled as a series
    of individual PUSHES as the arguments are evaluated.  (Equivalently,
    memory for the entire set of arguments can be allocated on the stack
    and individual values MOVEd into the correct location.)

    A function call using APPLY would PUSH the initial arguments and then
    call a hand-coded assembly routine to push the rest of the arguments.
    Consequently, the state of the stack when FOO is entered is the same
    regardless of how FOO gets called, so FOO cannot differentiate
    calls moderated by APPLY from normal function calls.  Your APPLY
    microprogram might do something similar, and it certainly could.
    Calls made using APPLY are not specially optimized.  However, FOO can
    be compiled using full knowledge of the body of FOO.  So if FOO does
    not return its &Rest list it can be CONSed on the stack.  On a LISPM
    you might find it possible to change some type bits in the stack words
    to make the arguments PUSHed onto the stack into a CDR-coded list.

What I believe the Lispm APPLY does is to push each of the regular
arguments onto the stack, then push the list argument onto the stack,
and set a bit indicating that the last argument should be spread if
necessary.  When entering the callee, if there are fewer regular lambda
variables than regular arguments, the extra regular arguments are turned
into a stack-consed list whose tail is the list argument (as you
described in your last sentence above), and the &rest argument is bound
to this.  If there are more regular lambda variables than regular
arguments, then values are popped from the list argument and stored in
the appropriate stack locations for those lambda variables, and the
&rest arg is bound to what is left of the list argument.

Except for the use of stack lists for regular arguments that become part
of an &rest variable, no special support is necessary for the above.
And the use of stack lists is independent of the other features.
Stack-consed &rest arguments merely guarantee that they NEVER cons.

    While we are discussing such low-level issues, it is worth thinking about
    how difficult it is to take advantage of the possibility of sharing 
    &Rest lists.  Basically there are two ways to do it.  If you are willing to
    give up or tremendously complicate separate compilation you can compile
    a function CALL using special knowledge of the definition of the Callee.
    For example, a call using APPLY could jump into the middle of the
    function entry sequence, after the point where &Rest lists normally
    get created.  This would be a major break with Lisp tradition.  Alternatively,
    you must use several different calling conventions and pass along 
    enough information so that the Callee knows how its arguments have been 
    encoded.  This will slow down every function call by the amount needed 
    to encode/decode this information, and complicate the implementation
    considerably.  Which option do you prefer?

Well, I guess what the Lispm does is the latter (actually, there is also
some of the former -- optional arguments are implemented by having each
of the first N instructions do the defaulting for the corresponding
argument, and the first M are skipped when M optional arguments have
been supplied).  But I don't think the overhead need be as much as you
think.  I suspect that it could be implemented such that its worst case
behavior is the same expense as NIL's regular case.

Here's how I would describe a good argument processing algorithm (I'm using Lisp for
convenience, this code doesn't come from the Lispm, or any existing implementation):

(defun process-arguments (function arg-array spread-last-arg-p)
  ;; ARG-ARRAY is the portion of the activation record containing the arguments,
  ;; which were pushed by the caller.
  (multiple-value-bind (n-bound-vars spread-last-var-p) ; n-bound-vars doesn't
						; include &rest var
      (function-arg-info function) ;; ignore optional/key for now
    (let ((n-ordinary-args (if spread-last-arg-p
			       (1- (length arg-array))
			       (length arg-array))))
      ;; Check for optimal case first
      (unless (and (= n-bound-vars n-ordinary-args)
		   (eq spread-last-arg-p spread-last-var-p))
	(let ((spread-arg (if spread-last-arg-p
			      (aref arg-array n-ordinary-args)
	  (cond ((< n-bound-vars n-ordinary-args)
		 (do ((i (1- n-ordinary-args) (1- i)))
		     ((= i n-bound-vars))
		   (push (aref arg-array i) spread-arg)))
		((> n-bound-vars n-ordinary-args)
		 (do ((i n-bound-vars (1+ i)))
		     ((= i n-ordinary-args))
		   (if (null spread-arg) (error "Not enough arguments supplied")
		       (push-argument (pop spread-arg))))))
	  (if (and spread-arg spread-last-var-p)
	      (push-argument spread-arg)
	      (error "Too many arguments supplied")))))))

The above example doesn't try to do stack-consing.  Therefore its
worst-case performance is in the case of (apply #'(lambda (&rest arg)
...) <n individual args> NIL), consing n elements, just like NIL.  But
in the case where the &rest arg exactly matches the last argument to
APPLY, neither loop executes.  (apply #'(lambda (<n individual args>)
...) <list>) also does O(n) work, but it doesn't cons, so it is cheaper
than the first example.  In general, my algorithm is O(n-ord-args -
n-ord-vars), whereas NIL is O(length(last-apply-arg)).

       Well, I know that I use APPLY a whole lot.  Not as often as LIST and
       CAR, certainly, but more often than I would have expected.

    I don't use APPLY very much.  I am sure that neither of us are
    biased because of our programming styles, right?

Actually, I suspect it has to do with the applications more than style.
APPLY is used frequently when writing intermediary routines dealing with
things like files, because you are passed an options list suitable for
OPEN, and you must use APPLY to invoke OPEN.  Much of what I do requires

One nice thing about the modularity of the NIL generated code: it could
easily be converted to an implementation like this without changing the
code generator.  Something equivalent to the above could be implemented
in CL$MS_STACKIFY (it would do almost nothing), CL$MS_HNDL_RS (this
would contain the two loops), and CL$MS_ANDREST_LIST (this would
correspond to my last IF form).  In fact, how do I know from only
looking at the code you supplied that it DOESN'T do that?  (The answer
is that I assume you have looked at the implementation, and it does what
you say).