[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
&Rest Lists
Date: Tue, 29 Mar 88 12:12 EDT
From: ELIOT@cs.umass.edu
From: Barry Margolin <barmar@think.COM>
To: ELIOT@cs.umass.EDU
Date: Thu, 24 Mar 88 17:12 EDT
From: ELIOT@cs.umass.edu
From: Barry Margolin <barmar@think.COM>
To: ELIOT@cs.umass.EDU
What I mean by saying that "Using APPLY is probably inefficient
anyhow" is that you will incur the overhead of a full extra function
call that probably cannot be optimized away. ...
What extra function call? I would hope that the compiler will inline
code calls to APPLY. (apply #'foo '(1 2 3)) should generate code at
least as good as (foo 1 2 3), and the only difference between (apply
#'foo '(1 2 3)) and (apply foo '(1 2 3)) should be whether it extracts
the function from FOO's function or value cell. In the implementation
I use (a Lispm), APPLY is a single instruction.
Is it an instruction, or a microcoded subroutine? :-)
I don't have access to the microcode source, so I can't tell. All I
know is that it is one macroinstruction. It is probably a microcode
routine, because it still has to do a significant amount of work, such
as checking the number of arguments supplied against the allowed number
of arguments for the function.
The only extra
inefficiency that APPLY might incur is the one YOU are advocating --
copying its last argument -- which makes your reasoning circular.
I looked at both VAXLISP and NIL to see what they do. Both of these
implementations spread the arguments out on the stack in the calling
function and collect them back into a list when entering the function
with the &Rest argument. Consequently they might use stack lists,
but they cannot share list stucture as you are proposing.
Because these implementors chose not to share the apply argument and the
&rest argument, they must do the spread and collect. This extra
function call you mentioned is directly caused by the non-sharing. You
implied that there was some inefficiency already inherent in APPLY, so
that adding the spread/collect wouldn't make it much slower.
A normal function call [i.e. (foo 1 2 3)] gets compiled as a series
of individual PUSHES as the arguments are evaluated. (Equivalently,
memory for the entire set of arguments can be allocated on the stack
and individual values MOVEd into the correct location.)
A function call using APPLY would PUSH the initial arguments and then
call a hand-coded assembly routine to push the rest of the arguments.
Consequently, the state of the stack when FOO is entered is the same
regardless of how FOO gets called, so FOO cannot differentiate
calls moderated by APPLY from normal function calls. Your APPLY
microprogram might do something similar, and it certainly could.
Calls made using APPLY are not specially optimized. However, FOO can
be compiled using full knowledge of the body of FOO. So if FOO does
not return its &Rest list it can be CONSed on the stack. On a LISPM
you might find it possible to change some type bits in the stack words
to make the arguments PUSHed onto the stack into a CDR-coded list.
What I believe the Lispm APPLY does is to push each of the regular
arguments onto the stack, then push the list argument onto the stack,
and set a bit indicating that the last argument should be spread if
necessary. When entering the callee, if there are fewer regular lambda
variables than regular arguments, the extra regular arguments are turned
into a stack-consed list whose tail is the list argument (as you
described in your last sentence above), and the &rest argument is bound
to this. If there are more regular lambda variables than regular
arguments, then values are popped from the list argument and stored in
the appropriate stack locations for those lambda variables, and the
&rest arg is bound to what is left of the list argument.
Except for the use of stack lists for regular arguments that become part
of an &rest variable, no special support is necessary for the above.
And the use of stack lists is independent of the other features.
Stack-consed &rest arguments merely guarantee that they NEVER cons.
While we are discussing such low-level issues, it is worth thinking about
how difficult it is to take advantage of the possibility of sharing
&Rest lists. Basically there are two ways to do it. If you are willing to
give up or tremendously complicate separate compilation you can compile
a function CALL using special knowledge of the definition of the Callee.
For example, a call using APPLY could jump into the middle of the
function entry sequence, after the point where &Rest lists normally
get created. This would be a major break with Lisp tradition. Alternatively,
you must use several different calling conventions and pass along
enough information so that the Callee knows how its arguments have been
encoded. This will slow down every function call by the amount needed
to encode/decode this information, and complicate the implementation
considerably. Which option do you prefer?
Well, I guess what the Lispm does is the latter (actually, there is also
some of the former -- optional arguments are implemented by having each
of the first N instructions do the defaulting for the corresponding
argument, and the first M are skipped when M optional arguments have
been supplied). But I don't think the overhead need be as much as you
think. I suspect that it could be implemented such that its worst case
behavior is the same expense as NIL's regular case.
Here's how I would describe a good argument processing algorithm (I'm using Lisp for
convenience, this code doesn't come from the Lispm, or any existing implementation):
(defun process-arguments (function arg-array spread-last-arg-p)
;; ARG-ARRAY is the portion of the activation record containing the arguments,
;; which were pushed by the caller.
(multiple-value-bind (n-bound-vars spread-last-var-p) ; n-bound-vars doesn't
; include &rest var
(function-arg-info function) ;; ignore optional/key for now
(let ((n-ordinary-args (if spread-last-arg-p
(1- (length arg-array))
(length arg-array))))
;; Check for optimal case first
(unless (and (= n-bound-vars n-ordinary-args)
(eq spread-last-arg-p spread-last-var-p))
(let ((spread-arg (if spread-last-arg-p
(aref arg-array n-ordinary-args)
nil)))
(cond ((< n-bound-vars n-ordinary-args)
(do ((i (1- n-ordinary-args) (1- i)))
((= i n-bound-vars))
(push (aref arg-array i) spread-arg)))
((> n-bound-vars n-ordinary-args)
(do ((i n-bound-vars (1+ i)))
((= i n-ordinary-args))
(if (null spread-arg) (error "Not enough arguments supplied")
(push-argument (pop spread-arg))))))
(if (and spread-arg spread-last-var-p)
(push-argument spread-arg)
(error "Too many arguments supplied")))))))
The above example doesn't try to do stack-consing. Therefore its
worst-case performance is in the case of (apply #'(lambda (&rest arg)
...) <n individual args> NIL), consing n elements, just like NIL. But
in the case where the &rest arg exactly matches the last argument to
APPLY, neither loop executes. (apply #'(lambda (<n individual args>)
...) <list>) also does O(n) work, but it doesn't cons, so it is cheaper
than the first example. In general, my algorithm is O(n-ord-args -
n-ord-vars), whereas NIL is O(length(last-apply-arg)).
Well, I know that I use APPLY a whole lot. Not as often as LIST and
CAR, certainly, but more often than I would have expected.
I don't use APPLY very much. I am sure that neither of us are
biased because of our programming styles, right?
Actually, I suspect it has to do with the applications more than style.
APPLY is used frequently when writing intermediary routines dealing with
things like files, because you are passed an options list suitable for
OPEN, and you must use APPLY to invoke OPEN. Much of what I do requires
this.
One nice thing about the modularity of the NIL generated code: it could
easily be converted to an implementation like this without changing the
code generator. Something equivalent to the above could be implemented
in CL$MS_STACKIFY (it would do almost nothing), CL$MS_HNDL_RS (this
would contain the two loops), and CL$MS_ANDREST_LIST (this would
correspond to my last IF form). In fact, how do I know from only
looking at the code you supplied that it DOESN'T do that? (The answer
is that I assume you have looked at the implementation, and it does what
you say).
barmar