[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

*To*: Common-Lisp at sail*Subject*: Issue #97, Colander page 134: floating-point assembly and disassembly*From*: MOON at SCRC-TENEX*Date*: Thu, 30 Sep 1982 09:55:00 -0000

I am not completely happy with the FLOAT-FRACTION, FLOAT-EXPONENT, and SCALE-FLOAT functions in the Colander edition. At the meeting in August I was assigned to make a proposal. I am slow. A minor issue is that the range of FLOAT-FRACTION fails to include zero (of course it has to), and is inclusive at both ends, which means that there are two possible return values for some numbers. I guess that this ugliness has to stay because some implementations require this freedom for hardware reasons, and it doesn't make a big difference from a numerical analysis point of view. My proposal is to include zero in the range and to add a note about two possible values for numbers that are an exact power of the base. A more major issue is that some applications that break down a flonum into a fraction and an exponent, or assemble a flonum from a fraction and an exponent, are best served by representing the fraction as a flonum, while others are best served by representing it as an integer. An example of the former is a numerical routine that scales its argument into a certain range. An example of the latter is a printing routine that must do exact integer arithmetic on the fraction. In the agenda for the August meeting it was also proposed that there be a function to return the precision of the representation of a given flonum (presumably in bits); this would be in addition to the "epsilon" constants described on page 143 of the Colander. A goal of all this is to make it possible to write portable numeric functions, such as the trigonometric functions and my debugged version of Steele's totally accurate floating-point number printer. These would be portable to all implementations but perhaps not as efficient as hand-crafted routines that avoided bignum arithmetic, used special machine instructions, avoided computing to more precision than the machine really has, etc. Proposal: SCALE-FLOAT x e -> y y = (* x (expt 2.0 e)) and is a float of the same type as x. SCALE-FLOAT is more efficient than exponentiating and multiplying, and also cannot overflow or underflow unless the final result (y) cannot be represented. x is also allowed to be a rational, in which case y is of the default type (same as the FLOAT function). [x being allowed to be a rational can be removed if anyone objects. But note that this function has to be generic across the different float types in any case, so it might as well be generic across all number types.] UNSCALE-FLOAT y -> x e The first value, x, is a float of the same type as y. The second value, e, is an integer such that (= y (* x (expt 2.0 e))). The magnitude of x is zero or between 1/b and 1 inclusive, where b is the radix of the representation: 2 on most machines, but examples of 8 and 16, and I think 4, exist. x has the same sign as y. It is an error if y is a rational rather than a float, or if y is an infinity. (Leave infinity out of the Common Lisp manual, though). It is not an error if y is zero. FLOAT-MANTISSA x -> f FLOAT-EXPONENT x -> e FLOAT-SIGN x -> s FLOAT-PRECISION x -> p f is a non-negative integer, e is an integer, s is 1 or 0. (= x (* (SCALE-FLOAT (FLOAT f x) e) (IF (ZEROP S) 1 -1))) is true. It is up to the implementation whether f is the smallest possible integer (zeros on the right are removed and e is increased), or f is an integer with as many bits as the precision of the representation of x, or perhaps a "few" more. The only thing guaranteed about f is that it is non-negative and the above equality is true. f is non-negative to avoid problems with minus zero. s is 1 for minus zero even though MINUSP is not true of minus zero (otherwise the FLOAT-SIGN function would be redundant). p is an integer, the number of bits of precision in x. This is a constant for each flonum representation type (except perhaps for variable-precision "bigfloats"). [I am amenable to converting these four functions into one function that returns four values if anyone can come up with a name. EXPLODE-FLOAT is the best so far, and it's not very good, especially since the traditional EXPLODE function has been flushed from Common Lisp. Perhaps DECODE-FLOAT.] [I am amenable to adding a function that takes f, e, and s as arguments and returns x. It might be called ENCODE-FLOAT or MAKE-FLOAT. It ought to take either a type argument or an optional fourth argument, the way FLOAT takes an optional second argument, which is an example of the type to return.] FTRUNC x -> fp ip The FTRUNC function as it is already defined provides the fraction-part and integer-part operations. These functions exist now in the Lisp machines, with different names and slightly different semantics in some cases. They are very easy to write. Comments? Suggestions for names?

- Prev by Date:
**Vectors/Arrays** - Next by Date:
**RESTART** - Previous by thread:
**vectors/arrays** - Next by thread:
**RESTART** - Index(es):