[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

long-char, kanji



    Didn't I already suggest how to avoid paying a penalty for Japanese
    the last time this topic came up?  Instead of a two-way partitioning
    of the type space, have a three-way partitioning. ``PLUMP-STRING''
    uses 16 bits to represent the characters, with default style.

I've got no problem with this, but it seems to be what Moon was arguing
against earlier (except that he expanded it to a four-way partitioning
and then decided that THAT was too complex).

    ...Please don't quote CLtL at
    us, we know very well what the book says.  We consider this portion of
    the book to be very poorly thought out, and not suitable as a guide.
    Preserving the status quo here would be a mistake.  Let's not use the
    book as a substitute for doing language design.

First, the reason I quoted the manual at Moon was that I read his note
as saying that any vector of type character was obviously a string.  It
sounded like he was confused about the current state of things.  It is
not uncommon for people whose implementations have lots of extensions to
forget where Common Lisp leaves off and the extensions begin.  Even if
Moon was not confused himself (his note can be read either way), other
readers might be.  When I see something like that go by, I feel that it
is important to flag the problem before the possible confusion spreads
any farther.  If Moon's note had clearly indicated that he was proposing
a change or extension to our current definition of string, then I
wouldn't have quoted the book at him.  (I would, however, have wondered
how you use the double-quote syntax to handle characters with bits that
have no compact printable representaiton, and characters with font
attributes that are perhaps not printable on the machine where the I/O
is occurring.)

For reasons indicated in my "proposed guidelines" message, I think that
we must start with the status quo, regardless of how stupid you or I
might think that some part of it might be.  Proposals to change the
language spec are in order, but they must be very clearly labeled as
such, and the costs to users and implementors must be considered as well
as the benefits of making a change.

I guess it does no harm for Symbolics to extend Strings to hold all
kinds of characters (if the extension is internally consistent), as long
as you don't use this in portable code and as long as you don't let this
extension contaminate any function in the LISP package...but that's
another discussion.

I also agree that the current definition of characters, with their bit
and font attributes, is a total mess, but it's one that we can live
with.  I'd love to make an incompatible change here and clean everything
up, but we have to move very carefully on such things.  There's a lot of
code out there that might be affected.  We should probably begin with a
survey of who would be screwed if char-bits and char-fonts went away.

-- Scott