[ERR5RS] Rationale and ease of use

Sat Sep 8 08:58:05 PDT 2007

On 9/6/07, AndrevanTonder <andre at het.brown.edu> wrote:
>
> On Tue, 4 Sep 2007, Lynn Winebarger wrote:

    Thanks for your thoughtful response, Andre.  I will present an
alternative viewpoint, which I hope will help delineate the subset of
programs that should be defined by ERR5RS.  Otherwise I would just be
venting my spleen without making a positive contribution.
    I disagree with most of your assertions below to the extent they are
unqualified as being for consistency with R6RS.  Consistency with R6RS
(short of bug-for-bug compatibility, to use Will's words) is a reasonable
goal without having to make overly broad claims.  First, I need to make a
digression

    After some contemplation, I think a significant problem is the nature of
"define".  To my knowledge, it is the only construct in R5RS which falsifies
the "a very small number of rules for forming expressions, with no
restrictions on how they are composed" statement.  In particular, you can't
place defines arbitrarily in a body.  What's more, the chosen semantics for
internal defines are inconsistent with those of top-level defines.  I
suppose the intent was to remove some layers of nesting, but the effect in
R6RS has been taken to the extreme - top-level defines have been redefined
to be consistent with internal defines!
     For example, let's take
(define bar 1)
(define foo
   (lamdba (x)
      (if x
          (define bar 100)
          (set! bar (+ bar 1)))
      #t)))

There is one sensible, consistent way to interpret this code, which is that
the (define bar 100) establishes a global binding of bar, not a local one.
The requirement derives from lexical scoping.   This is illustrating by the
following sequence of expressions (hopefully it is obvious why this sequence
represents the undesirable alternative):
(foo #f)
(display bar) => 2
(foo #t)
(display bar) => 100
(foo #f)
(display bar) => 101

What does this have to do with "shared" versus "separate" bindings?  Only
that the internal-define requirements are all about separating phases, where
the top-level definitions (in the presence of syntax-case, anyway) are
effectively evaluated in phases above the expansion of all subsequent
expressions, with bindings shared in downward phases.   You can write
programs that obey the internal-define requirements with top-level
semantics, but not vice-versa.  Therefore the top-level notion of define
(and hence the implied phasing and down-phase sharing of bindings) is more
desirable for a power-hungry programmer.
    Alas, I do not harbor any hope that the meaning of the define keyword
will be made consistent in any foreseeable version of Scheme.

> There were a lot of messages in that thread.  In the first message,
> > Andre prefered a single set of bindings for all phases, but then
> > changed his mind a few messages later.
>
> I think it is clear that if one insists on a single set of bindings
> for expand-time and runtime, one would have to recompile a program
> before running it every time one restarts a session, or otherwise
> include the compile-time image in the compiled object file.  I think
> that would be an undesirable constraint to impose on compiler-based
> Scheme systems.
>
      Only the reachable compile-time bindings.  Whether it is desirable
depends on the user of the language, not the compiler writer.
      This also suffers from the internal-vs-top-level define problem - a
programmer can write programs that do not need to share variables between
phases in implementations that do not require it, but not vice-versa.   Note
that I am not suggesting run-time should be able to modify compile time
values, just refer to them.  It seems to me the environment of the higher
phase could be regarded as a module implicitly exporting its bindings to the
run-time, with the corresponding restriction on mutation.

(For simplicity and performance, my implementation does share a single
> memory location for the bindings of the same identifier in phases
> higher than "expand", but it syntactically prohibits access to
> bindings, even though they may be present in memory, that
> are used outside their declared levels.  So to an extent enforcing
> levels syntactically is orthogonal to whether the actual objects
> are shared or not.  Communication between levels is possible using
> mutation of shared objects, but this is not portable and should not
> be used.  The only way in which information can reliably be
> communicated between levels is through syntax objects and datum->syntax
> on objects that have an external representation, as described in the
> thread you mentioned).

    Yes, I saw Matthew Flatt denigrating 3D macros.  For my taste,
implementations should be required to accept arbitrary data in syntax
object, in the same spirit that other values are self-quoting [yes, I am
conflating constants with values].  It's not clear why limiting the
programmer to expressions that have an external syntactic representations is
a good idea.  On the one hand, you see what the compiler writer wishes to
provide, and on the other what the program author wishes to write.  If an
expression appears meaningful, it should be meaningful.

> I agree with the premise of R6RS that portable programs be
> written to not depend on the unspecified details of the instantiation
> semantics.  This could be seen as analogous to the constraint that
> portable programs must not depend on the unspecified evaluation order
> of function calls.

    But this is an incorrect statement.  A program whose intended meaning
depends upon the order of evaluation is not portable, but I can certainly
write programs whose output/state varies with the order of evaluation, yet
does not depend on that order for correctness. In the same way, I might have
a use for knowing the order of instantiation of libraries without requiring
any particular order for correct functioning.

> I know I have found an analogous feature of PLT's
> > syntax-case incredibly annoying (not being able to directly use
> > variable definitions inside a syntax-case body).
>
> I know it is annoying, but it is also correct.  It avoids the even
> more annoying phenomenon of code that works in the REPL
> suddenly stopping working when you compile the same code (at compilation
> time the runtime variable does not have value, so the reference to it
> from syntax-case will fail).  A good reference on this kind of thing
> can be found in Matthew Flatt's paper whose title ends with
> "you want it when", which I would recommend if you have not already
> read it.

     If you mean "correct" in any sense other than "portable R6RS", you are
not making an defensible claim.  What is annoying is that the compiler
doesn't make the code work like it did at the REPL, not that the REPL does
the right thing.

>     If this is correct, could you (Andre, Will, or both) provide a
> > single coherent explanation of why the separated binding are the
> > better semantics?  In any case, as a user I would prefer to know that
> > it was going to be one or the other for certain.
>
> As Will said, programs should not depend on unspecified aspects of the
> semantics.  Maybe a good question for err5rs is whether this
> semantics-independence should be enforced by the implementation
> (e.g., providing only syntax-rules) or whether one should come
> up with a simple set of programmer's responsibilities that would,
> if followed, ensure such independence.

       I don't think it should be enforced by the semantics.   That is part
of the problem with R6RS.  The reason for ERR5RS to adopt a strict subset of
the R6RS library syntax and semantics is that no consensus appears reachable
on what the semantics of the unrestricted subset should be.  It is a shame
R6RS will not explicitly state what programs will have a definite meaning,
and explicitly leave the rest undefined.  The more programs you given
definitions, the fewer implementations will support ERR5RS.
      At least, this is how I understand this aspect of the project, and
believe this email demonstrates why the reason is sound.  I have also come
to understand the lament of some rrrs mailing list poster that an
implementor's workshop might encourage implementors to (heaven forfend) be
involved in defining the language they are implementing.

Lynn
PS.  Separated bindings are really a kind of dynamic scoping.  The following
illustrates what I think lexical scoping would require (visibility of
bindings in downward phases):
(define bar 10)
(define-syntax foo
  (lambda (x)
     (define bar 'a)
     (syntax-case x ()
       ((_) #'bar)
       ((_ y) #'(set! bar y)))))))
bar => 10
(foo) => a
(foo 'b) =>
bar => 10
(foo) => b
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.scheme-punks.org/pipermail/err5rs-scheme-punks.org/attachments/20070908/c5ad9b52/attachment-0001.htm>