Inlab Scheme implements hygienic macros differently by introducing syntax-lambda expressions.
This appendix describes an extension to Scheme that allows programs to define and use new derived expression types. A derived expression type that has been defined using this extension is called a macro.
Derived expression types introduced using this extension have the syntax
(<keyword> <datum>*)where <keyword> is an identifier that uniquely determines the expression type. This identifier is called the syntactic keyword, or simply keyword, of the macro. The number of the <datum>s, and their syntax, depends on the expression type.
Each instance of a macro is called a use of the macro. The set of rules, or more generally the procedure, that specifies how a use of a macro is transcribed into a more primitive expression is called the transformer of the macro.
The extension described here consists of three parts:
With this extension, there are no reserved identifiers. The syntactic keyword of a macro may shadow variable bindings, and local variable bindings may shadow keyword bindings. All macros defined using the pattern language are "hygienic" and "referentially transparent":
This appendix is divided into three major sections. The first section describes the expressions and definitions used to introduce macros, i.e. to bind identifiers to macro transformers.
The second section describes the pattern language. This pattern language is sufficient to specify most macro transformers, including those for all the derived expression types from section Derived expression types. The primary limitation of the pattern language is that it is thoroughly hygienic, and thus cannot express macros that bind identifiers implicitly.
The third section describes a low-level macro facility that could be used to implement the pattern language described in the second section. This low-level facility is also capable of expressing non-hygienic macros and other macros whose transformers cannot be described by the pattern language, and is important as an example of a more powerful facility that can co-exist with the high-level pattern language.
The particular low-level facility described in the third section is but one of several low-level facilities that have been designed and implemented to complement the pattern language described in the second section. The design of such low-level macro facilities remains an active area of research, and descriptions of alternative low-level facilities will be published in subsequent documents.
Define-syntax
, let-syntax
, and letrec-syntax
are
analogous to define
, let
, and letrec
, but they bind
syntactic keywords to macro transformers instead of binding variables
to locations that contain values. Furthermore, there is no
define-syntax
analogue of the internal definitions described in
section Internal definitions.
Rationale: As discussed below, the syntax and scope rules for definitions
give rise to syntactic ambiguities when syntactic keywords are
not reserved.
Further ambiguities would arise if define-syntax
were permitted at the beginning of a <body>, with scope
rules analogous to those for internal definitions.
These new expression types and the pattern language described in
section Pattern language are added to Scheme by augmenting the
BNF in section Formal syntax with the following new productions. Note
that the identifier ...
used in some of these productions is not
a metasymbol.
<expression> ==> <macro use> | <macro block><macro use> ==> (<keyword> <datum>*) <keyword> ==> <identifier>
<macro block> ==> (let-syntax (<syntax spec>*) <body>) | (letrec-syntax (<syntax spec>*) <body>) <syntax spec> ==> (<keyword> <transformer spec>) <transformer spec> ==> (syntax-rules (<identifier>*) <syntax rule>*) <syntax rule> ==> (<pattern> <template>) <pattern> ==> <pattern identifier> | (<pattern>*) | (<pattern>+ . <pattern>) | (<pattern>* <pattern> <ellipsis>) | <pattern datum> <pattern datum> ==> <vector> | <string> | <character> | <boolean> | <number> <template> ==> <pattern identifier> | (<template element>*) | (<template element>+ . <template>) | <template datum> <template element> ==> <template> | <template> <ellipsis> <template datum> ==> <pattern datum> <pattern identifier> ==> <any identifier except...
> <ellipsis> ==> <the identifier...
>
<command or definition> ==> <syntax definition> <syntax definition> ==> (define-syntax <keyword> <transformer spec>) | (begin <syntax definition>*)
Although macros may expand into definitions in any context that permits definitions, it is an error for a definition to shadow a syntactic keyword whose meaning is needed to determine whether some definition in the group of top-level or internal definitions that contains the shadowing definition is in fact a definition, or is needed to determine the boundary between the group and the expressions that follow the group. For example, the following are errors:
(define define 3) (begin (define begin list))
(let-syntax ((foo (syntax-rules () ((foo (proc args ...) body ...) (define proc (lambda (args ...) body ...)))))) (let ((x 3)) (foo (plus x y) (+ x y)) (define foo x) (plus foo x)))
syntax: syntax {let-syntax} <bindings> <body>
Syntax: <Bindings> should have the form
((<keyword> <transformer spec>) ...)Each <keyword> is an identifier, each <transformer spec> is an instance of
syntax-rules
, and
<body> should be a sequence of one or more expressions. It is an error
for a <keyword> to appear more than once in the list of keywords
being bound.
Semantics: The <body> is expanded in the syntactic environment
obtained by extending the syntactic environment of the
let-syntax
expression with macros whose keywords are
the <keyword>s, bound to the specified transformers.
Each binding of a <keyword> has <body> as its region.
(let-syntax ((when (syntax-rules () ((when test stmt1 stmt2 ...) (if test (begin stmt1 stmt2 ...)))))) (let ((if #t)) (when if (set! if 'now)) if)) => now
(let ((x 'outer)) (let-syntax ((m (syntax-rules () ((m) x)))) (let ((x 'inner)) (m)))) => outer
syntax: syntax {letrec-syntax} <bindings> <body>
Syntax: Same as for let-syntax
.
Semantics: The <body> is expanded in the syntactic environment obtained by
extending the syntactic environment of the letrec-syntax
expression with macros whose keywords are the
<keyword>s, bound to the specified transformers.
Each binding of a <keyword> has the <bindings>
as well as the <body> within its region,
so the transformers can
transcribe expressions into uses of the macros
introduced by the letrec-syntax
expression.
(letrec-syntax ((or (syntax-rules () ((or) #f) ((or e) e) ((or e1 e2 ...) (let ((temp e1)) (if temp temp (or e2 ...))))))) (let ((x #f) (y 7) (temp 8) (let odd?) (if even?)) (or x (let temp) (if y) y))) => 7
syntax: define-syntax <keyword> <transformer spec>
Syntax: The <keyword> is an identifier, and the <transformer
spec> should be an instance of syntax-rules
.
Semantics: The top-level syntactic environment is extended by binding the <keyword> to the specified transformer.
(define-syntax let* (syntax-rules () ((let* () body1 body2 ...) (let () body1 body2 ...)) ((let* ((name1 val1) (name2 val2) ...) body1 body2 ...) (let ((name1 val1)) (let* ((name2 val2) ...) body1 body2 ...)))))
syntax: syntax-rules <literals> <syntax rule> ...
Syntax: <Literals> is a list of identifiers, and each <syntax rule> should be of the form
(<pattern> <template>)where the <pattern> and <template> are as in the grammar above.
Semantics: An instance of syntax-rules
produces a new macro
transformer by specifying a sequence of hygienic rewrite rules. A use
of a macro whose keyword is associated with a transformer specified by
syntax-rules
is matched against the patterns contained in the
<syntax rule>s, beginning with the leftmost <syntax rule>.
When a match is found, the macro use is transcribed hygienically
according to the template.
Each pattern begins with the keyword for the macro. This keyword is not involved in the matching and is not considered a pattern variable or literal identifier.
Rationale: The scope of the keyword is determined by the expression or syntax
definition that binds it to the associated macro transformer.
If the keyword were a pattern variable or literal identifier, then
the template that follows the pattern would be within its scope
regardless of whether the keyword were bound by let-syntax
or by letrec-syntax
.
An identifier that appears in the pattern of a <syntax rule> is
a pattern variable, unless it is the keyword that begins the pattern,
is listed in <literals>, or is the identifier "...
".
Pattern variables match arbitrary input elements and
are used to refer to elements of the input in the template. It is an
error for the same pattern variable to appear more than once in a
<pattern>.
Identifiers that appear in <literals> are interpreted as literal identifiers to be matched against corresponding subforms of the input. A subform in the input matches a literal identifier if and only if it is an identifier and either both its occurrence in the macro expression and its occurrence in the macro definition have the same lexical binding, or the two identifiers are equal and both have no lexical binding.
A subpattern followed by ...
can match zero or more elements of the
input. It is an error for ...
to appear in <literals>.
Within a pattern the identifier ...
must follow the last element of
a nonempty sequence of subpatterns.
More formally, an input form F matches a pattern P if and only if:
(P1 ... Pn)
and F is a
list of n
forms that match P1 through Pn, respectively; or
(P1 P2 ... Pn . Q)
and F is a list or
improper list of n or more forms that match P1 through Pn,
respectively, and whose nth "cdr" matches Q; or
(P1 ... Pn Q <ellipsis>)
where <ellipsis> is the identifier ...
and F is
a proper list of at least n elements, the first n of which match
P1 through Pn, respectively, and each remaining element of F
matches Q; or
equal?
procedure.
It is an error to use a macro keyword, within the scope of its binding, in an expression that does not match any of the patterns.
When a macro use is transcribed according to the template of the
matching <syntax rule>, pattern variables that occur in the
template are replaced by the subforms they match in the input.
Pattern variables that occur in subpatterns followed by one or more
instances of the identifier
...
are allowed only in subtemplates that are
followed by as many instances of ...
.
They are replaced in the
output by all of the subforms they match in the input, distributed as
indicated. It is an error if the output cannot be built up as
specified.
Identifiers that appear in the template but are not pattern variables
or the identifier
...
are inserted into the output as literal identifiers. If a
literal identifier is inserted as a free identifier then it refers to the
binding of that identifier within whose scope the instance of
syntax-rules
appears.
If a literal identifier is inserted as a bound identifier then it is
in effect renamed to prevent inadvertent captures of free identifiers.
(define-syntax let (syntax-rules () ((let ((name val) ...) body1 body2 ...) ((lambda (name ...) body1 body2 ...) val ...)) ((let tag ((name val) ...) body1 body2 ...) ((letrec ((tag (lambda (name ...) body1 body2 ...))) tag) val ...))))
(define-syntax cond (syntax-rules (else =>) ((cond (else result1 result2 ...)) (begin result1 result2 ...)) ((cond (test => result)) (let ((temp test)) (if temp (result temp)))) ((cond (test => result) clause1 clause2 ...) (let ((temp test)) (if temp (result temp) (cond clause1 clause2 ...)))) ((cond (test)) test) ((cond (test) clause1 clause2 ...) (or test (cond clause1 clause2 ...))) ((cond (test result1 result2 ...)) (if test (begin result1 result2 ...))) ((cond (test result1 result2 ...) clause1 clause2 ...) (if test (begin result1 result2 ...) (cond clause1 clause2 ...)))))
(let ((=> #f)) (cond (#t => 'ok))) => ok
The last example is not an error because the local variable =>
is renamed in effect, so that its use is distinct from uses of the top
level identifier =>
that the transformer for cond
looks
for. Thus, rather than expanding into
(let ((=> #f)) (let ((temp #t)) (if temp ('ok temp))))
which would result in an invalid procedure call, it expands instead into
(let ((=> #f)) (if #t (begin => 'ok)))
Although the pattern language provided by syntax-rules
is the
preferred way to specify macro transformers, other low-level
facilities may be provided to specify more complex macro transformers.
In fact, syntax-rules
can itself be defined as a macro using the
low-level facilities described in this section.
The low-level macro facility described here introduces syntax
as a new syntactic keyword analogous to quote
, and allows a
<transformer spec> to be any expression. This is accomplished by
adding the following two productions to the productions in
section Formal syntax and in section Binding syntactic keywords above.
<expression> ==> (syntax <datum>) <transformer spec> ==> <expression>
The low-level macro system also adds the following procedures:
unwrap-syntax identifier->symbol identifier? generate-identifier free-identifier=? construct-identifier bound-identifier=?
Evaluation of a program proceeds in two logical steps. First the program is converted into an intermediate language via macro-expansion, and then the result of macro expansion is evaluated. When it is necessary to distinguish the second stage of this process from the full evaluation process, it is referred to as "execution."
Syntax definitions, either lexical or global, cause an identifier to
be treated as a keyword within the scope of the binding. The keyword
is associated with a transformer, which may be created implicitly
using the pattern language of syntax-rules
or explicitly using
the low-level facilities described below.
Since a transformer spec must be fully evaluated during the course of expansion, it is necessary to specify the environment in which this evaluation takes place. A transformer spec is expanded in the same environment as that in which the program is being expanded, but is executed in an environment that is distinct from the environment in which the program is executed. This execution environment distinction is important only for the resolution of global variable references and assignments. In what follows, the environment in which transformers are executed is called the standard transformer environment and is assumed to be a standard Scheme environment.
Since part of the task of hygienic macro expansion is to resolve identifier references, the fact that transformers are expanded in the same environment as the program means that identifier bindings in the program can shadow identifier uses within transformers. Since variable bindings in the program are not available at the time the transformer is executed, it is an error for a transformer to reference or assign them. However, since keyword bindings are available during expansion, lexically visible keyword bindings from the program may be used in macro uses in a transformer.
When a macro use is encountered, the macro transformer associated with the macro keyword is applied to a representation of the macro expression. The result returned by the macro transformer replaces the original expression and is expanded once again. Thus macro expansions may themselves be or contain macro uses.
The syntactic representation passed to a macro transformer encapsulates information about the structure of the represented form and the bindings of the identifiers it contains. These syntax objects can be traversed and examined using the procedures described below. The output of a transformer may be built up using the usual Scheme list constructors, combining pieces of the input with new syntactic structures.
Syntax: The <datum> may be any external representation of a Scheme object.
Semantics: Syntax
is the syntactic analogue of quote
. It creates a
syntactic representation of <datum> that, like an argument to a
transformer, contains information about the bindings for identifiers
contained in <datum>. The binding for an identifier introduced
by syntax
is the closest lexically visible binding. All
variables and keywords introduced by transformers must be created by
syntax
. It is an error to insert a symbol in the output of a
transformation procedure unless it is to be part of a quoted datum.
(symbol? (syntax x)) => #f
(let-syntax ((car (lambda (x) (syntax car)))) ((car) '(0))) => 0
(let-syntax ((quote-quote (lambda (x) (list (syntax quote) 'quote)))) (quote-quote)) => quote
(let-syntax ((quote-quote (lambda (x) (list 'quote 'quote)))) (quote-quote)) => error
The second quote-quote
example results in an error because two raw
symbols are being inserted in the output. The quoted quote
in the
first quote-quote
example does not cause an error because it will
be a quoted datum.
(let-syntax ((quote-me (lambda (x) (list (syntax quote) x)))) (quote-me please)) => (quote-me please)
(let ((x 0)) (let-syntax ((alpha (lambda (e) (syntax x)))) (alpha))) => 0
(let ((x 0)) (let-syntax ((alpha (lambda (x) (syntax x)))) (alpha))) => error
(let-syntax ((alpha (let-syntax ((beta (syntax-rules () ((beta) 0)))) (lambda (x) (syntax (beta)))))) (alpha)) => error
The last two examples are errors because in both cases a lexically
bound identifier is placed outside of the scope of its binding.
In the first case, the variable x
is placed outside its scope.
In the second case, the keyword beta
is placed outside its
scope.
(let-syntax ((alpha (syntax-rules () ((alpha) 0)))) (let-syntax ((beta (lambda (x) (alpha)))) (beta))) => 0
(let ((list 0)) (let-syntax ((alpha (lambda (x) (list 0)))) (alpha))) => error
The last example is an error because the reference to list
in the
transformer is shadowed by the lexical binding for list
. Since the
expansion process is distinct from the execution of the program,
transformers cannot reference program variables. On the other hand,
the previous example is not an error because definitions for keywords
in the program do exist at expansion time.
Note: It has been suggested that #'<datum>
and
#`<datum>
would be
felicitous abbreviations for (syntax <datum>)
and (quasisyntax <datum>)
, respectively,
where quasisyntax
, which is not described in this
appendix, would bear the same relationship to syntax
that quasiquote
bears to quote
.
procedure: identifier? syntax-object
Returns #t
if syntax-object represents an identifier,
otherwise returns #f
.
(identifier? (syntax x)) => #t (identifier? (quote x)) => #f (identifier? 3) => #f
procedure: unwrap-syntax syntax-object
If syntax-object is an identifier, then it is returned unchanged.
Otherwise unwrap-syntax
converts the outermost structure of
syntax-object into a
data object whose external representation is the same as that of
syntax-object. The result is either an identifier, a pair whose
car
and cdr are syntax objects, a vector whose elements are syntax
objects, an empty list, a string, a boolean, a character, or a number.
(identifier? (unwrap-syntax (syntax x))) => #t (identifier? (car (unwrap-syntax (syntax (x))))) => #t (unwrap-syntax (cdr (unwrap-syntax (syntax (x))))) => ()
procedure: free-identifier=? id1 id2
Returns #t
if the original occurrences of id1
and id2 have
the same binding, otherwise returns #f
.
free-identifier=?
is used to look for a literal identifier in the argument to a
transformer, such as else
in a cond
clause.
A macro
definition for syntax-rules
would use free-identifier=?
to look for literals in the input.
(free-identifier=? (syntax x) (syntax x)) => #t (free-identifier=? (syntax x) (syntax y)) => r#f
(let ((x (syntax x))) (free-identifier=? x (syntax x))) => #f
(let-syntax ((alpha (lambda (x) (free-identifier=? (car (unwrap-syntax x)) (syntax alpha))))) (alpha)) => #f
(letrec-syntax ((alpha (lambda (x) (free-identifier=? (car (unwrap-syntax x)) (syntax alpha))))) (alpha)) => #t
procedure: bound-identifier=? id1 id2
Returns #t
if a binding for one of the two identifiers
id1 and id2 would shadow free references to the other,
otherwise returns #f
.
Two identifiers can be free-identifier=?
without being
bound-identifier=?
if they were introduced at different
stages in the
expansion process.
Bound-identifier=?
can be used, for example, to
detect duplicate identifiers in bound-variable lists. A macro
definition of syntax-rules
would use bound-identifier=?
to look for
pattern variables from the input pattern in the output template.
(bound-identifier=? (syntax x) (syntax x)) => #t
(letrec-syntax ((alpha (lambda (x) (bound-identifier=? (car (unwrap-syntax x)) (syntax alpha))))) (alpha)) => #f
procedure: identifier->symbol id
Returns a symbol representing the original name of id.
Identifier->symbol
is used to examine identifiers that appear in
literal contexts, i.e., identifiers that will appear in quoted
structures.
(symbol? (identifier->symbol (syntax x))) => #t (identifier->symbol (syntax x)) => x
procedure: generate-identifier
procedure: generate-identifier symbol
Returns a new identifier.
The optional argument to generate-identifier
specifies the
symbolic name of the resulting identifier. If no argument is
supplied the name is unspecified.
Generate-identifier
is used to introduce bound identifiers into
the output of a transformer. Since introduced bound identifiers are
automatically renamed, generate-identifier
is necessary only for
distinguishing introduced identifiers when an indefinite number of them
must be generated by a macro.
The optional argument to generate-identifier
specifies the
symbolic name of the resulting identifier. If no argument is
supplied the name is unspecified. The procedure
identifier->symbol
reveals the symbolic name of an identifier.
(identifier->symbol (generate-identifier 'x)) => x
(bound-identifier=? (generate-identifier 'x) (generate-identifier 'x)) => #f
(define-syntax set*! ; (set*! (<identifier> <expression>) ...) (lambda (x) (letrec ((unwrap-exp (lambda (x) (let ((x (unwrap-syntax x))) (if (pair? x) (cons (car x) (unwrap-exp (cdr x))) x))))) (let ((sets (map unwrap-exp (cdr (unwrap-exp x))))) (let ((ids (map car sets)) (vals (map cadr sets)) (temps (map (lambda (x) (generate-identifier)) sets))) `(,(syntax let) ,(map list temps vals) ,@(map (lambda (id temp) `(,(syntax set!) ,id ,temp)) ids temps) #f))))))
procedure: construct-identifier id symbol
Creates and returns an identifier named by symbol that behaves as if it had been introduced where the identifier id was introduced.
Construct-identifier
is used to circumvent hygiene by
creating an identifier that behaves as though it had been
implicitly present in some expression. For example, the
transformer for a structure
definition macro might construct the name of a field accessor
that does not explicitly appear in a use of the macro,
but can be
constructed from the names of the structure and the field.
If a binding for the field accessor were introduced
by a hygienic transformer, then it would be renamed automatically,
so that the introduced binding would fail to capture any
references to the field accessor that were present in the
input and were intended to be
within the scope of the introduced binding.
Another example is a macro that implicitly binds exit
:
(define-syntax loop-until-exit (lambda (x) (let ((exit (construct-identifier (car (unwrap-syntax x)) 'exit)) (body (car (unwrap-syntax (cdr (unwrap-syntax x)))))) `(,(syntax call-with-current-continuation) (,(syntax lambda) (,exit) (,(syntax letrec) ((,(syntax loop) (,(syntax lambda) () ,body (,(syntax loop))))) (,(syntax loop))))))))
(let ((x 0) (y 1000)) (loop-until-exit (if (positive? y) (begin (set! x (+ x 3)) (set! y (- y 1))) (exit x)))) => 3000
The extension described in this appendix is the most
sophisticated macro facility that has ever been proposed
for a block-structured programming language. The main ideas
come from
Eugene Kohlbecker's PhD thesis on hygienic macro expansion
[KOHLBECKER86], written under the direction of Dan
Friedman [HYGIENIC], and from the work by Alan Bawden
and Jonathan Rees on syntactic closures [BAWDEN88].
Pattern-directed macro facilities were popularized by Kent
Dybvig's non-hygienic implementation of extend-syntax
[DYBVIG87].
At the 1988 meeting of this report's authors at Snowbird,
a macro committee consisting of Bawden, Rees, Dybvig,
and Bob Hieb was charged with developing a hygienic macro
facility akin to extend-syntax
but based on syntactic closures.
Chris Hanson implemented a prototype and wrote a paper on his
experience, pointing out that an implementation based on
syntactic closures must determine the syntactic roles of some
identifiers before macro expansion based on textual pattern
matching can make those roles apparent. William Clinger
observed that Kohlbecker's algorithm amounts to a technique
for delaying this determination, and proposed a more efficient
version of Kohlbecker's algorithm. Pavel Curtis spoke up for
referentially transparent local macros. Rees merged syntactic
environments with the modified Kohlbecker's algorithm and
implemented it all, twice [MACROSTHATWORK].
Dybvig and Hieb designed and implemented the low-level macro facility described above. Recently Hanson and Bawden have extended syntactic closures to obtain an alternative low-level macro facility. The macro committee has not endorsed any particular low-level facility, but does endorse the general concept of a low-level facility that is compatible with the high-level pattern language described in this appendix.
Several other people have contributed by working on macros over the years. Hal Abelson contributed by holding this report hostage to the appendix on macros.