Lexical Conventions

Go to the first, previous, next, last section, table of contents.

Lexical conventions

This section gives an informal account of some of the lexical conventions used in writing Scheme programs. For a formal syntax of Scheme, see section Formal syntax.

Upper and lower case forms of a letter are never distinguished except within character and string constants. For example, Foo is the same identifier as FOO, and #x1AB is the same number as #X1ab.

Identifiers

Most identifiers allowed by other programming languages are also acceptable to Scheme. The precise rules for forming identifiers vary among implementations of Scheme, but in all implementations a sequence of letters, digits, and "extended alphabetic characters" that begins with a character that cannot begin a number is an identifier. In addition, +, -, and ... are identifiers. Here are some examples of identifiers:

lambda                   q
list->vector             soup
+                        V17a
<=?                      a34kTMNs
the-word-recursion-has-many-meanings

Extended alphabetic characters may be used within identifiers as if they were letters. The following are extended alphabetic characters:

+ - . * / < = > ! ? : $ % _ & ~ ^

See section Lexical structure for a formal syntax of identifiers.

Identifiers have several uses within Scheme programs:

  • Certain identifiers are reserved for use as syntactic keywords (see below).
  • Any identifier that is not a syntactic keyword may be used as a variable (see section Variables and regions).
  • When an identifier appears as a literal or within a literal (see section Literal expressions), it is being used to denote a symbol (see section Symbols).

The following identifiers are syntactic keywords, and should not be used as variables:

=>           do            or
and          else          quasiquote
begin        if            quote
case         lambda        set!
cond         let           unquote
define       let*          unquote-splicing
delay        letrec

Some implementations allow all identifiers, including syntactic keywords, to be used as variables. This is a compatible extension to the language, but ambiguities in the language result when the restriction is relaxed, and the ways in which these ambiguities are resolved vary between implementations.

Whitespace and comments

Whitespace characters are spaces and newlines. (Implementations typically provide additional whitespace characters such as tab or page break.) Whitespace is used for improved readability and as necessary to separate tokens from each other, a token being an indivisible lexical unit such as an identifier or number, but is otherwise insignificant. Whitespace may occur between any two tokens, but not within a token. Whitespace may also occur inside a string, where it is significant.

A semicolon (;) indicates the start of a comment. The comment continues to the end of the line on which the semicolon appears. Comments are invisible to Scheme, but the end of the line is visible as whitespace. This prevents a comment from appearing in the middle of an identifier or number.

;;; The FACT procedure computes the factorial
;;; of a non-negative integer.
(define fact
  (lambda (n)
    (if (= n 0)
        1        ;Base case: return 1
        (* n (fact (- n 1))))))

Inlab Scheme accepts additionally C-Style comments, see General Usage - C Style Comments

Other notations

For a description of the notations used for numbers, see section Numbers.

. + -
These are used in numbers, and may also occur anywhere in an identifier except as the first character. A delimited plus or minus sign by itself is also an identifier. A delimited period (not occurring within a number or identifier) is used in the notation for pairs (section Pairs and lists), and to indicate a rest-parameter in a formal parameter list (section Lambda expressions). A delimited sequence of three successive periods is also an identifier.
( )
Parentheses are used for grouping and to notate lists (section Pairs and lists).
'
The single quote character is used to indicate literal data (section Literal expressions).
`
The backquote character is used to indicate almost-constant data (section Quasiquotation).
, ,@
The character comma and the sequence comma at-sign are used in conjunction with backquote (section Quasiquotation).
"
The double quote character is used to delimit strings (section Strings).
\
Backslash is used in the syntax for character constants (section Characters) and as an escape character within string constants (section Strings).
[ ] { }
Left and right square brackets and curly braces are reserved for possible future extensions to the language.
#
Sharp sign is used for a variety of purposes depending on the character that immediately follows it:
#t #f
These are the boolean constants (section Booleans).
#\
This introduces a character constant (section Characters).
#(
This introduces a vector constant (section Vectors). Vector constants are terminated by `)' .
#e #i #b #o #d #x
These are used in the notation for numbers (section Syntax of numerical constants).


Go to the first, previous, next, last section, table of contents.