~ubuntu-branches/ubuntu/precise/r5rs-doc/precise : contents of struct.tex at revision 4

~ubuntu-branches/ubuntu/precise/r5rs-doc/precise : (revision 4)
% 1. Structure of the language

\chapter{Overview of Scheme}

\section{Semantics}
\label{semanticsection}

This section gives an overview of Scheme's semantics.  A
detailed informal semantics is the subject of
chapters~\ref{basicchapter} through~\ref{builtinchapter}.  For reference
purposes, section~\ref{formalsemanticssection} provides a formal
semantics of Scheme.

\vest Following Algol, Scheme is a statically scoped programming
language.  Each use of a variable is associated with a lexically
apparent binding of that variable.

\vest Scheme has latent as opposed to manifest types.  Types
are associated with values (also called objects\mainindex{object}) rather than
with variables.  (Some authors refer to languages with latent types as
weakly typed or dynamically typed languages.)  Other languages with
latent types are APL, Snobol, and other dialects of Lisp.  Languages
with manifest types (sometimes referred to as strongly typed or
statically typed languages) include Algol 60, Pascal, and~C.

\vest All objects created in the course of a Scheme computation, including
procedures and continuations, have unlimited extent.
No Scheme object is ever destroyed.  The reason that
implementations of Scheme do not (usually!)\ run out of storage is that
they are permitted to reclaim the storage occupied by an object if
they can prove that the object cannot possibly matter to any future
computation.  Other languages in which most objects have unlimited
extent include APL and other Lisp dialects.

\vest Implementations of Scheme are required to be properly tail-recursive.
This allows the execution of an iterative computation in constant space,
even if the iterative computation is described by a syntactically
recursive procedure.  Thus with a properly tail-recursive implementation,
iteration can be expressed using the ordinary procedure-call
mechanics, so that special iteration constructs are useful only as
syntactic sugar.  See section~\ref{proper tail recursion}.

\vest Scheme procedures are objects in their own right.  Procedures can be
created dynamically, stored in data structures, returned as results of
procedures, and so on.  Other languages with these properties include
Common Lisp and ML. \todo{Rozas: Scheme had them first.}

\vest One distinguishing feature of Scheme is that continuations, which
in most other languages only operate behind the scenes, also have
``first-class'' status.  Continuations are useful for implementing a
wide variety of advanced control constructs, including non-local exits,
backtracking, and coroutines.  See section~\ref{continuations}.

\vest Arguments to Scheme procedures are always passed by value, which
means that the actual argument expressions are evaluated before the
procedure gains control, whether the procedure needs the result of the
evaluation or not.  ML, C, and APL are three other languages that always
pass arguments by value.
This is distinct from the lazy-evaluation semantics of Haskell,
or the call-by-name semantics of Algol 60, where an argument
expression is not evaluated unless its value is needed by the
procedure.

\todo{Lisp's call by value should be explained more
accurately.  What's funny is that all values are references.}

\vest Scheme's model of arithmetic is designed to remain as independent as
possible of the particular ways in which numbers are represented within a
computer. In Scheme, every integer is a rational number, every rational is a
real, and every real is a complex number.  Thus the distinction between integer
and real arithmetic, so important to many programming languages, does not
appear in Scheme.  In its place is a distinction between exact arithmetic,
which corresponds to the mathematical ideal, and inexact arithmetic on
approximations.  As in Common Lisp, exact arithmetic is not limited to
integers.

\section{Syntax}

Scheme, like most dialects of Lisp, employs a fully parenthesized prefix
notation for programs and (other) data; the grammar of Scheme generates a
sublanguage of the language used for data.  An important
consequence of this simple, uniform representation is the susceptibility of
Scheme programs and data to uniform treatment by other Scheme programs.
For example, the {\cf eval} procedure evaluates a Scheme program expressed
as data.

The {\cf read} procedure performs syntactic as well as lexical decomposition of
the data it reads.  The {\cf read} procedure parses its input as data
(section~\ref{datumsyntax}), not as program.

The formal syntax of Scheme is described in section~\ref{BNF}.


\section{Notation and terminology}


\subsection{Primitive, library, and optional features}
\label{qualifiers}

It is required that every implementation of Scheme support all
features that are not marked as being \defining{optional}.  Implementations are
free to omit optional features of Scheme or to add extensions,
provided the extensions are not in conflict with the language reported
here.  In particular, implementations must support portable code by
providing a syntactic mode that preempts no lexical conventions of this
report.

To aid in understanding and implementing Scheme, some features are marked
as \defining{library}. These can be easily implemented in terms of the other,
primitive, features.  They are redundant in the strict sense of
the word, but they capture common patterns of usage, and are therefore
provided as convenient abbreviations.

\subsection{Error situations and unspecified behavior}

\mainindex{error}
When speaking of an error situation, this report uses the phrase ``an
error is signalled'' to indicate that implementations must detect and
report the error.  If such wording does not appear in the discussion of
an error, then implementations are not required to detect or report the
error, though they are encouraged to do so.  An error situation that
implementations are not required to detect is usually referred to simply
as ``an error.''

\vest For example, it is an error for a procedure to be passed an argument that
the procedure is not explicitly specified to handle, even though such
domain errors are seldom mentioned in this report.  Implementations may
extend a procedure's domain of definition to include such arguments.

\vest This report uses the phrase ``may report a violation of an
implementation restriction'' to indicate circumstances under which an
implementation is permitted to report that it is unable to continue
execution of a correct program because of some restriction imposed by the
implementation.  Implementation restrictions are of course discouraged,
but implementations are encouraged to report violations of implementation
restrictions.\mainindex{implementation restriction}

\vest For example, an implementation may report a violation of an
implementation restriction if it does not have enough storage to run a
program.

\vest If the value of an expression is said to be ``unspecified,'' then
the expression must evaluate to some object without signalling an error,
but the value depends on the implementation; this report explicitly does
not say what value should be returned. \mainindex{unspecified}

\todo{Talk about unspecified behavior vs. unspecified values.}

\todo{Look at KMP's situations paper.}


\subsection{Entry format}

Chapters~\ref{expressionchapter} and~\ref{builtinchapter} are organized
into entries.  Each entry describes one language feature or a group of
related features, where a feature is either a syntactic construct or a
built-in procedure.  An entry begins with one or more header lines of the form

\noindent\pproto{\var{template}}{\var{category}}\unpenalty

for required, primitive features, or

\noindent\pproto{\var{template}}{\var{qualifier} \var{category}}\unpenalty

where \var{qualifier} is either ``library'' or ``optional'' as defined
 in section~\ref{qualifiers}.

If \var{category} is ``\exprtype'', the entry describes an expression
type, and the template gives the syntax of the expression type.
Components of expressions are designated by syntactic variables, which
are written using angle brackets, for example, \hyper{expression},
\hyper{variable}.  Syntactic variables should be understood to denote segments of
program text; for example, \hyper{expression} stands for any string of
characters which is a syntactically valid expression.  The notation
\begin{tabbing}
\qquad \hyperi{thing} $\ldots$
\end{tabbing}
indicates zero or more occurrences of a \hyper{thing}, and
\begin{tabbing}
\qquad \hyperi{thing} \hyperii{thing} $\ldots$
\end{tabbing}
indicates one or more occurrences of a \hyper{thing}.

If \var{category} is ``procedure'', then the entry describes a procedure, and
the header line gives a template for a call to the procedure.  Argument
names in the template are \var{italicized}.  Thus the header line

\noindent\pproto{(vector-ref \var{vector} \var{k})}{procedure}\unpenalty

indicates that the built-in procedure {\tt vector-ref} takes
two arguments, a vector \var{vector} and an exact non-negative integer
\var{k} (see below).  The header lines

\noindent%
\pproto{(make-vector \var{k})}{procedure}
\pproto{(make-vector \var{k} \var{fill})}{procedure}\unpenalty

indicate that the {\tt make-vector} procedure must be defined to take
either one or two arguments.

\label{typeconventions}
It is an error for an operation to be presented with an argument that it
is not specified to handle.  For succinctness, we follow the convention
that if an argument name is also the name of a type listed in
section~\ref{disjointness}, then that argument must be of the named type.
For example, the header line for {\tt vector-ref} given above dictates that the
first argument to {\tt vector-ref} must be a vector.  The following naming
conventions also imply type restrictions:
\newcommand{\foo}[1]{\vr{#1}, \vri{#1}, $\ldots$ \vrj{#1}, $\ldots$}
$$
\begin{tabular}{ll}
\var{obj}&any object\\
\foo{list}&list (see section~\ref{listsection})\\
\foo{z}&complex number\\
\foo{x}&real number\\
\foo{y}&real number\\
\foo{q}&rational number\\
\foo{n}&integer\\
\foo{k}&exact non-negative integer\\
\end{tabular}
$$

\todo{Provide an example entry??}


\subsection{Evaluation examples}

The symbol ``\evalsto'' used in program examples should be read
``evaluates to.''  For example,

\begin{scheme}
(* 5 8)      \ev  40%
\end{scheme}

means that the expression {\tt(* 5 8)} evaluates to the object {\tt 40}.
Or, more precisely:  the expression given by the sequence of characters
``{\tt(* 5 8)}'' evaluates, in the initial environment, to an object
that may be represented externally by the sequence of characters ``{\tt
40}''.  See section~\ref{externalreps} for a discussion of external
representations of objects.

\subsection{Naming conventions}

By convention, the names of procedures that always return a boolean
value usually end
in ``\ide{?}''.  Such procedures are called predicates.

By convention, the names of procedures that store values into previously
allocated locations (see section~\ref{storagemodel}) usually end in
``\ide{!}''.
Such procedures are called mutation procedures.
By convention, the value returned by a mutation procedure is unspecified.

By convention, ``\ide{->}'' appears within the names of procedures that
take an object of one type and return an analogous object of another type.
For example, {\cf list->vector} takes a list and returns a vector whose
elements are the same as those of the list.


	
\todo{Terms that need defining: thunk, command (what else?).}