CONTENTS - (Use <ENTER> g to access required sections)
-- Introduction
-- A minimal concept of compilation
-- Forms of interaction at run time
-- -- Case a: compile then execute.
-- -- Case b: alternating compilation and execution:
-- -- Case c: nested as well as alternated compilations and executions:
-- -- Case d: Merging compilation streams
-- -- Case e: allowing multiple interleaved compilation streams.
-- -- Case f: Time-shared concurrent streams:
-- To summarise:
-- Notes and qualifications
-- Introduction
There has been considerable discussion recently about dynamic vs lexical
binding of variables in comp.lang.pop
My impression is that there are more distinctions between different
variable binding schemes than have so far been clearly defined in the
discussion, and I thought it might be useful to list them, if only
because Pop-11 offers more choices to the programmer than any other
language I know of. (Maybe this should be a chapter in the Pop-11
primer.)
Before I present the options regarding variable bindings I need to
clarify some distinctions concerning compile time, run time, and
compilation streams, as there are some issues concerning variable
binding that concern only compile time, some concern only run time, and
some involve relations between the two. These distinctions get
complicated when there are different compilation streams in the same
process.
This article merely goes into the compile-time/run-time distinction
and the notion of a compilation stream. In a later article, when I have
time, I'll return to variable binding.
I've cross-posted to comp.compilers in case someone who knows about
other languages than Pop wishes to comment on the generality of my
remarks.
-- A minimal concept of compilation
In what follows I shall talk about source code being "compiled",
independently of whether it is compiled all the way to machine code, or
to some intermediate interpreted form (e.g. lisp-like parse trees, or
some interpreted intermediate machine language), or whether the source
code is stored directly in the machine as strings of text and
interpreted, or whether it is compiled into an object file which has to
be linked and loaded before it can be run.
In this sense, compiling is just the process of taking source code from
a file or from the terminal and storing it in the machine in some form
in which it can then be run. This process can include compiling whole
procedure definitions, or compiling individual commands or declarations
such as "let X = 99", in Basic or "vars X = 99" in Pop-11.
-- Forms of interaction at run time
Before I go on to distinguish various questions about access to
variables, I need to clarify an important notion, namely that the
language allows the programmer to interact with programs at run time
using the original programming language. Unfortunately this is not a
simple concept, as there are several slightly different cases to be
considered.
The question is whether there is a clearly separate phase in which
source-code programs are read in ("compile time"), after which programs
can only be executed ("run time"), or whether new source code can be
added to the running system at any time, including "top level" source
code instructions such as "print the value of X" or instructions to
alter the contents of datastructures. Different cases will be
distinguished.
-- -- Case a: compile then execute.
In most programming language systems (e.g. C, Pascal, Fortran) there is
a sharp distinction between compile time and run time and they cannot be
alternated or interleaved: for any program there is first a process of
compilation, and then later on the program is run, perhaps on many
occasions in different contexts. Within this paradigm, no program
execution is possible till the whole program has been compiled (whether
to machine code or an interpreted intermediate form), and once program
execution has begun no more code can be added.
In such a system, it is not possible for the user to interact with the
running system unless the program contains a procedure that reads in
text typed by the user (or responds to mouse actions, or whatever), and
performs appropriate actions. I.e. a special "command interpreter" may
be part of the program, and it will generally allow a command language
that is different from the programming language used to create the
program.
-- -- Case b: alternating compilation and execution:
By contrast, in BASIC, and many AI languages (Lisp, Prolog, Pop-11) and
functional languages (Scheme, ML, Miranda) a procedure to support
interaction is provided by the language system itself in the form of a
built in command interpreter or an incremental compiler that remains
available at run time.
In its simplest form the compiler accepts, either from the user or from
a specified file, commands which can either add declarations or
procedure definitions or give commands to run system or user procedures.
Such a compiler simply reads in definitions, declarations and commands
and obeys them, always returning to the top level read loop whenever
something has been completed. Sometimes this is referred to in
connection with Lisp as a "read-eval-print" loop. In this case,
compilation and execution are alternated.
-- -- Case c: nested as well as alternated compilations and executions:
In more sophisticated systems, compiling and running can be nested as
well as alternated. For instance many AI systems allow the compiler to
be invoked as a subroutine by user procedures. Then besides the
top-level invocation, there is a new invocation of the compiler. That in
turn can read in new commands which cause procedures to run which invoke
the compiler. In that case there can be several coexisting "compilation
streams", a point I'll return to later. By contrast if the compiler is
always a top-level process, which continues running after each command
is completed, then there is always only one compilation stream.
Nesting is useful in several differenc contexts, e.g.
o The decision to compile some additional files can be taken
by a user procedure at run time, depending on task requirements.
o The process of compiling one file can cause the compiler to be
invoked to compile another file. The latter process will then have
its own compilation stream, in most languages.
o A learning program can create and compile new procedures on the
basis of what it has learnt.
o When testing programs it is sometimes useful to enter a "break" in
which the compiler is invoked to give the user a chance to type in
commands to interrogate values of variables, contents of
datastructures, the current call stack, etc. Alternatively this may
be required for the application, e.g. if the user is allowed to
tailor the application by typing in procedures to be invoked in
certain circumstances.
When compilation streams are nested, compilation process C1 may cause
procedure P1 to be invoked which then invokes compilation process C2,
which may run P2, etc. P1 will not resume execution until C2 is
complete. Similarly C1 will not continue until procedure P1 is complete.
-- -- Case d: Merging compilation streams
Note that Poplog Pop-11 allows different compilation streams to be
merged into a single stream using #_INCLUDE (or the macro "include"
which copes with search lists for directories of files to be
"included").
The advantage of allowing merged compilation streams as well as nested
streams is that declarations that are normally local to a compilation
stream, but which need to be shared between different libraries, need
not be copied into all those libraries. Instead a single file contains
the declarations, and it can be invoked by several different files. This
is analogous to the use of #include in C, and the use of "source" in the
Unix Cshell.
Note that the reason why this facility is required in Pop-11 is that
there are certain types of variables that cannot be shared across
different compilation streams, namely file-local, or stream-local
lexical variables. Thus if such declarations are encountered in a
in a file whose compilation is nested they will not be inherited by the
calling environment.
Similarly if a unix shell script is run, the calling environment will
not pick up any environment variables declared in the shell script, nor
any directory changes, whereas using "source" to merge the script with
the current shell overcomes this.
It is interesting that the need for the merging facility became apparent
only after the introduction of lexical scoping at the level of whole
files. The need would have been apparent earlier had there been other
things local to the current compilation stream, such as the current
directory.
-- -- Case e: allowing multiple interleaved compilation streams.
A yet more sophisticated system can support multi-threading, or
lightweight processes, which can resume or suspend one another. In that
case procedure P1, instead of invoking C2 as a sub-routine may start it
up as an independent compilation process. Then at some point C2 may be
suspended, and P1 resumed. If P1 exits, then compilation process C1 is
resumed. After a while it may be suspended and C2 resumed. For example,
C1 and C2 may run in different windows, and a mouse or keyboard event
could be used to determine which compilation stream to use next to read
in commands. So the processes C1 and C2 are then interleaved. (This is
used, for example, by "immediate mode" in the Poplog editor, VED, where
interleaving allows you to suspend a procedure definition or command,
switch to another command stream to give commands to gain information
about the language or the current state of the system, then return and
complete the previous command.) Having different concurrent compilation
streams in different windows for different purposes, can be helpful to
the user, even if, in principle, it could all be done in one window
using nested commands. Another case is where the system (like Poplog)
actually supports different high level languages, in which case
different concurrent compilation streams can be used for the different
languages.
-- -- Case f: Time-shared concurrent streams:
A yet more sophisticated system will allow "backgrounding" - i.e. while
one process continues running "in the background", e.g. compiling large
files, another process reacts to user commands. This requires some kind
of scheduler which allocates time-slices between the different
compilation streams and the processes they generate. In Pop-11 a
scheduler is not built in, but the combination of timed interrupts and
the process mechanism makes it fairly easy to implement. (See HELP
ACTOR)
-- To summarise:
a. Simple systems allow only compilation then execution
b. Interactive language systems allow alternating compilation and
execution processes.
c. Systems in which the compiler is itself a procedure allow nested
compilations as well as alternating compilation and execution.
d. Systems that allow compilation streams involving different files
to be merged.
e. Systems that support multi-processing, permit interleaved compilation
streams, as well as nested and alternating compilation and execution
processes.
f. A multi-processing system with a scheduler allows process switching
on a time-sliced basis instead of waiting for an active stream to
complete a sub-task and pause before another one can be resumed.
-- Notes and qualifications
[Note 1: A non-interactive system that allows object files to be
dynamically linked in while a program is running blurs the distinction
between programs that allow interaction at run time and those that
don't. If, for example, a running program checks every now and again
whether you have created a new object file called "commands.o" and if so
takes steps to link it in (possibly after giving it a unique name), and
then runs its top level procedure you have a sort of run time command
interpreter, though it may feel clumsy to use.]
[Note 2: The availability of a debugger, especially if it can be invoked
after a program has started running, can also blur the distinction, if
it provides commands for interrogating variables and data-structures,
or calling procedures. If the debugger allows commands to be given in
the same language as the original program, then it fits one of the
previous categories. Conversely, languages that are fully interactive
and allow nested compilation streams provide their own debugging
facilities.]
[Note 3: There are operating system command languages like the Unix
shell languages that support all the above facilities. For instance, a
shell command file can spawn a new shell. Two windows on the same
machine can run two shells concurrently.]
[Note 4: I have so far written as if variable binding environments, or
identifier scopes, are tied entirely to compilation streams. But that is
not the case, since when variables are local to a procedure the scope
may be restricted to the portion of the compilation stream during which
the procedure is being compiled, and in addition, if the same procedure
is concurrently active more than once in a procedure calling hierarchy
different bindings for those variables can be associated with each
activation. Thus if the variable n is local to a recursive definition of
factorial, then calling factorial(5) may create 5 different activations,
in which the value of n is respectively 5, 4, 3, 2 and 1. This point
is common to both lexical and dynamic binding, though shallow dynamic
binding allows temporary sharing of activation values.]
Later, I hope to turn back to the discussion of different variable
binding regimes. Meanwhile I'd welcome comments on all this. Have I
missed anything important?
--
Aaron Sloman,
School of Computer Science, The University of Birmingham, B15 2TT, England
EMAIL A.Sloman@cs.bham.ac.uk OR A.Sloman@bham.ac.uk
Phone: +44-(0)21-414-3711 Fax: +44-(0)21-414-4281
|