pop@dcs.glasgow.ac.uk writes:
>
>
[ stuff deleted ]
> level (that is you are discouraged from seeing raw store). I have made a
> little table comparing these:
>
> GC = built-in garbage collector (this may or may not be a bonus - e.g. the
> Glasgow Haskell compiler has its own ideas about tags and garbage which can
> be matched by Poplog at a modest cost). INC = Incremental
> definition/redefinition of program.
>
>
>
> Code Speed Availability Price Size GC INC
> -----------------------------------------------------------------------------
> C excellent near-universal cheap Modest no no
>
> Poplog VM good but Unix&VMS - most expensive Big yes yes
> poor floats major architectures
>
> Forth poor? cheap? Modest? no yes
>
> Now, there are various gaps to close. How good can Forth code be made? Is
> a built in garbage collector desirable?
FORTH generally does not need garbage collection because all
memory allocation is under the user's control. That is, FORTH
does not often use linked lists (other than the dictionary) and
if a user defines a LL, then it is his responsibility to see that
the trash is taken out.
What about "code quality"? There are of course many measures of
this: speed, succinctness, clarity, maintainability, ease of
reading, ease of programming, debugging, complexity, etc.
Following the Biblical injunction ("...and the last shall be first")
let me address complexity. By most measures, FORTH programs that do
fancy things exhibit low- to zero complexity. (An article in DDJ
surveyed this some years ago.) Perhaps more modern measures have
been developed that do not bomb out on FORTH, but I do not know
about them. The reason is that many subtle and interesting FORTH
programs do everything with defining words, so the business end
gets hidden in the DOES> ... ; portion (which is often CODEd).
Another reason for apparent lack of complexity is that the FORTH
philosophy avoids decisions, letting execution take its place.
That is, the compiler has nothing to decide when IF...ELSE...THEN
DO...LOOP, BEGIN...UNTIL or BEGIN...WHILE...REPEAT are encountered
in the input stream. These words are IMMEDIATE and execute the ap-
propriate actions to install the branching primitives. Real FORTH
programmers (and despite my 8 years' experience and authorship of
a FORTH book I do not yet presume to call myself one, yet) tend
to use this power so naturally that their programs appear branch-
less, hence devoid of complexity. An example is a typical FORTH
assembler. Complete assemblers for CISC chips are often no more
than a few hundred lines long, and compile to a couple of kilo-
bytes, depending on the rationality of the machine-language.
What about debugging? Well all incrementally compiled languages
that can be tested interactively debug quickly. I have used
QuickBasic for that reason (and interpreted BASIC before that
in preference to FORTRAN). But FORTH is easier than anything else
I have tried, including single-stepping debuggers, etc. etc.
Ease of programming: depends on the task. If I have thought it
out and decomposed it into orthogonal sub-tasks, then it is
usually pretty quick. A case in point (that my book treats in
detail, including false steps) is solving linear equations:
Gaussian elimination with pivoting requires triangularizing
the matrix in place, then back-solving. Triangularizing has
several steps--finding the pivot element, transposing rows,
and eliminating leading elements (by combining rows). Each of
these steps is given a name and tested individually. The
entire job takes very little time--perhaps a couple of hours
starting from scratch.
What about ease of reading/maintainability? The old canard that
FORTH is "write-only" holds mainly for undocumented code with
poorly chosen word-names, and poorly factored definitions. When
programmers take the trouble with comments, names, factoring
layout, etc. then FORTH can be easy to read. Perhaps the biggest
problem (assuming the above conditions are met) lies in the fact
that FORTH is so freely extensible that programmers often develop
idiosyncratic styles and conventions that convey little to others.
That is, one man's self-documenting code is often another's
heiroglyphics. This is no different, really, from the professor
who, having forgotten how little he once knew, says "Obviously..."
and the cure is the same: lots and lots of low-level explanation.
Clarity is in the mind of the beholder. But FORTH tends to short
subroutines that can generally be understood at a glance. Good
naming helps, too. Thus my program for linear equations reads
: solve setup triangularize backsolve ;
FORTH excels at succinctness, as noted above. For example, my
little FORmula TRANslator, that allows embedded formulas in
definitions to be translated on the fly to FORTH, as in
: test F" a = (b+c)/(c-d) - COSH(pi*a) " ;
compiles to about 7 Kbytes of which 1 Kb is a buffer; another
0.5 K or so is the function library. Amazingly, the
source is only some 450 lines of (heavily commented!) code.
Finally, speed: high level FORTH is typically, 5x to 10x slower
than C compiled with an optimizing compiler, depending on the
method of threading. However, many FORTHs nowadays come with
some form of inline optimizer that will convert selected words
to inlined machine code. Certainly the FORTH I use does this.
Although said code is not super-optimized, it generally runs
within 2x of pretty good machine code. I have had long experience
at hand-optimizing machine language (since 1961) so my comparisons
(for both Intel and Motorola families) are probably good rules of
thumb for scalar processors.
Some FORTHs now compile directly to optimized machine code, giving
the best of all possible worlds (except that they then lose the
memory parsimony that has always been FORTH's hallmark--personally
I prefer the peephole optimizing of the preceding paragraph).
I hope you find these comments responsive. --jvn
--
Julian V. Noble
jvn@virginia.edu
|