[To reply replace "Aaron.Sloman.XX" with "A.Sloman"]
Jeff Best asked a question by email that I thought would merit a
public reply.
> Did Poplog ever get an incremental garbage collector, to disperse the
> load over time? I believe this works by maintaining a "collectable"
> list, which won't work in an architecture that requires mark-and-sweep.
John Gibson worked out a scheme for for incremental GC when I was still
at Sussex, in the early 80s I think. Lots of people were talking about
the need for incremental GC because of the need to maintain predictable
performance especially in view of the fact that at that time there were
a lot of AI systems which took long enough per over garbage collection
to justify going for coffee, or even lunch, instead of waiting.
John's scheme was partly like the "stop and copy" garbage collector
(one of two currently available in poplog, the other being mark and
sweep and shuffle see section 9 of REF SYSTEM).
It would have meant permanently keeping two areas in the virtual address
space and using "spare" time to copy data from one space to the other
until everything that was not garbage had been moved, and then copying
back. So each copy process copied only non-garbage, effectively
compacting the new space. There are other possible methods, including
use of reference counts.
We thought hard about this and decided against implementing this or any
other form of incremental garbage collection on the grounds that
(a) the poplog garbage collector was already much faster than most of
the others available then (one user at HP research labs was reported to
have switched from a dedicated Lisp machine to Poplog lisp on a unix
workstation simply because of the reduction in GC time, and having seen
a GC in operation on a lisp machine at MIT around that time, I can
believe it).
(b) Any incremental garbage collection process slows everything down, so
you have to be prepared to trade overall average or total speed for
uniformity/smoothness of performance. I think John calculated that the
extra checking for whether an item had or had not been moved from one
space to another which has to be done on each reference could produce an
overall reduction in performance of up to 50%. It would also double the
*constant* space requirement of the heap, with possible consequential
extra paging and swapping, whereas poplog now has its "heap" allocation
increased only for the duration of the garbage collection, which is
typically a small fraction of the total time (partly controllable by
popgcratio).
We decided on balance that it was not worth moving to an incremental
garbage collector. He did however introduce heap locking as an option,
to reduce GC time: i.e. after compiling lots of things that you know
will never become garbage because they are needed throughout the run of
a program (e.g. compiled procedures, and maybe some large
data-structures), you can call sysgarbage() to compact everything, then
sys_lock_heap to lock it (see the help files for details), and the
garbage collector will then not attempt to decide whether things in the
locked part are garbage, and will not move them around (the non-garbage
things still have to be looked at however to see what they refer to,
since they can point to things outside the locked area).
Heap locking works well, though if you repeatedly lock and unlock
the heap while creating X windows you will lose memory at present,
since non-relocatable stuff, such as C data structures, apparently
do not become re-usable after garbage collections if they have once
been part of a locked heap. I don't know whether that's a minor bug
in the implementation or a hard problem to avoid. But that just
means you should delay calling sys_lock_heap till the last moment
if you are using X.
Some "generational" garbage collectors try to do automated locking
and unlocking of different parts of the heap, using something like
dynamically computed estimates of volatility (I think), but this has
never been considered for Poplog, as far as I know.
Living with poplog now I think the decision not to bother with
incremental garbage collections was the right one, since because of the
increase in CPU and bus and memory and disc speeds garbage collections
on current systems even for quite large processes take a tiny fraction
of the total time and for many applications they are so quick as not to
be noticed. E.g. if you are doing a lot of interactive work (using Ved
for example) you can try
true -> popgctrace;
to find out when GCs occur, and how long they take. For me they tend to
be a small fraction of a second and relativley rare. I've just checked
and with my 2MWord (8Mbyte) heap at present on a multi-user 450 Mhz Sun
Ultrasparc, each GC takes about 0.05 seconds. So I never notice them.
This kind of delay is within the limits of what can happen in any
case on a Unix system doing lots of things.
This means that if you want to use pop-11 for real-time applications
where the time tolerances are very much smaller, you have to use a
special version of Unix and you have to try to write your programs so as
not to generate garbage collections, e.g. re-using vectors, lists, etc.,
or to prevent them occurring during critical intervals by making sure
there's a GC just before the interval and the heap is big enough, etc.
OOPS! My answer was much longer than I intended it to be when I
started.
Aaron
====
Aaron Sloman, ( http://www.cs.bham.ac.uk/~axs/ )
School of Computer Science, The University of Birmingham, B15 2TT, UK
EMAIL A.Sloman AT cs.bham.ac.uk (ReadATas@please !)
PAPERS: http://www.cs.bham.ac.uk/research/cogaff/
FREE TOOLS: http://www.cs.bham.ac.uk/research/poplog/freepoplog.html
|