[Date Prev] [Date Next] [Thread Prev] [Thread Next] Date Index Thread Index Search archive:
Date:Mon Dec 29 23:09:26 2003 
Subject:Re: Comparing Garbage Collectors 
From:Aaron Sloman 
Volume-ID:1031229.01 


Hi Julian,

Julian Clinton (SPSS) wrote (following up a discussion started by Steve
Leach, comparing Pop11 and Java):

> Date: Thu, 18 Dec 2003 10:38:51 +0000 (UTC)
 
> It's an interesting comparison. A few years ago we re-wrote our
> Clementine client application in Java, which had started life as a
> Poplog/Pop-11 and ObjectClass application. This gave us an insight into
> the behaviour of Java and Pop-11, or at least the 2001 version of
> Pop-11, under similar circumstances. Both applications usually had heap
> sizes greater than 150Mb and our latest client uses a default maximum of
> 256Mb (although the internal architectures are quite different and these
> figures don't represent any comparison of the way objects are
> represented).
>
> A few things we noted:
>
> 1. The client has a fair degree of interactivity (e.g., dragging objects
> around on a canvas). In Poplog, garbage collection would cause the
> client to "lock" for a few seconds. We used the non-copying GC because
> the copying GC generally caused much greater paging and making the whole
> machine unresponsive.

As you know, these things are highly context sensitive in pop11.

Here is some relevant information that may not be known to all pop11
experts. (Some of it is in HELP EFFICINCY).

Perhaps the most important single fact was that if there is as much
spare memory constantly available as the maximum heap size required,
pop11 can avoid swapping and paging, including the paging caused by
garbage collections using the 'stop and copy' garbage collector. If
there is less spare memory than the heap size, it may be better to turn
off the stop_and_copy GC by assigning false to pop_gc_copy. This uses a
slower garbage collector, which may be faster in effect when the
alternative method would generate a lot of paging. (This was mentioned
in Julian's message.)

It is useful also to initialise both popmemlim and popminmemlim to be
as large as you can manage without causing paging. This requires some
experimentation, and can depend on what else is using the computer.

Increasing popminmemlim forces the heap to expand before it has to, and
can reduce the frequency of garbage collections, by having more space to
'turn over': the memory allocator runs out of free space less often.

I recently ran a test on a PC running linux, a 1ghz AMD Athlon with 768
Mbytes memory. The pop11 required a lot of memory and generated many
garbage collections. I found that, according to 'top' on linux the pop11
process grew to about 400 Mbytes and ran very slowly mainly because it
used only a small percentage of cpu time, i.e. less than 10%. Presumably
this is because it was constantly waiting for system calls to complete
-- presumably because there was a lot of paging. I presume this makes
the program constantly lose its scheduler slot to a process that is
ready to run. (The super-user may be able to change priorities to reduce
this effect.)

I then changed the numbers in the program so that the process size shown
by top varied between 196M and about 380M, i.e. using a heap of about 50
million words. This caused the program to use less memory, so that it
needed no paging, and  was registered by top as using 99% of CPU time.
In this case each garbage collection took about 4.8 seconds When I made
pop_gc_copy false the process size remained fixed at 196M but garbage
collections took about 7.3 seconds.

I increased both popmemlim and popminmemlim to 75 million (i.e. about
300 Mbytes), then, as expected garbage collections occurred less often.
Moreover, with pop_gc_copy set true garbage collections occurred less
often and for some reason the time required for each dropped to about
1.3 seconds. If I made pop_gc_copy false, so that it did not need to
copy the heap, the GC times rose to about 3.3 seconds. Top showed a
minimum process size of 291 Mbytes, rising to about double that
occasionally when copying was turned on.  Because there was no paging,
this program still used about 99% of the cpu. However, simply by
increasing popminmemlim till the size of the program forced paging
I could make the percentage of CPU usage drop by about a factor of
10 or more, and the program ran very slowly.


> We initially noticed similar locking behaviour
> using the default settings in Sun's JVM. However, this was largely
> eliminated by using Java's incremental GC (-Xincgc option). Having an
> incremental or generational GC would have been useful in Poplog to try
> to avoid these long pauses.

As explained below, the cost of incremental GC would have been very
high. The use of sys_lock_heap makes possible something a bit like
generational GC, though it is not automated.

John Gibson worked out a detailed scheme for incremental garbage
collection at some time in the mid 1980s. This required significant
changes to the memory accessing code in Pop-11, and a considerable
additional memory requirement. It used two heap spaces permanently
allocated and the mechanism gradually moved non-garbage structures from
one heap space to the other, compacting in the process, until everything
had been moved, then it started moving things in the reverse direction.
This meant that every memory access required a check as to whether the
item accessed had been moved or not, along with updating of pointers if
it had been moved.

If I remember correctly, John said that his experiments or calculations
indicated that the net result of converting poplog's memory management
to support incremental garbage collection would slow down some programs
(e.g. programs doing a lot of list manipulation) by a factor of about 2
(i.e. they would take about twice as long) in addition to requiring
twice as much memory for the same size heap.

In those days physical memory was very scarce and very expensive. E.g.
in the mid 1980s it was rare to have more than a few megabytes of ram
on a computer. Also CPUs were hundreds of times slower than now.

After considerable discussion we concluded that the costs of supporting
incremental garbage collection outweighed the benefits, and instead
focused on mechanisms to allow users to reduce the frequency of garbage
collections and the total amount of garbage generated by certain
programs. These mechanisms included the following:

o sys_lock_heap, to enable data-structures known not to be garbage to be
locked so that they were scanned but never copied or moved,
(Unfortunately, if the heap is locked and then unlocked, space allocated
for external structures appears not to be reclaimed by the garbage
collector, so sys_lock_heap should not be used if a program is creating
many temporary X windows. I don't know if this is just a bug or an
unavoidable feature of how external structures are used.)

o popminmemlim, forcing expansion of the allocated heap size to reduce
frequency of GCs, instead of using only the automated expansion
controlled by popgcratio

o mechanisms for maintaining 'free lists' so that items known by the
programmer to be garbage could be returned to a free list for re-use,
e.g. sys_grbg_destpair sys_grbg_list sys_grbg_closure sys_grbg_fixed
(Users can implement their own versions to some extent, e.g. a mechanism
for handling free lists of vectors of different sizes.)

o 'destroy properties' (see REF PROPS/Destroy), which allowed
non-garbage-collectable space (e.g. space occupied by external
structures such as X widgets) to be reclaimed when certain pop11 objects
(e.g. the corresponding Pop-11 data-structures) are garbage collected.

o subscr_stack which allows part of the user stack to be used in place
of a temporary list or vector.

It was also possible for user programs to invoke the garbage collector
at at non-critical times, e.g. immediately before starting a procedure
which should run without being interrupted by a garbage collection.

I am still fairly sure that at that time we decided not to go for
incremental GC we took the right decision if John's calculations of the
costs of incremental garbage collection were correct. There may have
been techniques available with different costs (If I remember correctly
the use of reference counters would have slowed things down even more).

At present the copying garbage collector expands the amount of memory in
use (physical or virtual) to make space for the copying process. When
finished it releases that amount of memory. I wonder whether it would
save time if it did not repeatedly request and then release the extra
space. I expect that doing this a lot can make the operating system do a
lot more work, especially if it is paging, but I don't know.

> 2. We also noticed we could improve overall performance if we increased
> Java's "nursery" size (the area of memory where objects are first
> created). By increasing this to 4Mb from the default 600K, many
> operations that produce large numbers of short-lived objects (such as
> parsing XML) went 2-3 times faster. Again being able to tune Java's
> memory management behaviour was useful, although in this case it was an
> issue brought about by the additional memory management features.

This appears to be analogous to increasing popminmemlim in pop11.

> 5. On a more trivial basis, when Poplog GC'd, it put a custom cursor up.

This happens only if XVed is running, though I suppose it could be made
more general.

> Java doesn't which is a pity as it is useful feedback to the user. We
> also found the pop_after_gc hook quite useful for checking memory
> behaviour. The closest thing we have in Java is to create a temporary
> object whose "finalize()" method (which is called when the object is
> GC'd) updates the memory stats. Unfortunately, not all Java GC types run
> the "finalize()" method so the actual memory usage isn't always
> reflected in the UI.
>
> I guess the summary is that Java has greater flexibility in controlling
> the effect of memory management (short but frequent, infrequent but
> long, memory segment size etc.) than Poplog

Though you do have control over this to some extent, as explained above.

> but applications built on
> Java can produce less effective feedback during GC and can be less
> robust when memory becomes tight.

One thing you did not mention was whether there was any difference in
overall speed between the different versions of Clementine with similar
functionality.

Aaron
====
Aaron Sloman, ( http://www.cs.bham.ac.uk/~axs/ )
School of Computer Science, The University of Birmingham, B15 2TT, UK
EMAIL A.Sloman AT cs.bham.ac.uk   (ReadATas@please !)
PAPERS: http://www.cs.bham.ac.uk/research/cogaff/ (And free book on Philosophy of AI)
FREE TOOLS: http://www.cs.bham.ac.uk/research/poplog/freepoplog.html