Julian wrote:
> ...
> However, the typical end-user (at least of our software) will be running
> our client on their desktop machine along with Outlook, Word, PowerPoint,
> IE (insert your favourite(!) application here) etc. This means the ideal
> situation for us is the ability to configure the memory management to give
> a balance of low memory footprint, little or no latency during user
> interaction and the ability to scale up for "power users" who push the
> software.
This is partly handled in pop11 by popgcratio.
If you don't set popminmemlim, but set popmemlim high, you can alter
popgcratio to affect how sensitive the poplog memory manager should be
to the need to expand the heap (up to the maximum set by popmemlim).
When I made my original comment about using popminmemlim I thought the
problem was one where it was known in advance that the heap size
required would be large and would stay large: that was my reading of
Steve's problem.
In that case you probably don't want to let the system gradually
increase the heap size, causing lots of garbage collections on the way.
If the heap needs to expand and contract according to what the program
is doing, then by altering popgcratio (desired maximum ratio of GC time
to total CPU time) you can partly control how the system expands and
contracts the heap. E.g. it will contract the heap if there are
infrequent short garbage collections with lots of spare space, and will
grow it again if the situation changes.
(You can also pop_malloc_min_alloc to reduce excessive segmentation of
the heap on unix/linux versions of poplog).
Incidentally I used to find GC times more intrusive on suns than on PCs
running linux. I have no idea why. I don't think it was just a matter
of CPU and memory speeds. Perhaps the architecture difference somehow
affects the speed of a garbage collection?
(It could have been related to cache behaviour.)
Alternatively, there may have been some general improvement to the
Poplog system at about the time I switched from mostly using suns to
mostly using linux on PCs.
> We found we were able to do this with Java which was something
> we hadn't been able to do with Poplog.
>
> (BTW, I'm not saying that the Poplog GC is not good. What I am saying
> is that it currently doesn't provide sufficient control over the
> memory management behaviour for certain types of application.)
Agreed.
There are things you can do by dividing up the heap into different
segments and treating them differently, but I had bad experiences in the
late 1970s when using a version of poplog (on a PDP 10) which did that
for other reasons, namely using a 'caged' heap to put different kinds of
data-structures into different areas of virtual memory to save space
(data-types could then be inferred from address instead of requiring
each data-item to use an extra word of memory (or two bits in the case
of integers and decimals) to specify their type).
It was very hard to get the various segment sizes right for different
programs, and if, for instance, you ran out of space available for
lists, you had no option but to restart the program as that
implementation was not flexible enough to reorganise the partitioning of
memory dynamically.,
There were so many difficult tradeoffs that I ended up believing that in
that case the benefits of the segmented memory were outweighed by the
costs.
I don't know if the complications of keeping parts of the heap with
different ages in different places would cause similar problems. I can
see how for some programs that could be very useful. For others the
pattern of life-cycles of different kinds of data might actually cause
such a scheme to waste a lot of space because things judged to be
non-garbage on grounds of age were actually garbage.
Alterantively the scheme could waste a lot of time checking and shifting
things around to avoid that.
My hunch is that for programs where this really matters it may be
possible to make good use of free lists and explicitly allocate items
from free lists where possible.
Provided that the use of free lists is conservative (i.e. if unsure you
never put anything on a free list, leaving the general purpose garbage
collector to handle the difficult cases) you can then do it safely and
gain significant benefits without the nasty risk of memory leaks that
come from systems without garbage collectors, or the nasty risk of
corrupt programs that come from attempting to deal with hard cases and
making mistakes.
This is the sort of thing David Young did in his 'oldarray' library
package (which is not yet included in the version of his popvision
library available at Birmingham but will be, when I have some time to
install the latest version).
However, making good use of free lists requires more analysis by the
programmer, and that requires programmers to have a lot of training and
experience, including in some cases a good understanding of dlocal, e.g.
to ensure that opportunities to return items to free lists are not
lost because of abnormal procedure exits.
(By 'programmer' of course, I don't mean end-user of a system like
Clementine.)
[Julian]
> The "documents" themselves can vary in size from a few hundred K to
> many megabytes, depending on what the user is doing and there's no
> problem releasing the resources when a document is closed. However
> as memory usage increased in the old Poplog implementation, we could
> not program our way round the fact that a GC scans the whole heap
> which always caused a significant pause.
In poplog sys_lock_heap does address this in some cases.
I think we once discussed allowing multiple locks. I.e. when you lock
the heap you get back a tag saying how far it has been locked, then
you can unlock back to a specified tag. But I don't think that ever got
implemented.
The only way to avoid scanning all non-garbage (which is not the same as
scanning the whole heap) before treating data-structures as garbage is
either to use something like reference counts (which is fine if you
don't mind the extra overhead that produces) or doing compile-time
analysis to check when things become garbage, which I think is
impossible to infer in general in a system as rich as pop11. However
there are special cases where it could be done (similar to the automatic
analysis of lexical variables discussed in REF VMCODE).
In those special cases a compiler could, in principle, automatically put
certain no-longer needed items on free lists on exit from a procedure.
I wonder how that would compare with a generational GC.
Of course it would also have costs, since putting things back onto free
lists takes time, and in some contexts (with lots of memory and a very
fast system) it may be better to do one big GC occasionally than very
frequent 'return to freelist' operations.
> In the Java implementation,
> the generational GC and the additional controls over memory management
> means the application can have a very large heap without most GCs being
> noticable.
I've experienced that with sys_lock_heap. I suspect using
sys_lock_system (using large saved images, where much of the locked part
is permanently non-writeable) could help even more, but would be more
awkward during development.
It would be interesting to see studies of various 'life-histories' of
programs with analysis of how different sorts of data migrate between
different segments, so that we can do a deep analysis of the tradeoffs
for different classes of programs and investigate options for automatic
optimisation of garbage handling.
However, if deciding how an arbitrary program should be optimised is as
difficult as deciding whether an arbitrary procedure will ever halt,
then we have problems.
Aaron
|