Daniel J Birks <dqb@cs.bham.ac.uk> writes:
> Date: Sun, 12 Mar 2000 17:51:06 +0000
> ...
> Does anyone have any idea what this means, it occurs at run time and
> destroys everything.....
>
> ;;;
> ;;; <<<<<<< System Error: Signal = 10, PC = 00122EE0 >>>>>>>
> ;;;
> ;;;
> ;;; FATAL ERROR - serr: SYSTEM ERROR (see above)
> ;;; FILE : <SYSTEM_OBJECT 005F4140> LINE NUMBER: 61
> ;;; DOING : 00110A30 sys_exception_handler
> ;;; sys_raise_exception() .....
> ......
> .... vedinsertstring
> ....
> ;;; .... charout systrace_pr systrace_proc
> ;;; systrace sim_run_sensors(movement_agent,_) run_sim_test <false>
> ;;; sysEXECUTE pop11_exec_stmnt_seq_to sysCOMPILE pop11_comp_stream
> ;;; pop11_compile 000B61F8 000B6DF0 ved_lmr <SYSTEM_OBJECT 006FE118>
> ;;; ved_apply_action 000CEA88 vedprocesschar vedprocess runproc
> ;;; vedprocess
> ;;; _try_input 000AAF30 syshibernate 0053D8F0 sysCOMPILE setpop
>
Somehow objects in the heap have been corrupted. This is indicated by
the occurrences of things like <SYSTEM_OBJECT 006FE118> You'll have to
find out why.
First do some tests to check repeatability and sensitivity to how
you run the program.
Do you have a guaranteed way to generate this error?
To elicit informed help try to describe as best you can what your
program is doing, and what you have done before this occurs.
E.g. did you use the mouse to interact with control panels, menus, etc?
You are clearly running some complex program inside ved_lmr. You
could see whether error arises outside ved.
I.e. start pop11, then when you get the pop-11 prompt (a colon), give
the command to compile your files and then the command to start the
program. Does the error still occur? (The DOING list will be a bit
shorter).
If the mishap occurs only in Xved but not in Ved or when run direct from
Pop-11 then it could be that there's some unfortunately interaction with
Xved.
A common cause of this sort of error is doing something which
corrupts some object in in the heap. The corruption may not be
noticed at the time, but can cause an access violation later on, or some
other obscure error which occurs at a later time.
Several things can cause your heap to be corrupted.
1. One is using updaters of fast_ (i.e. non-checking) accessing
procedures to change contents of lists, vectors, objectclass
instances, etc. etc. You can rule this out by putting at the top of
your file
uses slowprocs
Make sure that it is compiled before any of your programs are
compiled. (see REF FASTPROCS, HELP EFFICIENCY: both of which have
warnings about fast_ procedures)).
Using slowprocs (if compiled before compiling your programs) will
convert fast_ procedures back to normal checking procedures.
Occasionally fi_ arithmetic operations can also cause problems of this
sort.
2. Another likely cause is use of callbacks e.g. with Propsheet or RCLIB
panels or graphic windows.
If your callback procedures (event handlers) cause garbage collections
while the system event handler is still running, datastructures in your
process can become corrupted. This may not show up until later on,
possibly at a later garbage collection, or when something innocent tries
to access a previously corrupted data-structure. (If you were using a
microsoft OS, such events might cause the operating system to crash, I
suppose,..., but on unix only your process crashes.)
To avoid this, make sure that when you define a callback which does
anything more complex than simple arithmetic operations or simply
assigning to a variable, you always invoke the callback actions via the
system procedure external_defer_apply. That will make sure that your
procedures run in a "safe" (well, safer) context.
The propsheet TEACH and REF files do not warn you about this, and they
should.
Most RCLIB buttons and sliders etc. are designed so that this will not
occur because external_defer_apply is used by default for most of the
actions. See the information about event handling in HELP RCLIB
If you create your own event handlers using procedures like
XptAddCallback or XtAddCallback then make sure you use
external_defer_apply to invoke your procedures.
Any slight loss in speed is worth tolerating for the sake of robustness.
There are probably other things you can inadvertently do which can cause
corruption and trigger system errors, but the above are among the most
frequent.
The occurrence of things like this in the DOING list:
0010F264 0010F420 00122E90 00123688 00122648
indicates invocation of anonymous system procedures. Those are the
(hexadecimal) addresses of the procedures (I think).
This bit of your DOING list:
0012A82C(*32)
indicates that one of the system procedures has been called recursively
32 times. I find this very suprising given that the event occurred
inside a call of vedinsertstring (caused by trace printing into a file)
and as far as I know vedinsertstring uses only iterative procedures not
recursion.
Maybe vedinsertstring triggered the garbage collector (which is
recursive). If you assign true (or 1) to popgctrace before your program
starts you can see if it has entered but not completed a garbage
collection before the error occurs.
Aaron
--
Aaron Sloman, ( http://www.cs.bham.ac.uk/~axs/ )
School of Computer Science, The University of Birmingham, B15 2TT, UK
EMAIL A.Sloman AT cs.bham.ac.uk (NB: Anti Spam address)
TOOLS: http://www.cs.bham.ac.uk/research/poplog/freepoplog.html
|