hebisch@math.uni.wroc.pl (Waldek Hebisch) writes:
> Organization: Politechnika Wroclawska
>
> Aaron Sloman (A.Sloman@cs.bham.ac.uk) wrote:
>
> : I wondered whether the recent changes adding extra indirection through a
> : C procedure to access errno might slow down poplog.
....
> : To my surprise the old version consistently took about 11 %
> : longer -- using the shell 'time' command, and using the real time.
> : The 'user cpu' time ratio was about the same.
I've now tried this on the Sun/solaris version of poplog also:
solaris version 8 running on a 4cpu 450 mhz ultrasparc.
The old version took about 5% longer than the new version.
It's a multi-user machine, but the load was very light at the time.
However, on an old Dell laptop, running redhat 9 on a 400 mhz
celeron CPU, there was barely any difference in speed.
(This may be partly due to the slow laptop disk drive causing
other differences to be masked. The machine on which the time
difference was most marked (1ghz athlon) also had the fastest
disk.)
>...
> : Can anyone explain why invoking Waldek's C procedure should have
> : been faster?
>
Waldek responded:
> I think that the speedup is "random". I would be very surprised
> if `errno' access was time critical. I am not able to do more
> refined test but starting poplog in the debugger shows that
> `get_libc_errno' is called 18 times before I get to Poplog prompt.
> I can see delay between hitting <enter> and getting Poplog prompt,
> and I estimate that startup takes around 50 or 100 miliseconds.
> The C functions I wrote should execute in few machine clocks, so
> to get measurable influence on runtime they should be called milion
> times. On the other hand adding/remowing a small piece of code
> changes exact memory layout of final program. The change in layout
> may easily give observed difference. Namely layaout affects:
>
> 1) alignment (acces to unaligned data is much slower)
> 2) cache hit ratio (if two memory cells have adresses differing
> only in high order bits they compete for the same cache line)
> 3) delay caused by cache misses
>
> Except for alignment (compilers include special padding to align
> data) the two other effects are almost 'random' (that is without
> info about frequency of access of various memory locations there
> is no way to predict which layout is better)
The effect does not seem to be random. I averaged over several runs
in all cases and the variance in times on each machine was not high.
On the other hand I did use only one type of test.
I asked a colleague about the speed difference and he offerered the
suggestion that inline access to a variable might be compiled in a
location-independent fashion which would make access slow compared
with a call to a procedure at an absolute address.
Aaron
====
Aaron Sloman, ( http://www.cs.bham.ac.uk/~axs/ )
School of Computer Science, The University of Birmingham, B15 2TT, UK
EMAIL A.Sloman AT cs.bham.ac.uk (ReadATas@please !)
PAPERS: http://www.cs.bham.ac.uk/research/cogaff/ (And free book on Philosophy of AI)
FREE TOOLS: http://www.cs.bham.ac.uk/research/poplog/freepoplog.html
|