Anthony and Jonathan, thanks for your comments.
Anthony wrote:
> Maybe we should agree a set of Poplog benchmarks and put the on an
> ftp server so people can apply them to lots of different machines.
I have now put my tests in the directory
ftp://ftp.cs.bham.ac.uk/pub/dist/poplog/benchmarks
The README file explains what the other files are. The benchtar.Z
file is a compressed tar file containing all the other files, so you
can get them in one go.
The tests include some shell scripts for running the prolog tests
N times in parallel (N = 1 2 4 8 12 16) which is useful for seeing how
performance degrades on a machine under heavy multi-user load.
> The question is what shall we call them POPmarks?
My tests are very simple. The floating point test is particularly
stupid, though it gives a very rough indication of relative performance
of machines.
> PS Arron I assume by SS/672 you mean a SpacServer 4/670 with two
> Cyprus 102 CPUs as used in the SparcStation 2.
Yes except that they are a bit faster than the SparcStation 2.
Steve Knight has now run my tests on his workstation, a HP 735, running
HP-UX. It comes out faster than all the other machines. He has added
his numbers to my table as column J:
A B C D E F G H I J
DEC HP Sun* Sun Sun Sun Sun Sun HP
3100 MIPS HP-UX SS2 SS/ SS10 SS10 SS10 Ross HP-UX
2000 M68040 672 /30 /41 /52 HS66 735
Sol2.3 mhz
------------------------------------------------------------------------------------
Prolog test KLips 56.1 68.9 107.8 117.0 134.8 203.3 238 295 335 387
(Simple reverse)
------------------------------------------------------------------------------------
Pop-11 Factorial(1000)
three times (Secs)
recursive 0.78 0.80 0.55 0.81 0.75 0.55 0.46 0.42 0.38 0.11
iterative 0.77 0.80 0.57 0.78 0.71 0.51 0.45 0.40 0.38 0.11
------------------------------------------------------------------------------------
Floating point
single (secs) 0.88 0.67 0.71 0.38 0.38 0.21 0.19 0.17 0.13 0.13
double 1.05 0.85 0.60 0.43 0.41 0.25 0.22 0.18 0.16 0.17
------------------------------------------------------------------------------------
* Replacing SS2 processor with Weitek chip gave
177 Klips, Factorial: 0.5, 0.45 secs, Float: 0.23, 0.3 secs
I.e. not quite SS10/30
He may be able to try out the tests on bigger HP machines later.
Jonathan wrote:
> Out of curiosity, I tried factorial(1000) in Franz
> Allegro CL/PC 1.0[1] on the Pentium 90 machine I am
> using at the moment. Computing it 3 times takes 0.22
> secs plus 0.17 secs garbage collection time (= 0.39
> secs) although if I run the loop 30 times it takes
> 1.92 plus 1.54 = 3.46, averaging a bit faster than the
> Sparc running at 66MHz (I assume Prof. Sloman's times
> included GC).
Yes. However, in Poplog with popmemlim set to 500000 GC times are
very much lower. E.g. on a SparServer with Dual 66 Mhz Hypersparcs
with several other people logged in.
define procedure fac(n);
lvars n;
if n == 0 then 1
else fac(n - 1)*n
endif
enddefine;
500000 ->> popmemlim -> popminmemlim;
true -> popgctrace;
sysgarbage(),timediff() ->;
repeat 30 times
fac(1000) ->;
endrepeat;
timediff() =>
Gives:
;;; GC-user(C) 0.12 MEM: l 69632 + u 117756 + f 326659 + s 1 = 514048
;;; GC-auto(C) 0.15 MEM: l 69632 + u 117958 + f 328505 + s 1 = 516096
;;; GC-auto(C) 0.17 MEM: l 69632 + u 117820 + f 328643 + s 1 = 516096
;;; GC-auto(C) 0.15 MEM: l 69632 + u 118017 + f 328446 + s 1 = 516096
;;; GC-auto(C) 0.15 MEM: l 69632 + u 117901 + f 328562 + s 1 = 516096
;;; GC-auto(C) 0.12 MEM: l 69632 + u 118065 + f 328398 + s 1 = 516096
;;; GC-auto(C) 0.15 MEM: l 69632 + u 117961 + f 328502 + s 1 = 516096
;;; GC-auto(C) 0.17 MEM: l 69632 + u 118113 + f 328350 + s 1 = 516096
;;; GC-auto(C) 0.13 MEM: l 69632 + u 118016 + f 328447 + s 1 = 516096
;;; GC-auto(C) 0.15 MEM: l 69632 + u 118160 + f 328303 + s 1 = 516096
;;; GC-auto(C) 0.15 MEM: l 69632 + u 118070 + f 330441 + s 1 = 518144
;;; GC-auto(C) 0.14 MEM: l 69632 + u 118210 + f 330301 + s 1 = 518144
** 5.85
I.e. GC time is 1.75 secs 1.75/5.85 = 30% of total time.
If I increase the memory allocation by a factor of 4
2000000 ->> popmemlim -> popminmemlim;
I get (on the same machine) only 3 garbage collections
;;; GC-user(C) 0.15 MEM: l 69632 + u 119803 + f 1811460 + s 1 = 2000896
;;; GC-auto(C) 0.13 MEM: l 69632 + u 119086 + f 1814225 + s 1 = 2002944
;;; GC-auto(C) 0.15 MEM: l 69632 + u 119136 + f 1814175 + s 1 = 2002944
** 5.1
I.e. GC time is 0.43 secs, = 0.43/5.1 = 8.4% of total time.
This shows how you have to control heap allocation when running
benchmarks.
> I don't think the compiler technology is dramatically
> better, so these figures reflect processor speed.
I have reason to believe, from some tests Harry Barrow did a few years
ago, that the American Lisp systems (certainly Lucid Lisp) do a lot more
compile time optimisation than Poplog does. (It's easier in Lisp,
because there's a parse tree, and also the open stack in Poplog means
that you can do such clever things with registers.) But maybe Allegro
is not so heavily optimised. In Lucid, compilation took very much
longer than in Poplog Pop-11 on the same machine. My impression is
that the Poplog garbage collector is one of the fastest there is,
but for user code other compilers do much better optimisation.
(E.g. I don't know whether Poplog uses the new integer operations on the
latest Sparc systems. The original SPARC used not to include integer
multiply and divide. Can anyone at Sussex or ISL comment?)
Here's a more realistic comparison of processor speed, from a table
prepared by John DiMarco at Toronto, available via
ftp://ftp.cdf.toronto.edu/pub/spectable
I've merely selected a subset of the values. Pentium is at the end.
System CPU ClkMHz Cache SPECint SPECfp Info Source
Name (NUMx)Type ext/in Ext+I/D 92 92 Date Obtained
================= ========== ======= ========== ======= ======= ===== ==========
-- -- HP
HP 425t 68040 25 4/4 12.3 10.3 Jun93 DECinfo
HP 425e 68040 25 4/4 12.2 9.3 Jun93 DECinfo
HP 730 PA1.1 66 128/256 47.8 75.4 May92 c.s.sun.hw
HP 7[35]5 PA7100 99 256/256 109.1 167.9 Jan94 HP
HP 7[35]5/125 PA7150 125 256/256 136 201 Apr94 HP
HP 750 PA1.1 66 256/256 48.1 75.0 Oct92 c.arch
-- -- SGI (fastest machines only here)
SGI PowerChl,Onyx R8000 75 4M+16/16 108.7 310.6 Jun94 c.arch
SGI Indigo2 R4400 100/200 1M+16/16 119 131 Nov94 SGI PdcTbl
SGI PowerIndigo2 R8000 75 2M+16/16 107 265 Nov94 SGI PdcTbl
SGI IndySC R4600 44/133 512+16/16 113.5 73.7 Feb95 SGI anno
SGI IndySC R4400 44/175 1M+16/16 122.6 115.5 Feb95 SGI anno
-- -- Sun Sparc (inc Hypersparc)
Sun SS/IPC FJMB86902 25 64 13.8 11.1 Nov92 Sunflash
Sun SS/IPX FJMB86903 40 64 21.8 21.5 Nov92 Sunflash
Sun SS2 RT601 40 64 21.8 22.8 Oct92 c.arch
Sun SS2/PowerUp WeitekPwUP 40/80 16/8 32.2 31.1 Jun93 c.s.sun.an
Sun SS10/20 SuprSP 33 20/16 39.8 46.6 Nov92 Sunflash
Sun SS10/30 SuprSP 36 20/16 45.2 54.0 Apr93 Cockcroft
Sun SS10/40 SuprSP 40 20/16 50.2 60.2 Apr93 Sunflash
Sun SS10/41 SuprSP 40/40.3 1M+20/16 53.2 67.8 Apr93 Cockcroft
Sun SS10/51 SuprSP 40/50 1M+20/16 65.2 83.0 Apr93 Sunflash
Sun Classic,LX MicroSP 50 4/2 26.4 21.0 Nov92 Sunflash
Sun Voyager MicroSP2 60 16/8 43.2 36.2 Mar94 Sun
Sun SS4/70 MicroSP2 70 16/8 59.6 46.8 Jan95 Sunflash
Sun SS4/85 MicroSP2 85 16/8 65.3 53.1 May95 SunIntro
Sun SS5/70 MicroSP2 70 16/8 57.0 47.3 Mar94 Sunflash
Sun SS5/85 MicroSP2 85 16/8 65.3 53.1 May95 SunIntro
Sun SS5/110 MicroSP2 110 16/8 78.6 65.3 May95 SunIntro
Sun SS20/50 SuprSP 50 20/16 76.9 80.1 May95 SunIntro
Sun SS20/51 SuprSP 40/50 1M+20/16 81.8 89.0 May95 SunIntro
Sun SS20/61 SuprSP 50/60 1M+20/16 98.2 107.2 May95 SunIntro
Sun SS20/71 SuprSP2 50/75 1M+20/16 125.8 121.2 Jan95 SunIntro
Sun SS20/612 2xSuprSP 50/60 1M+20/16 ? 127.1 Sep94 SPEC newsl
Sun SS20/HS11 HyperSP 50/100 256+8/0 104.5 127.6 Nov94 SunIntro
Sun SS20/HS21 HyperSP 50/125 256+8/0 131.2 153.0 May95 SunIntro
-- -- Ross hypersparc
RT 100S-66 HyperSP 40/66 256+8/0 67 87 Aug94 Ross
RT 100S-72 HyperSP 40/72 256+8/0 75 96 Aug94 Ross
RT 200S-66 HyperSP 50/66 256+8/0 72 94 Aug94 Ross
RT 200S-72 HyperSP 50/72 256+8/0 80 105 Aug94 Ross
RT 200S-90 HyperSP 50/90 256+8/0 101 120 Apr95 Ross
RT 200S-110 HyperSP 50/110 256+8/0 122 142 Apr95 Ross
-- -- Intel pentium
Intel Xpress Pentium 60/90 512+8/8 106.5 81.4 Mar95 www.intel
Intel Xpress Pentium 60/90 1M+8/8 110.1 84.4 Mar95 www.intel
> It
> really is a pity that, as recent postings reveal, there
> is no pop11 available at the moment for Window's users.
I understand that NT Poplog should be usable with Windows, in 32bit
compatibility mode. But I don't know if anyone has tried it.
Aaron
|